Latest update Dec 19, 2022
One or two persons, level of ambition accordingly. No prior knowledge of Cryptography is required.
In this project you will be given a set encrypted texts (ciphertexts). Your task is to decrypt them to find the corresponding plaintext.
This is one of the oldest and most known ciphers. It is a substitution cipher where each letter of the alphabet is shifted some c steps to the right.
Example: say we want to encrypt the plaintext: "THIS IS NOT VERY SECURE". First, pick an integer 0<c<26 (assuming we have 26 letters in the alphabet). Next, shift all letters c steps to the right. Doing this for c=7 yields:
Plaintext: THIS IS NOT VERY SECURE Ciphertext: AOPZ PZ UVA CLYF ZLJBYL
To decrypt we simply shift the ciphertext c steps to the left and we get the plaintext.
Your task is to write a program in F# that decodes a text file that has been encrypted using a Caesar cipher. You can assume that only the ASCII letters (A-Z) are shifted in the encrypted text, and that the text contains only capital letters. (Hint: consider how you can find c. How many different c's are there? Is it feasible to use brute force?)
Obviously your solution must involve the user. Your program should offer the user to check proposed decodings of the text, and to pick the correct one.
Test your solution using the file TEXT_1.txt. In this file, the text has been encrypted by shifting letters only: no other characters are encrypted.
The security of the Caesar cipher in 1 can be increased by instead of simply shifting the alphabet c positions, each letter l is replaced by the letter p(l) according to the permutation p. (A permutation is a function from a set to itself that is 1-1.) This means that an encrypted letter e is decoded into p-1(e), where p-1 is the inverse of p.
Example: we substitute letters in the regular alphabet (plaintext alphabet) according to the permutation (ciphertext alphabet) specified by the table below:
Plaintext alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ_ (_ is whitespace) Ciphertext alphabet: HCIJBDFXGSYEWUNL_PQRAKZMOTV
Using this ciphertext alphabet we encrypt the plaintext: "THIS IS A BIT MORE SECURE":
Plaintext: THIS IS A BIT MORE SECURE Ciphertext: RXGQVGQVHVCGRVWNPBVQBIAPB
The problem with this kind of encryption is that, although it scrambles the letters, it doesn't change the frequencies of which they appear. Therefore, if we analyze the frequency of each letter and compare it to the known letter frequency of the English language, we are given good hints as to the character mapping between the plaintext alphabet and the ciphertext alphabet. For instance, using knowledge about how often whitespace tends to appear in English texts, one could figure out from the above ciphertext that whitespace most likely is mapped to the letter 'V'.
Your task is to write a program that decrypts a text that has been encrypted using a mixed alphabet of the type described above. (Hint: search the web for information about the relative frequencies of letters in the English language).
Test your solution using the file TEXT_2.txt, which contains a cipher text that is encrypted using a mixed alphabet. In this file the text has been encrypted by permuting all characters in the file (not just the letters), including parentheses, periods, commas, etc.
Be aware: there can be random variations in the character frequencies in a given text, that make them deviate from them in a table. Also, frequency tables can be slightly different depending on what mix of texts they are based upon. Thus, one cannot expect a solution that is based just on execution frequencies to return a perfect decryption. Rather, one will have to work in an iterative manner like the following:
Thus, the software that provides your solution will work as a support for this process rather than providing the decrypted message fully automatically. In your report you should give a brief description how you worked your way to obtain the final clear text, and how your software supported this process.
The above is sufficient for one person. For two persons also implement:
Both the Caesar cipher in 1, and the mixed alphabet in 2 are vulnerable to simple frequency analysis. The Vigenère cipher tries to combat this by combining several Caesar ciphers. Recall the right shift offset c used in the Caesar cipher above. In the Vigenère cipher, we construct a 'key' K by selecting a sequence of these offsets. Let |K| be the key length (how many Caesar ciphers that are combined).
To encrypt plaintext we apply the first Caesar cipher to the first |K| characters. The next Caesar cipher to the next |K| characters, etc. After applying the last cipher the process restarts with the first.
Example: we select the key K = {4 1 7} of length 3. For each offset in the key, we construct a ciphertext alphabet by shifting the plaintext alphabet that many steps to the right.
K = {4 1 7}, |K| = 3
(Offset in-between parentheses)
Plaintext alphabet (0): ABCDEFGHIJKLMNOPQRSTUVWXYZ_ (_ is whitespace) Ciphertext alphabet (4): EFGHIJKLMNOPQRSTUVWXYZ_ABCD Ciphertext alphabet (1): BCDEFGHIJKLMNOPQRSTUVWXYZ_A Ciphertext alphabet (7): HIJKLMNOPQRSTUVWXYZ_ABCDEFG
Say we want to encrypt the plaintext: "THIS IS THE MOST SECURE YET", using the above key. The process would be:
Offsets: 444111777444111777444111777 Plaintext: THIS IS THE MOST SECURE YET Ciphertext: XLMTAJZG LIDNPT GZIGYSFAEL
Your task is to implement this cipher (not to break it). That is, implement the encrypt and decrypt functions. The input to the encryption function is a plaintext and a key. It outputs a ciphertext. The input to the decryption function is a ciphertext and a key. It outputs the plaintext.
Note: there are several variations of the Vigenère cipher. You are free to implement one of your own choice. However, you need to describe your reasoning (in the report) if you decide on another variation than the above.