Lab 4-2: Caesar Cipher - Encrypting and Decrypting¶
Note: Part of this lab came from Al Sweigart’s great book, Hacking Secret Ciphers with Python: A beginner’s Guide to cryptography and computer programming with Python, available online here at Invent With Python, among his other works. Feel free to check them out if they interest you!
The Caesar Cipher¶
The Caesar Cipher, named after Julius Caesar of Ancient Rome, is a type of substitution cipher where each letter of the original (plaintext) message is substituted with another letter.
Encrypting with the Caesar Cipher¶
But how do we decide what letter is replaced by what? That’s where the key comes into play. Found in almost every encryption algorithm, the key determines how the data is encrypted.
In the Caesar cipher, the key is a number from 0 to 25, because there are 26 letters in the alphabet. This means that for any given message, there are 26 different ways we can encrpyt the message.
For each letter, the key determines which letter is replacing the current letter, by counting down the alphabet. In the following example, let’s say we wanted to encrypt the letter B
with a key of 3
, we would find the 3rd letter that appears after B
- which is C, D, then finally E
.
In the Caesar cipher with a key of 3, A becomes D, B becomes E, C becomes F, and so on...
This can easily be shown by lining up two alphabets, one on top of each other, with the second alphabet starting at the letter shifted 3 letters down. This means that the bottom row should be shifted to the left.
ABCDEFGHIJKLMNOPQRSTUVWXYZ
DEFGHIJKLMNOPQRSTUVWXYZABC
You might notice that the second row starts has the letters ABC
right after Z
. This is because when you are shifting down the alphabet, if you reach the end (Z), it will loop around to the beginning (A).
This type of visual makes it much more clear what letter should become what. If we wanted to encrypt the word BILLY
, we would simply take each unique plaintext letter (top row) and find its matching ciphertext letter (bottom row), or:
A [B] CDEFGH [I] JK [L] MNOPQRSTUVWX [Y] Z
D [E] FGHIJK [L] MN [O] PQRSTUVWXYZA [B] C
So we clearly see that:
B
becomesE
I
becomesL
L
becomesO
Y
becomesB
(Note that we only have to find unique letters, like the L
because each L
will always turn into the same letter)
So the final ciphertext is ELOOB
. Note that when we reach the end of the alphabet, we keep counting starting with A, B, etc...
Decrypting with the Caesar Cipher¶
Decrypting works in a very similar way - except this time, instead of counting “down” the alphabet, you count “up”! As an example, let’s try to decrypt ELOOB
using a key of 3
- because we know the result should be our original plaintext, BILLY
. Let’s start by aligning our alphabets:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
XYZABCDEFGHIJKLMNOPQRSTUVW
Notice how this time, instead of shifting the bottom row to the left, now we are shifting it to the right. We can then find our ciphertext letters (top row) E,L,O,B
and find their corresponding plaintext letters (bottom row).
A [B] CD [E] FGHIJK [L] MN [O] PQRSTUVWXYZ
X [Y] ZA [B] CDEFGH [I] JK [L] MNOPQRSTUVW
We can now see that:
E
becomesB
L
becomesI
O
becomesL
B
becomesY
Which gives us our original plaintext, BILLY
.
Next, we’re going to learn about a python implementation of the Caesar cipher.
Encryption Walkthrough¶
To begin, create a file called FILN_caesar.py
, where FILN is your first initial and last name, no space.
This lab will work a little differently than previous ones. The purpose of this lab is to explain an algorithm, walking you through the algorithm, as well as highlighting thought processes behind good code design. Please pay special attention and read through all the text, as it does provide some strategies and insight into how you can write code.
In this lab, you should be following along and building this step-by-step in your preferred python IDE. Changed lines in code will be highlighted.
Let’s start with our encryption function. In this function, we’ll be taking in key
and message
as arguments. The first thing I’m going to do is to convert the entire message to uppercase, which we can do using the .upper()
string method. Then I’m going to create a variable called alpha
that will hold the alphabet as a long string
1 2 3 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
|
Next, I know that I will have to operate on every letter in the message. I can write a for
loop that will loop through every letter in a string by using for var in str
, where var
will cycle through being every character in str
. I also know that since I am going through letter by letter, I will need to initialize an empty string as the result so I can build upon it.
1 2 3 4 5 6 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
|
One thing to note about the Caesar cipher is that it doesn’t really handle characters that aren’t letters, such as punctuation, or spaces. So we want to make sure that if we’re going to encrypt something, we’re only going to encrypt letters.
Let’s add in a conditional to handle that. The logic here is, “if it’s a letter, encrypt it. otherwise, just add it to the result (i.e. don’t change it)”. We can check whether a letter var
is in a string str
by using var in str
, which returns True
if var
is found in str
.
1 2 3 4 5 6 7 8 9 10 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#encrypt it
else:
result = result + letter
|
Now we’re going to get into the actual encryption. This happens letter-by-letter since we are using the for loop to iterate through the message, which makes it simpler for us. The first thing we want to do is find out where is the letter in the alphabet? What index is the letter in? To do this, we can use the .find()
string method, which will give us the first occurrence of the letter.
1 2 3 4 5 6 7 8 9 10 11 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the letter in the alphabet
letter_index = alpha.find(letter)
else:
result = result + letter
|
Now that we know the plaintext letter index... we have to find the corresponding ciphertext letter! While we could technically create a new alphabet string that has been shifted, it would be much easier to compute the new letter index using the key. We can “shift” the alphabet to the left by adding the value of the key to the index. In our old example where we turned B
into E
with a key of 3
, you can see here that the index of B
is 1
, and the index of E
is 4
, so indeed, +3
is the operation here. More generally, it’s just +key
.
1 2 3 4 5 6 7 8 9 10 11 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = alpha.find(letter) + key
else:
result = result + letter
|
Now that we have the index of the ciphertext letter, we now just need to add that letter to the result. Right now, letter_index
represents the position of the letter in the alphabet. We want to get the letter itself. We do this just by using letter_index
as an index to the string, alpha[letter_index]
. Then we add this to the result.
1 2 3 4 5 6 7 8 9 10 11 12 13 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = alpha.find(letter) + key
result = result + alpha[letter_index]
else:
result = result + letter
|
We did forget one thing, however. We forgot to handle loop-arounds! How do we handle what happens when we reach the end of the alphabet? Well, whenever we hit a index of 26, we want that index to becomes zero (especially since alpha
doesn’t have a 26th index!). If our index goes to 28, we want our index to actually be 2. The consistent thing here is that if our index is 26 or over, we want to subtract 26. While we could use an if statement for this, there exists a better way.
This is a prime example of where modulo would be helpful. We can use modulo to handle loop-arounds. This works because any number that is greater than 26 will be reduced to a number from 0 to 25, since the remainder of anything divided by 26 can only be from 0-25.
We can know that this works for loop-arounds by observing the following behavior:
- \(25 \% 26 = (0*26 + 25) \% 26 = 25\)
- \(26 \% 26 = (1*26 + 0) \% 26 = 0\)
- \(27 \% 26 = (1*26 + 1) \% 26 = 1\)
- \(28 \% 26 = (1*26 + 2) \% 26 = 2\)
- and so on...
We can then just use the following line to handle loop-arounds:
letter_index = (alpha.find(letter) + key) % 26
A good practice in programming, however, is to avoid hard-coding numbers if possible. We are working with the number 26 because there are 26 letters in the alphabet, but also because alpha
has a length of 26 (len(alpha) == 26
). Instead of using 26 in our algorithm, we should use len(alpha)
.
1 2 3 4 5 6 7 8 9 10 11 12 13 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = (alpha.find(letter) + key) % len(alpha)
result = result + alpha[letter_index]
else:
result = result + letter
|
Finally, what we’ve written will repeat for every character in the message. Let’s return our final string.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = (alpha.find(letter) + key) % len(alpha)
result = result + alpha[letter_index]
else:
result = result + letter
return result
|
Decryption Walkthrough¶
Again, this is very similar to our encryption function, so let’s go ahead and rewrite our encryption function and call it “decrypt”.
As you’re rewriting the function, keep in mind what we are changing:
- Instead of
+ key
, we want to shift in the other direction, using- key
instead.
Our final program is then:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = (alpha.find(letter) + key) % len(alpha)
result = result + alpha[letter_index]
else:
result = result + letter
return result
def decrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = (alpha.find(letter) - key) % len(alpha)
result = result + alpha[letter_index]
else:
result = result + letter
return result
|
Testing our Program¶
Finally, we get to test our program. Let’s write a script to test our code. The following code should be added to the end of your program above.
def main():
word = "BILLY"
#encrypt "BILLY" with a key of 3
encrypted = encrypt(3,word)
print(encrypted) #should print "ELOOB"
#decrypt "ELOOB" with a key of 3
decrypted = decrypt(3,encrypted)
print(decrypted) #should print "BILLY"
if __name__ == "__main__":
main()
Run the program and compare its output to what it should be (it should print ELOOB
then BILLY
again). If it works, then great! We should test it one more time, but this time with a phrase/word with non-letter characters mixed in:
def main():
word = "HELLO WORLD?!"
#encrypt "HELLO WORLD?!" with a key of 20
encrypted = encrypt(20,word)
print(encrypted) #should print "BYFFI QILFX?!"
#decrypt "BYFFI QILFX?!" with a key of 20
decrypted = decrypt(20,encrypted)
print(decrypted) #should print "HELLO WORLD?!"
if __name__ == "__main__":
main
When we run this program, we should notice that when it prints its ciphertext form, the space, question mark, and exclamation point all stay in the same place. A good sign!
Refactoring Our Code¶
Per Wikipedia, refactoring code is the process of restructuring existing computer code without changing its external behavior. This is a very important process to learn because it makes our code easier to maintain, as you are about to see.
- There are two major flaws in our code right now:
- Our code repeats itself a lot.
- Our algorithm can be broken down into more dedicated functions
By looking at our final solution, we can notice that the for
loops in encrypt()
and decrypt()
are almost identical, except one adds the key and one subtracts the key (#1). We also notice that our process of encrypting/decrypting an entire word can be broken down further into encrypting/decrypting individual letters (as we do in our for
loop) (#2). These two flaws can be fixed by simply creating a new function.
Let’s create a new function, get_cipherletter()
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | def get_cipherletter(new_key, letter):
def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = (alpha.find(letter) + key) % len(alpha)
result = result + alpha[letter_index]
else:
result = result +
return result
def decrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = (alpha.find(letter) - key) % len(alpha)
result = result + alpha[letter_index]
else:
result = result + letter
return result
|
The purpose of this function will be to simply return the the new letter given a new key. We can pull code from our existing program to help us build this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | def get_cipherletter(new_key, letter):
#still need alpha to find letters
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
if letter in alpha:
return alpha[new_key]
else:
return letter
def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = (alpha.find(letter) + key) % len(alpha)
result = result + alpha[letter_index]
else:
result = result + letter
return result
def decrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
if letter in alpha: #if the letter is actually a letter
#find the corresponding ciphertext letter in the alphabet
letter_index = (alpha.find(letter) - key) % len(alpha)
result = result + alpha[letter_index]
else:
result = result + letter
return result
|
Next, we remove the bits of code from our encrypt
and decrypt
functions that are now redundant and replace it with code that uses the function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | def get_cipherletter(new_key, letter):
#still need alpha to find letters
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
if letter in alpha:
return alpha[new_key]
else:
return letter
def encrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
new_key = (alpha.find(letter) + key) % len(alpha)
result = result + get_cipherletter(new_key, letter)
return result
def decrypt(key, message):
message = mesage.upper()
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = ""
for letter in message:
new_key = (alpha.find(letter) - key) % len(alpha)
result = result + get_cipherletter(new_key, letter)
return result
|
“But wait!” - you might be saying - “What if we have a symbol? It won’t be found in alpha
when we use alpha.find()
!”
And you are correct! And that is fine. The .find()
function returns -1
, which is still a valid number, and the check for whether the letter exists in alpha
or not will still happen within the function. So it is still good. Good eye!
At this point, we are ready to move on to the next lab, where we will be crack the cipher through brute forcing!