Section11.1What is cryptography?

Cryptography is not just the science of making (and breaking) codes, as a dictionary might have it. It is the mathematical analysis of the tools of secrecy, from both the perspective of someone keeping a secret and that of the person trying to figure it out. Sometimes it is also called cryptology, while sometimes that term is reserved for a wider meaning.

There are two kinds of codes.

  • Ones intended to remain secret!
  • Ones encapsulating information in a convenient format.

Mathematicians use the word code to indicate information is being stored, reserving the term cipher to talk about a way to protect that information. So, what we do when learning about this is a some of each, though mostly about ciphers.

Subsection11.1.1Encoding and Decoding

There are many ways to encode a message; the easiest one for us will be to simply represent each letter of the English alphabet by an integer from 1 to 26. It is also easy to represent both upper- and lowercase letters from 1 to 52.

We'll use the following embedded cell to turn messages into numbers and vice versa. You encode a plaintext message (no spaces, in quotes, for our examples) and decode a positive integer.

Let's try to encode the letter “q”.

Remark11.1.1

Sage note:
If this cell doesn't work, then you may need to evaluate the previous one again. If anything on this sheet ever gives a NameError about a global name “encode”, you probably need to reevaluate some previous cells - most likely, the one with encode!

Decoding is similar.

This should be straightforward. Too straightforward, perhaps.

  • Notice that I didn't bother separating lower and uppercase letters.
  • In fact, no matter how complicated you get, with just a one-to-one correspondence, there are only a few possibilities for each letter. So if you know the human language in question, you can just start guessing which encrypted number stands for its most common letter.

That means that, in practice, we need to do a few other things.

  • For instance, one thing that is commonly done is to make longer blocks of letters, and then turn those into numbers.
  • Presumably there are a lot more three-letter (or longer) possible blocks of letters in English to be able to decrypt things too easily.

For pairs, we represent the first letter as a number from 1 to 26, and the second letter as 26 times the letter number (think of it as base 26). Remember that A=1, B=2, etc.

Now compare the following two encodings of “The best day of the year” and see which one might be easier to figure out.

Whereas there are many 5s in the first one, which you could guess were Es, the second one has only one repeat (though knowing English, one might guess it was 'Th'). And indeed, it's important to point out we haven't encrypted yet, just encoded.

With three letter blocks, there are then already \(26^3=17576\) possibilities.

One could use this to encode INT HEB EGI NNI GWA STH EWO RDX. In this case, we use an extra X to fill out the space from a famous quote.

To be fair, when filler of this type is used, it would more often be used in the middle to confuse things. In addition, one might recombine the message in various ways. We will, however, usually keep our whole message together as one item, since we want to understand the mathematical aspects most, rather then real cryptography.