Le chiffre indéchiffrable

(Apologies for the lateness of this post — recent bad weather was making my internet connection rather unreliable)

They say that the mark of understanding for any concept is one’s ability to teach that concept to others. On that note, this post is going to be me rambling about the workings of the Vigenère cipher, with notes and a couple of challenges for the reader. The rest of the post is under a ‘read more’ tag.

First, some background. The Vigenère cipher is based on the Caesar shift cipher, a mono-alphabetic substitution cipher. The Caesar shift is so called because it was used by Julius Caesar in his private correspondence. The working of the cipher is very simple: each letter of the plain text is replaced with a letter a certain number of places away from it in the alphabet. For example, in a Caesar shift of 3 places to the right, ‘a’ becomes ‘d’, ‘b’ becomes ‘e’, and so on. The shifted alphabet is the sole cipher alphabet used to encode the message (hence ‘mono-alphabetic’ — one alphabet is used — and ‘substitution’ — letters are replaced in the process of encoding). So, to continue the previous example, the full cipher alphabet for a right-shift of 3 is:

Plain: ABCDEFGHIJKLMNOPQRSTUVWXYZ

Cipher: DEFGHIJKLMNOPQRSTUVWXYZABC

This type of cipher is very vulnerable to frequency analysis; certain letters are more commonly used than others, so the letter which appears most often in a message encoded with a Caesar shift can be reasonably assumed to represent the letter ‘e’. The next-most-common letter is likely to represent ‘t’, and so on. Also worth looking out for are words of particular lengths; one-, two- or three-letter words are relatively uncommon in English,so identifying them can provide an easy way in to the cipher. Of course, this relies somewhat on the message being shorter than about 25-50 characters; given advancements in cryptanalysis, the Caesar shift is now so cryptographically feeble as to be of purely historical interest.

To illustrate what I have been saying, here is a passage encoded in a Caesar shift; attempt to decipher it (with paper and pencil before trying online decoders; this is intended as an intellectual challenge):

frzdugv glh pdqb wlphv ehiruh wkhlu ghdwkv tkh ydoldqw qhyhu wdvwh ri ghdwk exw rqfh ri doo wkh zrqghuv wkdw l bhw kdyh khdug lw vhhpv wr ph prvw vwudqjh wkdw phq vkrxog ihdu vhhlqj wkdw ghdwk d qhfhvvdub hqg zloo frph zkhq lw zloo frph

Post any solutions in the comments; I will post the actual solution in a couple of weeks or so.

Vigenère created his cipher by building on the work of others who had tried to create a stronger form of substitution cipher. The result, known for three centuries as le chiffre indéchiffrable (lit.: ‘the indecipherable figure’) was a quite fiendish poly-alphabetic substitution cipher. As you might have guessed, ‘poly-alphabetic’ means that more than one cipher alphabet is used to scramble a message. “But how do you decide which alphabet to use?”, I hear you cry. The answer is simple: you use a keyword. The keyword is written out above your plaintext, repeating until each letter of your plaintext has a corresponding key letter. You would then look to a Vigenère square, such as this…

Plain

a

b

c

d

e

f

g

h

i

j

k

l

m

n

o

p

q

r

s

t

u

v

w

x

y

z

1

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

2

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

3

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

4

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

5

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

6

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

7

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

8

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

9

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

10

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

11

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

12

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

13

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

14

O

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

15

P

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

16

Q

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

17

R

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

18

S

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

19

T

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

20

U

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

21

V

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

22

W

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

23

X

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

24

Y

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

25

Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

26

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

You then find the first letter of your plaintext along the top, find its corresponding key letter in the first column, and find the letter where the row and column meet. This is the first letter of your ciphertext. You would follow the same procedure with the other letters of your plaintext until you had it fully enciphered.

A worked example: suppose you wanted to encipher the message ‘cryptography is fun’ in the Vigenère cipher using the keyword ‘proofread’. You would write out your message and the keyword thus (squared paper is very helpful):

P

R

O

O

F

R

E

A

D

P

R

O

O

F

R

E

A

C

R

Y

P

T

O

G

R

A

P

H

Y

I

S

F

U

N

You would then look for ‘c’ on the top row of your Vigenère square, and the row which has ‘p’ in the first space (the 15th row). You would trace down the column and along the row until they met, which in this case would give you the letter ‘r’ for your ciphertext. Follow the same procedure with plaintext ‘r’ and key letter ‘r’ to get ciphertext ‘i’. Continue using the same method for each letter of your message until you get the complete ciphertext:

RIMDYFKRDEYMWXWYN

To decipher the text, you would write the keyword over your ciphertext in the manner described above and run the encipherment procedure essentially in reverse — find the row that begins with the letter of the keyword that you are working on, run along the row to find the corresponding letter of ciphertext, then run up to the top of that column to find the plaintext letter.

The great advantage of the Vigenère cipher is that traditional frequency analysis is next to useless; each letter, depending on the length of the keyword and the message, could be encoded in as many different ways as there are unique letters in the keyword. In the worked example above, the keyword ‘proofread’ contains 7 unique letters: ‘p’, ‘r’, ‘o’, ‘f’, ‘e’, ‘a’, ‘d’ and has 9 letters in total. This means that a message in Vigenère cipher using ‘proofread’ for a keyword essentially uses 7 different Caesar shifts.

The mention of Caesar shifts might lead you to spot a weakness in the Vigenère cipher. Unless you use a keyword which is the same length as your message, it is likely that there will be clusters of repeated letters in your ciphertext. The distances between these clusters can indicate the likely length of your keyword, which in turn indicates how many Caesar shift cipher alphabets were used. Once the length of the keyword has been determined (for example, 9 letters), then the ciphertext can be divided up into separate collections of letters, taking for example, every 9th letter, with each collection starting at letters 1-9 of the ciphertext: the 1st, 10th, 19th … letters, the 2nd, 11th, 20th … letters, and so on. Traditional frequency analysis can then be applied to each individual collection of letters, giving the keyword and from there the plaintext message. This is called Kasiski examination, after Friedrich Kasiski, who published an account of lines of attack on poly-alphabetic substitution ciphers, including the Vigenère cipher, in 1863. (Historical note: Charles Babbage may well have come to the same conclusions a decade earlier, but this was around the time that the Crimean War broke out, so his work would have had national security applications which meant that he couldn’t publish).

As an interesting side note, the Vigenère cipher is the basis of one form of the provably unbreakable one-time pad cipher. The trick is to use as your ‘keyword’ a random string of letters (with as few repetitions as possible) which is the same length as your message (and, crucially, is only used for that message, hence ‘one-time’ pad). You would go through the substitution process as normal and then destroy that substitution string. Once the intended recipient of the message had deciphered it using a duplicate pad, they would then destroy their copy of that substitution string. This makes one-time pad messages a colossal pain to crack; opposing cryptanalysts would have to sort through all possible messages of (say) 23 characters in length, unable to tell which one is correct. There are other forms of one-time pad that involve modular arithmetic in the encoding/decoding process; I cannot say too much about these as I’m no great shakes at modular arithmetic.

Anyway, here’s a challenge to apply the knowledge gained from this post: a piece of text in the Vigenère cipher. The keyword is NOT the same length as the text. See if you can decipher it before I post the solution in a couple of weeks!

VVHPJTSDJOWMIOJOKGDPMHVKVWKEWSOXFHKIMWTOVVDXJVOFGWVCPIFC

VVHPPJSDJOWMIOJOQTWLFZWPGHKEUWVKXSLWZCIBUOQHZCIBUOQHZCIBU

OVPFSDSUVDPMVOFGOUITHWCJOOPIOJOASWHFOHRYWOPCSPEVOSEVGS

PQFWLFDSKESRJNMMOCFVMOHVONCQKHFSOPUUETGKSNZEIZCIBUOQHZCIBUOQHZCIBU

Resources used/further reading

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s