Play fair, Playfair

Spoiler warning: this post contains mid-to-major-level spoilers for Have His Carcase, both Dorothy L. Sayers’ original 1932 novel and the 1987 TV adaptation starring Edward Petherbridge and Harriet Walter.

For more cryptography, see below the cut.

The Playfair cipher is, I have to admit, one of my favourite ciphers (cryptographically feeble though it ultimately is). It’s fairly simple to use and very elegant.

Unlike the Vigenère cipher, which I wrote about last week, the Playfair cipher is a mono-alphabetic digraph substitution cipher. In layman’s terms, it works by substituting pairs of letters (digraphs) in the plaintext with pairs of letters taken from a single cipher alphabet. This particular cipher was invented in 1854 by the scientist and prolific inventor Charles Wheatstone, but it was promoted by Lyon Playfair (the first Baron Playfair) and hence bears his name.

The process of creating the cipher alphabet for a Playfair encryption is based around a 5×5 grid containing letters of the alphabet. A keyword — which does not have to actually be a word — is written into the grid, one square per letter, with repeated letters omitted. Once all of the keyword has been inserted, the rest of the alphabet is written, in order, into the blank spaces in the grid. There are varying protocols for getting 25 letters in the cipher alphabet; some cryptographers omit uncommon letters such as ‘z’ or ‘q’, while others count ‘i’ and ‘j’ as one letter. Personally, I prefer the latter method.

To illustrate the above, here is a nicely long keyword: antidisestablishmentarianism. At 28 letters, it simply cannot fit into a Playfair square unchanged. However, removing repeated letters (of which this word has its fair share) gives A N T I D S E B L H M R. Putting this string into a Playfair square gives us:


























Obviously, longer keywords are preferable, as they provide a greater degree of scrambling and thus make cryptanalysis that much more difficult (this is also the reason why I prefer the ‘i/j as one’ method).

Now that we have a completed Playfair square, it’s time to explain how to use it. First, you take the text you wish to encipher, for example:

“The vulture’s maw/ Shall have his carcase, and the dogs his bones.” (from William Cowper’s translation of Homer’s Iliad)

You then ignore all punctuation and split the text into digraphs (pairs of letters). If you get a digraph consisting of two identical letters, put a different letter (such as ‘X’, ‘Q’, ‘V’, ‘Z’) after the first letter of that digraph and continue pairing. If the final result contains an odd number of letters, add dummy letters to get an even number. Applying this process to our example text gives us:


Now comes the encipherment. There are three ways to encipher a digraph with Playfair; the method used depends on the relative positions of each letter of the digraph in the Playfair square. If the letters are at diagonally opposite corners of a rectangle, each letter is encoded with the letter in the corner opposite to it either horizontally or vertically. If the letters are on the same line, they are encoded with the letter one space to their left or right, wrapping to the other end of the line as necessary. If the letters are in the same column, they are encoded with the letter one space above or below them, wrapping to the top or bottom of the column as necessary. Using all of these shifting protocols in one message would certainly provide an obstacle to cryptanalysis, but it would also be quite confusing and time-consuming for the encoder to keep track of. I therefore prefer the protocols of horizontally opposite corner, right along the row, and down the column.

Applying these protocols to our plaintext, the first digraph we have is TH. These letters are at opposite corners of a rectangle within the square, and the letters horizontally opposite to them are D and B. The first digraph of our ciphertext is therefore DB. Moving along the text, we come to the digraph RE. These letters are in the same column of the square; the letter below R is O and the letter below E is R. This digraph is therefore encoded as OR. Further along, we have SH. These letters are on the same row, at opposite ends. The letter immediately to the right of S is E, while the letter to the right of H (wrapping around to the start of the row) is S. SH in plaintext therefore becomes ES. Applying the appropriate protocol to each digraph of our plaintext, the resulting ciphertext is:


To decipher the text, the recipient would simply divide it into digraphs, consult the Playfair square, and reverse the shifting protocols: horizontally/vertically opposite corners, left/right along the row, down/up the column (as appropriate).

For some time after its creation, the Playfair cipher was devilishly difficult to crack, for several reasons. Firstly, basing the cipher on digraphs means that traditional letter-by-letter frequency analysis is useless. Secondly, frequency analysis for digraphs is considerably harder because the English language alone has around 600 digraphs. This means that, provided your ciphertext is sufficiently short, anyone who intercepts it will have a harder time finding out the contents of the plaintext.

However, as I noted at the beginning of this post, the Playfair cipher is presently considered to be fairly feeble. Firstly, while it is slightly less vulnerable to frequency analysis, it is still vulnerable. Digraphs in English have varying frequencies; the most common are ‘th’, ‘he’, ‘an’, ‘in’, ‘er’, ‘re’, and ‘es’. The most commonly appearing digraphs in the ciphertext, provided it is long enough, can be reasonably assumed to match up to one of these plaintext digraphs. Secondly, the use of a Playfair cipher can be given away by characteristics such as an even number of letters in the ciphertext, only 25 different letters being present, a suspicious lack of doubled letters, reversed digraphs such as ‘re’ and ‘er’ retaining this reversal through the encipherment process, and repeated sequences of letters being an even number of letters long and/or being separated by an even number of letters. Thirdly, if you have used keywords with as few repeated letters as possible, and if someone finds a list of those keywords, it is likely to tip them off as to which cipher you are using.

At this point, I would recommend going away and reading the following books in the order given: Whose Body?, Clouds of Witness, Unnatural Death, The Unpleasantness at the Bellona Club, Strong Poison, Five Red Herrings, Have His Carcase, Murder Must Advertise, The Nine Tailors, Gaudy Night, and Busman’s Honeymoon. These novels are the backbone of Dorothy L. Sayers’ Lord Peter Wimsey detective series, there are also a number of short stories. I recommend reading them in publication order to get the full flavour and progression of Wimsey’s character development.

I mention this series in particular because I feel that one part of it in particular will help to illustrate some of what I have been saying in this post. In any case, I’m relishing the chance to talk about something I like. ^_^

Finished? Enjoyed them? I hope so! If you’ve decided to finish reading this post first, be warned, for there be spoilers ahead. Yarrrr!


Anyway, this concerns a part of the climax of Have His Carcase, where Wimsey (whose backstory contains a certain amount of intelligence work in First World War) and Harriet Vane (a writer of detective novels, who is therefore very experienced with language and its manipulation) go through a blow-by-blow cracking of a Playfair cipher. In the novel, Wimsey twigs to the use of a Playfair cipher between the conspirators and their victim when he finds a list of words which contains very few repeated letters; he already knew that a cipher had been used, but did not know which one until that point. He then takes a letter, in cipher, which was found on the corpse of the victim. With the aid of Harriet Vane, Wimsey then picks out a flaw — a sequence of letters separate from the rest of the letter. Wimsey works out that some of them represent a date and Harriet, bearing in mind certain aspects of the wider case, suggests that ‘XNATNX’ is ‘WARSAW’. This provides a ‘crib’ on which she and Wimsey can hang a highly directed trial-and-error attack on the cipher. Through a series of deductions as to the locations and relations of letters in the cipher square, starting with ‘W’ and ‘X’ being in the same row and ‘R’ and ‘S’ being at diagonally opposite corners of a rectangle, they discover that the keyword is ‘MONARCHY’. This sequence is considerably simplified in the TV adaptation, for pacing reasons; the conspirators are sufficiently ‘rank amateurs’ (as Wimsey calls them) that they commit the cardinal cryptographic sin of indicating the keyword on the letter which is encoded with that word. This is a pair of numbers indicating a page of a book (specifically, a dictionary) and a word on that page. The keyword is still ‘MONARCHY’, for plot reasons. This modification illustrates two things: the importance of concealing one’s keywords, and the book-cipher-at-one-remove element of the code in question, which is pretty darn ingenious.


Moving on…

To conclude this post, I would like to set a challenge. Below is a piece of text enciphered in Playfair; see if you can decipher it before I post the solution in a couple of weeks’ time.


Resources used/further reading


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s