The Vigenere Cipher

 
Objective: Demonstrate a polyalphabetic cipher using a tableau or matrix to aid in the encoding/decoding of messages.

Level: Finite mathematics, linear algebra, or a computer science course that includes a unit on cryptography.

Prerequisites: Brief discussion of coding and row-column look-ups in a table or matrix.

Platform: Coding routines in visual basic and MATLAB are discussed. Links to java applets are included.

Instructor's Notes:

Background

A cipher is a secret method of writing, as by code. Cryptography, in a very broad sense, is the study of techniques related to aspects of information security. Hence cryptography is concerned with the writing (ciphering or encoding) and deciphering (decoding) of messages in secret code. Such considerations emphasize privacy and confidentiality of information. However, in today's electronic society cryptography has developed to encompass other features such as data integrity and authentication. Cryptography has evolved into a mathematical discipline that plays a major role in business and government.

Cryptography has a long and rich history dating from the Egyptians some 4000 years ago. A complete non-technical account of cryptography from antiquity through the early 1960's is given in Kahn's The Codebreakers [1]. It relates historical aspects which were most significant to the development of modern cryptography, including developments related to the two world wars. For a concise summary of important developments in the 1970's and their relation to cryptography today see pp. 1-6, in [2].

An overview of transposition and substitution ciphers

A transposition cipher is an encoding process that does not change any of  the letters of the original message, but changes the position of the letters. (Cryptographers use the term "plaintext" for the original message.) One simple transposition cipher reverses the order of the letters. For example, the message THE GAME IS AFOOT becomes encoded as EHT EMAG SI TOOFA. Such "backward writing" is easy to recognize and decode. There are a number of other easy transposition ciphers discussed in [3] and [4]. By analogy, transposition ciphers are like jigsaw puzzles. All the pieces are present, its just a matter of putting them in the correct order.

A substitution cipher is an encoding process that maintains the order of the letters in the message, but changes their identity. Each letter of the message is replaced by another letter or symbol. For example, Morse code is a substitution cipher in which each letter is replaced by a specific set of dots and dashes. Many substitution ciphers use only one alphabet, and are called "monoalphabetic". This means that we substitute one and only one  letter for a particular letter in the message. For example,  every T in the message is replaced by the same substitute letter or symbol. Such a cipher scheme is easy to remember, but is also vulnerable to "cracking" using frequency analysis (letter counting). Given a sufficiently large encoded message derived using a monoalphabetic substitution cipher, it can readily be "cracked" by comparing the frequency of letter occurrences in the coded message with the frequency of letter occurrences in the language used for the message.

Example 1. A well-known type of substitution cipher is called a shift cipher or Caesar cipher. A key number k is agreed upon by the sender and the receiver. Then the standard alphabet is shifted k positions so that the k-th letter in the alphabet is substituted for letter A, the k+1st for B, etc and we wrap the alphabet to maintain a one-to-one correspondence. Suppose k = 7, then the resulting shifted cipher is given by the correspondence that starts with A replaced by G as shown in Figure 1.

                                                           Figure 1.

Using this cipher the message CONVOY PROCEED becomes IUTBUE VXUIKKJ.

___________________________________________________________

Polyalphabetic ciphers

In order to make substitution ciphers more secure, more than one alphabet can be used. Such ciphers are called polyalphabetic, which means that the same letter of a message can be represented by different letters when encoded. Such a one-to-many correspondence makes the use of frequency analysis much more difficult in order to crack the code. We describe one such cipher named for Blaise de Vigenere a 16-th century Frenchman. 

The Vigenere Cipher

The Vigenere cipher is a polyalphabetic cipher based on using successively shifted alphabets, a different shifted alphabet for each of the 26 English letters. The procedure is based on the tableau shown in Figure 2 and the use of a keyword. The letters of the keyword determine the shifted alphabets used in the encoding process.

                                                              Figure 2.

Example 2. For the message COMPUTING GIVES INSIGHT and keyword LUCKY we proceed by repeating the keyword as many times as needed above the message, as follows.

Method #1. The Vigenere tableau in Figure 2 can be used directly. For each letter of the message use the letter of the keyword to determine a row and go across the row to the column headed by the corresponding letter of the message. We illustrate this row-column look-up in Figure 3 for the first two letters of the message. The red arrows indicate the encoding of the first letter C and the green arrows the encoding of the second letter O. It follows that the first two letters "CO" in the message are encoded as "NI".

                                                             Figure 3.

Method #2. We use the shifted alphabets that start with the letters of the keyword. We extract these shifted alphabets from Figure 2 in the order indicated by the numbered arrows in Figure 4a. This group of alphabets is repeated as many times as needed to assign an alphabet to each letter of the message. We append a standard alphabet at the top of this set of shifted alphabets to produce the Code Table that appears in Figure 4b.

                                                    Figure 4a.

                                                     Figure 4b.

Using the letters of the message in order we substitute the letter in the shifted alphabet associated with the letter of the keyword that appears immediately above it by going across to the column headed by that letter in the code table. To encode the first letter C we use the row of the code indicated by the arrow and the column indicated by the arrow; see Figure 5. Hence the letter N is substituted for C.

                                                         Figure 5.

The next letter to be encoded is O. We now use the second shifted alphabet from the code table and the column headed by the letter O. See the arrows in Figure 6.

                                                           Figure 6.

Continuing in this way using either method we find the encoded message that appears in  Figure 7.

                                                           Figure 7.

Note that the letter I in the message correspond to four different letters in the encoded message. Also in the encoded message the letter E was substituted for two different letters of the original message. Such many-to-one substitutions make letter frequency counting much more difficult.

____________________________________________________________

When the encoded message is received it must be decoded. We start the same way we did to encode; repeat the keyword as many times as needed above the encoded message. The decoding procedure is just the reverse of encoding. Either the Vigenere tableau in Figure 2 or the Coding Table in Figure 4b can be used to decode. To decode code a message using the Vigenere tableau we proceed as follows.

Go across the top row to the column headed by the letter in the keyword. Now go down that column until you come to the corresponding letter in the encoded message. The correct letter in the original message is found by proceeding to the left across the row containing the letter in the encoded message.
Example 3. Again let our keyword be LUCKY and let the encoded message be XUVRGDYXOPJQJOPP. Using Figure 2 we column-row look-up as described above to decode the first four letters a MATH. What is the remainder of the message?

___________________________________________________________

To decode a message using the Coding Table in Figure 5 we use a letter of the keyword to determine a row, move across that row until we find the letter of the encoded message and then go up the column to find the letter in the original message. Try this procedure with the encoded message in Example 3.

Cracking the Vigenere Cipher

For 300 years the Vigenere cipher was considered to be practically unbreakable. Then in 1863 a Prussian military officer devised a method to determine the length of the keyword and then divide the message into a simpler form to which letter frequency analysis could be applied. For further information see the URL given in [5] . We will not pursue this topic here.

Software for the Vigenere Cipher

A visual basic routine for encoding and decoding is available for downloading by clicking on VBVigenere.ZIP. Figure 8 shows the nice interface for this routine which was supplied by Faith Chao (Golden Gate University). Faith uses this routine as part of a unit on 'Codes, Coding, and Cryptography' in a course for Computer and Information Science majors. The course is 15 weeks and each week she gives the students a set of PowerPoint slides which summarize the material covered. Her visual basic routine can be accessed from PowerPoint. Figure 9 shows the visual basic screen once keyword has been entered; note the shifted alphabets as used in the Code Table in Figure 4b.

                                                         Figure 8.
                                                          Figure 9.

There are a number of Java applet implementations available for the Vigenere Cipher. See [6].

An interactive Vigenere cipher routine is available in MATLAB, just click on ventable.ZIP. In this routine the code table of Figure 4b is used. The user must position and click a mouse over the correct substitution for each letter of the message. The interactive encoding screen is shown in Figure 10. This routine was written by David R. Hill.

                                                        Figure 10.
References:

[1] D. Kahn, The Codebreakers, Macmillan Publishing Company, 1976.

[2] A. Menezes, P. van Oorschot, and S. Vanstone, Handbook of Applied Cryptography, CRC Press, 1997.

[3] M. Gardner, Codes, Ciphers, and Secret Writing, Dover Publications, Inc., 1972.

[4] L. Smith, Cryptography the Science of Secret Writing, Dover Publications, Inc., 1943.

[5] http://math.ucsd.edu/~crypto/java/EARLYCIPHERS/Vigenere.html

[6] http://islab.oregonstate.edu/koc/ece575/02Project/Mun+Lee/VigenereCipher.html#demo

Credits:  The idea for this demo was submitted by 

Faith Chao
Department of Mathematics 
Golden Gate University
The visual basic routine is included in Demos with Positive Impact with her permission.

DRH 7/05/01   Last updated 9/15/2010 DRH

Since 3/1/2002