Chapter 27

Encrypting Data

by Mark Wutka


CONTENTS

Data encryption is a touchy subject. Many cryptography algorithms and their implementations are restricted to use within the United States by the National Security Agency (NSA).

It is illegal to export certain kinds of encryption software outside the U.S. There has recently been a push to lift some of these export restrictions, however, because of the greater need for security on the Internet. Many American companies can't sell their software overseas because of the encryption software embedded in their code.

The increase in commerce over the Internet has made encryption a need for some businesses. When you type your credit card number on an order form, you want to be sure that no one can see the information as it passes through the Internet. Since you can't prevent people from snooping on your data, your only hope of safely keeping your credit card number away from prying eyes is to encrypt the data before you send it.

The purpose of encryption is to turn ordinary data into completely random-looking bits. The notion of "completely random-looking" is a real science.

You may think you have some clever little way of encrypting your data but unless you really know cryptography, your scheme can probably be broken easily. Unless you really know what you're doing, stick to the known encryption algorithms and you'll be pretty safe.

Data is encrypted with a key that can be a random string of bits, or some word or phrase that you pick. The key is like a password. It needs the same degree of secrecy and the same care in creating a word or phrase that can't be guessed. The key is used by the encryption algorithm to scramble your data and to unscramble it on the other side.

A data encryption algorithm is called a cipher. The data you are encrypting is called plaintext, whereas the encrypted version of the data is called ciphertext. The process of converting ciphertext back to plaintext is called decryption.

Figure 27.1 illustrates a simple use of encryption to pass a coded message.

Figure 27.1 : Encryption can be used to pass coded messages.

Data encryption ciphers are grouped into two categories: block ciphers and stream ciphers. The stream cipher is a simple single-character-in, single-character-out cipher. That is, it does the encryption one character at a time.

Each time a stream cipher reads a character, it uses the key and accumulated data from the other characters it has processed to figure out how to scramble the next byte of data. Unlike some of the simple ciphers you may be familiar with, a good stream cipher does not just map one character to another.

If you feed two A's in a row to a stream cipher, chances are you will not get two identical characters in a row in the encrypted text. Figure 27.2 illustrates a stream cipher in action.

Figure 27.2 : A stream cipher encodes a single character at a time.

A block cipher, on the other hand, encrypts whole blocks of data at a time. Unlike a stream cipher, the block cipher can scramble all the bits in a block so that the bits for the first byte of the block can be scrambled and placed in strange places.

Of course, the key and the actual values of the bits determine what the encoded block looks like. The first bit in a block may end up in one position using a certain key and in a different position using a different key. Figure 27.3 illustrates a block cipher in action.

Figure 27.3 : A block cipher scrambles whole blocks of data at one time.

There is another way to classify encryption algorithms, based on the kind of key used. Some algorithms use a private key, also called a symmetric key, whereas others use a public/private key pair, called an asymmetric pair.

Private key encryption is probably the one you are most familiar with. Two parties agree on a secret key. The sender encrypts the data with the secret key, and the receiver decrypts the data with the same key.

If anyone else finds out the secret key, he or she can spy on the data being exchanged. Figure 27.4 illustrates a data exchange using a private key.

Figure 27.4 : Both parties agree on a private key and use that key for encryption and decryption.

One of the problems with private keys is that you have to find some way of agreeing on the key ahead of time. How do two people exchange encrypted communications if they have no way to exchange keys to begin with? Public key encryption provides a neat solution to this problem.

With public key encryption, everyone who wants to get encrypted data creates a private decryption key and a public encryption key. This is called an asymmetric key cipher because the encryption key and the decryption key are different. The important part of this scheme is that although you can determine the public key based on the private key, you cannot figure out the private key from the public key.

Anyone wanting to send you encrypted data would look up your public key, which can be published in a number of ways, and use it to encrypt a message to you. You would receive this message and decrypt it with your private key. Figure 27.5 illustrates a data exchange using a public key.

Figure 27.5 : The data is encrypted with the public key and decrypted with the private key.

Choosing the Right Kind of Encryption

Given that a stream cipher handles one byte or even one bit at a time, you might think that a stream cipher is better for encrypting computer login sessions, which are more byte-oriented. As it turns out, block ciphers aren't really too bad for login sessions, since the blocks are often only 64 bits (8 bytes) in size. Essentially, your choice of cipher shouldn't depend on whether it is a block or stream cipher.

Some of the factors that will influence your choice of encryption are:

Guarding Against Malicious Attacks

Your choice of private keys versus private/public key pairs depends on what kind of communications you are doing and what are the possible ways someone might attack your communications.

Presumably, the reason you are encrypting your data is to hide something in the data from prying eyes. This hidden information might be a credit card number, it might be the password to another system somewhere, or it might just be personal information.

When you create secure applications, you need to have some idea of the ways someone can attack your application. In general, the two kinds of attack are a simple eavesdropping and an impersonation.

An eavesdropping attack is a passive attack that can be made by monitoring network traffic. You should assume that anyone who might want to listen in on a conversation will have access to such a monitor. Because so many computers can act as network monitors with only a small amount of programming, this is a safe assumption.

An impersonation attack is very insidious: Someone has the ability of impersonating another person or computer. The person sending confidential information sends the information to the impostor rather than to the real recipient. This is typically the hardest attack to defend against and also the hardest attack to use.

You can see how encryption solves various attacks by starting first with a conversation that doesn't involve encryption. Then, as encryption is added to solve various problems, you can see how the attackers respond. Once you are able to see how devious an attacker can be, you can design your applications appropriately.

Figure 27.6 shows a simple conversation between two people with an eavesdropper watching the conversation.

Figure 27.6 : Unencrypted conversations are easy prey for an eavesdropper.

To solve the simple eavesdropping problem, the participants in the conversation agree on a secret key and encrypt their conversation. Now, the eavesdropper doesn't know what they're saying.

In certain cases, the eavesdropper may not need to know what they're saying to make an attack. Suppose, for instance, that the participants in the conversation are an automatic teller machine and a bank.

As shown in Figure 27.7, the eavesdropper records the conversation between the teller machine and the bank as someone at the automatic teller makes a withdrawal.

Figure 27.7 : The eavesdropper records an encrypted conversation.

Some time later, the eavesdropper taps back into the connection between the bank and the teller machine, and replays the encrypted conversation. This causes the bank to make another cash withdrawal from the customer's account and the teller machine to spit out the cash, as shown in Figure 27.8.

Figure 27.8 : The eavesdropper can duplicate a transaction by playing it back at a later time.

Resisting a Playback Attack

This sort of attack is often useful when encryption is used to set up a login connection or similar session, but the rest of the session is unencrypted. An eavesdropper could play back the login sequence and make a login connection some time after the original login, without ever knowing the secret key.

You can counter the playback attack in the case of a login like this but you should encrypt the entire login session anyway. If someone wants to infiltrate your system, they could interfere with your unencrypted session and send bogus commands, intercepting the responses so you never know they are there.

A simple trick for thwarting the playback attack is to ensure that the messages change slightly from one session to the next-not just that they change, but that they have to change.

A simple solution for this problem is:

  1. The receiver generates a random number and sends it to the sender.
  2. The sender must use the randomly generated number inside the message it sends to the receiver.
  3. The receiver ignores any messages with an invalid number.

Now the eavesdropper cannot play back the messages later because it is highly unlikely that the receiver would generate the same random number the next time. In the worst case, the eavesdropper could get the sender to send a message by replaying a previous number sent by the receiver, but the receiver would ignore the message because it has the wrong number.

The eavesdropper could still wreak havoc by replaying previous messages during the current session. In other words, the eavesdropper waits for the session to begin and starts recording the conversation. At some point, the eavesdropper interferes and plays back an earlier part of the conversation, possibly causing some kind of security breach.

If each message has a sequence number associated with it, however, even that playback attack does not work. Each time a message is sent, it is given a sequence number that is greater than the previous sequence number.

Any out-of-sequence message is ignored. If the eavesdropper plays back a message from earlier in the session, it has a sequence number that is out of order and the message is ignored.

Don't Store Keys in Your Applets

Java makes simple, symmetric communication almost impossible for applets. The problem here is that someone could get your applet, find the secret key, and decrypt a conversation-or impersonate either or both of the participants. You probably still want to use symmetric keys for communication, but the sender needs a way to generate a random session key and send it to the receiver.

Using Public Key Encryption to Exchange Session Keys

One of the reasons you don't see public keys used for encrypting communication channels is that the asymmetric encryption algorithms tend to take a lot longer than symmetric key encryption. Still, public keys are quite useful for passing a random session key to a potential receiver.

The sequence for this is simple:

  1. The sender generates a random session key and encrypts it with the receiver's public key.
  2. The receiver decrypts the session key using the private key and the two are ready to talk.

Unfortunately, you can't just store the public key in your applet, either. If you do, you open yourself up to an impersonation, or "man-in-the-middle" attack. Java is particularly vulnerable to this kind of attack because the code itself is downloaded over the network.

An impersonator impersonates the receiver to the sender and the sender to the receiver, as shown in Figure 27.9.

Figure 27.9 : An impersonator sits between two communicating parties and impersonates them both.

Normally, this kind of attack doesn't work with public key encryption unless you can somehow make the sender think the public key is something other than it is. In Java, this turns out to be a simpler task, although digital signatures throw an extra wrench into the works.

If an impersonator wants to set up shop, it first impersonates the receiver. When the server downloads an applet from the receiver, it really downloads the applet from the impersonator. As shown in Figure 27.10, the impersonator has cleverly substituted its own public key in place of the receiver's public key in the applet.

Figure 27.10: The impersonator gives a phony applet to the sender.

When the sender generates a random session key, it encrypts it with what is now the impersonator's public key and sends the encrypted session key to the impersonator. The impersonator, wanting to spy but not be found out, impersonates the sender by generating a random session key, encrypting it with the receiver's public key, and sending it to the receiver.

As shown in Figure 27.11, the sender now thinks it is talking to the receiver, while the receiver thinks it is talking to the sender. The impersonator, however, is in the middle, watching all the data go by.

Figure 27.11: The impersonator is now playing the part of the sender and the receiver.

Anytime the sender sends information to the receiver, the impersonator intercepts it and then passes it on to the real receiver-but not before saving any interesting parts for later use.

As discussed in Chapter 26, "Securing Applets with Digital Signatures," digital signatures can put a damper on this kind of activity but they cannot prevent all attacks. In particular, if the impersonator is able to sign the phony applet with some other valid signature, the sender thinks the applet is okay.

Using Secure HTTP to Thwart Impersonations

The problem with thwarting an impersonation attack is that it needs some piece of secret data on the sender side that the impersonator cannot change. If the server is running an applet that is downloaded from another site, every part of the applet is subject to impersonation.

Secure HTTP pages are not subject to impersonation because the browser uses a certification authority to verify the Web page, much the same way that the digital signature mechanism verifies an applet. The nice thing is that not only is the Web page downloaded securely, everything on the page is downloaded with the same secure mechanism, including applets.

Thus, you can assure your customers that if they connect to your Web page using Secure HTTP, you can guarantee the safety of any information they give to your applets. That is, assuming you are wise enough to encrypt the data before you send it to your server!

Getting Encryption Software

Sun's Security API will eventually provide a number of different encryption schemes, as well as a general interface for encrypting and decrypting information. If you can't wait for Sun, however, you still have a few options.

Caution
The U.S. has some severe export restrictions on encryption software. If you are a U.S. resident, take special note of the warnings on any encryption software you get about whether or not it is restricted. If you don't know whether something is restricted, assume it is and it's probably a safe bet.

Getting SSLava, the Secure Sockets Library

Phaos Technologies (http://www.phaos.com) has created an implementation of the secure sockets layer (SSL) protocol. SSL is like the regular network socket implementation you are familiar with but it adds a nice degree of security.

When a client makes an SSL connection with a server, the server provides the client with a digitally signed certificate verifying that it is the intended server. Through a series of checks, the client can determine whether it is the real server or an impostor.

The specification for SSL is publicly available from Netscape Communications.The SSL Version 2 document is located at http://home.netscape.com/newsref/std/SSL.html. The proposal for the new SSL Version 3 standard is at http://home.netscape.com/eng/ssl3/index.html.

The SSLava library is available for free for noncommercial use, but commercial users must buy a license. SSLava is written entirely in Java, allowing you to use it on any Java-enabled platform. Unfortunately, according to the non-commercial license for SSLava, you cannot redistribute the binary copy of their libraries along with your applet. In other words, if you want someone to run your SSLava applet, they must download SSLava themselves. You do not have these restrictions if you purchase a commercial license.

Caution
If you use SSLava within an applet, make sure that the applet itself is downloaded with a secure protocol. Otherwise, an impostor could replace the real SSLava classes with phony ones and breach your security.

The SSL protocol does solve a number of encryption headaches and it will probably be more widely available in the future. The SSLava product gives you a good way to get your feet wet with SSL.

Note
The SSL protocol is a part of Java 1.1, giving you encryption capabilities on any Java 1.1 platform. While you are waiting for Java 1.1 to be available everywhere, you can use the SSLava libraries.

Getting the Cryptix Library

Systemics, Ltd. (http://www.systemics.com) has created an excellent library of encryption classes for Java. Unlike SSLava, some of the Systemics code is used as a native library, making it hard to write applets that use the code.

It does, however, provide a variety of encryption algorithms and is available free for commercial and noncommercial use. Since Systemics is not in the U.S., their software is not subject to the U.S. export restrictions. Still, there may be some patent problems since some of the algorithms used in the library are patented in the U.S.

Systemics recommends that you seek professional legal advice about whether or not you can legally use their software in your country.

Cryptix is handy when you need to do large amounts of encryption. Since it is implemented as a native library, it is very fast. While Cryptix may not be as desirable for applets, it works very well for stand-alone applications.

All of the block ciphers in the Cryptix library are subclasses of the BlockCipher class, which has methods to encrypt and decrypt blocks of data. One of the things that appears to be lacking from the library is a way to automatically encrypt or decrypt a multi-block buffer.

This is, in part, because it is hard to handle partial blocks. When the data doesn't fill a complete block, you usually pad the block with zeroes. Unfortunately, you have to figure out where the padding starts when you decrypt the data, which requires that you embed the information in the encryption. If you are willing to accept the padding at the end of an encrypted buffer, it is simple just to break up a buffer into multiple blocks and encrypt each block.

Listing 27.1 shows an input stream filter that decrypts data using the Cryptix library. When you create the stream, you pass it the cipher you want to use.


Listing 27.1  Source Code for BlockCipherInputStream
import java.io.*;

// This class wraps an input stream filter around the decrypt
// method from the Cryptix java.crypt.BlockCipher
// class. It allows you to decrypt an input stream. You
// can use it to read in an encrypted file as if it were
// unencrypted.

public class BlockCipherInputStream extends FilterInputStream
{
// cipher is the cipher we use to decrypt
     protected java.crypt.BlockCipher cipher;

// currentBlock is the most recent block of data we decrypted. The
// block cipher must work on blocks of data. We build up blocks from
// the input, decrypt them, and let them out byte by byte

     byte[] currentBlock;

// currentChar is the position in currentBlock of the next
// character we return in the read method.

     int currentChar;

// Create a cipher input stream on InputStream in, using the BlockCipher
// cipher.

     public BlockCipherInputStream(InputStream in,
          java.crypt.BlockCipher cipher)
     {
          super(in);

// save the cipher
          this.cipher = cipher;

// Create a block that is the cipher's block size
          currentBlock = new byte[cipher.blockLength()];

// Flag that we have no characters in the block right now
          currentChar = -1;
     }

// read gets a character from the block we've decrypted,
// If we've exhausted the current block, go read another block.

     public int read()
     throws IOException
     {

// If we've used up all the chars in the block, get another block
          if (currentChar < 0) {
               readNextBlock();
          }

// If we got another block and we're still out of chars, there's
// no chars left to get.

          if (currentChar < 0) return -1;

// Fetch the next char in the block

          int ch = currentBlock[currentChar++];

// If that was the last char in the block, set currentChar to -1
// to show we need to read a new block next time.

          if (currentChar >= currentBlock.length) {
               currentChar = -1;
          }

          return ch;
     }

// readNextBlock reads in another block from input, decrypts it, and
// resets the index for the current character.

     protected void readNextBlock()
     throws IOException
     {

// Read the next character from input.

          for (int i=0; i < currentBlock.length; i++) {
               int ch = in.read();

// If we hit EOF and we only have a partial block, flag it as an error.

               if (ch < 0) {

// If we hit EOF and the current block is empty, we don't need to
// decrypt. Just mark the current character as end-of-block and return.

                    if (i == 0) {
                         currentChar = -1;
                         return;
                    }

// If we get here, it means that we hit EOF in the middle of a block.
// This shouldn't happen because we only encrypt whole blocks.

                    throw new IOException("Incomplete block.");
               }
               currentBlock[i] = (byte)ch;
          }

// Reset the index of the current character
          currentChar = 0;

// Decrypt the current block

          cipher.decrypt(currentBlock);
          return;
     }
}

All the BlockCipherInputStream does is read whole blocks of encrypted data, decrypt them, and then dole the unencrypted data out one byte at a time through the read method. This class is quite handy because you can now use encryption on any data stream, including a network socket.

The companion to this class is the BlockCipherOutputStream, which takes one byte at a time, stores the byte in a block, and encrypts each block as it fills up. It then writes out the block onto the output stream it is filtering.

Listing 27.2 shows the BlockCipherOutputStream.


Listing 27.2  Source Code for BlockCipherOutputStream
import java.io.*;

// This class implements an output stream filter that encrypts
// data using an BlockCipher from the Cryptix java.crypt libraries.

public class BlockCipherOutputStream extends FilterOutputStream
{
// The actual cipher we're using
     protected java.crypt.BlockCipher cipher;

// The current block of data we're writing
     byte[] currentBlock;

// The index of the current character we're writing.
     int currentChar;

// Create an output stream filte rfor a particular cipher
     public BlockCipherOutputStream(OutputStream out,
          java.crypt.BlockCipher cipher)
     {
          super(out);

          this.cipher = cipher;

          currentBlock = new byte[cipher.blockLength()];

          currentChar = 0;
     }
          
// write adds a character to the output block, and when the buffer
// is full, it encrypts the block and sends it up the filter
     public void write(int ch)
     throws IOException
     {

// Add the character to the block
          currentBlock[currentChar++] = (byte) ch;

// If we've filled the block, encrypt the block and write it out
          if (currentChar >= currentBlock.length) {
               cipher.encrypt(currentBlock);

               out.write(currentBlock);

               currentChar = 0;
          }
     }

// Flush fills out the remainder of the current block with 0's, then
// encrypts the block and writes it out.

     public void flush()
     throws IOException
     {

// If there's a partial block, fill it out with 0's
          if (currentChar > 0) {
               while (currentChar < currentBlock.length) {
                    currentBlock[currentChar++] = 0;
               }

// Encrypt the block and write it out
               cipher.encrypt(currentBlock);

               out.write(currentBlock, 0, currentBlock.length);

               currentChar = 0;
          }

// Do whatever else we have to do to flush the stream
          super.flush();
     }


     public void close()
     throws IOException
     {
// Before closing the stream, flush it.
          flush();

          super.close();
     }
}

These classes are fairly small. All they really do is arrange data into a form more suitable for the encryption and decryption routines, which do the hard work.

Listing 27.3 shows a program that tests out both the encryption and decryption streams using a pipe. It uses the IDEA cipher for encryption, which is a patented cipher and must be licensed for commercial use. It can be used noncommercially for free.

Tip
The technique used in the IdeaCrypt program of testing input and output stream filters by using a pipe is a handy technique. You can also use it to test out networking protocols if you are writing both sides of the protocol. The pipe takes only two lines of code to create, compared to the amount of time it takes to set up socket connections.


Listing 27.3  Source Code for IdeaCrypt.java
import java.io.*;

//
// This program encrypts a block of data using the IDEA
// cipher, then unencrypts it.

public class IdeaCrypt
{
     public static void main(String[] args)
     {

// encodeMe is the string we want to encode
          String encodeMe = "This is a string to be encoded!!!";

// The IDEA cipher implemented in the Cryptix library uses a 16-byte
// key - that's 128 bits!
          String key = "<Sixteen>ByteKey";

          try {

// The Cryptix library likes the keys and the data to be in byte
// arrays, not strings. We allocate an array for the key and
// copy the key into the array.

               byte[] keyBytes = new byte[key.length()];

// Copy the key string to a byte array.
               key.getBytes(0, key.length(), keyBytes, 0);

// Create an IDEA cipher for our key.
               java.crypt.IDEA cipher = new java.crypt.IDEA(keyBytes);

// In order to demonstrate the encryption and decryption filters, we
// just set up a pipe, so we can write to the pipe and read from it,
// testing the encryption and decryption in one simple program.

               PipedInputStream pipeIn = new PipedInputStream();
               PipedOutputStream pipeOut = new PipedOutputStream(pipeIn);
     
// Create a decryption filter in the input side of the pipe. It will
// decrypt the data we wrote to the pipe.

               BlockCipherInputStream decrypter =
                    new BlockCipherInputStream(pipeIn, cipher);

// Create an encryption filter on the output side of the pipe. This will
// encrypt the data we write to the pipe.

               BlockCipherOutputStream encrypter =
                    new BlockCipherOutputStream(pipeOut, cipher);

// Now create a print stream on top of the encryption stream so we can
// write stuff to the pipe using println.

               PrintStream encryptPrint = new PrintStream(encrypter);

// Write out the string to be encrypted
               encryptPrint.println(encodeMe);

               System.out.println("Wrote encrypted string: "+encodeMe);

// Go ahead and flush the print stream and close it.
               encryptPrint.flush();
               encryptPrint.close();

// Create a DataInputStream on the decryption stream. This allows us to
// use readLine to read the data back in.
               DataInputStream decryptIn = new DataInputStream(decrypter);

// Read the unencrypted string
               String unencrypted = decryptIn.readLine();

               System.out.println("Read unencrypted string: "+unencrypted);

          } catch (Exception e) {
               e.printStackTrace();
          }

     }
}

Getting the Acme Crypto Package

Jef Poskanzer has written a number of useful Java classes, including an encryption package. You can find his excellent Java software at http://www.acme.com/java. The beauty of the Acme package is that it is written entirely in Java. This means that your applets can use encryption and not need the installation of native libraries on the local host.

The Acme Crypto package makes a good complement to the SSLava package. You can use SSLava for a normal encrypted data exchange, and then use Crypto for various areas where the SSL protocol doesn't make sense or is too cumbersome. For instance, you might use the Crypto package to create encrypted data files. The Crypto package is also advantageous if you don't want to buy a commercial license for SSLava but don't want to require that your users download the encryption library themselves.

Caution
The unfortunate aspect of downloadable encryption code is that you may violate the U.S. export restrictions if your site is based in the U.S. and the encryption method you use is restricted. If you are in the U.S. and plan to make your applet available internationally, you should seek professional legal advice about the steps you need to take.

The Crypto package includes several block ciphers, such as DES, DES3, and IDEA. The data encryption standard (DES) algorithm has been around for a number of years and is used heavily around the U.S.

Because of the relatively small key length of DES, information is sometimes encrypted three times with DES, using two different keys. This is called DES3.

The IDEA cipher is a relatively new block encryption scheme that is still undergoing some analysis of its security. The IDEA cipher is patented, and you have to get a license if you want to use it commercially.

The owners of the patent, Ascom Systec AG, are based in Switzerland and can be reached over the Net at idea@ascom.ch.

In addition to block ciphers, the Crypto package has two stream ciphers. One cipher, the rot13, is a simple substitution cipher that is completely insecure. Rot13 is an alphabetic shift in which A is replaced with M, B is replaced with N, and so on.

It is popular on the UseNet for hiding "spoilers" to games and movies, or for hiding potentially offensive material. Most newsreaders can decrypt rot13 so you can view the hidden information if you choose to. Don't even think of using rot13 as a security tool. It is mainly used for testing.

The RC4 stream cipher is a secure cipher that is both restricted for export and patented. The patent is owned by RSA Data Security Inc., who originally kept the design of RC4 a secret.

Unfortunately for RSA, someone anonymously posted the algorithm to the UseNet in 1994, exposing it to the world. RSA still owns the patent on the algorithm, however, so if you plan to use it in a commercial product, you should contact RSA for licensing information.

The Crypto package also includes EncryptedInputStream and EncryptedOutputStream filters. You can filter any input or output stream with these filters, using any stream or block cipher in the Crypto package. You can also create new ciphers by subclassing the BlockCipher and StreamCipher classes.

The Crypto package is used in a remote system-access application in Chapter 28, "Accessing Remote Systems Securely."