Previous Table of Contents Next


Coding

The final step in the JPEG process is coding the quantized images. The JPEG coding phase combines three different steps to compress the image. The first changes the DC coefficient at 0,0 from an absolute value to a relative value. Since adjacent blocks in an image exhibit a high degree of correlation, coding the DC element as the difference from the previous DC element typically produces a very small number. Next, the coefficients of the image are arranged in the “zig-zag sequence.” Then they are encoded using two different mechanisms. The first is run-length encoding of zero values. The second is what JPEG calls “Entropy Coding.” This involves sending out the coefficient codes, using either Huffman codes or arithmetic coding depending on the choice of the implementer.

The Zig-Zag Sequence

One reason the JPEG algorithm compresses so effectively is that a large number of coefficients in the DCT image are truncated to zero values during the coefficient quantization stage. So many values are set to zero that the JPEG committee elected to handle zero values differently from other coefficient values.

Instead of relying on Huffman or arithmetic coding to compress the zero values, they are coded using a Run-Length Encoding (RLE) algorithm. A simple code is developed that gives a count of consecutive zero values in the image. Since over half of the coefficients are quantized to zero in many images, this gives an opportunity for outstanding compression.

One way to increase the length of runs is to reorder the coefficients in the zig-zag sequence. Instead of compressing the coefficient in row-major order, as a programmer would probably do, the JPEG algorithm moves through the block along diagonal paths, selecting what should be the highest value elements first, and working its way toward the values likely to be lowest.

The actual path of the zig-zag sequence is shown in Figure 11.12. In the code used in this chapter, the diagonal sequences of quantization steps follow exactly the same lines, so the zig-zag sequence should be optimal for our purposes.


Figure 11.12  The path of the zig-zag sequence.

Implementing the zig-zag sequence in C is probably done best using a simple lookup table. In our sample code for this chapter, the sequence is coded as part of a structure that can be accessed sequentially to determine which row and column to encode:

struct zigzag {
 int row;
 int col;
} ZigZag[ N * N ] =

{
  {0, 0},
  {0, 1}, {1, 0},
  {2, 0}, {1, 1}, {0, 2},
  {0, 3}, {1, 2}, {2, 1}, {3, 0},
  {4, 0}, {3, 1}, {2, 2}, {1, 3}, {0, 4},
  {0, 5}, {1, 4}, {2, 3}, {3, 2}, {4, 1}, {5, 0},
  {6, 0}, {5, 1}, {4, 2}, {3, 3}, {2, 4}, {1, 5}, {0, 6},
  {0, 7}, {1, 6}, {2, 5}, {3, 4}, {4, 3}, {5, 2}, {6, 1}, {7, 0},
  {7, 1}, {6, 2}, {5, 3}, {4, 4}, {3, 5}, {2, 6}, {1, 7},
  {2, 7}, {3, 6}, {4, 5}, {5, 4}, {6, 3}, {7, 2},
  {7, 3}, {6, 4}, {5, 5}, {4, 6}, {3, 7},
  {4, 7}, {5, 6}, {6, 5}, {7, 4},
  {7, 5}, {6, 6}, {5, 7},
  {6, 7}, {7, 6},
  {7, 7}
};

The C code that sends each of the DCT results to the compressor follows. Note that instead of directly looking up each result, we instead determine which row and column to use next by looking it up in the zig-zag structure. We then encode the element determined by the row and column from the zig-zag structure.

 for ( i = 0 ; i < ( N * N ) ; i ++ ) {
  row = ZigZag[ i ].row;
  col = ZigZag[ i ].col;
  result = DCT[ row ][ col ] / Quantum[ row ][ col ];
  OutputCode( output_file, ROUND( result ) );
 }

Entropy Encoding

After converting the DC element to a difference from the last block, then reordering the DCT block in the zig-zag sequence, the JPEG algorithm outputs the elements using an “entropy encoding” mechanism. The output has RLE built into it as an integral part of the coding mechanism. Basically, the output of the entropy encoder consists of a sequence of three tokens, repeated until the block is complete. The three tokens look like this:

Run Length: The number of consecutive zeros that preceded the current element in the DCT output matrix.
Bit Count: The number of bits to follow in the amplitude number.
Amplitude: The amplitude of the DCT coefficient.

The coding sequence used in this chapter’s test program is a combination of Run Length Encoding and variable-length integer coding. The run-length and bit-count values are combined to form a code that is output. The bit count refers to the number of bits used to encode the amplitude as a variable-length integer.

The variable-length integer coding scheme takes advantage of the fact that the DCT output should consist of mostly smaller numbers, which we want to encode with smaller numbers of bits. The bit counts and the amplitudes which the encode follow.


Bit Count Amplitudes
1 -1, 1
2 -3 to -2, 2 to 3
3 7 to -4, 4 to 7
4 -15 to -8, 8 to 15
5 -31 to -16, 16 to 31
6 -63 to -32, 32 to 64
7 -127 to -64, 64 to 127
8 -255 to -128, 128 to 255
9 -511 to -256, 256 to 511
10 -1023 to -512, 512 to 1023

Note that each bit count encodes a symmetrical set of high and low values. The values skipped over in the middle will be encoded with a lower bit count from one in the table.

While this form of variable-bit coding is not quite as efficient as Huffman coding, it works fairly well, particularly if the data performs as expected, which means smaller values dominate and larger values are rare.


Previous Table of Contents Next