Previous Table of Contents Next


Chapter 11
Lossy Graphics Compression

Desktop computers communicate information primarily via their screens, so graphics are a major concern for computer programmers and designers. Programmers spend enormous amounts of time and effort trying to accommodate the proliferation of the Graphical User Interface (GUI). Millions of man hours and billions of dollars worth of equipment are being allocated just to make improvements in the way programs display data.

The money being spent on computers equipped to perform properly under GUIs such as Microsoft Windows or Motif has created a vast array of computers capable of displaying complex graphical images, with resolution approaching that of conventional media, such as television or magazines. In turn, this capability has spawned new software designed to exploit these capabilities.

Programs using complex graphics are showing up in virtually every area of computing applications: games, education, desktop publishing, and graphical design, just to mention a few. These programs have one factor in common. The images they use consume prodigious amounts of disk storage.

In the IBM world, for example, the VGA display is probably the current lowest common denominator for high-quality color graphics. The VGA can display 256 simultaneous colors selected from a palette of 262,144 colors. This lets the VGA display continuous tone images, such as color photographs, with a reasonable amount of fidelity.

The problem with using images of photographic quality is the amount of storage required to use them in a program. For the previously mentioned VGA, a 256-color screen image has 200 rows of 320 pixels, each consuming a single byte of storage. This means that a single screen image consumes a minimum of 64K! It isn’t hard to imagine applications that would require literally hundreds of these images to be accessed. An on-line catalog for a retail sales outlet, for example, could easily have 1,000 images stored for immediate access. The problem is that 1,000 images of this quality would consume 64MB of storage. And this is not an unreasonable number: We are just beginning to see game programs being distributed on CD-ROM, due to the enormous amounts of storage required by screen images.

This chapter discusses the use of lossy compression techniques to achieve very high levels of data compression of continuous tone graphical images, such as digitized images of photographs.

Enter Compression

There has been an explosion of research into graphics storage during the last decade, and many interesting results have been published. In the late 1970s and early 1980s, most graphics compression concentrated on using conventional lossless techniques. Popular PC file formats now use techniques discussed earlier in the book to achieve savings ranging from 10 to 90 percent on graphics images. Well-known formats using compression include the PCX, GIF, and BMP standards.

As the use of stored graphical images increased, file formats such as PCX began to appear inadequate. Cutting file sizes in half certainly is a worthwhile thing to do, but developers and users Ywere filling their storage space up so fast that system requirements for multimedia systems appeared prohibitively expensive. Worse yet, the promise of full motion video on the desktop was simply not possible until some method was developed for radically reducing storage needs. Clearly, compression capabilities needed to improve, perhaps by orders of magnitude.

Statistical and Dictionary Compression Methods

Conventional programs and data on computers respond well to compression based on exploiting statistical variations in the frequency of both individual symbols and strings of symbols or phrases. Dictionary-based systems are in fact just statistical programs in disguise. Unfortunately, these types of compression don’t tend to do very well on continuous tone images.

The primary problem these programs have stems from the fact that pixels in photographic images tend to be well spread out over their entire range. If the colors in an image are plotted as a histogram based on frequency, the histogram is not as “spiky” as we would like for statistical compression to succeed. In fact, over the long run, histograms for live images from sources such as television tend to be flat. This means that each pixel code has approximately the same chance of appearing as any other, negating any opportunity for exploiting entropy differences.

Dictionary-based compression programs run into similar problems. Images based on scanned photographs just don’t have the right kind of data characteristics to create multiple occurrences of the same phrase. In a rasterized image, for example, a vertical structure such as the side of a house may give similar strings in many consecutive rows of a picture. Unfortunately, because of the vagaries of the real world, the same feature in each row will tend to be slightly different from the one before. Out of a string of twenty pixels, one or two will vary by a single step from the scans before and after. And while these differences are small enough that they are either undetectable or meaningless to the human eye, they throw a monkey wrench into the works of dictionary-based compression. Strings have to match exactly for this compression method to work. Because of minute variations, the length of matching strings tends to be small, which limits the effectiveness of compression.


Previous Table of Contents Next