Previous Table of Contents Next


Programs to Know

General-purpose data-compression programs have been available only for the past ten years or so. It wasn’t until around 1980 that machines with the power to do the analysis needed for effective compression started to become commonplace.

In the Unix world, one of the first general-purpose compression programs was COMPACT. COMPACT is a relatively straightforward implementation of an order-0 compression program that uses adaptive Huffman coding. COMPACT produced good enough compression to make it useful, but it was slow. COMPACT was also a proprietary product, so it was not available to all Unix users.

Compress, a somewhat improved program, became available to Unix users a few years later. It is a straightforward implementation of the LZW dictionary-based compression scheme. compress gave significantly better compression than COMPACT, and it ran faster. Even better, the source code to a compress was readily available as a public-domain program, and it proved quite portable. compress is still in wide use among UNIX users, though its continued use is questionable due to the LZW patent held by Unisys.

In the early 1980s, desktop users of CP/M and MS-DOS systems were first exposed to data compression through the SQ program. SQ performed order-0 compression using a static Huffman tree passed in the file. SQ gave compression comparable to that of the COMPACT program, and it was widely used by early pioneers in desktop telecommunications.

As in the Unix world, Huffman coding soon gave way to LZW compression with the advent of ARC. ARC is a general-purpose program that performs both file compression and archiving, two features that often go hand in hand. (Unix users typically archive files first using TAR, then they compress the entire archive.) ARC could originally compress files using run-length encoding, order-0 static Huffman coding, or LZW compression. The original LZW code for ARC appears to be a derivative of the Unix compress code.

Due to the rapid distribution possible using shareware and telecommunications, ARC quickly became a de facto standard and began spawning imitators right and left. ARC underwent many revisions but has faded in popularity in recent years. Today, if there is a compression standard in the DOS world, it is the shareware program PKZIP, written by Phil Katz.

PKZIP is a relatively inexpensive program that offers both superior compression ratios and compression speed. At this writing, the current shareware version is PKZip V2.04g and can be found on many bulletin boards and online forums. Katz’s company, PKWare, also sells a commercial version. Note that V2.04g of PKZIP can create ZIP files that are not backward compatible with previous versions. On Compuserve, many forums have switched to the new format for files kept in the forum libraries. Usually, a copy of the distribution PKZ204.EXE is also found in the forum library. For example, you can find this file on 23 different forums on Compuserve. Because Phil Katz has placed the file format in the public domain, there are many other archiving/compression utilities that support the ZIP format. A search on Compuserve, using the File Finder facility on the keyword "PKZIP" resulted in 580 files found, most of which were utilities rather than data files. Programs like WinZIP, that integrate with the Windows File Manager, provide a modern interface to a venerable file format.

In DOS, two strong alternatives to PKZIP are LHArc and ARJ. LHARC comes from Japan, and has several advantages over other archiving/compression programs. First, the source to LHArc is freely available and has been ported to numerous operating systems and hardware platforms. Second, the author of LHarc, Haruyasu Yoshizaki (Yoshi), has explicitly granted the right to use his program for any purpose, personal or commercial.

ARJ is a program written by Robert Jung (robjung@world.std.com) and is free for non-commercial use. It has managed to achieve compression ratios slightly better than the best LHArc can offer. It is available for DOS, Windows, Amiga, MAC, OS/2, and includes source code.

On the Macintosh platform, there are also many archiving/compression programs which support file formats found on DOS and Unix. In addition to LHArc and ARJ, there are programs like ZipIt V1.2 lets you work with ZIP files. However, the predominant archiving/compression program is StuffIt, a shareware program written by Raymond Lau. On bulletin boards and online services that are geared to Macintosh users, you will find more SIT files (StuffIt files) than any other format. Another popular Macintosh format is CPT (created by Compact-Pro program) but it is not as widespread as StuffIt.

In general, the trend is toward greater interoperability among platforms and formats. Jeff Gilchrist (jeffg@mi.net) distributes a monthly Archive Comparison Test (ACT) that compares sixty different DOS programs for speed and efficiency, working on a variety of files (text, binary executables, graphics). If you have Internet access, you can view the current copy of ACT by fingering: s0b8@jupiter.sun.csd.unb.ca. You can also view ACT using the World-Wide Web at http://www.mi.net/act/act.html. At this writing, one promising new archiver on Gilchrist’s ACT list is X1, written by Stig Valentini (sv@id.dtu.dk). The current version is 0.90, still in beta stage. This program supports thirteen different archive formats, include: ZIP, LHA, ARJ, HA, PUT, TAR+GZIP(TGZ), and ZOO.

As mentioned earlier, you can find archive programs on Compuserve, America Online and other online services and bulletin boards. On the Internet, there are several ftp repositories. One is at oak.oakland.edu (in the directory /SimTel/msdos/archiver). Another is garbo.uwasa.fi, in the directory /pc/arcers.


Previous Table of Contents Next