The Data Compression Book-:Index

Table of Contents

Index

ACT (Archive Comparison Test), 25

adaptive compression, 76-77, 527. See also adaptive Huffman coding

dictionary-based compression, 203-207

example (QIC-122), 204-207

with graphic files, 325-326

Adaptive Differential Pulse Code Modulation (ADPCM), 319, 527

adaptive Huffman coding, 77-112

algorithm for, 82

code for, 88-89, 101-112

compression program, 91, 101-112

EncodeSymbol routine, 92-95

escape code in, 83-84

expand program, 91-92, 101

incoming symbols, 100-101

initialization of array, 89-90

overflow problem with, 84-87

rescaling, benefit of, 87-88

swapping in, 81, 96-97

symbols not used, coding space for, 82-83

updating tree in, 77-80, 95-100

adaptive modeling, 18-20, 155-162, 527. See also

adaptive compression; adaptive Huffman coding

compression in, 19

decompression in, 19

ADC. See analog-to-digital converter

AddFileListToArchive() routine (CARMAN), 399-400

Add files command (CARMAN), 382-383

AddString() routine (LZSS), 235-238

Adobe, 213

ADPCM (Adaptive Differential Pulse Code Modulation), 319, 527

affine functions, 462-463

affine maps, 476-477

AHUFF.C, 101-112

statistics for, 512-514

alphabets, 527

American National Standards Institute (ANSI), 3

analog-to-digital converter (ADC), 292-295, 311, 312

ANSI (American National Standards Institute), 3

ANSI C, 3-6

prototyping in, 43-44

Apple, 213-214, 298

ARC (compression program), 24, 209, 210, 528

Archive Comparison Test (ACT), 25

archives, 527

ARITH.C, 123, 136-152

statistics for, 512-514

ARITH-1.C., 156-162

code for, 156-158

escape code as fallback with, 159-161

improvements to, 161-162

ARITH-N.C, 163-200

code for, 172-200

escape possibilities, 164-165

flush code for, 170

memory and implementation of, 170

order(-1) table for, 169

order(-2) table for, 169-170

potential improvements to, 171

redesign of data structures for, 166-169

scoreboarding with, 165-166

statistics for, 512-514

updating ARITH-1.C. for, 163-164

arithmetic coding, 16-17, 114-152, 528

code for, 123, 136-152

compression program, 124-125, 136-152

decoding process in, 116-117, 121-122, 133-136

encoding process in, 114-116, 120-121, 130-133

expansion program, 125-126

floating-point math with, 118

flushing encoder in, 133

Huffman coding, advantages over, 113-114, 122-123

initialization

of encoder, 130

of model, 126-130

integer registers, using, 118-120

output from, 114

totals[] array, building, 130

underflow, handling, 120-121

ARJ (compression program), 24, 209, 210, 299-302, 381, 528

Association for Shareware Professionals (ASP), 534-535

AT&T, 290

audio, digital. See digital audio

average_boundaries() function (FRAC.C), 477

Barnsley, Michael, 458-459, 462

BITIO.C, 35-43

BITIO.H, 36-38

black cube, 298

block coding, 528

“blocky” images, 469

Boss, Roger, 458, 510

BuildFileList() routine (CARMAN), 393-394

build_model() routine (arithmetic coding), 124, 127-128

build_totals() routine (arithmetic coding), 130

build_tree() (HUFF.C), 58

C (programming language)

issues in writing portable, 3-7

reasons for using, 2-3

versions of, 3

CAR files, 382

structure of, 384

CARMAN, 381-455

archive files, opening, 396-398

CAR files in, 382, 384

code, 408-455

command-line processing in, 390-392

command set, 382-383

file list, generating, 392-396

headers, 384-390

CRC, header, 388-390

storing, 385-388

main processing loop of, 399-408

extraction, file, 406-408

input file, skipping/copying, 404-405

insertion, file, 405-406

workspace cleanup, 408

CCITT (Consultive Committee for International Telegraph and Telephone), 22, 326, 528

Central Point Software, 210

CHEETAH.GS, 340, 375-379

CHURN.C, 517-526

classify_domains() function (FRAC.C), 474-475

Cleary, John, 162

codecs, 312-314

codes, 528

coding, 15-17, 528

adaptive, 76-77

modeling vs., 12-13

Collage Theorem, 462

color images, 344-345

fractal compression of, 461

COMPACT (compression program), 23, 529

COMPAND.C, 314-318

companding, 310-318

code, 314-318

codecs with, 312-314

and linear conversion, 311-312

and minimum resolution, 310-311

compilers, 4

COMPRESS (compression program), 23, 208, 529

and LZW, 262

compressed archive files, 381-382

compress_init() function (FRAC.C), 473-474

compression, lossy vs. lossless, 11-12

compression algorithms

adaptive Huffman coding, 91, 101-112

arithmetic coding, 124-125, 136-152

DCT.C, 346-348

fractal compression, 461-467, 472

LZ77, 216-218

LZSS, 227, 231-242

AddString() routine, 235-238

DeleteString() routine, 238-240

exit code, 235

initialization, 232-233

main loop, 233-235

support routines, binary tree, 240-242

LZW, 262-263, 270-271

compression ratios, 7, 529

compress_range() function (FRAC.C), 475-476

Compton’s, 213

Compuserve Information Service, 24, 210

constants, in LZSS, 228-229

Consultive Committee for International Telegraph and Telephone (CCITT), 22, 326, 528

Contractive Mapping theorem, 460

ContractNode() routine (LZSS), 240-241

convert_int_to_symbol() routine

arithmetic coding, 124-125, 131, 135-136

statistical modeling, 160-161

Cosine Transform Matrix, 333-335

count_bytes() routine

arithmetic coding, 128

HUFF.C, 56

CPT (compression program), 25

CPU, and arithmetic coding, 16, 17

Cyclical Redundancy Check (CRC), 529

DAC. See digital-to-analog converter

data structures, in highest-order modeling, 166-169

DCLZ, 211

DCT. See Discrete Cosine Transform

DCT.C, 345-370

code for, 357-370

compression module, 346-348

expansion algorithm, 353-354

forward DCT routine, 349-350

initialization of, 348-349

input codes, 355-356

OutputCode() routine, 351-353

ReadDCTData() routine, 354-355

WriteDCTData() routine, 350-351

decode_string() routine (LZW), 272-273

DecodeSymbol routine (adaptive Huffman coding), 91

decompression algorithms

adaptive Huffman coding, 91-92, 101

arithmetic coding, 125-126

DCT.C, 353-354

fractal images, 467-469, 477

LZ77, 218-219

LZSS, 242-244

potential improvements to, 243-244

LZW, 264-270, 271-273

implementation, 267

tree maintenance and navigation, 268-270

undefined code, problem with, 266-267

Delete files command (CARMAN), 383

DeleteString() routine (LZSS), 238-240

dictionaries, 530

dictionary-based compression, 20, 201-214

adaptive methods with, 203-207

example (QIC-122), 204-207

ARC, 209

COMPRESS (program), 208

example of, 201-202

general-purpose programs, 210-211

with graphics files, 322-323

hardware-specific programs, 211-212

history of, 207-208

patented software, 212-214

static dictionaries, 202-203

differentiation modulation, 324-325

digital audio

low-pass filters in, 293-294

PC-based sound, 297-299

and range of hearing, 290

replay of sound, 293

and sampling, 291-292, 294-297

waveforms, 290-291

Digital Signal Processor (DSP), 22

digital-to-analog converter (DAC), 293, 295, 297, 311-312

Discrete Cosine Transform (DCT), 22-23, 327-331, 529

code, 357-370

and entropy encoding, 343-344

implementing, 332-333

matrix multiplication, 333-335

inverse, 330, 356-357

output of, 335-336

and quantization, 337-340

sample program, utilizing, 345-370

compression module, 346-348

expansion algorithm, 353-354

forward DCT routine, 349-350

initialization, 348-349

input DCT codes, 355-356

input format, 346

Inverse DCT, 356-357

OutputCode() routine, 351-352

ReadDCTData() routine, 354-355

WriteDCTData() routine, 350-351

two-dimensional, 330

usefulness of, 331-332

and zig-zag sequence, 341-343

Discrete Fourier Transform, 335

Disk Doubler, 211

DOS, compression programs in, 23-24

Dr. Dobb’s Journal, 7

DSP. See Digital Signal Processor

8-bit characters, 4

encode_ symbol() routine

adaptive Huffman coding, 91-95

arithmetic coding, 131

encoding, 528

entropy, 13-14, 530

entropy encoding, 343-344

ERRHAND.C, 53-54

ERRHAND.H, 53

errors, quantization, 295

escape codes, 530

adaptive Huffman coding, 83-84

ARITH-1.C., 159-161

ARITH-N.C, 164-165

expansion algorithms. See decompression algorithms

Extract() routine (CARMAN), 405-408

Fano, R. M., 27

Fast Fourier Transform (FFT), 22, 327-329

Fibonacci function, 84-85

FIF (Fractal Image Format), 459

filters, low-pass, 293-294

find_map() function (FRAC.C), 476-477

FindNextNode() routine (LZSS), 241-242

finite-context modeling, 154

first pass, 155

Fisher, Yuval, 458, 510

floating-point math, with arithmetic coding, 118

flush_arithmetic_encoder() routine (arithmetic coding), 133

flushing, with highest-order modeling, 170

Forward DCT() routine, 349-350

FRAC.C, 471-505

classify_domains() function of, 474-475

code, 477-505

compress_init() function, 473-474

compression results, 506-509

compress_range() function of, 475-476

find_map() function of, 476-477

main compression module, 472-473

traverse_image() function of, 477

fractal image compression, 457-458

decoding, 467-469

history of, 457-459

iterated function systems, 459-463

image compression with, 461-467

mathematics of, 460-461

partitioned iterated function systems, 463-467

patent concerns, 509-510

and resolution independence, 469-471

sample program, 471-509

code, 477-505

decompression module, 477

domain classification in, 474-475

image partitioning in, 475-476

initialization function, 473-474

main compression module, 472-473

optimal affine maps, finding, 476-477

results, compression, 506-509

Fractal Image Format (FIF), 459

FractInt program, 458

freeware, 298-299, 530

function body, 6-7

function declarations, parameters in, 43-44

function prototypes, 4-6

G.721 algorithm, 318-319

ghost buffers, 244

Gibb’s phenomenon, 457

GIF format, 210, 212-213

Gilchrist, Jeff, 25

global variables, in LZSS, 229-231

Graphical User Interface (GUI), 321

graphics compression

fractal compression. See Fractal image compression

lossy. See Lossy graphics compression

Group 3 FAX, 530

GS.C, 346

code, 370-372

GSDIFF.C, 372-375, 466-467

GUI (Graphical User Interface), 321

hard disk, and dictionary-based compression, 211

hashing function (LZW), 268-269

header CRC, 388-390

headers, in CARMAN, 384-390

hearing, range of, 290

Hewlett-Packard, 211

higher-order modeling, 153-154

highest-order modeling, 162-170

escape possibilities, 164-165

flush code for, 170

memory and implementation of, 170

order(-1) and (-2) tables for, 169-170

and redesign of data structures, 166-169

scoreboarding with, 165-166

updating ARITH-1.C. code for, 163-164

HUFF.C

code, 60-73

command lines for, 74

statistics for, 512-514

HUFF-E, command lines for, 74

Huffman, D. A., 15, 16, 531

Huffman coding, 15, 54, 530-531. See also Adaptive Huffman coding

advantages of arithmetic coding over, 122-123

arithmetic coding vs., 16-17

BITIO.C as illustration of, 35-43

difficulties with, 113-114

model of, 12-13

probability table in, 75-76

Shannon-Fano coding vs., 34-35

Huffman table, 18

Huffman tree, 18

algorithm for building, 31-35

building, 55-58

counts, saving, 56-57

symbols, counting, 56

using, 59-60

Hutchinson, J., 458

IBM, 213-214, 297

IDCT (Inverse DCT), 329-330, 356-357

IFS. See Iterated Function Systems

information theory, 13-15, 27, 531

initialization

adaptive Huffman coding, 89-90

arithmetic coding, 126-130

DCT.C, 348-349

Discrete Cosine Transform, 348-349

fractal image compression, 473-474

LZSS, 232-233

initialize_model() routine (adaptive Huffman coding), 76

InitializeTree(tree) (adaptive Huffman coding), 90

InitTree() (LZSS), 233

input_counts() (HUFF.C), 57

integer registers, with arithmetic coding, 118-120

Intel, 298

International Standards Organization (ISO), 22, 326, 338, 531

International Telegraph and Telephone Consultative Committee (CCITT), 528

Internet, 25, 298

Inverse DCT (IDCT), 329-330, 356-357

ISO. See International Standards Organization

Iterated Function Systems (IFS), 459-467. See also Partitioned Iterated Function Systems

image compression with, 461-467

mathematics of, 460-461

Iterated Systems, Inc., 458, 509-510

Jacobs, Bill, 458, 510

Jacquin, Arnaud, 458

Joint Photographic Experts Group (JPEG), 22, 23, 326, 531. See also JPEG standard

JPEG. See Joint Photographic Experts Group

JPEG standard, 326-345

coding, 340-345

compression algorithm, 327

Discrete Cosine Transform with, 327-340

drawbacks of, 457

Jung, Robert, 209

Katz, Phil, 24

K&R C, 3-7

Lau, Raymond, 25

League for Programming Freedom, 213

Lehman, Bruce, 214

Lempel, Abraham, 21, 207, 532, 536

LHArc (compression program), 24, 209, 210, 531

library function calls, 3-4

LIFS (Local Iterated Function Systems), 459

Linear Predictive Coding (LPC), 319, 532

LISAW.GS, 464-471

List files command (CARMAN), 383

Local Iterated Function Systems (LIFS), 459

lossless compression, 11-12, 532

of sound, 299-302

lossy compression (generally), 11, 22-23, 532

of sound, 302-303

lossy graphics compression, 321-379

and adaptive coding, 325-326

and differential modulation, 324-325

JPEG standard for, 326-345

coding, 340-345

compression algorithm, 327

Discrete Cosine Transform with, 327-340

and quantization, 337-340

results, analysis of, 375-379

sample program, 345-370

code, 357-370

compression module, 346-348

expansion algorithm, 353-354

forward DCT routine, 349-350

initialization, 348-349

input DCT codes, 355-356

input format, 346

Inverse DCT, 356-357

OutputCode() routine, 351-352

ReadDCTData() routine, 354-355

WriteDCTData() routine, 350-351

statistical and dictionary compression vs., 322-323

support programs, 370

GS.C, 370-372

GSDIFF.C, 372-375

low-pass filters, 293-294

LPC (Linear Predictive Coding), 319, 532

LZ77, 21, 207-208, 209, 215-221, 532

compression algorithm, 216-218

decompression algorithm, 218-219

as “greedy” algorithm, 226

inefficiency of, 221

limitations of, 255

LZ78 vs., 256-257

LZSS vs., 262

problems with, 220-221

and QIC-122, 205

text window in, 215-216

LZ78, 21-22, 207-208, 209, 255-262, 287-288, 532. See also LZW

functioning of, 257-260

implementation of, 260-262

LZ77 vs., 256-257

tree in, 260-261

LZSS, 221-253, 532

and ARJ 2.10, 300-301

balanced trees, maintaining, 225-226

binary tree in, 230-231

and CARMAN, 383, 405

code for, 244-253

compression program, 227, 231-242

AddString() routine, 235-238

DeleteString() routine (LZSS), 238-240

exit code, 235

initialization, 232-233

main loop, 233-235

support routines, binary tree, 240-242

constants in, 228

data structures in, 222-225

expansion program, 242-244

potential improvements to, 243-244

global variables in, 229-231

as “greedy” algorithm, 226-226

LZ77 vs., 262

macros in, 228-229

text window in, 221-222

LZSS.C, 244-253

statistics for, 512-514

LZW, 208, 210, 212, 213, 262-288, 532

code, 273-278

compression program, 262-263, 270-271

decompression program, 264-270, 271-273

and LZW implementation, 267

tree maintenance and navigation, 268-270

undefined code, problem with, 266-267

LZW15V.C as improvement of, 279

patent concerns, 287-288

LZW12.C

code, 273-278

decompression routine in, 271-273

hashing function of, 268-269

“special” nodes in, 270

statistics for, 512-514

LZW15V.C

code, 280-287

as improvement over LZW, 279

statistics for, 512-514

Macintosh, compression programs for, 24-25

macros, in LZSS, 228-229

MAIN-C.C, 45-50

MAIN-E.C, 50-53

MAIN.H, 45-46

Mandelbrot, Benoit, 457, 458

Mandelbrot set, 457-458

MassageMSDOSFileName() routine (CARMAN), 394-396

measurement of compression, 7

Microcom Corp., 210

Microsoft, 211, 213

MNP-5, 210

modeling, 17-18. See also adaptive modeling; highest-order modeling; Statistical modeling

coding vs., 12-13

finite-context, 154

higher-order, 153-154

models, 532

static, 535

modulation, differentiation, 324-325

Moving Pictures Experts Group (MPEG), 22, 23, 532-533

MPEG-II, 22

MPEG (Moving Pictures Experts Group), 22, 23, 532-533

MS-DOS, CARMAN with, 392-396

NeXT Computer, 298

nodes[] array, 88-89

Nyquist, Harry, 295

Nyquist Rate, 295, 533

on-line services, 298

OpenArchiveFiles() command, 397-398

Oracle, 213

order-0 modeling, 154

order(-1) table (ARITH-N.C), 169

order(-2) table (ARITH-N.C), 169

order (of model), 533

OutputBits routine (adaptive Huffman coding), 93

OutputCode() routine (DCT.C), 351-353

output_counts() routine

arithmetic coding, 129

HUFF.C, 57

overflow, in adaptive Huffman coding, 84-87

p x 64, 533-534

P6 processor (Intel), 298

parameters, in function declarations, 43-44

ParseArguments() routine (CARMAN), 390-391

Partitioned Iterated Function Systems (PIFS), 458, 467, 509

compression with, 463-467

patent concerns

dictionary-based compression programs, 212-214

fractal image compression, 509-510

LZW algorithm, 287-288

PC Backup, 210

PC-based sound, 297-299

PCX format, 322

phrases, 533

PIFS. See Partitioned Iterated Function Systems

PKWare, 24

PKZ204.EXE, 24

PKZIP, 24, 209, 210, 381, 533

PNG format, 209, 213

Print files command (CARMAN), 383

probability table (Huffman coding), 75-76

prototypes, 43-44

pseudocode, 2

QIC-122, 205-207, 211, 212

quality factor, 345

quantization, 337-340

errors in, 295

matrix for, 338-340

Radford, Neal, 161

ReadDCTData() routine, 354-355

redundancy, 13

redundant information, 13

refine_image() function (FRAC.C), 477

Replace files command (CARMAN), 383

rescaling, in adaptive Huffman coding, 87-88

resolution independence, 469-471

RLE (Run-Length Encoding), 341, 534

root mean square (RMS), 346, 466-467, 507

Run-Length Encoding (RLE), 341, 534

SAMPLE-3.RAW., 301

sampling, 291-292

of variables, 294-297

scale_counts() routine (arithmetic coding), 128-129

scoreboarding, with highest-order modeling, 165-166

second pass, 155

Shannon, Claude, 13, 14, 27, 202, 534

Shannon-Fano coding, 15, 27-29, 534

Huffman coding vs., 34-35

Shannon-Fano tree, 28-29, 32

algorithm for building, 29-31

shareware, 298-299, 534-535

SILENCE.C, 305-309

silence compression, 303-310

code, 305-309

effectiveness of, 310

16-bit machines, 4

sliding window compression. See LZ77; LZSS

Sloan, Alan, 458-459

Sound Blaster, 298

speech compression, 289-319

ADPCM algorithm, 319

companding, 310-318

code, 314-318

codecs with, 312-314

linear conversion with, 311-312

and minimum resolution, 310-311

and digital audio, 289-299

low-pass filters, 293-294

PC-based sound, 297-299

range of hearing, 290

replay, 293

sampling, 291-292

variables, sampling, 294-297

waveforms, 290-291

G.721 algorithm, 318-319

Linear Predictive Coding, 319

lossless compression, 299-302

lossy compression, 302-303

silence compression, 303-310

code, 305-309

effectiveness of, 310

SQ program, 23, 535

Stac Electronics, 204-205, 210, 211, 213

Stacker, 211

static dictionary-based compression, 20, 202-203

static models, 535

statistical modeling, 14, 18-20, 153-200

and adaptive compression, 155-162

escape code as fallback, using, 159-161

example (ARITH-1.C.), 156-158

improvements to, 161-162

and finite-context modeling, 154

with graphics files, 322

and higher-order modeling, 153-154

highest-order modeling (ARITH-N.C), 162-170

code, 172-200

data structures, redesign of, 166-169

escape probabilities, 164-165

flush code for, 170

implementation, memory and, 170

order(-1) table for, 169

order(-2) table for, 169-170

scoreboarding with, 165-166

updating ARITH-1.C for, 163-164

potential improvements to, 171

statistics, compression, 7, 511-515

__STDC__ macro, 74

StuffIt (compression program), 25

swap_nodes() routine (adaptive Huffman coding), 82

swapping, in adaptive Huffman coding, 81, 96-97

switching equipment, 290

symbols, 535

TAR, 24, 535-536

Test files command (CARMAN), 383

test program (CHURN.C), 517-526

text windows

LZ77, 215-216

LZSS, 221-222

32-bit machines, 4

time domain representation, 328

tokens, 536

traverse_image() function (FRAC.C), 477

tree(s). See also Huffman tree

in LZ78, 260-261

in LZSS, 225-226, 230-231

in LZW, 268-270

updating, in adaptive Huffman coding, 77-80, 95-100

underflow, 120-121, 132-133

Unisys, 209, 212, 287

UNIX, 4

CARMAN with, 392-393

compression programs in, 23

dictionary-based compression with, 208, 209

update exclusion, 163-164

UpdateModel() routine (adaptive Huffman coding), 76, 77, 91-92

U.S. Patent Office, 213-214

utility programs, 298-299

V.4bis, 212

V.42bis, 211

variables, sampling, 294-297

vector quantization, 457, 463-467

VGA display, 321, 346

VOC-HDR.EXE, 298

waveforms, 290-291

Welch, Terry, 208, 209, 212, 287, 536

WinZIP, 24

Witten, Ian H., 161

World-Wide Web, 298

WriteDCTData() routine, 350-351

WriteFileHeader() routine (CARMAN), 388

Xtract files command (CARMAN), 383

Yoshizaki, Haruyasu, 24, 209

zig-zag sequence, 341-343

ZIP files, 24, 25

ZipIt V1.2, 25

Ziv, Jacob, 21, 207, 532, 536

Table of Contents