The Data Compression Book-:Speech Compression

Table of Contents

Lossy compression is not necessarily an end to itself. We frequently use lossy compression in a two-phase process: a lossy stage followed by a lossless stage. One nice thing about lossy compression is that it frequently smooths out the data, which makes it even more suitable for lossless compression. So we get an extra unexpected benefit from lossy compression, above and beyond the compression itself.

Silence Compression

Silence compression on sound files is the equivalent of run-length encoding on normal data files. In this case, however, the runs we encode are sequences of relative silence in a sound file. This is a lossy technique because we replace the sequences of relative silence with absolute silence.

Figure 10.11 shows a typical sound sample that has a long sequence of silence. The first two thirds of it is composed of silence. Note that though we call it “silence,” there are actually very small “blips” in the waveform. These are normal background noise and can be considered inconsequential.

Figure 10.11 A typical sound sample with a long sequence of silence.

A compression program for a sample like this needs to work with a few parameters. First, it needs a threshold value for what can be considered silence. With our eight-bit samples, for example, 80H is considered “pure” silence. We might want to consider any sample value within a range of plus or minus three from 80H to be silence.

Second, it needs a way to encode a run of silence. The sample program that follows creates a special SILENCE_CODE with a value of FF used to encode silence. The SILENCE_CODE is followed by a single byte that indicates how many consecutive silence codes there are.

Third, it needs a parameter that gives a threshold for recognizing the start of a run of silence. We wouldn’t want to start encoding silence after seeing just a single byte of silence. It doesn’t even become economical until three bytes of silence are seen. We may want to experiment with even higher values than three to see how it affects the fidelity of the recording.

Finally, we need another parameter that indicates how many consecutive non-silence codes need to be seen in the input stream before we declare the silence run to be over. Setting this parameter to a value greater than one filters out anomalous spikes in the input data. This can also cut back on noise in the recording.

The code to implement this silence compression follows. It incorporates a starting threshold of four and a stop threshold of two, so we have to see four consecutive silence codes before we consider a run started.

SILENCE.C by definition spends a lot of time looking ahead at upcoming input data. For example, to see if a silence run has really started the program must look at the next upcoming four input values. To simplify this, the program keeps a look-ahead buffer full of input data. It never directly examines the upcoming data read in via getc(). Instead, it looks at the bytes read into the buffer. This makes it easy to write functions to determine if a silence run has been started or if one is now over.

/************************ Start of SILENCE.C ************************
*
* This is the silence compression coding module used in chapter 10.
* Compile with BITIO.C, ERRHAND.C, and either MAIN-C.C or MAIN-E.C
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "bitio.h"
#include "errhand.h"
#include "main.h"

/*
* These two strings are used by MAIN-C.C and MAIN-E.C to print
* messages of importance to the user of the program.
*/
char *CompressionName = "Silence compression";
char *Usage = "infile outfile\n";

/*
* These macros define the parameters used to compress the silent
* sequences. SILENCE_LIMIT is the maximum size of a signal that can
* be considered silent, in terms of offset from the center point.
* START_THRESHOLD gives the number of consecutive silent codes that
* have to be seen before a run is started. STOP_THRESHOLD tells how
* many non-silent codes need to be seen before a run is considered
* to be over. SILENCE_CODE is the special code output to the
* compressed file to indicate that a run has been detected.
* SILENCE_CODE is always followed by a single byte indicating how
* many consecutive silence bytes are to follow.
*/

#define SILENCE_LIMIT   4
#define START_THRESHOL  5
#define STOP_THRESHOL   2
#define SILENCE_CODE    Oxff
#define IS_SILENCE( c ) ( (c) >( 0x7f - SILENCE_LIMIT ) && \
                        (c) < ( 0x80 + SILENCE_LIMIT ) )

/*
* BUFFER_SIZE is the size of the look-ahead buffer. BUFFER_MASK is
* the mask applied to a buffer index when performing index math.
*/
#defined BUFFER_SIZE 8
#defined BUFFER_MASK 7

/*
* Local function prototypes.
*/

#ifdef __STDC__

int silence_run( int buffer [], int index )
int end_of_silence( int buffer[], int index)

#else

int silence_run();
int end_of_silence();

#endif

/*
* The compression routine has the hard job here. It has to detect
* when a silence run has started and when it is over. It does this
* by keeping up-and-coming bytes in a look-ahead buffer. The buffer
* and the current index are passed ahead to routines that check to
* see if a run has started or if it has ended.
*/

void CompressFile( input, output, argc, argv )
FILE *input;
BIT_FILE *output;

int argc;
char *argv[];
{
  int look_ahead[ BUFFER_SIZE ];
  int index;
  int i;
  int run_length;

  for ( i = 0 ; i < BUFFER_SIZE ; i++ )
   look_ahead[ i ] = getc( input );
  index = 0;
  for ( ; ; ) {
   if ( look_ahead[ index ] == EOF )
    break;
/*
* If run has started, I handle it here. I sit in the do loop until
* the run is complete, loading new characters all the while.
*/
   if ( silence_run( look_ahead, index ) ) {
    run_length = 0;
    do {
     look_ahead[ index ++ ] = getc( input );
     index &= BUFFER_MASK;
     if ( ++run_length == 255 ) {
      putc( SILENCE_CODE, output->file );
      putc( 255, output->file );
      run_length = 0;
   }
  } while ( !end_of_silence( look_ahead, index ) );
  if ( run_length > 0 ) {
   putc( SILENCE_CODE, output->file );
   putc( run_length, output->file );
  }
 }
/*
* Eventually, any run of silence is over, and I output some plain codes.
* Any code that accidentally matches the silence code gets silently
* changed.
*/
   if ( look_ahead[ index ]== SILENCE_CODE )
    look_ahead[ index ]--;
   putc( look_ahead[ index ], output->file );
   look_ahead[ index++ ] = getc( input );
   index & = BUFFER_MASK;
  }
  while ( argc-- > 0 )
   printf( "Unused argument: %s\n", *argv++ );
}

/*
* The expansion routine used here has a very easy time of it. It just
* has to check for the run code, and when it finds it, pad out the
* output file with some silence bytes.
*/
void ExpandFile( input, output, argc, argv )
BIT_FILE *input;
FILE *output;
int argc;
char argv[];
{
  int c;
  int run_count;

  while ( ( c = getc( input->file ) ) != EOF ) {
   if ( c == SILENCE_CODE ) {
    run_count = getc( input->file );
    while ( run_count-- > 0 )
     putc( 0x80, output );
   } else
    putc( c, output );
  }
  while ( argc-- > 0 )
   printf( "Unused argument: %s\n", *argv++ );
}

/*
* This support routine checks to see if the look-ahead buffer
* contains the start of a run, which by definition is
* START_THRESHOLD consecutive silence characters.
*/

int silence_run( buffer, index )
int buffer[];
int index;
{
  int i;

  for ( i = 0 ; i < START_THRESHOLD ; i++ )
   if ( !IS_SILENCE( buffer[ ( index + i ) & BUFFER_MASK ] ) )
    return( 0 );
  return( 1 );
}

/*
* This support routine is called while we are in the middle of a
* run of silence. It checks to see if we have reached the end of
* the run. By definition this occurs when we see STOP_THRESHOLD
* consecutive non-silence characters.
*/

int end_of silence( buffer, index )
int buffer[];
int index;
{
  int i;

  for ( i = 0 ; i < STOP_THRESHOLD ; i++ )
   if ( IS_SILENCE( buffer[ ( index + i ) & BUFFER_MASK ] ) )
    return( 0 );
  return( 1 );
}
/************************ End of SILENCE.C ************************/

Table of Contents