## Exploiting spatial and temporal redundancies for vector quantization of speech and images

##### Abstract

The objective of the proposed research is to compress data such as speech, audio, and
images using a new re-ordering vector quantization approach that exploits the transition
probability between consecutive code vectors in a signal. Vector quantization is the process
of encoding blocks of samples from a data sequence by replacing every input vector from
a dictionary of reproduction vectors. Shannon’s rate-distortion theory states that signals
encoded as blocks of samples have a better rate-distortion performance relative to when
encoded on a sample-to-sample basis. As such, vector quantization achieves a lower coding
rate for a given distortion relative to scalar quantization for any given signal.
Vector quantization does not take advantage of the inter-vector correlation between successive
input vectors in data sequences. It has been demonstrated that real signals have significant
inter-vector correlation. This correlation has led to vector quantization approaches
that encode input vectors based on previously encoded vectors.
Some methods have been proposed in literature to exploit the dependence between successive
code vectors. Predictive vector quantization, dynamic codebook re-ordering, and
finite-state vector quantization are examples of vector quantization schemes that use intervector
correlation. Predictive vector quantization and finite-state vector quantization predict
the reproduction vector for a given input vector by using past input vectors. Dynamic
codebook re-ordering vector quantization has the same reproduction vectors as standard
vector quantization. The dynamic codebook re-ordering algorithm is based on the concept
of re-ordering indices whereby existing reproduction vectors are assigned new channel indices
according a structure that orders the reproduction vectors in an order of increasing
dissimilarity. Hence, an input vector encoded in the standard vector quantization method
is transmitted through a channel with new indices such that 0 is assigned to the closest
reproduction vector to the past reproduction vector. Larger index values are assigned to
reproduction vectors that have larger distances from the previous reproduction vector.
Dynamic codebook re-ordering assumes that the reproduction vectors of two successive
vectors of real signals are typically close to each other according to a distance metric.
Sometimes, two successively encoded vectors may have relatively larger distances from
each other. Our likelihood codebook re-ordering vector quantization algorithm exploits
the structure within a signal by exploiting the non-uniformity in the reproduction vector
transition probability in a data sequence. Input vectors that have higher probability of transition
from prior reproduction vectors are assigned indices of smaller values. The code
vectors that are more likely to follow a given vector are assigned indices closer to 0 while
the less likely are given assigned indices of higher value. This re-ordering provides the
reproduction dictionary a structure suitable for entropy coding such as Huffman and arithmetic
coding. Since such transitions are common in real signals, it is expected that our
proposed algorithm when combined with entropy coding algorithms such binary arithmetic
and Huffman coding, will result in lower bit rates for the same distortion as a standard
vector quantization algorithm.
The re-ordering vector quantization approach on quantized indices can be useful in
speech, images, audio transmission. By applying our re-ordering approach to these data
types, we expect to achieve lower coding rates for a given distortion or perceptual quality.
This reduced coding rate makes our proposed algorithm useful for transmission and storage
of larger image, speech streams for their respective communication channels. The use
of truncation on the likelihood codebook re-ordering scheme results in much lower compression
rates without significantly distorting the perceptual quality of the signals. Today,
texts and other multimedia signals may be benefit from this additional layer of likelihood
re-ordering compression.