Algorithmic RateDistortion Theory
N.K. Vereshchagin and P.M.B. Vitanyi
(2005)
CWI, Amsterdam.
AbstractWe propose and develop ratedistortion theory in the Kolmogorov
complexity setting. This gives the ultimate limits of lossy compression of
individual data objects, taking all effective regularities
of the data into account.
EPrint Type:  Other 

Additional Information:  Kolmogorov complexity is the accepted absolute measure of
the information content of an individual finite object. It gives the
ultimate limit on the number of bits resulting from lossless
compression of the objectmore precisely, the number of bits from
which effective lossless decompression of the object is possible.
A similar absolute approach is needed for lossy compression, that is,
a ratedistortion theory giving the ultimate effective limits
for individual finite data objects. We give natural definitions
of the ratedistortion functions of individual data (independent of a
random source producing those data). We analyze the possible shapes
of the ratedistortion graphs for all data and all computable distortions.
The classic Shannon ratedistortion curve corresponds approximately
to the individual curves of typical (random) data from the postulated
random source, while the nonrandom data have completely different curves.
It is easy to see that one is generally interested in the behavior
of lossy compression on complex structured nonrandom data, like
pictures, movies, music, while the typical unstructured random data
like noise (represented by the Shannon curve)
is discarded (we are not likely to want to store it).
%Given a probability
%distribution, the expected Kolmogorov complexity equals the entropy
%up to an additive term expressing the Kolmogorov complexity of the
%distribution in question. The
%pointswise expectation of the individual ratedistortion graphs
%equals the Shannon ratedistortion graph up to a similar additive term.
Finally, we formulate a new problem related to the practice of lossy
compression. Is it the case that a code word that realizes least
distortion of the source word at a given rate also captures the
most properties of that source word that are possible at this rate?
Clearly, this question cannot be well posed in the Shannon setting,
where we deal with expected distortion, while also the notion
of capturing a certain amount of the properties of the data cannot
be well expressed. We show that in our setting this question is
answered in the affirmative for every distortion measure that satisfies
a certain parsimonyofcovering property.


Project Keyword:  Project Keyword UNSPECIFIED 

Subjects:  Learning/Statistics & Optimisation Information Retrieval & Textual Information Access 

ID Code:  1837 

Deposited By:  Paul Vitányi 

Deposited On:  29 December 2005 

[Edit]
