|
|
|
|
|
Redundancy of Lossless Data Compression for Known Sources by Analytic Methods
Author(s): Michael Drmota;Wojciech Szpankowski
Source: Journal:Foundations and Trends® in Communications and Information Theory ISSN Print:1567-2190, ISSN Online:1567-2328 Publisher:Now Publishers Volume 13 Number 4, Pages: 144(277-417) DOI: 10.1561/0100000090
Abstract:
Lossless data compression is a facet of source coding and a well studied
problem of information theory. Its goal is to find a shortest possible
code that can be unambiguously recovered. Here, we focus on
rigorous analysis of code redundancy for known sources. The redundancy
rate problem determines by how much the actual code length
exceeds the optimal code length. We present precise analyses of three
types of lossless data compression schemes, namely fixed-to-variable
(FV) length codes, variable-to-fixed (VF) length codes, and variableto-
variable (VV) length codes. In particular, we investigate the average
redundancy of Shannon, Huffman, Tunstall, Khodak and Boncelet
codes. These codes have succinct representations as trees, either as
coding or parsing trees, and we analyze here some of their parameters
(e.g., the average path from the root to a leaf). Such trees are
precisely analyzed by analytic methods, known also as analytic combinatorics,
in which complex analysis plays decisive role. These tools
include generating functions, Mellin transform, Fourier series, saddle
point method, analytic poissonization and depoissonization, Tauberian
theorems, and singularity analysis. The term analytic information the-
ory has been coined to describe problems of information theory studied
by analytic tools. This approach lies on the crossroad of information
theory, analysis of algorithms, and combinatorics.
|
|
|
|