Upload
gurusodhii
View
214
Download
0
Embed Size (px)
Citation preview
7/28/2019 Imc12 03 Huffman Adaptive Lec
1/19
Huffman Coding
7/28/2019 Imc12 03 Huffman Adaptive Lec
2/19
Huffman Codes
Optimum prefix code developed by D. Huffman in a
class assignment
Construction of Huffman codes is based on two ideas:
In an optimum code, symbols with higher probability should
have shorter codewordsIn an optimum prefix code, the two symbols that occur least
frequently will have the same length (otherwise, the
truncation of the longer codeword to the same length still
produce a decodable code)
2/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
3/19
Principle of Huffman Codes
Starting with two least probable symbols, and , ofan alphabet A, if the codeword foris [m]0, thecodeword for would be [m]1, where [m] is a string of1s and 0s.
Now, the two symbols can be combined into a group,which represents a new symbol in the alphabet set.The symbol has the probability P() + P().
Recursively determine the bit pattern [m] using thenew alphabet set.
3/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
4/19
Example: Huffman Code
Let A = {a1, , a5}, P(ai) = {0.2, 0.4, 0.2, 0.1, 0.1}.
Symbol Step 1 Step 2 Step 3 Step 4 Codeword
a 2 0.4 0.4 0.4 0.6 0 1
a 0.20.2 0.4 0.4 1 01
1 0a 3 0.2 0.2 0 0.2 1 000
a 4 0.1 0 0.2 1 0010
a 5 0.1 1 0011
Combine last two symbols with lowest probabilities, anduse one bit (last bit in codeword) to differentiate between them!
4/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
5/19
Efficiency of Huffman Codes
Redundancy - the difference between the entropy
and the average length of a code
The average codeword length for this code is
l = 0.4 x 1 + 0.2 x 2 + 0.2 x 3 + 0.1 x 4 + 0.1 x 4 = 2.2 bits/symbol. The
redundancy is 0.078 bits/symbol.
For Huffman code, the redundancy is zero when theprobabilities are negative powers of two.
5/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
6/19
Minimum Variance Huffman Codes
When more than two symbols in a Huffman tree
have the same probability, different merge orders
produce different Huffman codes.
Symbol Step 1 Step 2 Step 3 Step 4 Codeword
a 2 0.4 0.4 0.4 0.6 0 00 The average codeword
a 1 0.2
a 3 0.2
a 4 0.1 0
0.2 0.4 0 0.4 1 10
10.2 0 0.2 110.2 1 010
length is still
2.2 bits/symbol.
But variances are different!
a 5 0.1 1 011
Two code trees with same symbol probabilities:
We prefer a code with
smaller length-variance,
Why?
6/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
7/19
Length of Huffman Codes (1/2)
Given a sequence of positive integers {l1, l2, , lk}
satisfiesk
i=1
2li 1,
there exists a uniquely decodable code whosecodeword lengths are given by {l1, l2, , lk}.
The optimal code for a source S has an average code
length lavg with the following bounds:H(S) l avg
7/28/2019 Imc12 03 Huffman Adaptive Lec
8/19
7/28/2019 Imc12 03 Huffman Adaptive Lec
9/19
Non-binary Huffman Codes
Huffman codes can be applied to n-ary code space.
For example, codewords composed of {0, 1, 2}, we
have ternary Huffman code
Let A = {a1, , a5}, P(ai) = {0.25, 0.25, 0.2, 0.15, 0.15}.
Symbol Step 1 Step 2 Codeword
a 1 0.25 0.5 0 1
a 2 0.25 0.25 1 2
a 3 0.20 0 0.25 2 00
a 4 0.151
01a 5 0.15 2 02
12/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
10/19
Adaptive Huffman Coding
Huffman codes require exact probability model of the
source to compute optimal codewords. For messages
with unknown duration, this is not possible.
One can try to re-compute the probability model for
every received symbol, and re-generate new set ofcodewords based on new model for the next symbol
from scratch too expensive!
Adaptive Huffman coding tries to achieve this goal atlower cost.
13/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
11/19
Adaptive Huffman Coding Tree
Adaptive Huffman coding maintains a dynamic code
tree. The tree will be updated synchronously on both
transmitter-side and receiver-side. If the alphabet
size is m, the total number of nodes 2m - 1.
51 = 2m - 1,m = alphabet size
51
Weight of a node:
number of occurrences ofthe symbol, or all the
symbols in the subtree
All symbols Not Yet
Transmitted (NYT)
49 50
4847
45 46
43 44
Node number:
unique ID of each node,
sorted decreasingly from
top-to-down and left-to-right
14/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
12/19
Initial Codewords
Before transmission of any symbols, all symbols in
the source alphabet {a1, a2, , am} belongs to theNYT list.
Each symbol in the alphabet has an initial codeword using
eitherlog2m orlog2m+1 bits fixed-length binary code.
When a symbol ai is transmitted for the first time, thecode for NYT is transmitted, followed by the fixed
code forai. A new node is created forai and ai istaken out of the NYT list.
From this point on, we follow the update procedure tomaintain the Huffman code tree.
15/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
13/19
Update Procedure
The set of nodes
with the same weight
16/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
14/19
Encoding Procedure
17/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
15/19
Decoding Procedure
1
7/28/2019 Imc12 03 Huffman Adaptive Lec
16/19
Applications: Image Compression
Direct application of Huffman coding on image data
has limited compression ratio
no model prediction
with model prediction
26/26
7/28/2019 Imc12 03 Huffman Adaptive Lec
17/19
7/28/2019 Imc12 03 Huffman Adaptive Lec
18/19
7/28/2019 Imc12 03 Huffman Adaptive Lec
19/19