Imc12 03 Huffman Adaptive Lec

7/28/2019 Imc12 03 Huffman Adaptive Lec

1/19

Huffman Coding


2/19

Huffman Codes

Optimum prefix code developed by D. Huffman in a

class assignment

Construction of Huffman codes is based on two ideas:

In an optimum code, symbols with higher probability should

have shorter codewordsIn an optimum prefix code, the two symbols that occur least

frequently will have the same length (otherwise, the

truncation of the longer codeword to the same length still

produce a decodable code)

2/26


3/19

Principle of Huffman Codes

Starting with two least probable symbols, and , ofan alphabet A, if the codeword foris [m]0, thecodeword for would be [m]1, where [m] is a string of1s and 0s.

Now, the two symbols can be combined into a group,which represents a new symbol in the alphabet set.The symbol has the probability P() + P().

Recursively determine the bit pattern [m] using thenew alphabet set.

3/26


4/19

Example: Huffman Code

Let A = {a1, , a5}, P(ai) = {0.2, 0.4, 0.2, 0.1, 0.1}.

Symbol Step 1 Step 2 Step 3 Step 4 Codeword

a 2 0.4 0.4 0.4 0.6 0 1

a 0.20.2 0.4 0.4 1 01

1 0a 3 0.2 0.2 0 0.2 1 000

a 4 0.1 0 0.2 1 0010

a 5 0.1 1 0011

Combine last two symbols with lowest probabilities, anduse one bit (last bit in codeword) to differentiate between them!

4/26


5/19

Efficiency of Huffman Codes

Redundancy - the difference between the entropy

and the average length of a code

The average codeword length for this code is

l = 0.4 x 1 + 0.2 x 2 + 0.2 x 3 + 0.1 x 4 + 0.1 x 4 = 2.2 bits/symbol. The

redundancy is 0.078 bits/symbol.

For Huffman code, the redundancy is zero when theprobabilities are negative powers of two.

5/26


6/19

Minimum Variance Huffman Codes

When more than two symbols in a Huffman tree

have the same probability, different merge orders

produce different Huffman codes.

Symbol Step 1 Step 2 Step 3 Step 4 Codeword

a 2 0.4 0.4 0.4 0.6 0 00 The average codeword

a 1 0.2

a 3 0.2

a 4 0.1 0

0.2 0.4 0 0.4 1 10

10.2 0 0.2 110.2 1 010

length is still

2.2 bits/symbol.

But variances are different!

a 5 0.1 1 011

Two code trees with same symbol probabilities:

We prefer a code with

smaller length-variance,

Why?

6/26


7/19

Length of Huffman Codes (1/2)

Given a sequence of positive integers {l1, l2, , lk}

satisfiesk

i=1

2li 1,

there exists a uniquely decodable code whosecodeword lengths are given by {l1, l2, , lk}.

The optimal code for a source S has an average code

length lavg with the following bounds:H(S) l avg


8/19


9/19

Non-binary Huffman Codes

Huffman codes can be applied to n-ary code space.

For example, codewords composed of {0, 1, 2}, we

have ternary Huffman code

Let A = {a1, , a5}, P(ai) = {0.25, 0.25, 0.2, 0.15, 0.15}.

Symbol Step 1 Step 2 Codeword

a 1 0.25 0.5 0 1

a 2 0.25 0.25 1 2

a 3 0.20 0 0.25 2 00

a 4 0.151

01a 5 0.15 2 02

12/26


10/19

Adaptive Huffman Coding

Huffman codes require exact probability model of the

source to compute optimal codewords. For messages

with unknown duration, this is not possible.

One can try to re-compute the probability model for

every received symbol, and re-generate new set ofcodewords based on new model for the next symbol

from scratch too expensive!

Adaptive Huffman coding tries to achieve this goal atlower cost.

13/26


11/19

Adaptive Huffman Coding Tree

Adaptive Huffman coding maintains a dynamic code

tree. The tree will be updated synchronously on both

transmitter-side and receiver-side. If the alphabet

size is m, the total number of nodes 2m - 1.

51 = 2m - 1,m = alphabet size

51

Weight of a node:

number of occurrences ofthe symbol, or all the

symbols in the subtree

All symbols Not Yet

Transmitted (NYT)

49 50

4847

45 46

43 44

Node number:

unique ID of each node,

sorted decreasingly from

top-to-down and left-to-right

14/26


12/19

Initial Codewords

Before transmission of any symbols, all symbols in

the source alphabet {a1, a2, , am} belongs to theNYT list.

Each symbol in the alphabet has an initial codeword using

eitherlog2m orlog2m+1 bits fixed-length binary code.

When a symbol ai is transmitted for the first time, thecode for NYT is transmitted, followed by the fixed

code forai. A new node is created forai and ai istaken out of the NYT list.

From this point on, we follow the update procedure tomaintain the Huffman code tree.

15/26


13/19

Update Procedure

The set of nodes

with the same weight

16/26


14/19

Encoding Procedure

17/26


15/19

Decoding Procedure

1


16/19

Applications: Image Compression

Direct application of Huffman coding on image data

has limited compression ratio

no model prediction

with model prediction

26/26


17/19


18/19


19/19

Documents

Imc12 03 Huffman Adaptive Lec