Practical Implementations of
Arithmetic Coding
Paul G. Howard and Jeffrey Scott Vitter
吳浩庠 R99944019 楊鈞傑 R99922150 黃信博 B96902039 吳彥緯 D98922013 蔡佩真 B96901012
李枝新 D99945016 姚甯之 R99944014
朱民晃 R96943077 李佳憲 R99945042
1
Arithmetic Coding
Advantage
◦Flexibility
◦Optimality
Disadvantage
◦Slowness
Overview
Section 2 : Tutorial on Arithmeti c coding
◦Basic algorithm
◦Dynamic Interval expansion
◦Integer arithmetic coding
Section 3
◦Improving the speed of Arithmetic cod ing
3
Basic Algorithm
1. Begin with at “current interva l [L,H) initialized to [0,1).
0 1
Basic Algorithm
2. For each symbol of the file, we perf orm :
◦(a.) Subdivide current intervals into subint ervals, one for each symbol.
PC =
PN =
The new subintervals : [L+ PC (H –L ), L + PN (H –L ) )
◦(b.) Select the subinterval corresponding to the next symbol to be read. ( ex : ai )
5
Basic Algorithm
Basic Algorithm
3. Output enough bits to distinguish the final current interval from all other possible final intervals.
◦ Length of final subinterval
= product of the probabilities of the individua l symbol
= probability p of the symbols in the fi le.
◦ Final step use almost exactly – log2 p bits
7
Encoding algorithm for arithmetic coding
L = 0.0 ; H =1.0 ; while not EOF do
range = H -L;
read(ai) ;
H = L + range H(ai) ; L = L + range L(ai) ;
Arithmetic Coding
ExampleSymbol Probability Range
a 0.4 [0.00,0.4)
b 0.5 [0.40,0.90)
EOF 0.1 [0.90,1.00)
Suppose that we want to encode the following message:
b b b EOF
9
Arithmetic Coding
Example0.4
0.90 0.0
0 a
b
EO
0.4
0.6
0.8 5
b
0.6
0.7 0
0.82 5
b
0.7 0
0.812
5 EO
0.812 5
Arithmetic Coding
ExampleCurrent Interval
[L,H)
Action
Subintervals
Input
a b EOF
[0.000,1.00 0)
Subdivi
de [0.000,0.4
00) [0.400,0.90
0) [0.900,1.0
00) b
[0.400,0.90 0)
Subdivi
de [0.400,0.6
00) [0.600,0.85
0) [0.850,0.9
00) b
[0.600,0.85 0)
Subdivi
de [0.600,0.7
00) [0.700,0.82
5) [0.825,0.8
50) b
[0.700,0.82 5)
Subdivi de
[0.700,0.7 50)
[0.750,0.81 2)
[0.812,0.8 25)
EOF
[0.8125,0.8 25)
11
Arithmetic Coding
Example Final Interval = [0.8125,0.825)
= [0.11010 00000,0.1 1010 01100)
(binary form)
We can uniquely identify this interval by 110 1000.
Probability p = (0.5) x (0.5) x (0.5) x (0.1) = 0.0125
Dynamic Interval expansion
The problem of basic arithmetic coding : the shrinking current interval requires t he use of high precision arithmetic
IEEE 754 standard :
Single precision => 10^-7 Double pricision => 10^-16
Only less than 30 symbols can be coded!
We need Dynamic Interval expansion
13
Dynamic Interval expansion
Keep the current interval length a little larger than 1/2
Dynamic Interval expansion
An example :
15
What’s Arithmetic Coding for?
It’s for compression.
Encoder Decoder
The file to be sent
bbb 0.8125/
1101000 bbb
Received file Magic number
0110 0010 (6) (2)
17
What’s Arithmetic Coding for?
Compression
Compression is usually fulfilled by making good use of symbol probabilities.
◦Unbalanced symbol probabilities imply better compression ratio.
Encoder Decoder
The file to be sent
bbb 0.8125/
1101000 bbb
Received file Magic number
01100010 01100010
01100010 01100010
Integer Arithmetic Coding
◦ In practice, arithmetic coding is slow.
Too many floating-point operations
Solution1: To buy powerful FP processors
Solution2: Integer arithmetic coding
Overview
Encode r
Decode r
The file to be sent
bbb 0.8125/
1101000 bbb
Received file Magic number
maintain integral intervals here
still a real number here
19
New interval calculation
General Arithmetic Coding
◦New interval calculation requires FP operations
Integer Arithmetic Coding
◦New interval calculation requires onl y INT operations
Current Interval
-FP
Current Interval-
INT Action Subinterval
a (Pa = 0.4) Subinterval
n (Pb = 0.5) Subinterval
EOF (PEOF = 0.1) In- put
[0.00,
1.00) [0000,99
99) Subdivide [0.00, 0.40)
[0000,4000) [0.40, 0.90)
[4000,9000) [0.90, 1.00)
[9000,9999) b
[0.40,
0.90) [4000,90
00) Subdivide [0.40, 0.60)
[4000,6000) [0.60, 0.85)
[6000,8500) [0.85, 0.90)
[8500,9000) b
[0.60,
0.85) [6000,85
00) Output 1
Expand [1/2,1) [0.20,
0.70) [2000,70
00) Subdivide [0.20, 0.40)
[2000,4000) [0.40, 0.65)
[4000,6500) [0.65, 0.70)
[6500,7000) b
[0.40,
0.65) [4000,65
00) Follow
Expand [1/4,3/4) [0.30,
0.80) [3000,80
00) Subdivide [0.30, 0.50)
[3000,5000) [0.50, 0.75)
[5000,7500) [0.75, 0.80)
[7500,8000) EOF
[0.75,
0.80) [7500,80
00) Output 10
Expand [1/2,1) [0.50,
0.60) [5000,60
00) Output 1
Expand [1/2,1) [0.00,
0.20) [0000,20
00) Output 0
Expand [0,1/2) [0.00,
0.40) [0000,40
00) Output 0
Expand [0,1/2) [0.00,
0.80) [0000,80
00) Output 0
[3000+5000*4/10, 3000+5000*9/10)
21
Drawback of Integer Arithmetic
If there is gain, there is also lo st.
Approximation leads to longer code length
◦Optimal code length is obtained under
accurate probability
Current
Interval-INT Action Subinterval
a (Pa = 0.88) Subinterval
b (Pb = 0.02) Subinterval
EOF (PEOF = 0.1) Input
[000,999) Subdivide [000,880) [880,900) [900,999) a
[000,880) Subdivide [000,774.4)
[000,774) [774.4,792)
[774,792) [792,880) b
Fortunately, it’s limited
23
Event probabilities
-Generalized symbol probabilities
Happy Birthday to You Happy Birthday to You Happy Birthday to You Happy Birthday to You
Step1: Apply other methods to recognize events Step2: Collect probabilities of events
Step3: Use arithmetic coding
[Advanced] Adaptive Model
Take advantage of locality
bbbbaabbb bbbbaabbc aaaaaabba
a
aaaaaabba c
bbbbaabbb bbbbaabbc
a:0 b:10 c:11 b:0 a:10 c:11
b:0 a:10 c:11 b:0 a:10 c:11
25
[Advanced] Scaling
Maintain symbol counts is a proble m
◦It can be arbitrarily large
By periodically reduce all symbol
’s counts by the same factor, we can keep the relative frequencies approximately the same as usual.
[Advanced] High Order Models
P(i) > P( 萄 )
P( 萄 |last word = 葡 ) is almost 100
%
27
REDUCED-PRECISION
ARITHMETIC CODING
3-1
Reduced-Precision Arithmetic Codi ng
Arithmetic operations → table lookups
Reduce the number of possible states
◦Reduce N in [0,N)
◦N must be even; 4-multiple is preferred
Still completely reversible
Decoder makes the same assignment
Only the average code length is reduce d
29
Definitions and Assumptions
Definitions
◦Follow: follow-on case
Process is described in Dynamic Interval expan sion
◦α : Cutoff probability between 1/2 and 3 /4
Excess code length is not very sensitive to α
◦“-”: no output
Assumptions
◦Prob{0} is uniformly distributed on (0,1)
Simplest Non-Trivial Coder (N=4)
01
00 10 11
Probability State
Output
1-a
1/4 1/2 3/4
1 2 3
0 0
4 1 a
31
Eliminate the need of “follow”
0-
01
00 10 11
Probability
Output State
1-a
1/4 1/2 3/4
1 2 3
0 1
0 0
4 1 a
More/Less Probable Symbol Idea
More/Less Probable Symbol (MPS/LPS):
1/0
Consider Prob{MPS} in [1/2, 1) only
Combine transitions and eliminate st ates
1 0 00 Output
Probability
LPS: 0 MPS: 1
3/4 1/4
0 1/2
0
1
1 a
33
Stat e
LPS input MPS input Outpu
t Next
state Outpu
t Next
state [0, 4) ½≤p
≤α 0 [0, 4) 1 [0, 4)
α<p<
1 00 [0, 4) - [1, 4)
[1, 01 [0, 4) 1 [0, 4)
Stat e
LPS input MPS input
Output Next
state Outpu
t Next
state [0, 4) 00 [0, 4) - [1, 4)
[1, 01 [0, 4) 1 [0, 4)
α
- - 1 1 - - 1 1
1 1 1 1 1 1 1 1 0 0
0
0
0
0
35Maximally Unbalanced Subdivision
Stat e
LPS input MPS input Output Next
state Outpu
t Next
state [0, 8) 000 [0, 8) - [1, 8) [1, 8) 001 [0, 8) - [2, 8) [2,
8) 010 [0, 8) - [3, 8)
- - - - - - 1 1
1 1 1 1 1 1 1 1 0 0
00 0 00
0
37Elias Code
State
LPS input MPS input Outpu
t Next
state Outp
ut Next state [0, 2)/2 0 STOP 1 [0, 4)/4 [0, 4)/4 00 STOP - [1, 4)/4 [1,
4)/4 01 STOP 1 [0, 8)/8
[0,
8)/8 000 STOP - [1, 8)/8
[1, 001 STOP - [2, 8)/8
38
N=8
A SIX-STATE CODER
39
N=8 , a six-state coder
N=8 , a six-state coder
a b
b b
b b
Output:0
Maximally unbalanced subdivision
41
N=8 , a six-state coder
Prob{MPS} =7/8 LPS : 111
MPS : [0,7)
LPS MPS
N=8 , a six-state coder
Prob{MPS} =7/8 LPS : 111
MPS : [0,7)
000 001 010 011 100 101 110 111
0 1 2 3 4 5 6 7 8
LPS : 111 MPS : [0,7)
0 1 2 3 4 5 6 7
43
N=8 , a six-state coder
LPS MPS
Prob{MPS}
=4/7
LPS : 1 [0,6) MPS : 0
Prob{MPS}
=5/7 LPS : 1f MPS : [0,5) Prob{MPS}
N=8 , a six-state coder
Prob{MPS}
=4/7
LPS : 1 [0,6) MPS : 0
Prob{MPS}
=4/7
LPS : 1 [0,6) MPS : 0
Prob{MPS}
=5/7 LPS : 1f MPS : [0,5)
Prob{MPS}
=6/7
LPS : 110 MPS : [0,6) 000 001 010 011 100 101 110
0 1 2 3 4 5 6 7
MPS : 0 LPS : 1
0 1 2 3 4 5 6
Prob{MPS} =7/8 MPS : [0,7)
45
N=8 , a six-state coder
Prob{MPS}
=4/7 Prob{MPS}
=5/7
Prob{MPS}
=5/7
Prob{MPS}
=6/7
000 001 010 011 100 101 110
0 1 2 3 4 5 6 7
MPS : [0,5) LPS : 1f
0 1 2 3 4 5
Prob{MPS} =7/8 MPS : [0,7)
N=8 , a six-state coder
Prob{MPS}
=4/7
LPS : 1 [0,6) MPS : 0
Prob{MPS}
=6/7
LPS : 110 MPS : [0,6) Prob{MPS}
=6/7
LPS : 110 MPS : [0,6) 000 001 010 011 100 101 110
0 1 2 3 4 5 6 7
MPS : [0,6) LPS : 110
Prob{MPS} =7/8 MPS : [0,7)
Prob{MPS}
=5/7 LPS : 1f MPS : [0,5)
0 1 2 3 4 5 6
47
FLEXIBLE CODER DESIGN
A class of reduced-precision coders
N= any power of 2
All states are of the form [k,N)
Denote state [k,N) by k
000 001 010 011 100 101 110 111
0 1 2 3 4 5 6 7 8
49
N= any power of 2
Number of states is N/2
K≥N/2 will produce output, and interval will be expanded
000 001 010 011 100 101 110 111
0 1 2 3 4 5 6 7 8
In every state [k.N)
Maximally unbalanced subdivision (at k+1)
-Prob{MPS} between (N-2/N) and (N-1)/N
000000 001001 010010 011011 100 101 110 111
0 1 2 3 4 5 6 7 8
Output:1 MPS
MPS
MPS
51
In every state [k.N)
Also include a nearly balanced subdivision
-So that we will not lose efficiency when Prob{MPS}≒1/2
000000 001001 010010 011011 100 101 110 111
0 1 2 3 4 5 6 7 8
Output:1
LPS MPS
Output: 0
In state k
Divided at k+1
LPS : output lgN bits of k, and move to state 0
MPS move to state k+1
If next state is N/2 : output 1, and move to st ate 0
000 001 010 011 100 101 110 111
0 1 2 3 4 5 6 7 8
53
Example
ADDITIONAL OUTPUT AND EXPANSION MAYBE POSSIBLE
000 001 010 011011 100 101 110 111
0 1 2 3 4 5 6 7 8
LPS MPS
Example
ADDITIONAL OUTPUT AND EXPANSION MAYBE POSSIBLE
000 001 010 011011 100 101 110 111
0 1 2 3 4 5 6 7 8
MPS
2 3 4 5 6 7 8
55
Example
ADDITIONAL OUTPUT AND EXPANSION MAYBE POSSIBLE
000 001 010 011011 100 101 110 111
0 1 2 3 4 5 6 7 8
0 1 2 3 LPS4 5 6 7 8
N= any power of 2
◦Small number of states
◦Every state Porb{MPS} :
near 1
near ½
In between
◦ We can choose a large N
highly probable events require negligible code len gth
number of states small enough to allow table looku ps rather than arithmetic
57
PARTITIONS - ρ and є
3-2
Partitions - ρ and є
We know that is possible to use a few number of possible probabiliti es to design a binary arithmetic c oder.
Now we want to give a theoretical basis for selecting the probabilit ies.
59
Excess code length
Average code length of a given mod el
◦L =
Optimal average code length
◦H=
Excess code length
◦ E = L H
If we let di=qi pi , we can have
Є -partitions
We can partition the space of poss ible probabilities to guarantee th at the use of approximate probabil ities will never add more than є t o the code length of any event.
61
Є -partitions
1. set i:=0 and Q0:=1/2
2. Find Pi+1 > Qi s.t. E(Pi+1, Qi)= Є
3. Find Qi+1 > Pi+1 s.t. E(Pi+1, Qi+1)=
Є
4. i++, repeat step 2 and 3 till P
i+1or Qi+1 reach 1
Є -partitions
Range of actual
probabilities Probability to use
[0.0000, 0.0130) 0.0003
[0.0130, 0.1421) 0.0676
[0.1421, 0.3691) 0.2501
[0.3691, 0.6309) 0.5000
[0.6309, 0.8579) 0.7499
[0.8579, 0.9870) 0.9324
[0.9870, 1.0000) 0.9997
63
Є -partitions
Range of actual
probabilities Probability to use
[0.0000, 0.0130) 0.0003
[0.0130, 0.1421) 0.0676
[0.1421, 0.3691) 0.2501
[0.3691, 0.6309) 0.5000
[0.6309, 0.8579) 0.7499
[0.8579, 0.9870) 0.9324
[0.9870, 1.0000) 0.9997
Є -partitions
Range of actual
probabilities Probability to use
[0.0000, 0.0130) 0.0003
[0.0130, 0.1421) 0.0676
[0.1421, 0.3691) 0.2501
[0.3691, 0.6309) 0.5000
[0.6309, 0.8579) 0.7499
[0.8579, 0.9870) 0.9324
[0.9870, 1.0000) 0.9997
65
Є -partitions
Range of actual
probabilities Probability to use
[0.0000, 0.0130) 0.0003
[0.0130, 0.1421) 0.0676
[0.1421, 0.3691) 0.2501
[0.3691, 0.6309) 0.5000
[0.6309, 0.8579) 0.7499
[0.8579, 0.9870) 0.9324
[0.9870, 1.0000) 0.9997
Є -partitions
Range of actual
probabilities Probability to use
[0.0000, 0.0130) 0.0003
[0.0130, 0.1421) 0.0676
[0.1421, 0.3691) 0.2501
[0.3691, 0.6309) 0.5000
[0.6309, 0.8579) 0.7499
[0.8579, 0.9870) 0.9324
[0.9870, 1.0000) 0.9997
67
Є -partitions
Range of actual
probabilities Probability to use
[0.0000, 0.0130) 0.0003
[0.0130, 0.1421) 0.0676
[0.1421, 0.3691) 0.2501
[0.3691, 0.6309) 0.5000
[0.6309, 0.8579) 0.7499
[0.8579, 0.9870) 0.9324
[0.9870, 1.0000) 0.9997
Є -partitions
Range of actual
probabilities Probability to use
[0.0000, 0.0130) 0.0003
[0.0130, 0.1421) 0.0676
[0.1421, 0.3691) 0.2501
[0.3691, 0.6309) 0.5000
[0.6309, 0.8579) 0.7499
[0.8579, 0.9870) 0.9324
[0.9870, 1.0000) 0.9997
69
Є -partitions
Range of actual
probabilities Probability to use
[0.0000, 0.0130) 0.0003
[0.0130, 0.1421) 0.0676
[0.1421, 0.3691) 0.2501
[0.3691, 0.6309) 0.5000
[0.6309, 0.8579) 0.7499
[0.8579, 0.9870) 0.9324
[0.9870, 1.0000) 0.9997
ρ -partitions
We might wish to limit the relativ e error so that the code length ca n never exceed the optimal by more than a factor of 1+ ρ.
Procedure is similar to Є-partiti ons.
The ρ-partitions are not finite.
◦As P approaches 1, the optimal averag e code length grows very small, so to obtain a small relative loss Q must b
e very close to P. 71
ρ-partitions
Range of actual probabilities
Probability to use
… …
[0.0033, 0.0154) 0.0069
[0.0154, 0.0573) 0.0291
[0.0573, 0.1670) 0.0982
[0.1670, 0.3722) 0.2555
[0.3722, 0.6278) 0.5000
[0.6278, 0.8330) 0.7445
[0.8330, 0.9427) 0.9018
[0.9427, 0.9846) 0.9709
[0.9846, 0.9967) 0.9931
COMPRESSED TREES
3-3
73
To apply binary fast arithmetic co ding algorithms (mentioned in prev ious sections), input symbols shou ld be transformed into binary sequ ences.
Here, compressed trees were propos ed to complete the task.
A 0110
Tree construction
- - 33 67
a b c d
100 0 25 75
e f g h
38 62
0 100 20 80
a 0
c 1/
8
d 1/
4 b
0
e 1/
8
f 0
g 1/
8
h 3/
8 Symbol
Probabilit y
75
Linearize the tree
- - 33 67
a b c d
100 0 25 75
e f g h
38 62
0 100 20 80
Compress the tree
3838 00 2020 -- 3333 100100 2525
3838 00 2020 3333 100100 2525
- -
a b
By the fact that probability of a and b are both 0, the omitted node
could be tracked.
77