Practical Implementations of Arithmetic Coding

(1)

Practical Implementations of

Arithmetic Coding

Paul G. Howard and Jeffrey Scott Vitter

吳浩庠 R99944019 楊鈞傑 R99922150 黃信博 B96902039 吳彥緯 D98922013 蔡佩真 B96901012

李枝新 D99945016 姚甯之 R99944014

朱民晃 R96943077 李佳憲 R99945042

1

(2)

Arithmetic Coding

 Advantage

◦Flexibility

◦Optimality

 Disadvantage

◦Slowness

(3)

Overview

 Section 2 : Tutorial on Arithmeti c coding

◦Basic algorithm

◦Dynamic Interval expansion

◦Integer arithmetic coding

 Section 3

◦Improving the speed of Arithmetic cod ing

3

(4)

Basic Algorithm

 1. Begin with at “current interva l [L,H) initialized to [0,1).

0 1

(5)

Basic Algorithm

 2. For each symbol of the file, we perf orm :

◦(a.) Subdivide current intervals into subint ervals, one for each symbol.

 P_C =

 P_N =

 The new subintervals : [L+ P_C (H –L ), L + P_N (H –L ) )

◦(b.) Select the subinterval corresponding to the next symbol to be read. ( ex : a_i )



5

(6)

Basic Algorithm

(7)

Basic Algorithm

 3. Output enough bits to distinguish the final current interval from all other possible final intervals.

◦ Length of final subinterval

= product of the probabilities of the individua l symbol

= probability p of the symbols in the fi le.

◦ Final step use almost exactly – log2 p bits

7

(8)

Encoding algorithm for arithmetic coding

L = 0.0 ; H =1.0 ; while not EOF do

range = H -L;

read(a_i) ;

H = L + range  H(a_i) ; L = L + range  L(a_i) ;

(9)

Arithmetic Coding

^Example

Symbol Probability Range

a 0.4 [0.00,0.4)

b 0.5 [0.40,0.90)

EOF 0.1 [0.90,1.00)

Suppose that we want to encode the following message:

b b b EOF

9

(10)

Arithmetic Coding

^Example

0.4

0.90 0.0

0 a

b

EO

0.4

0.6

0.8 5

b

0.6

0.7 0

0.82 5

b

0.7 0

0.812

5 EO

0.812 5

(11)

Arithmetic Coding

^Example

Current Interval

[L,H)

Action

Subintervals

Input

a b EOF

[0.000,1.00 0)

Subdivi

de [0.000,0.4

00) [0.400,0.90

0) [0.900,1.0

00) b

[0.400,0.90 0)

Subdivi

de [0.400,0.6

00) [0.600,0.85

0) [0.850,0.9

00) b

[0.600,0.85 0)

Subdivi

de [0.600,0.7

00) [0.700,0.82

5) [0.825,0.8

50) b

[0.700,0.82 5)

Subdivi de

[0.700,0.7 50)

[0.750,0.81 2)

[0.812,0.8 25)

EOF

[0.8125,0.8 25)

11

(12)

Arithmetic Coding

^Example

 Final Interval = [0.8125,0.825)

= [0.11010 00000,0.1 1010 01100)

(binary form)

 We can uniquely identify this interval by 110 1000.

 Probability p = (0.5) x (0.5) x (0.5) x (0.1) = 0.0125

(13)

Dynamic Interval expansion

 The problem of basic arithmetic coding : the shrinking current interval requires t he use of high precision arithmetic

 IEEE 754 standard :

Single precision => 10^^-7 Double pricision => 10^^-16

Only less than 30 symbols can be coded!

 We need Dynamic Interval expansion

13

(14)

Dynamic Interval expansion

 Keep the current interval length a little larger than 1/2

(15)

Dynamic Interval expansion

 An example :

15

(16)

What’s Arithmetic Coding for?

 It’s for compression.

Encoder Decoder

The file to be sent

bbb 0.8125/

1101000 bbb

Received file Magic number

(17)

0110 0010 (6) (2)

17

(18)

What’s Arithmetic Coding for?

 Compression

 Compression is usually fulfilled by making good use of symbol probabilities.

◦Unbalanced symbol probabilities imply better compression ratio.

Encoder Decoder

The file to be sent

bbb 0.8125/

1101000 bbb

01100010 01100010

(19)

Integer Arithmetic Coding

◦ In practice, arithmetic coding is slow.

 Too many floating-point operations

 Solution1: To buy powerful FP processors

 Solution2: Integer arithmetic coding

 Overview

Encode r

Decode r

The file to be sent

bbb 0.8125/

1101000 bbb

maintain integral intervals here

still a real number here

19

(20)

New interval calculation

 General Arithmetic Coding

◦New interval calculation requires FP operations

 Integer Arithmetic Coding

◦New interval calculation requires onl y INT operations

(21)

Current Interval

-FP

Current Interval-

INT Action Subinterval

a (P_a = 0.4) Subinterval

n (P_b = 0.5) Subinterval

EOF (P_EOF = 0.1) In- put

[0.00,

1.00) [0000,99

99) Subdivide [0.00, 0.40)

[0000,4000) [0.40, 0.90)

[4000,9000) [0.90, 1.00)

[9000,9999) b

[0.40,

0.90) [4000,90

00) Subdivide [0.40, 0.60)

[4000,6000) [0.60, 0.85)

[6000,8500) [0.85, 0.90)

[8500,9000) b

[0.60,

0.85) [6000,85

00) Output 1

Expand [1/2,1) [0.20,

0.70) [2000,70

00) Subdivide [0.20, 0.40)

[2000,4000) [0.40, 0.65)

[4000,6500) [0.65, 0.70)

[6500,7000) b

[0.40,

0.65) [4000,65

00) Follow

Expand [1/4,3/4) [0.30,

0.80) [3000,80

00) Subdivide [0.30, 0.50)

[3000,5000) [0.50, 0.75)

[5000,7500) [0.75, 0.80)

[7500,8000) EOF

[0.75,

0.80) [7500,80

00) Output 10

Expand [1/2,1) [0.50,

0.60) [5000,60

00) Output 1

Expand [1/2,1) [0.00,

0.20) [0000,20

00) Output 0

Expand [0,1/2) [0.00,

0.40) [0000,40

00) Output 0

Expand [0,1/2) [0.00,

0.80) [0000,80

00) Output 0

[3000+5000*4/10, 3000+5000*9/10)

21

(22)

Drawback of Integer Arithmetic

 If there is gain, there is also lo st.

 Approximation leads to longer code length

◦Optimal code length is obtained under

 accurate probability

Current

Interval-INT Action Subinterval

a (P_a = 0.88) Subinterval

b (P_b = 0.02) Subinterval

EOF (P_EOF = 0.1) Input

[000,999) Subdivide [000,880) [880,900) [900,999) a

[000,880) Subdivide [000,774.4)

[000,774) [774.4,792)

[774,792) [792,880) b

(23)

Fortunately, it’s limited

23

(24)

Event probabilities

-Generalized symbol probabilities

Happy Birthday to You Happy Birthday to You Happy Birthday to You Happy Birthday to You

Step1: Apply other methods to recognize events Step2: Collect probabilities of events

Step3: Use arithmetic coding

(25)

[Advanced] Adaptive Model

 Take advantage of locality

bbbbaabbb bbbbaabbc aaaaaabba

a

aaaaaabba c

bbbbaabbb bbbbaabbc

a:0 b:10 c:11 b:0 a:10 c:11

b:0 a:10 c:11 b:0 a:10 c:11

25

(26)

[Advanced] Scaling

 Maintain symbol counts is a proble m

◦It can be arbitrarily large

 By periodically reduce all symbol

’s counts by the same factor, we can keep the relative frequencies approximately the same as usual.

(27)

[Advanced] High Order Models

 P(i) > P( 萄 )

 P( 萄 |last word = 葡 ) is almost 100

%

27

(28)

REDUCED-PRECISION

ARITHMETIC CODING

3-1

(29)

Reduced-Precision Arithmetic Codi ng

 Arithmetic operations → table lookups

 Reduce the number of possible states

◦Reduce N in [0,N)

◦N must be even; 4-multiple is preferred

 Still completely reversible

 Decoder makes the same assignment

 Only the average code length is reduce d

29

(30)

Definitions and Assumptions

 Definitions

◦Follow: follow-on case

 Process is described in Dynamic Interval expan sion

◦α : Cutoff probability between 1/2 and 3 /4

 Excess code length is not very sensitive to α

◦“-”: no output

 Assumptions

◦Prob{0} is uniformly distributed on (0,1)

(31)

Simplest Non-Trivial Coder (N=4)

01

00 10 11

Probability State

Output

1-a

1/4 1/2 3/4

1 2 3

0 0

4 1 a

31

(32)

Eliminate the need of “follow”

0-

01

00 10 11

Probability

Output State

1-a

1/4 1/2 3/4

1 2 3

0 1

0 0

4 1 a

(33)

More/Less Probable Symbol Idea

 More/Less Probable Symbol (MPS/LPS):

1/0

 Consider Prob{MPS} in [1/2, 1) only

 Combine transitions and eliminate st ates

1 0 00 Output

Probability

LPS: 0 MPS: 1

3/4 1/4

0 1/2

0

1

1 a

33

(34)

Stat e

LPS input MPS input Outpu

t Next

state Outpu

t Next

state [0, 4) ½≤p

≤α 0 [0, 4) 1 [0, 4)

α<p<

1 00 [0, 4) - [1, 4)

[1, 01 [0, 4) 1 [0, 4)

Stat e

LPS input MPS input

Output Next

state Outpu

t Next

state [0, 4) 00 [0, 4) - [1, 4)

[1, 01 [0, 4) 1 [0, 4)

(35)

α

- - 1 1 - - 1 1

1 1 1 1 1 1 1 1 0 0

0

35

(36)

Maximally Unbalanced Subdivision

Stat e

LPS input MPS input Output Next

state Outpu

t Next

state [0, 8) 000 [0, 8) - [1, 8) [1, 8) 001 [0, 8) - [2, 8) [2,

8) 010 [0, 8) - [3, 8)

(37)

- - - - - - 1 1

1 1 1 1 1 1 1 1 0 0

00 0 00

0

37

(38)

Elias Code

State

LPS input MPS input Outpu

t Next

state Outp

ut Next state [0, 2)/2 0 STOP 1 [0, 4)/4 [0, 4)/4 00 STOP - [1, 4)/4 [1,

4)/4 01 STOP 1 [0, 8)/8

[0,

8)/8 000 STOP - [1, 8)/8

[1, 001 STOP - [2, 8)/8

38

(39)

N=8

A SIX-STATE CODER

39

(40)

N=8 , a six-state coder

(41)

N=8 , a six-state coder

a b

b b

Output:0

Maximally unbalanced subdivision

41

(42)

N=8 , a six-state coder

Prob{MPS} =7/8 LPS : 111

MPS : [0,7)

LPS MPS

(43)

N=8 , a six-state coder

Prob{MPS} =7/8 LPS : 111

MPS : [0,7)

000 001 010 011 100 101 110 111

0 1 2 3 4 5 6 7 8

LPS : 111 MPS : [0,7)

0 1 2 3 4 5 6 7

43

(44)

N=8 , a six-state coder

LPS MPS

Prob{MPS}

=4/7

LPS : 1 [0,6) MPS : 0

Prob{MPS}

=5/7 LPS : 1f MPS : [0,5) Prob{MPS}

(45)

N=8 , a six-state coder

Prob{MPS}

=4/7

LPS : 1 [0,6) MPS : 0

Prob{MPS}

=4/7

LPS : 1 [0,6) MPS : 0

Prob{MPS}

=5/7 LPS : 1f MPS : [0,5)

Prob{MPS}

=6/7

LPS : 110 MPS : [0,6) 000 001 010 011 100 101 110

0 1 2 3 4 5 6 7

MPS : 0 LPS : 1

0 1 2 3 4 5 6

Prob{MPS} =7/8 MPS : [0,7)

45

(46)

N=8 , a six-state coder

Prob{MPS}

=4/7 Prob{MPS}

=5/7

Prob{MPS}

=5/7

Prob{MPS}

=6/7

000 001 010 011 100 101 110

0 1 2 3 4 5 6 7

MPS : [0,5) LPS : 1f

0 1 2 3 4 5

Prob{MPS} =7/8 MPS : [0,7)

(47)

N=8 , a six-state coder

Prob{MPS}

=4/7

LPS : 1 [0,6) MPS : 0

Prob{MPS}

=6/7

LPS : 110 MPS : [0,6) Prob{MPS}

=6/7

LPS : 110 MPS : [0,6) 000 001 010 011 100 101 110

0 1 2 3 4 5 6 7

MPS : [0,6) LPS : 110

Prob{MPS} =7/8 MPS : [0,7)

Prob{MPS}

=5/7 LPS : 1f MPS : [0,5)

0 1 2 3 4 5 6

47

(48)

FLEXIBLE CODER DESIGN

A class of reduced-precision coders

(49)

N= any power of 2

 All states are of the form [k,N)

 Denote state [k,N) by k

000 001 010 011 100 101 110 111

0 1 2 3 4 5 6 7 8

49

(50)

N= any power of 2

 Number of states is N/2

 K≥N/2 will produce output, and interval will be expanded

000 001 010 011 100 101 110 111

0 1 2 3 4 5 6 7 8

(51)

 In every state [k.N)

 Maximally unbalanced subdivision (at k+1)

-Prob{MPS} between (N-2/N) and (N-1)/N

000000 001001 010010 011011 100 101 110 111

0 1 2 3 4 5 6 7 8

Output:1 MPS

MPS

51

(52)

 In every state [k.N)

 Also include a nearly balanced subdivision

-So that we will not lose efficiency when Prob{MPS}≒1/2

000000 001001 010010 011011 100 101 110 111

0 1 2 3 4 5 6 7 8

Output:1

LPS MPS

Output: 0

(53)

 In state k

 Divided at k+1

 LPS : output lgN bits of k, and move to state 0

 MPS move to state k+1

 If next state is N/2 : output 1, and move to st ate 0

000 001 010 011 100 101 110 111

0 1 2 3 4 5 6 7 8

53

(54)

Example

ADDITIONAL OUTPUT AND EXPANSION MAYBE POSSIBLE

000 001 010 011011 100 101 110 111

0 1 2 3 4 5 6 7 8

LPS MPS

(55)

Example

000 001 010 011011 100 101 110 111

0 1 2 3 4 5 6 7 8

MPS

2 3 4 5 6 7 8

55

(56)

Example

000 001 010 011011 100 101 110 111

0 1 2 3 4 5 6 7 8

0 1 2 3 LPS4 5 6 7 8

(57)

N= any power of 2

◦Small number of states

◦Every state Porb{MPS} :

 near 1

 near ½

 In between

◦ We can choose a large N

 highly probable events require negligible code len gth

 number of states small enough to allow table looku ps rather than arithmetic

57

(58)

PARTITIONS - ρ and є

3-2

(59)

Partitions - ρ and є

 We know that is possible to use a few number of possible probabiliti es to design a binary arithmetic c oder.

 Now we want to give a theoretical basis for selecting the probabilit ies.

59

(60)

Excess code length

 Average code length of a given mod el

◦L =

 Optimal average code length

◦H=

 Excess code length

◦ E = L H

 If we let d_i=q_i p_i , we can have



(61)

Є -partitions

 We can partition the space of poss ible probabilities to guarantee th at the use of approximate probabil ities will never add more than є^t o the code length of any event.

61

(62)

Є -partitions

 1. set i:=0 and Q₀:=1/2

 2. Find P_i+1 > Q_i s.t. E(P_i+1, Q_i)= Є

 3. Find Q_i+1 > P_i+1 s.t. E(P_i+1, Q_i+1)=

Є

 4. i++, repeat step 2 and 3 till P

i+1or Q_i+1 reach 1

(63)

Є -partitions

Range of actual

probabilities Probability to use

[0.0000, 0.0130) 0.0003

[0.0130, 0.1421) 0.0676

[0.1421, 0.3691) 0.2501

[0.3691, 0.6309) 0.5000

[0.6309, 0.8579) 0.7499

[0.8579, 0.9870) 0.9324

[0.9870, 1.0000) 0.9997

63

(64)

Є -partitions

Range of actual

[0.0000, 0.0130) 0.0003

[0.0130, 0.1421) 0.0676

[0.1421, 0.3691) 0.2501

[0.3691, 0.6309) 0.5000

[0.6309, 0.8579) 0.7499

[0.8579, 0.9870) 0.9324

[0.9870, 1.0000) 0.9997

(65)

Є -partitions

Range of actual

[0.0000, 0.0130) 0.0003

[0.0130, 0.1421) 0.0676

[0.1421, 0.3691) 0.2501

[0.3691, 0.6309) 0.5000

[0.6309, 0.8579) 0.7499

[0.8579, 0.9870) 0.9324

[0.9870, 1.0000) 0.9997

65

(66)

Є -partitions

Range of actual

[0.0000, 0.0130) 0.0003

[0.0130, 0.1421) 0.0676

[0.1421, 0.3691) 0.2501

[0.3691, 0.6309) 0.5000

[0.6309, 0.8579) 0.7499

[0.8579, 0.9870) 0.9324

[0.9870, 1.0000) 0.9997

(67)

Є -partitions

Range of actual

[0.0000, 0.0130) 0.0003

[0.0130, 0.1421) 0.0676

[0.1421, 0.3691) 0.2501

[0.3691, 0.6309) 0.5000

[0.6309, 0.8579) 0.7499

[0.8579, 0.9870) 0.9324

[0.9870, 1.0000) 0.9997

67

(68)

Є -partitions

Range of actual

[0.0000, 0.0130) 0.0003

[0.0130, 0.1421) 0.0676

[0.1421, 0.3691) 0.2501

[0.3691, 0.6309) 0.5000

[0.6309, 0.8579) 0.7499

[0.8579, 0.9870) 0.9324

[0.9870, 1.0000) 0.9997

(69)

Є -partitions

Range of actual

[0.0000, 0.0130) 0.0003

[0.0130, 0.1421) 0.0676

[0.1421, 0.3691) 0.2501

[0.3691, 0.6309) 0.5000

[0.6309, 0.8579) 0.7499

[0.8579, 0.9870) 0.9324

[0.9870, 1.0000) 0.9997

69

(70)

Є -partitions

Range of actual

[0.0000, 0.0130) 0.0003

[0.0130, 0.1421) 0.0676

[0.1421, 0.3691) 0.2501

[0.3691, 0.6309) 0.5000

[0.6309, 0.8579) 0.7499

[0.8579, 0.9870) 0.9324

[0.9870, 1.0000) 0.9997

(71)

ρ -partitions

 We might wish to limit the relativ e error so that the code length ca n never exceed the optimal by more than a factor of 1+ ρ.

 Procedure is similar to Є^-partiti ons.

 The ρ-partitions are not finite.

◦As P approaches 1, the optimal averag e code length grows very small, so to obtain a small relative loss Q must b

e very close to P. ⁷¹

(72)

ρ-partitions

Range of actual probabilities

Probability to use

… …

[0.0033, 0.0154) 0.0069

[0.0154, 0.0573) 0.0291

[0.0573, 0.1670) 0.0982

[0.1670, 0.3722) 0.2555

[0.3722, 0.6278) 0.5000

[0.6278, 0.8330) 0.7445

[0.8330, 0.9427) 0.9018

[0.9427, 0.9846) 0.9709

[0.9846, 0.9967) 0.9931

(73)

COMPRESSED TREES

3-3

73

(74)

 To apply binary fast arithmetic co ding algorithms (mentioned in prev ious sections), input symbols shou ld be transformed into binary sequ ences.

 Here, compressed trees were propos ed to complete the task.

A 0110

(75)

Tree construction

- - 33 67

a b c d

100 0 25 75

e f g h

38 62

0 100 20 80

a 0

c 1/

8

d 1/

4 b

0

e 1/

8

f 0

g 1/

8

h 3/

8 Symbol

Probabilit y

75

(76)

Linearize the tree

- - 33 67

a b c d

100 0 25 75

e f g h

38 62

0 100 20 80

(77)

Compress the tree

3838 00 2020 -- 3333 100100 2525

3838 00 2020 3333 100100 2525

- -

a b

By the fact that probability of a and b are both 0, the omitted node

could be tracked.

77

(78)