Motivation - 應用於H.264/AVC 1080HD的高產量背景適應性二元算術編解碼器

Chapter 1 Introduction

1.1 Motivation

H.264/AVC is the state-of-the-art video coding standard developed by ITU-T Video Coding Experts Group and ISO/IEC Moving Picture Experts Group (MPEG).

The new standard provides gains in compression efficiency of up to 50% over a wide range of bit rates and video resolutions compared with the former standards such as H.263 and MPEG-4 by employing many innovative technologies such as multiple reference frame, variable block size motion estimation, in-loop de-blocking filter and context-based adaptive binary arithmetic coding (CABAC). Because of its outstanding performance in quality and compression gain, the H.264/AVC is adopted to be video standard in more and more consumer application products such as digital video recorder / player, portable video device…etc.

H.264/AVC contains two alternative entropy coding schemes which are context-based adaptive variable length coding (CAVLC) and context-based adaptive binary arithmetic coding (CABAC). The simpler entropy coding method is CAVLC for simple profile. It can save about 10% for the execution time under increasing about 7% bit-rate compared with CABAC. Because of bit-rate saving, CABAC is the superior scheme for massive capacity demand of the newest video application.

inevitable complexity overhead. The results of the software-based complexity analysis are presented in [3], which claims that switching from CAVLC to CABAC usually leads to complexity increasing by 25% ~ 30% for encoding and 12% for decoding, in terms of access frequency (total number of memory transfers per second); therefore, both the coding acceleration and the cost efficiency promoting of CABAC are required.

We propose three throughput promoting methods to make CABAC encoder achieve the specification of 1080HD in level 4.1 stipulated in H.264/AVC standard [1].

For CABAC decoder, we continue using the high throughput CABAC decoder [2]

which we presented in August 2006. It can achieve the specification of 1080HD in level 4.0. So our CABAC codec is sufficient to support 1080 HD video for real-time encoding and decoding at 30 fps. Besides, we also introduces some low cost methods such as finite state machine sharing, table reuse…etc to make the CABAC codec be better cost efficiency.

1.2 Organization of this thesis

This thesis is organized as follows. In Chapter 2, we present the algorithm of CABAC. It contains two levels coding procedure. For encoding, the first level is binarization engine, and the second level is arithmetic encoder. For decoding, the order of the two levels procedure is just opposite to encoding. Chapter 3 focuses on throughput promoting. We introduce three arithmetic encoding modes, and showing the proposed three high throughput methods. Chapter 4 focuses on cost efficiency designing. We present the proposed low cost methods and the memory requirement. In the end of this chapter, we show the proposed CABAC codec system architecture

simulation and chip implementation will be shown in Chapter 5. We make a brief conclusion and future work in the last chapter.

Chapter 2 Algorithm of CABAC for H.264/AVC

In this chapter, we introduce the algorithm of CABAC encoding and decoding respectively. Both CABAC encoding and CABAC decoding are composed of three parts: the binarization process, the arithmetic coding process and the context model.

For CABAC encoding, the binarization process reads syntax elements (SE), then computing the bin to offer the arithmetic encoding process for encoding the corresponding bit-streams. For CABAC decoding, the arithmetic decoding process reads the input bit-streams generated by H.264 encoder, and computing the bin to offer the binarization process for decoding the suitable SE. Both arithmetic encoding and arithmetic decoding have to look up context model which records the historical probability to compute the corresponding bit-streams in encoding and the bin value in decoding.

About the description of CABAC algorithm in this chapter, it is based on the content of [1] and [5], where the latter only focuses on decoding aspect, and the encoding aspect is introduced in addition in this chapter.

This chapter is organized as follows. In Section 2.1, we present the overviews of the CABAC encoding and decoding flow respectively, and show the two levels coding processe. In Section 2.2, we introduce all kinds of the binarization process such as the

organization. In Section 2.3, the algorithm of basic binary arithmetic coding will be introduced briefly. We introduce it in terms of encoding and decoding respectively in the section 2.3.1 and the section 2.3.2. In Section 2.4, we present the advanced binary arithmetic coding for H.264/AVC, and relating it to arithmetic coding process. Section 2.5 shows the context model related to the different SEs. In final section, what we show is that how to get the neighbor SE to index the suitable context model allocation.

2.1 Overview of CABAC encoding/decoding flow

Intra Frame

Figure 1 H.264/AVC encoder/decoder system block diagram

Figure 1 shows the system block diagram of H.264/AVC encoder and decoder.

Both entropy encoder and entropy decoder contain three entropy coding strategies such as universal variable length coding (UVLC), context-based adaptive variable length coding (CAVLC) and context adaptive binary arithmetic coding (CABAC).

For H.264/AVC baseline profile, it only adopts UVLC and CAVLC two variable length coding (VLC) strategies to code the macroblock (MB) information and the pixels coefficients. UVLC is one of VLC in baseline profile, it codes not only the MB information such as the mb_type, coded_block_pattern, intra_prediction_mode, and so on, but also the MB coefficient such as mvd. Because the residual data coding occupies over 50% of the entire execution time, the residual coefficients are computed by the CAVLC architecture for more efficiency.

For H.264/AVC main profile, it has an advance choice except VLC. CABAC can be used in place of UVLC and CAVLC. Thus, H.264 system just needs CABAC to code all MB information and pixel data if entropy coding flag is assigned to CABAC.

In this section, we introduce the block diagram of CABAC encoder and decoder.

Then, the execution flow of them will be introduced respectively meanwhile.

Binarization Context bin value for context model update

bit stream

Figure 2 shows the block diagram of CABAC encoder. We first see the left side of this figure. All syntax elements (SEs) of the H.264/AVC will be transferred into the binary code “bin” when entering the CABAC encoding process. Besides the SE of fixed-length coding type, all SEs have to be encoded by the binarization process which

by the binary arithmetic encoder. The binary arithmetic encoder has three different encoding types such as normal, bypass and terminal encoding processes. The terminal encoding process is seldom applied in CABAC system, which is only executed one time per macroblock (MB) encoding flow when the current MB is complete. So we ignore its influence due to its seldom applying opportunities. The normal and bypass encoding process are two main binary arithmetic coding modes. If it performs the bypass encoding process, there is no need to refer to the context model because the probability of bit-stream value is equal (probability = 0.5) between logical “1” and “0”.

If it applies the normal encoding process, it has to refer the associated context model depending on the SE type and the bin index.

Figure 3 shows the block diagram of CABAC decoder. In H.264/AVC decoder, the decoding flow is contrary to CABAC encoder. At first, the binary arithmetic decoder reads the bit-stream and transfers it to be bin string. The binarization process reads the bin string and decodes it to be SE by five kinds of decoding flows. The execution sequences between CABAC encoder and decoder are just reverse. But the

and CABAC decoder.

2.2 Binarization process

In Section 2.2, we focus on the binarization process. In CABAC encoder, it reads the syntax element and transfers it to be bin string. In CABAC decoder, it reads the bin string to look up the suitable syntax element. For H.264/AVC, both CABAC encoder and CABAC decoder adopt five kinds of the binarization methods to encode/decode the syntax element/bin string respectively. This section is organized as follows. In Section 2.2.1, the flow of the unary code is shown at first. The unary code is the basic coding method. Section 2.2.2 shows the truncated unary code which is the advanced unary coding method. In Section 2.2.3, we introduce the fixed-length coding flow. It is the typical binary integer method. Section 2.2.4 is the Exp-Golomb coding flow. The Exp-Golomb coding flow is only used for the residual data and the motion vector difference (mvd). Section 2.2.5 is the special definition which is by means of the table method. Specifically, we focus on the binary tree of the macroblock type (mb_type) and the sub-macroblock type (sub_mb_type).

2.2.1 Unary (U) binarization process

Input to this process is a request for a U binarization for a syntax element. Output of this process is the U binarization of the syntax element.

The bin string of a syntax element having value synElVal is a bit string of length synElVal + 1 indexed by binIdx. The bins for binIdx less than synElVal are equal to “1”.

The bin with binIdx equal to synElVal is equal to “0”.

the syntax element is equal to “0”, the bin outputs single bit “0”. Except the syntax element “0”, the bin string sends numSE “1” and one “0” in the end of the binary value.

The value of numSE is equal to the syntax element. Therefore, we find that the bin string length of current syntax element is numSE + 1.

Table 1 bin string of the unary binarization [1]

Syntax

element bin string

0 0 1 1 0 2 1 1 0 3 1 1 1 0 4 1 1 1 1 0 5 1 1 1 1 1 0

… ……

binIdx 0 1 2 3 4 5

According to the unary bin string format shown above, we arrange the encoding and decoding algorithm in Eq. 1 and Eq. 2. These two equations represent the pseudo code of the unary encoding/decoding flow.

For unary encoding, it gets the binIdx which is the index of the bin string in Table 1 from the value of the input syntax element (SEVal). The for loop in Eq. 1 generates the bin string according to the binIdx. When finishing the for loop, it will generate one bit “0” as the end bit of the corresponding bin string. Namely, the unary binarization process arrives at the end of encoding step.

For unary decoding, it sets binIdx to zero at the initial step. The while loop in Eq.

2 checks the current bin assigned by binIdx from the bin string and binIdx counts if the

process arrives at the end of decoding step. binIdx sends to SEVal which is defined as the value of the syntax element.

Start unary (U) encoding process binIdx = SEVal;

for (i == 0; i < binIdx; i ++ ){

bin = 1;

}

bin = 0; (Eq. 1)

Start unary (U) decoding process binIdx = 0;

while (bin[binIdx] == 1){

binIdx = binIdx + 1;

}

SEVal = binIdx; (Eq. 2)

2.2.2 Truncated unary (TU) binarization process

Input to this process is a request for a TU binarization for a syntax element and cMax. Output of this process is the TU binarization of the syntax element.

For syntax element values less than cMax, the U binarization process mentioned in Section 2.2.1 is invoked. For the syntax element value equal to cMax the bin string is a bit string of length cMax with all bins being equal to “1”. TU binarization is always invoked with a cMax value equal to the largest possible value of the syntax element.

The truncated unary binarization process is based on the unary one and has an additional factor of cMax which is defined as the maximum length of the current bin string. If the value of syntax element (SEVal) is less than cMax, the truncated unary

number “1” of the bin string is equal to cMax and there is no “0” bit in the end of the current string. For example, SEVal(=“4”) is assumed. If the value of cMax is “5”, the result of bin string is equal to “11110”. If the value of cMax is also “4”, the result of the bin string is equal to “1111” where the end bit of “0” is truncated in this case.

Eq. 3 is the truncated unary encoding flow which is modified from Eq. 1. Besides checking the value of syntax element (SEVal), it also takes cMax into consideration. If SEVal is less than cMax, it works as the unary encoding process. If binIdx isn’t less than cMax, it doesn’t generate “0” in the end of bin string when completing the encoding of current syntax element.

Start truncated unary (TU) encoding process binIdx = SEVal;

for (i == 0; i < binIdx; i ++ ){

bin = 1;

}

If (SEVal < cMax) bin = 0; (Eq. 3)

Eq. 4 is the truncated unary decoding flow which is modified from Eq. 2. Besides checking the bin value, it takes cMax as a factor, additionally. It works as the unary decoding process when binsIdx is less than cMax. If binIdx isn’t less than cMax, it doesn’t complete the decoding action until reading the end bit “0” in the end of the bin string.

Start truncated unary (TU) decoding process binIdx = 0;

while (bin[binIdx] == 1 && (binIdx < cMax)){

binIdx = binIdx + 1;

}

2.2.3 Fixed-length (FL) binarization process

Input to this process is a request for a FL binarization for a syntax element and cMax. Output of this process is the FL binarization of the syntax element.

FL binarization is constructed by using an fixedLength-bit unsigned integer bin string of the syntax element value. The indexing of bins for the FL binarization is such that the binIdx = 0 relates to the least significant bit with increasing values of binIdx towards the most significant bit.

The fixed-length code is the simple-defined format of the binarization coding process which is defined as the typical unsigned integer. The coding rule is represented by means of the typical binary number. For example, the value of “510” is equal to

“1012”. The value of “510” is defined as the decimal style and the value of “1012” is the binary format which is the required fixed-length code. Table 2 shows the fixed-length code definition.

Table 2 bin string of the fixed-length code Syntax

element bin string

0 0 0 0 0 0 0

For fixed-length encoding process, the value of input syntax element defines the binIdx. The value of (binIdx + 1) is just the required bit numbers for the corresponding bin string of the current syntax element.

For fixed-length decoding process, it has to refer to the value of cMax which

because the maximum value of binIdx is five. All syntax elements which are decoded by the fixed-length format are always represented with six binary bits.

2.2.4 Unary/k-th order Exp_Golomb (UEGk) binarization process

Input to this process is a request for a UEGk binarization for a syntax element, signedValFlag and uCoff. Output of this process is the UEGk binarization of the syntax element.

A UEGk bin string is a concatenation of a prefix bit string and a suffix bit string.

The prefix of the binarization is specified by invoking the TU binarization process for the prefix part [ Min( uCoff, Abs( synElVal ) ) ] of a syntax element value synElVal as specified in Section 2.2.2 with cMax = uCoff, where uCoff > 0. Namely, the prefix part is dominated by cMax. The suffix part of this code doesn’t always apply because it isn’t adopted by two cases.

The UEGk bin string is derived as follows:

¾ If one of the following is true, the bin string of a syntax element having value synElVal consists only of a prefix bit string. Namely, the UEGk doesn’t enter the suffix coding step.

1. If signedValFlag is equal to 1 and the prefix bin contains only one 0 bit, the value of syntax element is just decided by prefix bin string with truncated unary (TU) code.

2. If signedValFlag is equal to 0 and the prefix bin string isn’t equal to the bit string which is composed of the string length cMax of bit 1.

¾ Otherwise, the bin string of the UEGk suffix part of a syntax element value synElVal is specified by a process equivalent to the following pseudo-code:

If( Abs( synE1Val) >= uCoff){

sufS = Abs( synElVal) – uCoff

If( signedValFlag && synElVal != 0) If( synElVal > 0)

put( 0 ) else

put( 1 ) (Eq. 5)

The initial value of k is defined as the order of the unary Exp-Golomb coding which are named as UEGk. In the binarization decoding of CABAC, it only applies two decoding flows such as UEG0 and UEG3. UEG0 is used by the residual data decoding process and UEG3 is used by the motion vector difference one.

2.2.5 Special binarization process

Input to this process is a request for a binarization for syntax element mb_type or sub_mb_type. Output of this process is the binarization of these two syntax elements.

All formats of the binarization coding process are introduced above. But there is still a special coding flow which we don’t describe yet. In order to perform the higher video quality, the macroblock and sub-macroblock are divided into many kinds of types such as I, P, B, and SI slices. In the four basic types, they are also sorted by variable block sizes. These two syntax elements are difficult to define by means of the aforementioned coding flows. In H.264/AVC, it adopts the table-based method to define the macro and sub-macroblock types. The binarization engine reads the bin string and checks if the bin string is mapped the specified location in these tables. If the assigned bin string is found in these tables, it can look up the current macroblock type.

The binarization scheme for coding of macroblock type in I slice is specified in Table 3 [1]. For example, if the value of bin string is equal to “1001011” in I slice, the mapped macroblock type is equal to “8” by look up Table 3. We observe that the probability of the macroblock type appearance is large and its corresponding bin string is shorter.

For macroblock types in SI slices, the binarization consists of bin strings specified as a concatenation of a prefix and a suffix bit string as follows.

The prefix bit string consists of a single bit, which is specified by b0 = ( ( mb_type

= = SI)? 0 : 1 ). For the syntax element value for which b0 is equal to 1, the binarization is given by concatenating the prefix b0 and the suffix bit string as specified in Table 3 for macroblock type in I slices indexed by subtracting 1 from the value of

Table 3 Binarization for macroblock types in I slice Value (name) of mb_type Bin string

0 (I_4x4) 0

In other words, the macroblock in SI slice is the enhanced format of the macroblock type in I slice. The bin string of SI slice macroblock type is composed of

suffix part and the syntax element is equal to 0. If the prefix bit is equal to 1, the suffix part is defined in Table 3 and the mapped value of macroblock types has to be added by 1 in SI slice.

Besides the macroblock type in SI slice, there are still two cases that they generate the bin value through the suffix process. The binarization schemes for P macroblock types in P and SP slices and B macroblocks in B slices are specified in Table 4 [1].

Table 4 is the prefix definitions of mb_type. The suffix parts are decided by the formats in Table 3 because these fields are the intra macroblock type in P, SP, and B slice. But the value of the macroblock type in the last fields in P and B slice have to be added by offset value.

The bin string for I macroblock types in P and SP slices corresponding to mb_type value 5 to 30 consists of a concatenation of a prefix, which consists of a single bit with value equal to 1 as specified in Table 4 and a suffix as specified in Table 3, indexed by subtracting 5 from the value of my_type. Namely in the P and SP slices, the value of

在文檔中應用於H.264/AVC 1080HD的高產量背景適應性二元算術編解碼器 (頁 14-0)