S IMPLIFIED RFGS P REDICTION S CHEME - STACK ROBUST FINE GRANULARITY SCALABILITY (SRFGS)

CHAPTER 4 STACK ROBUST FINE GRANULARITY SCALABILITY (SRFGS)

4.2 S IMPLIFIED RFGS P REDICTION S CHEME

Figure 4.1 shows the original RFGS encoder architecture as proposed in [8] and [24].

The enhancement layer bitstream is generated with the following process. The motion compensation module of the enhancement layer uses the base layer motion vectors and the high quality reference image HQRI stored in the enhancement layer frame buffer to generate the high quality prediction image ELPI. The enhancement layer motion compensated frame difference MCFDEL is computed by subtracting ELPI from the original signal F:

mc i i

i i

EL F ELPI F HQRI

MCFD _, = − = −( ₋₁) (4.1)

, where the subscripts i and i-1 mean the current frame time i and the previous frame time i-1, respectively. The subscript mc means that (y)mc is the motion compensated version of y. The signal Dˆ is computed by subtracting the reconstructed base layer DCT coefficients Bˆ from the MCFDEL:

i i EL

i MCFD B

∧

∧ = − (4.2)

The signal Dˆ is entropy encoded to generate the enhancement layer bitstream.

Note that for simplicity and also due to the linearity of DCT, in this chapter we use same notation for the symbol in spatial and transform domain.

The high quality reference image HQRI at the enhancement layer is generated as follows. The first β bit planes of the difference signal Dˆ is summed up with Bˆ . The resultant signal is converted back to the spatial domain using the IDCT transform and summed up with ELPI to get the enhancement layer reconstructed image ELRI.

i i mc i

i HQRI B D

ELRI =( ₋₁) + ˆ + ˆ (4.3)

It should be noted that for simplicity we assume all of the bit planes in will be used in the enhancement layer prediction loop. The base layer reconstructed signal B will be subtracted from the signal ELRI to get the signal D with only enhancement layer information. The signal D will be attenuated by a leak factor α and added back the signal B before storing into the enhancement layer reference frame buffer. Thus, we have the following relationship:

Dˆi

Enhancement

Figure 4.1 The original RFGS encoder

i i

i B D

HQRI = +α (4.4)

The rationale for performing the attenuation process on the signal D is that we want the errors to be attenuated for all the past frames recursively. If the attenuation process is only applied to the first few bit planes of , only the errors occurred in the current frame are attenuated. The errors occurred earlier are still accumulated for the subsequent frames through the motion prediction loop without attenuation.

Dˆ

Although the RFGS prediction architecture efficiently reduce the drift error, it is quite complex. The base layer needs to store the reconstructed DCT coefficient . The enhancement layer firstly subtracts from the prediction error MCFD

Bˆ

Bˆ EL to reduce the

entropy in the signal , and then it uses to form the ELRI. The enhancement layer further accesses the base layer reconstructed image B to generate the signal D with only the enhancement layer information and to generate the HQRI stored in the enhancement layer frame buffer. This prediction scheme increases requirement for both memory and memory access bandwidth. Further, with this complex prediction architecture, the prediction concept of RFGS is difficult to grasp and make new improvements.

Dˆ Bˆ

Thus, we will simplify the prediction scheme while maintaining the same coding efficiency. From equation (4.3) and (4.4), we can get the following relationship:

i i mc i i

i B D B D

ELRI =( ₋₁+α ₋₁) + ˆ + ˆ (4.5) By grouping the base layer information and the enhancement layer information, equation (4.5) becomes

i mc i i

mc i

i B B D D

ELRI =( ₋₁) + ˆ +(α ₋₁) + ˆ =B_i+D_i (4.6) , where

i mc i

i B B

B =( ₋₁) + ˆ (4.7)

and

i mc i

i D D

D =(α ₋₁) + ˆ . (4.8)

DCTQ

Q^-1IDCT

Loop Filter BFB (base layer

frame buffer) BFB (base layer

frame buffer)

Figure 4.2 The simplified RFGS encoder

From (4.8) we know that the residue D can be derived simply from accumulating the signal Dˆ in all the previous frames. From equations (4.1) and (4.4), we can

Again, by grouping the base layer information and the enhancement layer information, equation (4.9) becomes

mc The difference between the original frame F and the base layer reconstructed image B is actually the quantization error QE at the base layer,

i i

i F B

QE = − (4.11)

Thus, equation (4.10) becomes

mc i i

i QE D

D^∧ = −(α ₋₁) (4.12)

From (4.8) and (4.12), we realize that the only signal that the enhancement layer acquires from the base layer is the base layer quantization error QE, all the other signals can be generated by the enhancement layer itself. With this analysis, we can derive a simplified RFGS prediction scheme as shown in Figure 4.2, and it still provides identical functionality with the original RFGS prediction scheme as shown in Figure 4.1. In the simplified architecture, the base layer quantization error QE will be predicted

with the reference frame stored in the enhancement layer frame buffer EFB. This step performs the equation (4.12) in Figure 4.1. The prediction error will be transformed and bit plane coded as FGS bitstreams. The first β bit planes will be inversely transformed and added back with the prediction to generate the signal D. This step performs the equation

Dˆi

(4.8) in Figure 4.1. The resultant signal D will multiply by α for leaky prediction before it is stored in the frame buffer. The simplified RFGS architecture significantly reduces the complexity of the RFGS. The base layer encoder needs not store the reconstructed base layer DCT coefficient . The enhancement layer encoder needs not access and perform the computation with the base layer signal

and B. The enhancement layer encoder architecture is just like the base layer encoder replacing the original signal from F with the base layer quantization error QE.

Bˆ

4.3 Enhanced Prediction Architecture Using

在文檔中具強韌細緻架構之可調視訊編碼演算法 (頁 82-87)