Problem Formulation - 可調視訊編碼之高等細緻可調性研究

The problem that we would like to solve is how to construct a better enhancement-layer predic-tor that improves the coding efficiency while minimizing the degradation from drifting errors.

When our EMFGS, MPEG-4 FGS [15], and other advanced FGS schemes compress the video into a base-layer and an enhancement-layer, there are three assumptions:

1. Base layer is guaranteed to be received without error.

Chapter 3. Enhanced Mode-Adaptive Fine Granularity Scalability

Drifting Loss

Bit Rate Quality (PSNR)

Coding Efficiency Loss

MPEG-4 FGS Advanced FGS

Non-Scalable Our Goal

Figure 3.1: Comparion of rate-distortion performance for non-scalable codec, MPEG-4 FGS, and advanced FGS algorithms.

2. Base layer is of low bit rate and low quality. Thus, the residue at the enhancement layer is large.

3. Enhancement layer is not received in an expected manner by encoder/server. Thus, there could be predictor mismatch if we create a prediction loop at the enhancement layer.

Based on these assumptions, this section describes (a) the formulation for minimizing pre-diction residue at the enhancement layer, (b) the problem when decoder receives less enhance-ment layer than expected, (c) the formulation of predictor mismatch at the enhanceenhance-ment layer, and (d) the target of constraining mismatch errors.

3.2.1 Predictor for Enhancement Layer

While MPEG-4 FGS [15] predicts the enhancement layer from the base layer, we can exploit the available reconstructed frames at time t to form a better enhancement-layer predictor. Currently, the enhancement-layer predictor PE(t)is like the following function:

PE(t) = f (IB(t)). (3.1)

To construct a better enhancement-layer predictor PE(t), we can optimally exploit all the avail-able reconstructed frames at time t, as illustrated below:

PE(t) = f (IB(t), IB(t− 1), ..., I^E(t− 1), I^E(t− 2), ....). (3.2)

Because Eq. (3.2) offers more selections for constructing the predictor than Eq. (3.1) does, it is easier to minimize the prediction residue at the enhancement layer as Eq. (3.3).

minkI^o(t)− P^E(t)k . (3.3)

Because the residue contains less energy, the reconstructed enhancement-layer frame IE(t)will have better quality.

While the optimal predictor requires multiple frame buffers and motion compensation loops, our predictor is restricted to be constructed from the current base-layer frame and the previous enhancement-layer frame, for lower complexity; that is,

PE(t) = f (IB(t), IE(t− 1)). (3.4) In this case, we only need two frame buffers and motion prediction loops. To further improve prediction efficiency, one can introduce more frame buffers and adaptively select reference frames as the long-term prediction in [31]. For simplicities of the presentation, we will use Eq. (3.4) instead of Eq. (3.2) for the rest of the theoretic framework. One can easily replace Eq.

(3.4) with Eq. (3.2) for more detailed theoretic derivation.

3.2.2 Predictor Mismatch

Although we can construct a better predictor from the reconstructed enhancement-layer frames, using the reconstructed enhancement-layer frames as predictor could create a mismatch prob-lem. This is because decoder may not receive the enhancement layer in an expected manner.

When decoder receives less enhancement layer, a distorted enhancement-layer predictor is reconstructed at decoder side. Because the enhancement-layer predictor is from the recon-structed base-layer frame as well as the reconrecon-structed previous enhancement-layer frame, the enhancement-layer predictor at decoder side becomes ePE(t)instead of PE(t), as shown below:

PeE(t) = f (IB(t), eIE(t− 1)). (3.5)

The difference between P_E(t) in Eq. (3.4) and ePE(t) in Eq. (3.5) is the predictor mismatch between encoder and decoder.

The predictor mismatch will create errors in the decoded pictures. This is because the output

Chapter 3. Enhanced Mode-Adaptive Fine Granularity Scalability

picture equals to the summation of the predictor and the residue, as the following:

Ieo(t) = ePE(t) + (t). (3.6)

In this case, even if we received the correct residue at time t, we cannot reconstruct prefect pictures without errors:

Error = Io(t)− eIo(t)

= ( (t) + PE(t))− ( (t) + ePE(t)) (3.7)

= (PE(t)− ePE(t)).

3.2.3 Drifting and Accumulation Errors

For more details, we further illustrate the mismatch problem using the end-to-end transmission model shown in Figure 3.2. As illustrated, the enhancement-layer residue at the encoder is

(t) = Io(t)− P^E(t) ∀t ≥ 0, (3.8)

and its reconstructed frame for the construction of future predictor is

IE(t) = T runnh (t)i + P^E(t) (3.9)

= b(t) + P^E(t).

Through an erasure channel, the enhancement layer received by the decoder is modeled as the subtraction of an error term d(t) from the original enhancement-layer (t):

e(t) = (t) − d(t). (3.10)

Therefore, at the decoder, the reconstructed enhancement-layer frame for the construction of future predictor is

IeE(t), Trunnhe(t)i + ePE(t) = be(t) + ePE(t) ∀t ≥ 0. (3.11)

To illustrate the worse case of mismatch effect, we define the enhancement-layer predictor

( )

d t Unreliable Transm ission

Enhancement-Layer Encoder Enhancement-Layer Decoder

Reliable Transmission

Figure 3.2: An end-to-end transmission model for the analysis of drifting error in the enhanced mode-adaptive FGS algorithm.

as the previously reconstructed enhancement-layer frame:

PE(t) = f (IB(t), IE(t− 1)) , 0 × I^B(t) + 1× MC^thI^E(t− 1)i . (3.12) Recall that the enhancement layer is not guaranteed to be received in an expected manner. Thus, constructing the predictor purely from the enhancement layer produces the worse case for the mismatch problem. From the definition in Eq. (3.12), the equivalent predictor at the decoder can be written as the following:

PeE(t), MC^tD

IeE(t− 1)E

. (3.13)

To represent the predictor at the decoder as a function of received enhancement-layer, we substitute Eq. (3.11) into Eq. (3.13). After the recursive substitution, we have the following expression:

Chapter 3. Enhanced Mode-Adaptive Fine Granularity Scalability

By further substituting Eq. (3.10) into Eq. (3.14), we can group all the transmission errors together as the following:

where ePE(0) = PE(0) = IB(t) because the enhancement-layer predictor for the first intra-frame is from the base layer. The first term in Eq. (3.15) is the enhancement-layer predictor PE(t) at the encoder and the grouped error terms become the equivalent predictor mismatch error, as the following:

M ismatchError = PE(t)− ePE(t) = Xt−1

i=0

d(i),b (3.16)

where we save the expressions of the motion compensations for notation simplicity.

From Eq. (3.16), the transmission error further creates two kinds of errors:

1. Drifting error: For a single transmission error at time j, i.e., d(j)δ[i − j], the mismatch error in Eq. (3.16) can be expressed as bd(j)μ[t− 1 − j]. In other words, the transmission error at time j drifts to the enhancement-layer predictors after time j, i.e., {PE(t)|t > j}.

2. Accumulation error: The equivalent predictor mismatch error at frame j is the accumu-lation of transmission errors before frame j, i.e., ^j−1P

i=0

d(i). It is a consequence of driftingb error and temporal prediction.

3.2.4 Constraining Predictor Mismatch

Although a better predictor can bring coding gain at high bit rate, it could introduce drifting er-rors at low bit rate. While optimizing the performance at different bit rates, a dilemma situation may occur. As a result, our goal is to find the predictor function f(·) in Eq. (3.2) that minimizes the prediction residue at the enhancement layer, as described in Eq. (3.17),

minkI^o(t)− P^E(t)k , (3.17)

and constraints the predictor mismatch described by the following:

||P^E(t)− ePE(t)|| ≤ T hreshold. (3.18) To find the best trade-off between prediction residue and mismatch error, we employ a La-grange multiplier as in traditional rate-distortion optimization problem. We observe that the prediction residue, kI^o(t)− P^E(t)k, is inversely proportional to the mismatch error, ||P^E(t)− PeE(t)||, as depicted in Figure 3.3. In other words, as more enhancement layer is used for predic-tion, predictor of better quality can reduce the prediction residue. However, if the enhancement layer is not received, more serious mismatch error may occur. Therefore, according to Lagrange principle, the optimal predictor function is the one that minimizes the Lagrange cost:

min(λ× kI^o(t)− P^E(t)k + ||P^E(t)− ePE(t)||). (3.19) In practice, we find that the convex property in Figure 3.3 is not guaranteed; that is, the Lagrange solution may lead to a sub-optimal solution. However, even with such imperfection, our heuristic solution still follows the Lagrange principle, i.e., the determination of predictor function should consider both coding gain and drifting loss.

在文檔中可調視訊編碼之高等細緻可調性研究 (頁 42-48)