Multi-layer Encoder Control - 針對可調視訊編碼多層編碼控制的快速決策演算法

The explicit bit allocation (EBA), which can trade oﬀ motion information and residual information, has been well studied in image and video coding [5]. The feature of EBA is that bits can be shifted among diﬀerent regions according to their relative importance such that the overall visual quality could be optimized. From this point of view, the R-D performance of the BL and the R-D performance of the EL can be seen as two diﬀerent regions, and the tradeoﬀ problem between these regions can also be seen as a kind of explicit bit allocation (EBA) problem in SVC. This EBA problem is introduced in Subsection 2.2.2.1, and the reasons of development for our proposed MLEC formula are concluded in the end.

Furthermore, there is a diﬀerent type of bit allocation criterion, called implicit bit allocation (IBA). The feature of IBA is that bits allocated to each region are "fixed"

and only "control policy" can be made according to the relative importance of each region. From the point of view to trade oﬀ BL and ELs, the "control policy" can be seen to make the corelations between Langrangian factor λ of each layer by being involved with the weighting factor assigned to each region. Thus, the IBA method, which has been addressed for the SVC, is introduced in Subsection 2.2.2.2.

In the final Section 2.3, the diﬀerent properties between the EBA method and the IBA method are compared and summarized.

2.2.2.1 Explicit Bit Allocation

In order to overcome the disadvantages of the BUEC, an encoder control of EBA for fidelity scalable coding by joint optimization of BL and EL coding parameter selection has been developed in [3]. Without loss of generality, the modifications of the encoder

Chapter 2. Scalable Video Coding and its Encoder Control

control are described for a simple two-layer configuration; but they can be easily gen-eralized for a multi-layer scenario. In the two-layer scenario, all BL decisions are based on the minimization of the weighted cost function:

{p0,minp1|p0}(1− w) · (D⁰(p0) + λ0· R⁰(p0)) (2.4) + w· (D¹(p1|p⁰) + λ1· (R⁰(p0) + R1(p1|p⁰)))

The first and second term of Eq. (2.4), which are adopted from BUEC, represent weighted costs for the BL and the EL, respectively. The weighting factor w [0; 1]

controls the trade-oﬀ between BL and EL coding eﬃciency. In order to let the R-D performance of w = 0 case can fit the R-D performance of BUEC, the BL decisions are based on the minimization of Eq. (2.4), but the EL decisions are refined later by the minimization of Eq. (2.3). When w is equal to 1, the BL parameters are only optimized for the EL coding without taking the reconstruction quality of the BL into account. The coding parameters pi can be seen as the current MB type of layer i in the mode decision process, on the other hand it can also be seen as the current motion vector of layer i in the motion estimation process. The motion estimation process of MLEC is simplified by Schwarz et al. in [3]. With the general concept of Eq. (2.4), the minimization proceeds over of the Cartesian product space of p0 and p1. The MLEC problem can then be reverse derived as the following multiple objective optimization problems:

min(1− w)D⁰(p0) + wD1(p1|p⁰) s.t.

(1) (1− w) × (R⁰(p0))≤ R^B

(2) (w)× (R⁰(p0) + R1(p1|p⁰))≤ R^E

(2.5)

RBis the maximum target bit-rate of the BL and RE is the maximum target bit-rate of the EL. The constraint (2) of Eq. (2.5) is unlimited in w = 0 case, but it converges to RE when w increases to 1. The objective and constraints vary with the choice of the weighting factor. However, in the contexts of MLEC, it makes more sense to alter the objective function while leaving the constraints unchanged. Thus, these constraints do

Sec 2.2. SVC Encoder Control

not make sense. It causes that we can’t obtain the expected RD performance of EL in some high weighting cases and may lead to unpredictable results. Hence, this problem formulation of MLEC is a heuristic solution.

For the above reasons, we have reformulated the problem of MLEC and, based on the problem formulation, proposed a new decision criterion. These details are presented in Chapter 3.

2.2.2.2 Implicit Bit Allocation

An implicit bit allocation (IBA) method, as a tradeoﬀ criterion for the coding eﬃciency between diﬀerent regions has been proposed for the combined coarse granular scala-bility (CGS) and spatial scalascala-bility [6]. The IBA is formulated as a multiple objective optimization problem for given that a region which is a quality level at a spatial res-olution and a weighting factor input that is determined by customers’ interests. The IBA exhibits a distinguished feature, which allows bits allocation to each region being fixed and only tradeoﬀ between motion and residual information in each region can be properly set such that coding eﬃciency of each region is guaranteed in order according to the weighting factor.

Normally, diﬀerent weighting factors can be assigned to diﬀerent regions according to their diﬀerent importances. Obviously, a "control policy" is actually a type of

"implicit" bits. If a "control policy" is favorable to a region, more "implicit" bits are allocated there with that region’s coding eﬃciency becoming relatively higher. In this section, the IBA formula derivation process is briefly introduced.

Let a Lagrangian factor λl,i,j ( = 0.85Q²_l,i,j) represents the region corresponding to the lth temporal level, the jth spatial resolution and the ith SNR level, i.e., λl,i,j

corresponds to the target bit rate ˆγl,i,j. Normally, three SNR layers are suﬃcient for the CGS at a given spatial level, so we assume that the section contains three SNR layers. Due to rate distortion optimization, the tradeoﬀ between motion and residual information in a region is determined by two quantization parameters, one for ME/MC, and the other for the quantization of residual information. When the CGS range at the jth spatial level is wide, two pairs of Lagrangian multipliers, (λ^mv_lo (l, j), λl,j,1) and (λ^mv_hi (l, j), λl,j,2) are required to generate two motion vector fields (MVF), which correspond to (QP_lo^mv(l, j), QPl,j,1) and (QP_hi^mv(l, j), QPl,j,2), respectively. Otherwise,

Chapter 2. Scalable Video Coding and its Encoder Control

one MVF by (λ^mv_lo (l, j), λl,j,1) is enough at a spatial level j.

Using the Lagrangian optimization method, the implicit solution to the optimiza-tion problem can be derived. Since the weighting factor of the (i × j)th region is w^l,j,i, the corresponding Langrangian function is

h =

and the objective map between ˆγl,i,j and λl,j,i, the optimal solution is solved by a simplified solution of "Divide and Conquer" presented in [6]. The key idea of "Divide and Conquer" is to simplify the process of obtaining the solution to complex problems by ignoring certain correlations among diﬀerent regions. Thus, the author first uses this idea to compute two auxiliary values λ^opt_lo (l, j) and λ^opt_hi(l, j), which function to determine the value of λ^mv_lo (l, j) and λ^mv_hi (l, j) at a spatial level j. After the "Divide and Conquer" process, the two auxiliary values λ^opt_lo (l, j) and λ^opt_hi (l, j) are obtained as follows: The Eq. (2.6) and Eq. (2.7) reveal the qualitative insight brought by the choices of λ^opt_lo (l, j) and λ^opt_hi (l, j) according to weighting factor wl,j,i.The author uses the philoso-phy of "think globally" to determine the values of λ^mv_lo (l, j) and λ^mv_hi (l, j).The customer oriented scalable tradeoﬀ is achieved by applying, in order, the following rules:

• Rule 1: The ROI corresponding to λ^optlo (l, φ(l))has the highest priority to guaran-tee its coding eﬃciency. (φ(l) is the most important spatial layer in the temporal layer l.).

Sec 2.3. Comparison and Summary

• Rule 2: Subsequently, the ROI that corresponds to λ^optlo (l, j) has the second highest priority to guarantee its coding eﬃciency at the spatial level j.

• Rule 3: Finally, other regions have the lowest priority to guarantee their coding eﬃciency.

The above IBA solution is further adopted to support two cross-layer motion esti-mation/motion compensation (ME/MC) schemes for the CGS and spatial scalability, which are also presented in the remainder of [6]. In this section, the two cross-layer schemes are not discussed. From the previous presentation of IBA, we find that IBA is a method used to fix the target bitrate and modify the Lagrangian multiplier in each coding layer (region). In this way, two things are achieved:

• The tradeoﬀ of coding eﬃciency between diﬀerent coding layers.

• The tradeoﬀ of motion information and residual information in one coding layer.

But the encoding process of IBA is still the same as Section 2.2.1.

在文檔中針對可調視訊編碼多層編碼控制的快速決策演算法 (頁 21-25)