• 沒有找到結果。

Discrete Laplacian Parameter

4.4 Parameter Estimation

4.4.1 Discrete Laplacian Parameter

For calculating the expectations of ∆D and ∆R, we need the probability distribution of each transform coefficient. Since the actual distribution is only available at encoder side, we adopt a model-based approach to minimize the overhead. Specifically, we model each 4x4 integer transform coefficient with a discrete Laplacian distribution, as defined in Eq. (4.2), where Xn,k denotes the n-th zigzag-ordered coefficient of block k and xn,k stands for its outcome.

Particularly, we assume the co-located coefficients are independently and identically distributed

Chapter 4. Context-Adaptive Bit-Plane Coding

(i.i.d.). Thus, the Laplacian parameter σn only depends on the zigzag index n. Although i.i.d.

assumption is not exactly true, it simplifies the derivation and provides good approximation.

P [Xn,k = xn,k], 1− σn

1 + σn × (σn)|xn,k|. (4.2) To estimate the Laplacian parameter, we use maximum likelihood principle [30] since max-imum likelihood estimator offers many good properties such as asymptotically unbiased and minimum variance. Given a set of M observed data and a presumed joint probability with an unknown parameter, the maximum likelihood estimator for the unknown parameter is the one that maximizes the joint probability. For an enhancement-layer frame having M 4x4 blocks, the joint probability for the n-th coefficients can be written as in Eq. (4.3). According to the i.i.d. assumption, we can simplify the joint probability as M multiplication terms. Further, by substituting Eq. (4.2) into Eq. (4.3), we can obtain a close form formula for the joint probability as bellow:

By definition, the maximum likelihood estimator of σnis the one that maximizes Eq. (4.3).

To find the solution, we take the derivative with respect to σn and solve for the root as in Eq.

(4.4).

For simplification, in Eq. (4.5), we further incorporate a logarithm function to transform the exponential terms into additions/subtractions. Since logarithm function is monotonic, the root of Eq. (4.5) is also the one that maximizes Eq. (4.3).

d

After taking the derivative with respect to σn, Eq. (4.5) can be rewritten as Eq. (4.6), which can

be further rearranged as Eq. (4.7).

Since σnis a real number between 0 and 1, we exclude the root that does not meet such a con-straint. Eq. (4.8) shows the remaining solution for σn, which is also our maximum likelihood estimator for the Laplacian parameter.

σn=−μ−1x + 2

According to Eq. (4.8), to estimate the σn for a coefficient, we first calculate the mean of absolute values of the co-located coefficients, μx, and then substitute it into Eq. (4.8) to obtain the estimator. Specifically, the estimation is done at the encoder and the estimated parame-ters are transmitted to the decoder. For each enhancement-layer frame, we have 16 parameparame-ters for the luminance component and another 16 parameters for the chrominance part. Particu-larly, each parameter is quantized and coded with a 8-bit syntax at frame level. Thus, for each enhancement-layer frame, additional 256 bits are coded as overhead. As compared to the entire bit-stream, the overhead is just a minor portion.

Figure 4.6 compares the estimated distributions with the actual ones. As shown, the actual distributions are close to Laplacian and the estimated models preserve the relative distributions of different coefficients. To show the accuracy of our estimation, we calculate the Kullback-Leibler distance2 (KLD) [7], which is a common measure for showing the difference between the actual distribution and its estimation. A KLD of value 0 means that the estimated distribution is identical to the actual one. As shown in the part (b) of Figure 4.6, the KLD between our estimated distributions and the actual ones approaches zero; that is, our estimated distributions are close to the actual ones.

2KLD(P (x), eP (x)) =P

x

P (x) log2(P (x)h

P (x)), where P (x) is the actual probability distribution and eP (x) is its estimation.

Chapter 4. Context-Adaptive Bit-Plane Coding

Figure 4.6: Probability distributions of the 4x4 integer transform coefficients. The legend ZZn denotes the zigzag index of a coefficeint and the KLD stands for the Kullback-Leibler distance.

(a) Actual probability distributions. (b) Estimated probability models.

1

Refinement bit to be coded Significant bit to be coded

The first coded bit

Figure 4.7: Examples of ∆D estimation for the significant bit and the refinement bit.

4.4.2 Estimation of ∆D

To estimate the ∆D for a coefficient bit, we use the reduction of expected squared error. Since the decoding of a coefficient bit is to reduce the uncertainty for a coefficient, we can calculate the reduction of expected squared error from the decrease of uncertainty interval.

In Figure 4.7 we depict the estimated distribution of a 4-bit coefficient and give two exam-ples for illustrating the decrease of uncertainty interval. Without loss of generality, the left hand side shows an example of significant bit, where the first bit is coded as zero. On the other hand, the right hand side illustrates the case of refinement bit, where the first bit is non-zero.

From the coded bits, we can identify the uncertainty interval in which the actual value is located. For instance, in the example of significant bit, we know that the actual value is confined within the interval B. Similarly, for the case of refinement bit, we learn that the actual value falls in the interval A+. Given the interval derived from the previously coded bits, the next bit for coding is to further decrease the uncertainty interval. For example, the significant bit to be coded is to determine that the actual value is in the subinterval B0or©

B1+∪ B1

ª. Particularly, for a significant bit of non-zero, an additional sign bit is coded to further decide that the actual value is in the subinterval B+1 or B1. By the same token, the refinement bit to be coded is to

Chapter 4. Context-Adaptive Bit-Plane Coding

determine that the actual value is in the subinterval A+1 or A+0.

From the decrease of uncertainty interval, we can calculate the reduction of expected squared error. At decoder side, the expected squared error in an interval is the variance within the inter-val. Thus, we can express our ∆D estimation as the reduction of variance. Eq. (4.9) formulates our estimation for the significant bit in Figure 4.7, where Var[Xn,k|Xn,k ∈ B] denotes the condi-tional variance of Xn,k given that Xn,k is in the interval B. Similarly, we have the variances for the subintervals B1+, B1, and B0. Since we do not know in which subinterval the actual value is located, the variance of each subinterval is further weighted by its probability. To simplify the expression, we find that the variances of B1+ and B1 are identical because Laplacian distribu-tion is symmetric. So, we can merge the second and the third terms in Eq. (4.9) by factorizadistribu-tion.

Appendix A gives the detail derivations for the conditional probability and conditional variance in terms of interval range and σn.

E[e 4Dn,k,B,signif icant] ,Var[Xn,k|Xn,k ∈ B]

− P £

Xn,k∈ B1+|Xn,k ∈ B¤

× Var[Xn,k|Xn,k ∈ B1+]

− P £

Xn,k∈ B1|Xn,k ∈ B¤

× Var[Xn,k|Xn,k ∈ B1]

− P [Xn,k ∈ B0|Xn,k ∈ B] × Var[Xn,k|Xn,k∈ B0] (4.9)

=Var[Xn,k|Xn,k ∈ B]

− P £

Xn,k∈ {B1+∪ B1}|Xn,k∈ B¤

× Var[Xn,k|Xn,k ∈ B1+]

− P [Xn,k ∈ B0|Xn,k ∈ B] × Var[Xn,k|Xn,k∈ B0].

In Eq. (4.9), the co-located significant bits within the same interval have the same estimated

∆D because the co-located coefficients have identical Laplacian model. The priorities of the co-located significant bits may not be distinguishable. To perform the reshuffling in a content-aware manner so that the regions containing more energy are with higher priority, in Eq. (4.9) the subinterval probabilities are replaced with the context probability models. Recall that the context model of the significant bit refers to the significance status of the adjacent and co-located coefficients. Using the context probability model for substitution makes the ∆D estimation become content aware and energy dependent.

For the substitution, we find that P £

Xn,k ∈ {B1+∪ B1}|Xn,k ∈ B¤

actually denotes the probability of non-zero for the significant bit to be coded and P [Xn,k ∈ B0|Xn,k ∈ B]

rep-resents its probability of zero. Hence, we use the associated context probability models to substitute for these two terms. Eq. (4.10) shows the content-aware ∆D estimation for the example of significant bit in Figure 4.7, where SigCtxP(CtxIdx(n, k, B), 1) denotes the con-text probability model of non-zero for the significant bit of Xn,k that locates in the interval B.

Correspondingly, the SigCtxP(CtxIdx(n, k, B), 0) represents its probability of zero.

E[e4Dn,k,B,signif icant]

∼=Var[Xn,k|Xn,k ∈ B] (4.10)

− SigCtxP(CtxIdx(n, k, B), 1) × Var[Xn,k|Xn,k ∈ B1+])

− SigCtxP(CtxIdx(n, k, B), 0) × Var[Xn,k|Xn,k ∈ B0].

Following the same procedure, one can estimate the ∆D for the other bits. Eq. (4.11) shows the estimated ∆D for the refinement bit in Figure 4.7. Since the refinement bit does not have any context probability model, the conditional probabilities in Eq. (4.11) are derived from the estimated Laplacian model.

E[e 4Dn,k,A+,ref inement]

,Var[Xn,k|Xn,k∈ A+] (4.11)

− P (Xn,k ∈ A+1|Xn,k ∈ A+)× Var[Xn,k|Xn,k ∈ A+1]

− P (Xn,k ∈ A+0|Xn,k ∈ A+)× Var[Xn,k|Xn,k ∈ A+0].

4.4.3 Estimation of ∆R

To estimate the ∆R for a coefficient bit, we use the binary entropy function3 which represents the minimum expected coding bit rate for an input bit. Eq. (4.12) defines our ∆R estimation for the significant bit in Figure 4.7. The first term represents the binary entropy of a significant bit using the context probability model as an argument while the second term denotes the cost from a sign bit. The sign bit is considered as partial cost of the significant bit because the decoder can only perform the reconstruction after the sign is received. Recall that each sign bit averagely consumes one bit. Also, the sign bit is only coded after a significant bit of non-zero. Thus, the

3Hb(P (1)) = −P (1) × log2(P (1)) − (1 − P (1)) × log2(1 − P (1)), where P(1) is the probability of non-zero for an input bit.

Chapter 4. Context-Adaptive Bit-Plane Coding

cost of the sign bit is weighted by the context probability model of non-zero.

E[e4Rn,k,B,signif icant]

,Hb(SigCtxP (CtxIdx(n, k, B), 1)) (4.12)

+SigCtxP (CtxIdx(n, k, B), 1) × 1.

In addition, Eq. (4.13) illustrates our ∆R estimation for the example of refinement bit. To calculate the binary entropy, we use the conditional probability of a subinterval as an argument.

For instance, we use P (Xn,k ∈ A+1|Xn,k ∈ A+)as an argument in Eq. (4.13). Particularly, as mentioned in Section 4.2, such an estimated probability is not only for the calculation of binary entropy, but also for the CABIC coding. By the same methodology, one can estimate the ∆R for the other bits.

E[e4Rn,k,A+,ref inement], Hb(P (Xn,k ∈ A+1|Xn,k ∈ A+)). (4.13)