Effect of Prediction Block Size - Weighting Function

3. Mode-Dependent Pixel-Based Weighted Intra Prediction (MPWIP) 15

3.2. Weighting Function

3.2.3. Effect of Prediction Block Size

This last investigation shows how the prediction block size affects the weighting functions. The effect of the prediction block size is somewhat expectable (the higher weight value for the EL’s components will be given to the smaller block size);

however, the texture information in the BL perhaps varies the expectation. Therefore, we need to consider the interaction of the BL texture information. As a result, the analysis on the effect of the prediction block size is necessary.

Figure 13: Waveforms of weighting functions for DC component in the EL of Vertical and Planar mode at different block size levels.

To this end, Figure 13 contrasts the weighting function of the EL’s DC component for 4x4 and 16x16 block sizes. Results are given for Vertical and Planar modes. As expected, the simulation results show that the EL predictor tends to have a higher weight value across the entire prediction block when the block size is smaller.

This is understandable given that directional intra prediction usually performs more efficiently with a smaller block size and that the EL reference pixels are subject to less coding error.

27 3.2.4. Summary

We can summarize our findings so far as follows:

 The weighting functions depend on the direction of the prediction mode taken by the EL intra predictor, i.e. they are mode dependent.

 The DC and AC components from the same layer have sets of weighting

functions that are similar in waveform but differ in magnitude, i.e. the separation of DC and AC components seems beneficial.

 The weighting functions of Horizontal and Vertical modes are related to each other mainly by a transpose operation.

 The effect of the QP setting on the weighting functions is insignificant in terms of bit rate savings in the common test conditions.

 The EL tends to be weighted more heavily in the case of smaller block sizes.

CHAPTER 4 Experimental Results

4.1. Test Conditions

Experimental results provided in this chapter are produced following mainly the All Intra (AI) common test conditions specified in [8]. In particular, only the results for mandatory tests, in which the base layer is HEVC coded, are presented. Moreover, for the 2-layer spatial scalability, the ratio of the EL’s resolution to that of BL is limited to 2x and 1.5x, although in principle any resolution ratios between the layers, including the ratio of 1, can be considered. Table 1 details the test sequences used. It is noteworthy that the weight values used in our pixel-based weighted intra prediction scheme are obtained based on a separate set of training sequences, as given in Table 2.

Generally, it is important to test a model against data which is outside of the samples used to develop it.

Table 1: Test set of video sequences

Class Sequence name Frame count Table 2: Training set of video sequences

Class Sequence name Frame count A SteamLocomotive 300 60fps 10 1280x800 2560x1600 Spatial 2x B Blue Sky 217 25fps 8 960x540

Table 3: Test set of quantization parameter values

Scalability QP of BL QP = QP of EL – QP of BL

Spatial 2x 22, 26, 30, 34 0, 2

Spatial 1.5x 22, 26, 30, 34 0, 2

Table 3 defines the quantization parameter values used for the I-frames in the base and enhancement layers of a sequence for the HEVC base layer case.

Figure 14: An example of rate-distortion curves for two sets of encoding configurations. Four QP values of BL in each set are 22, 26, 30, and 34.

4.2. Coding Performance

The coding performance of the proposed scheme is quantified by measuring the BD-rate savings [2] relative to the SHM 1.0 anchor. In particular, the results presented in Table 4 correspond to the average numbers taken over two sets of encoding configuration, as suggested by the common test conditions [8]. The method used to calculate the values reported in all performance tables will be described as follows.

First, the BD-rate calculation for the single layer coding will be used to calculate the BD-rate savings in each set of QP configurations (QP=0, QP=2) as depicted in Figure 14. The average numbers presented for every coding performance savings table are the average values of these BD-rate savings obtained from the aforementioned two sets.

From Table 4, the overall Y-BD-rate gain over the anchor is 1.0% for the AI-2x case and 0.5% for the AI-1.5x case. The coding gain achieved, however, is highly variable over the test sequences. As an example, the smallest gain is in the

(a)QP=0 (b)QP=2

Table 4: Performance of the MPWIP with respect to SHM 1.0

AI HEVC 2x AI HEVC 1.5x

Y U V Y U V

Class A Traffic -0.7% -0.6% -0.6%

PeopleOnStreet -0.6% -0.7% -0.4% N/A

Class B

Kimono -0.4% -0.4% -0.5% -0.2% -0.3% -0.3%

ParkScene -0.3% -0.3% -0.4% -0.1% -0.2% -0.3%

Cactus -1.0% -1.2% -1.3% -0.3% -0.6% -0.6%

BasketballDrive -2.4% -2.7% -2.2% -1.5% -1.6% -1.3%

BQTerrace -1.3% -1.6% -1.8% -0.4% -0.6% -0.7%

Overall (EL+BL) -1.0% -1.1% -1.0% -0.5% -0.7% -0.6%

Figure 15: The statistical mode distribution of two test sequences at the enhancement layer in SHM-1.0: (a) The ‘ParkScene’ sequence, (b) The ‘BasketballDrive’ sequence

Figure 16: The statistical mode distribution of two test sequences at the enhancement layer in the proposed design: (a) The ‘ParkScene’ sequence, (b) The ‘BasketballDrive’ sequence

(a) ParkScene (b) BasketballDrive

‘ParkScene’ sequence, with only a 0.3% and 0.1% gain for AI-2x and AI-1.5x,

respectively, whereas a much higher improvement is observed in the ‘BasketballDrive’

sequence, reaching up to 2.4% and 1.5%, respectively. The observation above can be explained by the statistical study on mode distribution and the characteristics of each sequence.

Conventionally, in the AI configuration of the common test conditions, at the EL, there are two prediction modes; Intra and Intra-BL. In the proposed design, the proposed algorithm is applied to create a new prediction mode, called MPWIP; this prediction mode has to compete with other conventional modes in the rate distortion optimization (RDO) process to find the best mode with the lowest rate distortion (RD) cost. Therefore, generally speaking, the more pixels coded in this new mode the higher the bit rate savings are expected to be achieved.

The statistical results will be analyzed in this section. Figure 15 shows the mode distribution diagram of the reference software (SHM-1.0) which includes only two conventional prediction modes. It can be seen that the percentage of intra-coded pixels in the ‘BasketballDrive’ sequence is considerably higher than that in the ‘ParkScene’

sequence. Specifically, it is observed that 26.54% more pixels are intra-coded in the

‘BasketballDrive’ sequence, while half as much of that percentage (i.e. 13.87%) is

found in the ‘ParkScene’ sequence. This can be explained by the characteristics of each sequence. The ‘ParkScene’ sequence is a highly-textured sequence while the

‘BasketballDrive’ sequence contains more homogeneous regions.

Figure 16 shows the mode distribution diagram of the proposed design. The results show that the percentage of pixels coded in the new mode of the ‘ParkScene’

Table 5: Performance of the IDCC with respect to SHM 1.0

AI HEVC 2x AI HEVC 1.5x

Table 6: Performance of the WIP with respect to SHM 1.0

AI HEVC 2x AI HEVC 1.5x

sequence. As a result, the higher bit rate savings can be achieved in the latter sequence.

In particular, for both test sequences, most of the pixels coded in the new mode resulted from those that were intra-coded or, alternatively, the proposed algorithm mostly improved the intra-frame prediction. Therefore, it can be concluded that the proposed algorithm works best in the sequences that contain more homogeneous regions.

For comparison with the IDCC and WIP algorithms, their respective BD-rate savings relative to the anchor are shown in Table 5 and Table 6. It can be seen that both schemes offer a much smaller BD-rate savings than our MPWIP. Specially, that of the IDCC ranges from 0.0% to 0.5% in all test cases, with an overall savings of no

more than 0.2%. The WIP, although performing relatively better in some sequences (such as ‘BasketballDrive’ and ‘BQTerrace’), shows a similar performance to the IDCC in terms of Y-BD-rate.

4.3. Simplification

The superior coding performance of our MPWIP comes at the cost of additional memory requirements in both the encoder and decoder for storing weight tables. In our current implementation, the luminance and chrominance components use separate weight tables for different block sizes (7 block sizes in total). Moreover, according to our analysis in Chapter 3, the weight tables for Horizontal, Vertical, Planar and DC modes are distinct from each other. The situation is further complicated by the QP setting, as we found the weighting functions also vary with the QP setting of the BL and the EL. As a result, a total of 840 weight tables are needed, which calls for a memory space of 795 kilobytes (with an assumption that a single-precision floating-point format occupies 32 bits [4 bytes]). Obviously, 795kB of required memory storage is a significant cost for an on-chip memory design.

4.3.1. The Unification to MPWIP

In an attempt to reduce the number of weight tables, we began to analyze the effect of the QP settings on the weighting functions in terms of bit rate savings; two experiments were carried out.

The first experiment is to examine the effect of the QP setting within the delta QP group and the second one is to examine the effect between two QP setting groups.

Specifically, in the former case, we first took out a set of weighting functions obtained

Table 7: Comparison performance of Y-BD-rate savings of an experiment in Test Group A vs.

MPWIP with respect to SHM 1.0. Specifically, the set of weighting functions of QP(22,22) is used by the other QP settings within the delta QP=0 and the set of weighting functions of QP(22,24) is used by the other QP settings within the delta QP=2

AI HEVC 2x AI HEVC 1.5x

MPWIP with respect to SHM 1.0. Specifically, all the sets of weighting functions in delta QP = 0 are used by QP setting in delta QP = 2 with a constraint that the QP settings have the same QP value of the BL share the same set of weighting functions

AI HEVC 2x AI HEVC 1.5x

Table 9: Performance of the QP setting unification with respect to SHM 1.0

AI HEVC 2x AI HEVC 1.5x

Table 10: Performance of the Horizontal and Vertical unification with respect to SHM 1.0

AI HEVC 2x AI HEVC 1.5x

Table 11: Performance of MPWIP-U (the combination of two unifications) with respect to SHM 1.0

AI HEVC 2x AI HEVC 1.5x savings compared to when each QP setting has its own set of weighting functions. In the latter experiment, all the weighting functions obtained within one delta QP group will be used by another group with a constraint that the QP setting sets which have the same QP value at the BL will share the same set of weighting functions. Let’s call all the experiments in the unifying weighting functions within the delta QP setting case as

“Test Group A” and all the experiments in the latter case for the unifying weighting functions between the delta QP groups as “Test Group B”. All the possible test cases

in both “Test Groups” were examined; however, only one test case in each group is presented in this section, as further described in the caption of Table 7 and Table 8.

As can be seen from Table 7 and Table 8, the Y-BD-rate savings in both cases varies insignificantly between the two unifications and the proposed design. In particular, the bit rate savings drops at most by only 0.3% in all of the cases.

Therefore, this provides an opportunity to perform the simplification for weighting functions of the QP settings within a delta QP group and/or for those of the QP settings between two delta QP groups. As a result, the following sub sections are arrived at to support the proposal for the first simplification mode by unifying the weighting functions.

Unifying Weight Tables for Different QP Settings

From the above analysis, we proposed the simplification in which there is only one set of weighting functions for all the QP settings. Table 9 presents the coding performance of the proposed scheme with only one set of weight tables for all QP settings. In this case, the number of weight tables is reduced to only 105. As before, the least-squares method is employed to compute the unified weights for different QP settings. From Table 9 and by comparison with the results in Table 4, a 0.2% decrease in Y-BD-rate savings results, declining from 1.0% and 0.5% to 0.8% and 0.3% in the 2x and AI-1.5x test cases, respectively.

Unifying Weight Tables for Horizontal and Vertical Modes

We also observed that the weighting functions for the Horizontal and Vertical modes are related to each other mainly through a transpose operation, which is intuitively agreeable considering the symmetric properties of signal statistics in the natural

images. Thus, the other simplification we made is to unify their weight tables. That is, for MPWIP, we apply the same weight tables to the Horizontal and Vertical modes, and the tables only need to be transposed for one of them prior to the application. As a result of this unification, the number of weight tables is reduced to 616. Table 10 shows the resulting coding performance from this simplification. As can be seen and expected, the coding gain in the AI-2x case is reduced slightly from 1.0% to 0.9% (and in the A1-1.5x from 0.5% to 0.4% respectively) and a similar performance decline is observed for both luminance and chrominance components.

The previous two unifications can be applied simultaneously to gain further reduction in memory requirements. The results in Table 11 show that the coding loss due to the application of both simplifications is moderate as compared to the design without any unification of weight tables. In this case, a coding loss of 0.2% and 0.1%

was observed in the Y-BD-rate of AI-2x and AI-1.5x, respectively. The total number of weight tables is however reduced from 840 to 77, which amounts to a decrease in storage space.

4.3.2. Constrained MPWIP

With respect to the observation that the weighting functions of the DC and AC components of the same layer have similar waveforms (although their magnitudes can differ considerably), in this section we propose a Constrained MPWIP, in which the weight values to associate with the DC and AC components from the same layer are restricted to be identical or, alternatively, the texture information of the EL intra predictor and the reconstructed BL block are to be weighted. There is an additional constraint; that the weight values for these aforementioned texture information must

Table 12: Performance of the Constrained-MPWIP with respect to SHM 1.0

AI HEVC 2x AI HEVC 1.5x

Figure 17: The Constrained MPWIP Scheme

add up to one; the constraint is referred to as the unit-gain constraint [9]. The use of this constraint is due to the fact that each of these textures can be considered as a prediction candidate; therefore, for the worst case, the sum of these textures could be blown out to the range of values that predefined the sample. Essentially, this Constrained MPWIP becomes one that forms a prediction of the EL by linearly

combining the reconstructed BL block and the EL (directional) intra predictor with an adaptive weighting at the pixel level, as illustrated in Figure 17, and that is therefore similar to the scheme proposed in [6]. Furthermore, it is observed that all the unifying

EL Intra

Figure 18: Waveform of weighting functions for all test modes at block size of 16x16, QP(30,30). (a) Planar mode, (b) DC mode, (c) Horizontal mode, (d) Vertical mode

techniques found in the previous section can be applied to this scheme. Therefore, when viewed as a simplified version of MPWIP, this scheme can be considered as an extension of the MPWIP-U scheme and requires only 21 weight tables, one (rather than four as in the proposed design) weighting function for each EL prediction mode, only a single set of weight tables for all QP settings, unifying weight tables for the Horizontal and Vertical modes and for different block sizes which are the luma/chroma type. As an example, Figure 18 depicts the weighting function for the EL intra predictor produced with four test modes for the luminance component at the block size of 16x16.

Table 12 shows the coding performance of the constrained MPWIP, as compared to the SHM-1.0 anchor. As expected, it incurs a moderate coding loss of 0.5% and

0.3% in the AI-2x and AI-1.5x cases, respectively, when compared to the original design. This provides beneficial separation of signals into DC and AC components in forming a better EL intra predictor.

In an attempt to examine the effect of the unit gain constraint in terms of bit rate savings, the supplementary experiments were carried out with all the test conditions being the same as those of the Constrained MPWIP, except that the ‘unit gain constraint’ was relaxed. It can be concluded that the bit rate savings of this

supplementary experiment varies inconsiderably compared with the ‘Constrained MPWIP’. In addition, the supplementary experiment has a higher cost in terms of increasing the number of weight tables to twice as many.

Conceptually, the Constrained MPWIP scheme is very similar to the proposed WIP algorithm in which the texture information of the BL and the EL are combined adaptively according to the pixel’s position. Therefore, it is beneficial to make a comparison between the Constrained MPWIP and the WIP algorithm. From Table 6 and Table 12, it can be seen that our simplified mode provides a 0.5% coding gain for the AI-2x and 0.3% for AI-1.5x cases respectively, while the WIP algorithm achieved 0.3% and 0.1% gains respectively. There are two major differences between our scheme and the WIP algorithm that enables our algorithm to provide a higher gain: 1) we created four additional modes and the best mode for each block is determined by the RDO process, while the WIP algorithm is applied to all intra prediction modes; 2) it seems that the WIP algorithm uses a unified weighting scheme for all the intra prediction modes, even though there are cases where it is not necessary to weight some neighboring reference pixels in the EL; in our mode, we have different weighting

functions for each intra mode, which leads to a more appropriate weighting scheme to form the final prediction. However, each mode in our scheme has its own weighting functions and the number of weight tables also depends on the prediction block size, so that the cost of our scheme is higher in terms of memory storage to store those weight tables.

4.3.3. Summary

The findings of this chapter would be summarized as follows

 It turns out that the performance of the proposed algorithm (MPWIP)

depends strongly on the characteristics of the video sequences. Specifically, the proposed scheme works best in sequences that contain more homogeneous regions.

 For the Unification to MPWIP simplifications, it can be concluded that the

number of weight tables is reduced considerably with moderate R-D losses, compared to MPWIP.

 For the Constrained MPWIP, even though the number of weight tables can be

further reduced, the losses seem to be significant, compared to MPWIP.

Therefore, it justifies the benefits of separating the texture information into the AC and DC components.

CHAPTER 5 Conclusions

5.1. Summary

In this thesis, we have introduced a sophisticated algorithm to combine the EL intra predictor and the BL reconstructed block targeted to improve the EL intra prediction in the framework of the TextureRL; and this algorithm does bring a coding gain. The algorithm first separates those textures into the AC and DC components, and then weights them by different weight values; the weight value to associate with each component is a function of the prediction pixel’s position in the block. In addition, the parameters (e.g. the intra prediction mode of the EL intra predictor, the QP setting, and

在文檔中用於HEVC可調視訊編碼中估測模式相依之像素權重畫面內預測演算法 (頁 36-0)