CHAPTER 3 ADAPTIVE MOTION ESTIMATION WITH VARIABLE
3.3 ANALYZE VISUAL QUALITY DEGRADATION WITH SPATIOTEMPORAL CONDITION
pass filter (HPF) Eq.3.1 to do the edge extraction. HPF can let the higher frequency band pass and filter out the lower frequency band. After the procedure of edge extraction, we can get gradient pixel values in a frame. And use the Eq.3.2 to calculate the ThEC. Then the algorithm uses the ThEC value as a condition to pick the edge pixels produced by Eq.3.3. Finally we sum all the edge pixels in a frame to get the EC value.
)
After we get the two indices ZNBC and EC, we will analyze the relation between quality degradation and the two indices at next section.
3.3 Analyze Visual Quality Degradation with Spatiotemporal Condition
In H.264/MPEG-4 AVC [4] encoder with JM9.2 [23] software, there is a
function named rate control. If we turn on this function, we can set a fixed target bit-rate and the encoder system will vary the Qp to fit the target bit-rate. It is a good approach for many portable devices and storage mechanisms. But the varied Qp would cause the frame quality vary together. In another word, the frame quality can not express itself content behavior accurate under the rate control enable. In order to solve this problem, we disable the rate control function, and scan the Qp from 2 to 42.
At each Qp, we can get its own quality degradation and encoded bit-rate in a frame.
Then we can use these two data to draw the RD curve (rate-distortion curve) in Fig.3.2, every frame has its own RD curve. We must to do curve fitting to get the quality at a constant bit-rate for these four subsample ratios. But, from the observation of these sample data, they are not distributed linearly, it like a logarithm scale distribution. So, we use logarithm scale to do the curve fitting with that like Fig3.3. In our simulation, we keep a constant bit-rate 128k bits/sec, and then we calculate the quality degradation difference between full search and other subsample ratios using the Eq.3.4. However, we get the quality degradation curve in a video sequence like Fig.3.5 and Fig.3.8.
ith frame i FSME SSR
Q =PSNRY −PSNRY
+ (Eq.3.4)
We take the video sequences “Table” and “Foreman” for examples. To particularly analyze the results of visual quality degradation with different subsample ratios for a video, the video sequences “Table” and “Foreman” are simulated in H.264/MPEG-4 AVC [4] coder with JM9.2 [23]. Here, we defined one group of picture (GOP) is fifteen frames, video sequence type is IPPP…, frame rate is 30 frames/sec and the bit rate is 128k bits/sec. Subsample ratios are 16:8, 16:4 and 16:2
respectively and can be generated from Eq.2.3, We analyze the “Table” video sequence first. Fig.3.5 shows quality degradation results versus these subsample ratios.
Fig.3.6 shows the ZMBC value of every frame in the “Table” video sequence, and Fig.3.7 shows the EC value. From Fig.3.6 , there exists the strong temporal variance between the 20th frame to the 105th frame, hence, the higher subsample ratios result in more obviously higher quality degradation. Furthermore, the 132nd frame has the maximum quality degradation because of scene change (Fig.3.4). To deal with that problem, we also can use the temporal index; the ZMBC is extremely small at this time. Base on our observation, in most CIF clips when scene change occur, the ZMBC is smaller than 10. Fig.3.2 (a) and (b) are the ZMBC of “Table” clip and the ZMBC of “Foreman” clip. We define it, when ZMBC is smaller than 10, the scene change must happen. In D1 clips, “Football” sequence is a very fast motion clip and its motion is not regular (Fig.3.3). The “Football” motion is sometimes fast and sometimes slow, so the ZMBC value in “Football” is changed seriously. So we consider the phenomenon is also a kind of scene change. The D1 (720×480) resolution is larger than CIF (352×288), it has 1350 MBs. So we choose the scene change threshold as 100 for all D1 clips. According that, we apply low subsample ratio for this frame to be encoded. From Fig.3.7, there we can detect the spatial variance increasing gradually the 60th frame and the 105th frame, and the higher subsample ratios also degrade higher and higher. About the “Foreman” tested sequence from Fig.3.8 , there exists the strong temporal variance between the 170th frame to the 195th frame and the 225th frame to the 255th frame in the Fig.3.9, hence, the higher subsample ratios result in more obviously higher quality degradation. From Fig.3.10, there exists the strong spatial variance between the 240th to the 300th frame.
Hence, the higher subsample ratios result in higher quality degradation. So, we will choose the low subsample ratio to encode these frames.
Above-mentioned, we simulate for observation in the relation between the quality degradation and spatiotemporal condition. In order to simulate the adaptive algorithm, we must have some thresholds for according to. In the next section, a threshold decision for variable subsample ratios will be presented.
(a) (b)
Fig.3.2 (a) The ZMBC of “Table” (CIF) clip (b) The ZMBC of “Foreman” (CIF) clip
Fig.3.3 The ZMBC of “Football” (D1) clip
Fig.3.4 The RD curve of four subsample ratios at the 132th frame in “Table”
sequence
Fig.3.5 Logarithm scale curve fitting of the four RD curve
(a) (b)
Fig.3.6 The scene change occurrence (a) the 131st frame of “Table” sequence (b) the 132nd frame of “Table” sequence
Fig.3.7 The diagram of ΔQ with 16:8, 16:4 and 16:2 subsample ratios for
“Table” sequence with rate control disable
Fig.3.8 The ZMBC of the “Table” sequence
Fig.3.9 The EC of the “Table” sequence
Fig.3.10 The diagram of ΔQ with 16:8, 16:4 and 16:2 subsample ratios for
“Foreman” sequence with rate control disable
Fig.3.11 The ZMBC of the “Froeman” sequence
Fig.3.12 The EC of the “Foreman” sequence