Introduction - 解碼端影像誤差估測用於分散式視訊編碼法校正優先權的設計

Distributed video coding (DVC) is a new video coding paradigm which allows more flexible coding complexity distribution between encoder and decoder.

Traditional video codecs, for example, H.264, is designed for the situation where video is encoded once and decoded many times. The coding efficiency is mainly determined by the computational power of the encoder. But now there are many new applications, such as sensor networks and security camera systems, where the computational power of the encoder (sensors or cameras) is weaker than the decoder (central receiver of recorded videos). For these applications, a reversed paradigm is needed, and distributed video coding is suitable for this situation.

In traditional closed-loop video coding systems, motion estimation at the encoder side is used to eliminate temporal correlation of video data and the information of correlation is transmitted to decoders as motion vectors. The main idea of DVC is to estimate-and-construct the inter-frame correlation at the decoder side with little help from the encoder; therefore, the computation burden of motion estimation is shifted to the decoder [1]. For a typical DVC system, the video source would be divided into two interleaving sub-sequences: key frame subsequences and Wyner-Ziv (W-Z) frame subsequences. Key frames would be encoded using traditional encoder (such as the motion JPEG encoder or any video encoder).

For W-Z frames, the encoder takes the original video frame as input and applies a low-complexity algorithm to predict and generate some data (refer to as W-Z bits) that can help the decoder correct any errors in generation of the target frames (refer to as the side information). On the decoder side, the side information (SI) would be generated first using any information reconstruction technologies based on temporal correlation of neighboring key frames. And then, a W-Z decoder uses the W-Z bits

from the encoder to correct any potential errors in the SI such that the resulting W-Z frames would be close to the original frames at the encoder side. The key components in a DVC framework are the W-Z bits generator and the SI generator.

The main complexity of encoder in traditional video codec architecture is due to motion estimation. For distributed video coding, we want a simple, low power encoder and a powerful decoder. So the encoder cannot perform motion estimation anymore; the job of predictive coding should be shifted to the decoder side. Whether this new paradigm can achieve the same coding performance as traditional video codec is an important research topic. According to previous information theory research results, the compression efficiency of distributed video coding should match that of traditional video coding techniques. Two of the most fundamental results related to the concept of distributed video coding from information theory are Slepian-Wolf theorem [2] and Wyner-Ziv theorem [3]. The latter is a lossy version of the former theorem.

Consider when we want to encode two statistically dependent variables, X and Y.

According to information theory, fewer bits (H(X, Y)) are needed to describe the two variables if we jointly encode them. For video coding, two successive frames can be coded more efficiently if we consider their predictable relationship by motion estimation, and then encode the unpredictable residuals. But if the two variables X and Y are separately encoded, how many bits are required to describe them? For video coding, what bitrate will be required if frames are intra coded instead of inter coded?

The Slepian-Wolf theorem tell us that even if two variables X and Y are separately encoded, once they are jointly decoded, only H(X, Y) bits are required to decode them.

For video coding, when frames are intra coded, if the decoder can jointly decoded them, the same coding efficiency can be reached. So, theoretically, in distributed

video coding, if the decoder can jointly decode frames, that is, the decoder know the relationship between the frames and use the information to decode the frames, then only H(X, Y) bits will be sent to the decoder even if the encoder encodes the frames in intra mode. The theorem tells us that the performance of distributed video coding should as good as that of traditional video coding in theory. But until today, there is no technique based on distributed video coding principle whose performance can be close to that of traditional video coding.

In this thesis, a macroblock rearranging method is proposed to enhance R-D performance of distributed video coding. Existing distributed video coding methods performs particularly worse at low bit-rate end of the R-D curve. For QCIF sequence with frame rate 15 Hz, the proposed techniques can improve the performance of existing method by about 0.5 dB in PSNR measure when bit-rate below 200 kbps.

This bit-rate range is reasonable for QCIF sequence. The proposed technique uses some cues to detect at the decoder side which part of the predicted side information has potentially high prediction errors and provide the encoder with this information.

The encoder then rearranges the macroblocks so that the channel code-based W-Z bits generation process can be more efficient. As a result, the decoder can request W-Z bits to correct the hard-to-predict macroblocks first. Experimental results show that this technique outperforms current distributed video coding techniques, particularly at low bit-rate ends.

The organization of this thesis is as follows. Chapter 2 presents a literature survey on previous work of distributed video coding. In chapter 3, some analyses are conducted to identify the weakness of current techniques. In chapter 4, the proposed block rearranging method and system architecture will be described. And the experimental results will be presented in chapter 5. Finally discussions and future work are given in chapter 6.

在文檔中解碼端影像誤差估測用於分散式視訊編碼法校正優先權的設計 (頁 14-17)