• 沒有找到結果。

The video coding standard MPEG-4 Part10 AVC/H.264 [1] was developed by the joint video team (JVT) of ISO/IEC (MPEG) and ITU-T (VCEG). The video coding standard is based on traditional hybrid coding scheme but with several additional methods to attain high coding efficiency such as adaptive motion compensation with variable block sizes, multiple reference frames, intra coding with various spatial prediction directions, and so on. The new technologies mentioned above are quite important to many networked multimedia services such as multipoint video conferencing, distance learning, video on demand, and digital TV. Transmission of compressed video over heterogeneous networks with different transmission bandwidths may require a reduction in bit rate.

The compressed video bit stream is often converted to the reduced frame rate video bit stream in order to reduce the bit rate. Video transcoding provides not only the format conversion but also resolution scaling (spatial transcoding), bit-rate conversion (quality transcoding), and frame rate conversion (temporal transcoding).

Because different networks may have different bandwidths, if a gateway or receiver end can include a transcoder to adapt the video bit rates, video services can be provided on different networks. When the bandwidth in a wireless network is very limited, the quality transcoding can cause high degradation of the transcoded video quality, if the frame rate is held constant.

The most straightforward way of implementation a transcoder is to cascade a decoder and an encoder. The basic transcoding architecture is shown in Fig. 1.1. In pixel-domain transcoding [2][3], the incoming video bitstream is decoded fully in the

pixel domain, and the decoded video frames are then re-encoded at the desired output bit rate. This technique, however, is computationally expensive. DCT-domain transcoding [4] overcomes to some degree this computational complexity by decoding the incoming bitstream into the intermediate discrete cosine transform (DCT) domain and then re-encoding new bitstream from this DCT domain information.

Front Encoder

(Original Encoder at the transmitter end)

End Decoder

(Decoder at the receiving end)

Decoder

(Incoming bitstream is decoded to either pixel or DCT domain)

Encoder

(Decoded bitstream is re-encoded to from the outgoing bitstream)

Transcoder

Incoming

Bitstream Outgoing

Bitstream

Fig. 1-1 Basic Transcoding Architecture

In both of above mentioned transcoding methods, bit rate reduction is primarily achieved by re-encoding DCT coefficients using coarser quantization. This approach suffers the following two problems. First, the quantization error would accumulate due to different quantization levels used in the front encoder. This causes poor video quality, especially for DCT-domain transcoding. Second, employing re-quantization does not reduce the output bit rate significantly.

Frame skipping trasncoding [5][6] is often used to reduce the output bit rate by skipping some of the incoming frames at regular or dynamic intervals while maintaining sustainable image quality. When some incoming frames are dropped for frame-rate conversions, the incoming motion vectors pointed to the dropped frames become invalid in the transcoded bitstream. One of the most straightforward solutions to overcome this problem is to re-estimate all the invalid motion vectors through full-scale full search algorithm using the non-skipped frames as reference frames.

However, motion estimation is the most computationally expensive stage in the encoding process. To speed up the operation, a video transcoder usually reuses the

decoded motion vectors from the incoming bitstream. Reuse of incoming motion vector can be achieved by bilinear interpolation, forward dominant vector selection (FDVS) [7], activity dominant vector selection (ADVS) [6], parametric activity dominant vector selection (PADVS) [8] techniques, etc.

In this thesis, except the original FDVS, ADVS, and PADVS(n) are used on all 16x16 modes in H.264/AVC, the enhanced FDVS and ADVS methods we proposed are also applied to all variable block sizes in H.264/AVC video coding standard.

The remaining of the paper is organized as follows. In section 2, we first introduce the background of H.264/AVC video coding techniques, including some H.264/ AVC important features and video transcoding technologies, and then some related works about block mode decision and motion vector composition are discussed. In section 3, the system architecture flow chart and the proposed method are presented. The experimental results of the proposed methods are shown in section 4. Finally, conclusion of this thesis will be presented in section 5.

Motivation

Due to requirement of many multimedia applications and expectation of good quality of video under limited network transmission, the thesis is based on H.264/AVC video coding standard and frame-rate conversion of video transcoding technology. Macroblock mode decision methods and motion vector in position methods are proposed. In the proposed methods, we make a choice about what information we need to obtain from the compressed video stream in H.264/AVC format and then re-use the information to decide the block mode types and the motion vectors in the retained frame so that we can reduce transcoding time and gain the acceptable video quality.

Besides, there are few topics discussing about how to decide block mode and motion vector in H.264/AVC frame skipping. Most of existing methods are used to resolve in MPEG-2 or H.263. Moreover, the distance between remaining frames become estranged so that the motion vector referred to previous frame may become invalid or imprecision and original macroblock modes may not suitable after frame skipping. H.264/AVC provides variable block size and quarter pixel precision in motion vector. The issues we are interested in are how to change block modes and motion vector in order to reduce encoding time and enhance the compressed ratio.

Chapter 2 Background and Related

相關文件