• 沒有找到結果。

中 華 大 學

N/A
N/A
Protected

Academic year: 2022

Share "中 華 大 學"

Copied!
65
0
0

全文

(1)

1

中 華 大 學

碩 士 論 文

題目 題目

題目 題目: :: : H.264/AVC H.264/AVC H.264/AVC H.264/AVC 之快速編碼 之快速編碼 之快速編碼 之快速編碼模式決策 模式決策 模式決策 模式決策

A Fast Mode Decision Method for H.264/AVC Coding Using the Spatial-Temporal Prediction

Scheme

系 所 別:資訊工程學系碩士班 學號姓名:M09202011 喻仲平 指導教授:連振昌 博士

中 華 民 國 九十四 年 七 月

(2)
(3)
(4)
(5)

1

摘要 摘要 摘要 摘要

在H.264/AVC的壓縮標準中,此標準擁有極有彈性的motion estimation

mode,多張參考的frames,在I-frames中有intra的預測模型,motion estimation

的在精確,entropy coding…等都被用來提供獲得最佳的R-D成本。特別是,針

對每一個macroblock,H.264/AVC中總共有七種motion estimation mode的大小

(從4×4到16×16)來去找出最小的compensation error。然而,在參考軟體JM-9.3

中使用了全域搜尋演算法(Full Search)會造成龐大的計算量使得壓縮變的無效

率。因此,有些方法使用SAD(sum of absolute difference),同質性(homogeneous)

區 域 分 析 或 是 邊 緣 偵 測 (edge detection) 來 決 策 出 較 為 適 合 的 motion

estimation mode。但是,許許多多的快速搜尋方法都要耗費額外的影像處理運

算量。所以在本篇論文中,提出了在各frame間,以空間跟時間(spatial-temporal)

關 聯 性 來 節 省 motion estimation 的 時 間 。 更 進 一 步 的 , 應 用 了 drift

compensation的概念去避免預測飄移(prediction drift)的現象。最後實驗結果

顯示了此方法節省了約60%的motion estimation運算量,且PSNR平均只掉落了約

0.05dB。

(6)

2

致謝 致謝 致謝 致謝

很幸運的有這個機會能夠在中華大學的校園裡求學,並且順利取得碩士學

位,度過值得回憶的兩年研究生活,是很令人興奮的。在這之中首先,我要感謝

我的指導教授 連振昌教授在這兩年間不僅是在專業知識上的教導以及細心的教

學外,在人生的態度以及做人處世的道理到求學的精神更是讓我獲益良多,這兩

年在連教授一點一滴的栽培下,才得以順利的完成研究所的學業。

其次,要感謝智慧型多媒體實驗室許許多多朝夕相處的夥伴們,尤其感謝勝

政、秋龍、振宇學長在我論文研究上給予莫大的指導與幫助,以及禮潭、嘉宏、

衍毅、堯弘、士棻、志豪、學偉同學的相互支持和照顧下,志強、揚凱、鬱婷…

等眾多實驗室學弟妹的幫忙及陪伴,使得兩年的生活過的多采多姿,感謝你們讓

我留下美好甜蜜的回憶。

最後,感謝我的父母親,因為您們無時無刻的叮嚀並照顧使得我可以如期完

成我的論文。

(7)

3

目錄 目錄 目錄 目錄

摘要 摘要 摘要

摘要 ... 1 致謝致謝

致謝致謝 ... 2 目錄

目錄 目錄

目錄 ... 3 第一章第一章

第一章第一章 ... 4 簡介

簡介 簡介

簡介 ... 4 第二章第二章

第二章第二章 ... 5 H.264/AVC

H.264/AVC H.264/AVC

H.264/AVC 以及以及以及 JM以及JMJM-JM---9.39.39.39.3 參考軟體參考軟體參考軟體... 5 參考軟體 第三章第三章

第三章第三章 ... 6 本篇論文所提出的快速模式決策方法

本篇論文所提出的快速模式決策方法 本篇論文所提出的快速模式決策方法

本篇論文所提出的快速模式決策方法 ... 6 第四章第四章

第四章第四章 ... 7 實驗結果

實驗結果 實驗結果

實驗結果 ... 7 第五章

第五章 第五章

第五章 ... 8 結論

結論 結論

結論 ... 8

英 英

英 文文文 文 附附附附 錄錄錄錄 ... 9

(8)

4

第一章 第一章 第一章 第一章

簡介 簡介 簡介 簡介

在最新一代的壓縮技術-H.264/AVC 中,此標準擁有極有彈性的 motion

estimation mode,多張參考的 frames,在 I-frames 中有 intra 的預測模型,

motion estimation 的在精確,entropy coding…等都被用來提供獲得最佳的 R-D

成本。特別是,針對每一個 macroblock,H.264/AVC 中總共有七種 motion

estimation mode 的大小(從 4×4 到 16×16)來去找出最小的 compensation

error。跟過去的壓縮技術比較起來,例如:H.263 CHC (Conversational High

Compression),H.263 Baseline,以及 MPEG-4 SP (Simple Profile),分別提

升了 27%,40%,以及 29%的壓縮率。然而,在參考軟體 JM-9.3 中使用了全域搜

尋演算法(Full Search)會造成龐大的計算量使得壓縮變的無效率。所以在本篇

論文中,提出了在各 frame 間,以空間跟時間(spatial-temporal)關聯性來節省

motion estimation 的時間。更進一步的,應用了 drift compensation 的概念

去避免預測飄移(prediction drift)的現象。第二章將會講解一下 H.264/AVC

跟 JM-9.3 參考軟體的架構,第三章是我們提出的快速搜尋演算法,第四章是實

驗結果,第五章是結論。

(9)

5

第二章 第二章 第二章 第二章 H.264/AVC

H.264/AVC H.264/AVC

H.264/AVC 以及 以及 以及 JM 以及 JM JM- JM -- -9.3 9.3 9.3 9.3 參考軟體 參考軟體 參考軟體 參考軟體

由 MPEG 以及 ITU-T 組織所共同制定的 H.264/AVC 壓縮標準中,此標準的主

要目標是在於改進壓縮的效率(compression efficiency),對網路有更好的親和

性(network friendly)還有錯誤的回復機制(error robustness)。在 H.264/AVC

中,整體的壓縮架構流程跟過去並沒有什麼太大的差異,所以在此章中,會導讀

一下 H.264/AVC 的架構以及在參考軟體 JM-9.3 中重要的一些壓縮流程。

(10)

6

第三章 第三章 第三章 第三章

本篇論文所提出的快速模式決策方法 本篇論文所提出的快速模式決策方法 本篇論文所提出的快速模式決策方法 本篇論文所提出的快速模式決策方法

首先我們會先分析各 frame 間模式預測結果的關聯性,由這個關聯性的觀察

我們導出我們的快速搜尋演算法機制,利用過去已經決策出來的模式來統計出未

來可能也會出現相同的模式的觀念,讓沒有必要的模式就取消掉不要去做額外的

搜尋動作以節省時間。要達到此目的就必須先去知道現在的 macroblock 是在過

去畫面間大約那個位置,這裡用到了 motion vector 的 prediction,然後去算

出預測 macroblock 周圍 3×3 的 macroblock 的模式分布,挑出前兩名的模式分布

用來當作我們的模式候選人來壓縮,並且再做 drift compensation 就是我們所

有 可 能 需 要 去 壓 縮 的 模 式 , 最 後 找 出 最 小 成 本 的 模 式 就 是 我 們 要 對 此

macroblock 壓縮的模式。

(11)

7

第四章 第四章 第四章 第四章 實驗結果 實驗結果 實驗結果 實驗結果

我們用了以下的環境來進行我們的實驗:

1. 影片壓縮長度為 200 張 frames。

2. 參考 frames 設定為五張。

3. Motion estimation 的搜尋範圍設定為 16 個 pixels。

4. 開啟針對 DC 的 Hadamard 轉換。

5. 開啟 Rate-distortion 的最佳化。

6. GOP 度為 13 張 frames。

7. 共測試九段的 yuv qcif 影片

實驗進行分三階段,第一階段會把 QP 鎖住在 28、32、36 跟 40 來比較效率

如何。第二階段會變動不同的 rate,在各種 rate 下的壓縮效率為何。最後會跟

兩篇過去提出的快速模式搜尋演算法來比較看看,我們提出來的效率為何。

(12)

8

第五章 第五章 第五章 第五章

結論 結論 結論 結論

在我們所提出的利用空間跟時間(spatial-temporal)關聯性來節省motion

estimation的時間,實驗結果證明了我們平均提升了約60%的motion estimation

運算量,且PSNR平均只掉落了約0.05dB,並且沒有其餘額外的影像處理技巧的運

算。本篇論文的方法可以很快的實作並且運用在H.264/AVC下,並且大大的提升

壓縮效能。

(13)

9

英 英 英

英 文 文 文 文 附 附 附 附 錄 錄 錄 錄

(14)

2

A Fast Mode Decision Method for H.264/AVC Coding Using the Spatial-Temporal Prediction Scheme

Prepared by Chung-Ping Yu Directed by

Dr. Cheng-Chang Lien

Computer Science and Information Engineering Chung Hua University

Hsin-Chu, Taiwan, R.O.C.

July, 2005

(15)

I

ABSTRACT

In the H.264/AVC coding standard, large flexibilities for the motion estimation

mode, multiple reference frames, intra prediction modes for I-frame, motion

estimation refinements, entropy coding… etc. are provided to obtain the optimum

R-D cost. Especially, there are seven motion estimation modes from 4×4 to 16×16 are

used to find the minimum motion compensation error for each macroblock. However,

the high computation cost of the full search method in the reference software JM-9.3

make the encoding process inefficient. Therefore, the methods of applying the SAD

(sum of absolute difference), homogeneous region analysis, and edge detection are

developed to determine the optimum motion estimation mode. However, the

additional computation cost of the image processing will reduce the efficiency of the

motion compensation process. In this thesis, the spatial-temporal correlations between

the current frame and the reference frame are analyzed to develop a fast mode

decision method in which no extra image processes are used. Furthermore, the

concept of drift compensation is adopted to avoid the error accumulation phenomenon

during the mode decision process. The experimental results show that the computation

cost may be reduced above 60% and the average PSNR is only dropped about 0.05db.

Keywords: H.264, JM-9.3, mode decision, R-D Cost

(16)

II

CONTENTS

ABSTRACT...I CONTENTS... II LIST OF TABLES ...III LIST OF FIGURES...V

CHAPTER 1 ... 1

INTRODUCTION ... 1

CHAPTER 2 ... 5

THE H.264/AVC AND JM-9.3 REFERENCE SOFTWARE ... 5

2.1THE H.264/AVCBASELINE PROFILE... 6

2.2THE JM-9.3ENCODER STRUCTURE... 8

2.3H.264/AVCPERFORMANCE ANALYSIS... 10

CHAPTER 3 ... 17

THE FAST MODE DECISION METHOD... 17

3.1THE SPATIAL-TEMPORAL MODE CORRELATION... 17

3.2THE FAST MODE DECISION METHOD USING SPATIAL-TEMPORAL MODE CORRELATION... 21

CHAPTER 4 ... 26

EXPERIMENTAL RESULTS... 26

4.1THE EFFICIENCY AND PSNRANALYSES FOR FIXED QPPARAMETERS... 26

4.2THE RATE-DISTORTION ANALYSES... 32

CHAPTER 5 ... 42

CONCLUSION... 42

REFERENCE ... 43

(17)

III

LIST OF TABLES

TABLE 2.1THE RELATED PARAMETERS IN THE H.264/AVC FOR THE TESTING PROCESS... 11

TABLE 2.2THE EFFICIENCY ANALYSIS USING THE AKIYO TEST SEQUENCE... 11

TABLE 2.3THE EFFICIENCY ANALYSIS USING THE CONTAINER TEST SEQUENCE... 11

TABLE 2.4THE EFFICIENCY ANALYSIS USING THE HALL_MONITOR TEST SEQUENCE. ... 11

TABLE 2.5THE EFFICIENCY ANALYSIS USING THE MOTHER_AND_DAUGHTER TEST SEQUENCE. ... 12

TABLE 2.6THE EFFICIENCY ANALYSIS USING THE NEWS TEST SEQUENCE. ... 12

TABLE 2.7THE EFFICIENCY ANALYSIS USING THE SALESMAN TEST SEQUENCE. ... 12

TABLE 2.8THE EFFICIENCY ANALYSIS USING THE CARPHONE TEST SEQUENCE... 12

TABLE 2.9THE EFFICIENCY ANALYSIS USING THE COASTGRD TEST SEQUENCE... 13

TABLE 2.10THE EFFICIENCY ANALYSIS USING THE FOREMAN TEST SEQUENCE. ... 13

TABLE 2.11 THE COLOR DEFINITION... 14

TABLE 3.1THE PROBABILITIES FOR THE MODE OF A MACROBLOCK ON P FRAME #2 IS SIMILAR TO THE MODES OF THE NEIGHBORING MACROBLOCKS CENTERED AT THE SAME POSITION ON P FRAME #1. 19 TABLE 3.2THE PROBABILITY THAT THE ESTIMATION MODE OF A MACROBLOCK IS SIMILAR TO THE TOP ONE... 20

TABLE 3.3THE PROBABILITY THAT THE ESTIMATION MODE OF A MACROBLOCK IS SIMILAR TO THE TOP TWO. ... 20

TABLE 3.4THE NUMBER OF THE AVAILABLE MOTION VECTORS IN FIG.3.2... 22

TABLE 3.5FIVE CATEGORIES FOR BLOCK MODES. ... 23

TABLE 4.1THE COMPUTATION TIMES (SECS) OF THE MOTION ESTIMATION PROCESS FOR NINE VIDEO SEQUENCES USING THE JM-9.3 REFERENCE SOFTWARE. ... 28

TABLE 4.2THE COMPUTATION TIMES (SECS) OF THE MOTION ESTIMATION PROCESS FOR NINE VIDEO SEQUENCES USING OUR PROPOSED METHOD... 28

TABLE 4.3THE SPEED UP ANALYSIS. ... 29

TABLE 4.4THE PSNR(DB) OF THE MOTION ESTIMATION PROCESS FOR NINE VIDEO SEQUENCES USING THE JM-9.3 REFERENCE SOFTWARE. ... 29

TABLE 4.5THE PSNR(DB) OF THE MOTION ESTIMATION PROCESS FOR NINE VIDEO SEQUENCES USING OUR PROPOSED METHOD. ... 30

TABLE4.6THE PSNR(DB) ANALYSIS. ... 30

TABLE 4.7THE BIT-RATE (KBITS) OF THE MOTION ESTIMATION PROCESS FOR NINE VIDEO SEQUENCES USING THE JM-9.3 REFERENCE SOFTWARE... 31

TABLE 4.8THE BIT-RATE (KBITS) OF THE MOTION ESTIMATION PROCESS FOR NINE VIDEO SEQUENCES USING OUR PROPOSED METHOD... 31

TABLE 4.9THE BIT-RATE ANALYSIS. ... 32

TABLE 4.10THE EFFICIENCY AND PSNR COMPARISONS FOR THE PROPOSED METHOD AND WUS METHOD. ... 40

(18)

IV

TABLE 4.11THE EFFICIENCY AND PSNR COMPARISONS FOR THE PROPOSED METHOD AND JINGS METHOD. ... 40

(19)

V

LIST OF FIGURES

FIG.1.1MACROBLOCK AND SUB-MACROBLOCK PARTITIONS IN H.264... 2

FIG.2.1THE STRUCTURE OF H.264ENCODER... 5

FIG.2.2MACROBLOCK AND SUB-MACROBLOCK PARTITIONS IN H.264[1] ... 7

FIG.2.3THE FLOWCHART OF THE MAIN PART OF JM-9.3 ENCODER. ... 9

FIG.2.4THE FLOWCHART OF FUNCTION ENCODE_ONE_MACROBLOCK”. ... 10

FIG.2.5THE VIDEO SEQUENCE OF BIRD FLYING (A) AND MOVING BUS (B) ARE USED TO ILLUSTRATE THAT THE BEST REFERENCE FRAME IS NOT THE PREVIOUS ONE. ... 14

FIG.2.6THE ILLUSTRATION OF MODE DECISION WITH THE ENCODING PARAMETERS DEFINED IN TABLE 2.1 FOR AKIYO SEQUENCE. ... 15

FIG.2.7THE ILLUSTRATION OF MODE DECISION WITH THE ENCODING PARAMETERS DEFINED IN TABLE 2.1 FOR COASTGRD SEQUENCE. ... 15

FIG.2.8THE ILLUSTRATION OF MODE DECISION WITH THE ENCODING PARAMETERS DEFINED IN TABLE 2.1 FOR FOREMAN SEQUENCE. ... 16

FIG.3.1THE MODE CORRELATION FOR A MACROBLOCK BETWEEN ADJACENT FRAMES. ... 18

FIG.3.2DISTRIBUTION OF MOTION VECTORS FOR A MACROBLOCK ON P-FRAME #2... 22

FIG.3.3THE PREDICTION MOTION VECTOR CORRESPONDING TO MACROBLOCK. ... 23

FIG.3.4THE BLOCK DIAGRAM OF FAST MODE DECISION METHOD... 25

FIG.4.1RATE-DISTORTION CURVES FOR AKIYO VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 33

FIG.4.2RATE-DISTORTION CURVES FOR CARPHONE VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 33

FIG.4.3RATE-DISTORTION CURVES FOR COASTGRD VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 34

FIG.4.4RATE-DISTORTION CURVES FOR CONTAINER VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 34

FIG.4.5RATE-DISTORTION CURVES FOR FOREMAN VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 35

FIG.4.6RATE-DISTORTION CURVES FOR HALL_MONITOR VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 35

FIG.4.7RATE-DISTORTION CURVES FOR MOTHER_AND_DAUGHTER VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK... 36

FIG.4.8RATE-DISTORTION CURVES FOR NEWS VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 36

FIG.4.9RATE-DISTORTION CURVES FOR SALESMAN VIDEO SEQUENCE OBTAINED BY JM-9.3, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 37 FIG.4.10THE COMPUTATION TIME OF THE MOTION ESTIMATION PROCESS FOR AKIYO (LEFT) AND

(20)

VI

CARPHONE (RIGHT) VIDEO SEQUENCES USING THE JM-9.3 REFERENCE SOFTWARE, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 37 FIG.4.11THE COMPUTATION TIME OF THE MOTION ESTIMATION PROCESS FOR COASTGRD (LEFT) AND

CONTAINER(RIGHT) VIDEO SEQUENCES USING THE JM-9.3 REFERENCE SOFTWARE, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 38 FIG.4.12THE COMPUTATION TIME OF THE MOTION ESTIMATION PROCESS FOR FOREMAN (LEFT) AND

HALL_MONITOR (RIGHT) VIDEO SEQUENCES USING THE JM-9.3 REFERENCE SOFTWARE, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 38 FIG.4.13THE COMPUTATION TIME OF THE MOTION ESTIMATION PROCESS FOR MOTHER_AND_DAUGHTER

(LEFT) AND NEWS (RIGHT) VIDEO SEQUENCES USING THE JM-9.3 REFERENCE SOFTWARE, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK. ... 38 FIG.4.14THE COMPUTATION TIME OF THE MOTION ESTIMATION PROCESS FOR SALESMAN VIDEO

SEQUENCE USING THE JM-9.3 REFERENCE SOFTWARE, OUR PROPOSED METHOD, AND JM-9.3 WITH ONLY MACROBLOCK... 39

(21)

1

CHAPTER 1 Introduction

Recently, the new video coding standard H.264/AVC [1] is proposed by the Joint

Video Team (JVT) [1] to develop the new low bit-rate video compression technology.

By comparing with the conventional coding standards, e.g., H.263 CHC

(Conversational High Compression) [3], H.263 Baseline [3], and MPEG-4 SP (Simple

Profile) [4], the compression rate may be improved about 27%, 40%, and 29%

respectively [8]. In the H.264/AVC coding standard [1], large flexibilities for the

motion estimation modes, entropy coding, multiple reference frames, intra prediction

modes for I-frame, motion estimation refinements… etc. are provided to obtain the

optimum R-D cost. Especially, there are seven motion estimation modes from 4×4 to

16×16 are used to find the minimum motion compensated error for each macroblock.

In the current JVT reference software [9], seven modes (the various kinds of block

sizes) are applied to perform the motion compensation process such that the R-D cost

defined in (1) is optimized.

R D

JMode = +

λ

Mode× (1)

where D denotes the motion compensated error of a macroblock,

λ

Mode is the

Lagrange multiplier, and R represents the demanded bit-rate. In order to obtain the

optimum motion estimation mode, we must calculate the R-D cost for each motion

(22)

2

estimation mode in which some time-consuming processes, e.g., the motion

estimation, DCT transformation, and quantization, are involved. In addition, seven

motion estimation modes from 4×4 to 16×16 (see Fig. 1.1) are used to determine the

optimum motion estimation mode for each macroblock. Hence, the high computation

cost for the full search method used in the reference software JM-9.3 [9] make the

encoding process inefficient.

Fig. 1.1 Macroblock and Sub-macroblock partitions in H.264

Recently, many researches [10-13] [18-19] addressed on the fast mode decision

methods are proposed. In [10], the SAD (sum of absolute difference) between the

current frame and previous frame for each macroblock is applied to evaluate which

kind of block size is appropriate for the motion estimation process. The decision rules

in [10] are described briefly as follows. Firstly, the SAD is calculated and compared

with the predefined threshold. If the SAD is smaller than a threshold value then the

motion estimation mode will be selected among the 16×16, 16×8 and 8×16 block

Sub-macroblock partitions

0

1

0

2 1

3

Macroblock partitions

0

2 1

3

0

1

0 1

0 1

16×16 16×8 8×16 8×8

8×8 8×4 4×8 4×4

0

0

(23)

3

types. Otherwise, the motion estimation mode is determined by evaluate all the seven

block types. In [11], the edge detection [12] is applied to classify the homogeneous

and non-homogeneous regions. If the macroblocks belong to the homogeneous

regions then the motion estimation modes 4, 5, 6 and 7 (8×8, 8×4, 4×8 and 4×4) are

removed. Otherwise, the macroblock is divided into many sub-macroblocks of size

8×8 and each sub-macroblock is analyzed whether it belongs to the homogenous

region or not. If each 8×8 sub-macroblock is homogenous then the motion estimation

modes 5, 6 and 7 (8×4, 4×8 and 4×4) are removed. This process is continued until all

sub-macroblock are evaluated. Finally the rate-distortion optimization is used to

determine which motion estimation mode is best. In [12], Pan et al. applied the edge

direction histogram to reduce the computation cost of the decision of intra-prediction

modes. In [13], Zhu et al. applied the low-resolution image and edge detection to

determine the motion estimation modes (8×4, 4×8 and 4×4). However, the efficiency

of the motion compensation process will be reduced by the extra image processes for

determining the appropriate motion estimation mode.

In this thesis, the spatial-temporal correlations among the reference frames and

neighboring macroblocks are analyzed to develop a fast mode decision method in

which no extra image processes are used. Furthermore, the concept of drift

compensation [17] is adopted to avoid the error accumulation during the mode

(24)

4

decision process. By comparing with JM-9.3 reference software, the experimental

results show that the computation cost may be reduced above 60% and the average

PSNR is only dropped about 0.05db.

The rest of the thesis is organized as follows. The H.264/AVC structure and

JM-9.3 encoder are reviewed in the chapter 2. The proposed method of fast mode

decision is described in the chapter 3. In chapter 4, the simulation results for nine

video sequences are illustrated.

(25)

5

CHAPTER 2

The H.264/AVC and JM-9.3 Reference Software

Recently, the new video coding standard entitled “Advanced Video Coding

(AVC)” is proposed jointly as the part 10 of MPEG-4 and the ITU-T

Recommendation H.264. The main objectives for H.264/AVC coding standard are to

improve the compression efficiency, network friendly, error robustness and video

representation for interactive and non-interactive applications. In the H.264/AVC

coding standard, large flexibilities for the motion estimation modes, multiple

reference frames, intra prediction mode for I-frames, motion estimation refinements,

entropy coding… etc. are provided to obtain the optimum R-D cost. The H.264/AVC

coding structure [2] is shown in Fig. 2.1.

Fig. 2.1 The structure of H.264 Encoder

(26)

6

2.1 The H.264/AVC Baseline Profile

Generally, the H.264/AVC defines four encoding profiles: baseline, main,

extended, and fidelity range extensions (FREXT) [5]. In this thesis, only the baseline

profile is considered to develop the fast mode decision method. In the baseline profile,

just I- and P- slices are supported. I-slices consist of the intra-coded macroblocks;

while the P-slices consist of the intra-coded, inter-coded, or skipped macroblocks. The

inter-coded macroblocks in the P-slices are encoded by the motion compensated

process from a number of reference frames (previous coded pictures). Then the

residual data are transformed and quantized with the 4×4 integer transformation [2].

The transform coefficients are coded by using the context-adaptive variable length

coding scheme (CAVLC) [6] [7].

In the conventional coding standards, the reference frame for the P frame is fixed

as the previous frame. However, the reference frame in H.264/AVC may be multiple.

In addition, the block size for the motion compensation prediction process is no

longer limited to 16×16. In H.264, the block size may vary are from 16×16 to 4×4 (Fig.

2.2). There are totally seven block sizes are used for the motion compensation

prediction process.

(27)

7

0

Sub-macroblock partitions

0 1

0 1

0 1

2 3

0

0

1

0 1

0

2 1

3 1 macroblock partition of

16*16 luma samples and associated chroma samples

Macroblock partitions

2 macroblock partitions of 16*8 luma samples and associated chroma samples

4 sub-macroblocks of 8*8 luma samples and associated chroma samples 2 macroblock partitions of

8*16 luma samples and associated chroma samples

1 sub-macroblock partition of 8*8 luma samples and associated chroma samples

2 sub-macroblock partitions of 8*4 luma samples and associated chroma samples

4 sub-macroblock partitions of 4*4 luma samples and associated chroma samples 2 sub-macroblock partitions

of 4*8 luma samples and associated chroma samples

Fig. 2.2 Macroblock and Sub-macroblock partitions in H.264 [1]

Hence, the motion vector will be estimated for each partition of macroblock and

sub-macroblock. If the coding system choose a large partition size (16×16, 16×8 or 8×

16) means that the small number of bits are used to encode the motion vectors. On the

contrary, if the coding system choose the small partition size (8×8, 8×4, 4×8 or 4×4)

then larger number of bits will be required to encode the motion vectors.

In general, chroma components (Cr and Cb) of each 16×16 macroblock has half

horizontal and vertical resolutions of the luminance component. Each chroma

component is partitioned in the same way as the luminance component. The partition

in the luminance block 16×16 corresponds to the partition in the chroma block 8×8.

The partition in the luminance block 16×8 corresponds to the partition in the chroma

block 8×4.

(28)

8

2.2 The JM-9.3 Encoder Structure

In this thesis, the H.264/AVC reference software JM-9.3 [9] is utilized to develop

the fast mode decision method. Figure 2.3 shows the flowchart of the main part of

JM-9.3 encoder. In the JM-9.3 reference software, the configuration

(encoder_baseline.cfg) should be initialized at the head of “main” function. Then, the

frame_picture function is used to define the interlaced or progressive video. The

field_picture function is used for the interlaced video only. In the following, the

encode_one_macroblock function is used to encode each macroblock. In

encode_one_macroblock function, the PartitionMotionSearch function is utilized to

partition frame into the macroblocks for the BlockMotionSearch function. Finally, the

BlockMotionSearch function selects the motion estimation methods for the motion

compensation process. The algorithm of the function BlockMotionSearch for the

integer pixel and sub pixel searching are described as follows.

If (FastMotionEstimation enable) FastIntegerPelBlockMotionSearch() Else

#ifndef _FAST_FULL_ME FullPelBlockMotionSearch()

#else

FastFullPelBlockMotionSearch()

If (FastMotionEstimation enable) FastSubPelBlockMotionSearch() Else

SubPelBlockMotionSearch()

▲ Sub pixel motion search selection

◄ Integer pixel motion search selection

(29)

9

Fig. 2.3 The flowchart of the main part of JM-9.3 encoder.

The flowchart of the motion estimation and rate distortion optimization for mode

decision in reference software JM-9.3 is shows in Fig. 2.4.

(30)

10

Fig. 2.4 The flowchart of function “encode_one_macroblock”.

2.3 H.264/AVC Performance analysis

In this section, the performance of H.264/AVC coding system is illustrated. All

the tests/simulations/experiments are performed on the PC with P4-2.8G. The test

sequences consist of Akiyo, Container, Hall_Monitor, Mother_and_daughter, News,

Salesman, Carphone, Coastgrd, and Foreman. The parameters in the H.264/AVC for

the testing process are shown in Table 2.1.

(31)

11

Table 2.1 The related parameters in the H.264/AVC for the testing process.

Type categories QP Video Ref. Frame No. Inter mode block type A 28 qcif yuv 5 200 All the block type

B 28 qcif yuv 5 200 16×16

C 28 qcif yuv 5 200 16×16, 16×8, 8×16

D 28 qcif yuv 2 200 All the block type

The objectives of the testing processes focus on the analyses of the impact on the

encoding efficiency for the mode number used for the motion estimation and the

number of reference frames. Tables 2.2 to 2.10 illustrate the efficiency analysis.

Table 2.2 The efficiency analysis using the Akiyo test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 471 38.77 63.61

B 190 38.55 66.11

C 304 38.70 63.72

D 257 38.75 63.76

Table 2.3 The efficiency analysis using the Container test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 423 36.59 84.79

B 205 36.43 88.59

C 322 36.56 85.34

D 323 36.52 86.92

Table 2.4 The efficiency analysis using the Hall_Monitor test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 400 37.81 99.23

B 201 37.54 107.92

C 315 37.72 103.09

D 310 37.72 99.08

(32)

12

Table 2.5 The efficiency analysis using the Mother_and_daughter test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 428 37.78 81.40

B 196 37.49 86.40

C 318 37.70 82.10

D 327 37.71 81.41

Table 2.6 The efficiency analysis using the News test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 429 37.25 125.21

B 209 36.95 136.36

C 312 37.12 128.49

D 293 37.24 125.62

Table 2.7 The efficiency analysis using the Salesman test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 465 36.07 110.98

B 209 35.81 119.99

C 331 35.98 113.60

D 307 36.07 110.89

Table 2.8 The efficiency analysis using the Carphone test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 516 37.53 140.69

B 199 37.16 151.76

C 336 37.39 142.61

D 285 37.35 144.29

(33)

13

Table 2.9 The efficiency analysis using the Coastgrd test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 578 34.32 271.33

B 220 34.13 294.38

C 375 34.24 276.93

D 331 34.27 272.61

Table 2.10 The efficiency analysis using the Foreman test sequence.

Type categories Time (sec) PSNR (dB) Bit rate (kbps)

A 536 36.80 179.53

B 200 36.38 205.45

C 389 36.64 185.47

D 306 36.70 180.15

From Tables 2.2 to 2.10, the large number of macroblocks and reference frames

will make the encoding process inefficiency.

Many researches of improving the efficiency of multiple reference frames were

proposed. In [14], the authors use the SATD (Sum of Absolute Transformed

Differences) and motion vector compactness to decide how many reference frames are

needed. For example, the video sequences of bird flying (Fig. 2.5a) and moving bus

(Fig. 2.5b) are used to illustrate that the best reference frame is not the previous one.

(34)

14 (a)

(b)

Fig. 2.5 The video sequence of bird flying (a) and moving bus (b) are used to illustrate that the best reference frame is not the previous one.

Finally, some typical video sequences for the multiple mode section in

H.264/AVC are shown in Fig. 2.7 ~ 2.9. Furthermore, the blocks encoded with Inter

Mode and Intra Mode is labeled with different colors defined in Table. 2.11.

Table 2.11 the color definition.

Block type Color

MB_COPY ■

Inter Mode ■

Intra Mode ■

(35)

15

I frame P frame 1 P frame 2 P frame 3

P frame 4 P frame 5 P frame 6 P frame 7

P frame 8 P frame 9 P frame 10 P frame 11

Fig. 2.6 The illustration of mode decision with the encoding parameters defined in Table 2.1 for Akiyo sequence.

I frame P frame 1 P frame 2 P frame 3

P frame 4 P frame 5 P frame 6 P frame 7

P frame 8 P frame 9 P frame 10 P frame 11

Fig. 2.7 The illustration of mode decision with the encoding parameters defined in Table 2.1 for Coastgrd sequence.

(36)

16

I frame P frame 1 P frame 2 P frame 3

P frame 4 P frame 5 P frame 6 P frame 7

P frame 8 P frame 9 P frame 10 P frame 11

Fig. 2.8 The illustration of mode decision with the encoding parameters defined in Table 2.1 for Foreman sequence.

(37)

17

CHAPTER 3

The Fast Mode Decision Method

The full searching process (16×16, 16×8,…,4×4) for determining the estimation

mode in the reference software JM-9.3 will make the encoding process inefficient. In

this chapter, we will describe the proposed fast mode decision method. Firstly we will

analysis the spatial-temporal mode correlations among the spatial and temporal

macroblocks. Based on the spatial-temporal mode correlation, the fast mode decision

method is constructed.

3.1 The Spatial-Temporal Mode Correlation

By the careful observation of the mode decision process in JM-9.3 reference

software, the motion estimation mode of a macroblock is highly correlated with the

modes of the macroblocks neighboring to the same position on the previous reference

frames (multiple reference frames). The mode correlation is described in Fig. 3.1.

(38)

18

Fig. 3.1 The mode correlation for a macroblock between adjacent frames.

The probability that the mode of a macroblock on P frame #2 is similar to the

modes of the neighboring macroblocks centered at the same position on P frame #1

(The number 1~9 of P frame #1 in Fig. 3.1) is high. Based on the JM reference

software, the mode correlation is analyzed and listed in Table 3.1. Here, the

estimation modes: 16×16, 16×8, 8×16, 8×8, and skip mode are used to analyze the

mode correlation.

(39)

19

Table 3.1 The probabilities for the mode of a macroblock on P frame #2 is similar to the modes of the neighboring macroblocks centered at the same position on P frame

#1.

In Table 3.1, nine video sequences with length of 200 frames are used to analyze

the mode correlation. Furthermore, the quantization step size is set as QP=28, 32, 36

and 40 respectively. For the Foreman video sequence with QP=28, the possibility that

the estimation mode of a macroblock is closed to the estimation modes of the

neighboring macroblocks centered at the same position on the previous frame is

87.28%. From the analysis results in Table 3.1, it was observed that the mode

correlation for a macroblock between adjacent frames is high. Furthermore, two

additional experiments for the mode correlation analysis are given in Table 3.2 and

3.3. In Table 3.2 and 3.3 the skip and 16×16 modes are regarded as the same class.

Table 3.2 and 3.3 illustrate the probability that the estimation mode of a macroblock is

similar to the top one and top two estimation modes among the neighboring

macroblocks centered at the same position on the previous frame respectively.

Sequences QP28 QP32 QP36 QP40

Akiyo 95.11% 96.02% 97.22% 98.26%

Container 93.35% 94.88% 96.40% 97.55%

Hall_Monitor 95.28% 96.62% 97.16% 97.51%

Moth&Daug 91.77% 93.59% 95.11% 96.42%

News 92.90% 93.63% 94.52% 95.73%

Salesman 94.63% 95.10% 96.00% 97.31%

Carphone 88.54% 90.16% 92.32% 94.40%

Coastgrd 86.05% 87.31% 89.17% 92.70%

Foreman 87.28% 88.29% 89.22% 90.90%

(40)

20

Table 3.2 The probability that the estimation mode of a macroblock is similar to the top one.

Table 3.3 The probability that the estimation mode of a macroblock is similar to the top two.

From the observation in Tables 3.1, 3.2, and 3.3, we found that the motion

estimation mode of a macroblock is highly correlated with the motion estimation

modes of the macroblocks neighboring to the same position on the previous reference

frame. According to the above mode correlation analysis, we propose a new method

of fast mode decision to improve the computation cost of the mode decision process

Sequences QP28 QP32 QP36 QP40

Akiyo 90.10% 93.78% 96.76% 98.57%

Container 89.18% 93.38% 96.46% 98.49%

Hall_Monitor 91.58% 92.65% 93.74% 95.60%

Moth&Daug 78.31% 87.71% 93.86% 97.30%

News 81.11% 85.39% 90.04% 93.76%

Salesman 85.52% 87.99% 92.64% 96.69%

Carphone 64.70% 76.14% 85.99% 92.74%

Coastgrd 53.23% 65.31% 78.79% 89.42%

Foreman 54.29% 63.22% 74.78% 84.00%

Sequences QP28 QP32 QP36 QP40

Akiyo 93.48% 95.74% 97.76% 98.94%

Container 93.08% 95.37% 97.30% 98.70%

Hall_Monitor 96.01% 96.48% 96.41% 97.22%

Moth&Daug 86.42% 92.06% 95.95% 97.85%

News 87.86% 90.41% 93.16% 95.51%

Salesman 91.85% 92.35% 95.19% 97.80%

Carphone 78.22% 84.90% 90.51% 94.81%

Coastgrd 73.61% 79.11% 86.36% 93.32%

Foreman 72.11% 77.30% 84.31% 89.66%

(41)

21 in the H.264/AVC encoding system.

3.2 The Fast Mode Decision Method using Spatial-Temporal Mode Correlation

For each GOP, the mode decision process for first P-frame is determined by using

the full search method in JM reference software and the determined estimation modes

for each macroblock are used to predict the modes for next P-frame. The algorithm of

the fast mode decision method is described as follows.

Step 1. Tracking of the macroblock

To find the mode histogram for each macroblock, the corresponding position on

previous frame for each macroblock should be tracked. The position of each

macroblock on the current P-frame is tracked with the weighted motion vectors of the

macroblocks neighboring to the same position on the previous reference frame shown

in Fig. 3.2. The weighted motion vector is calculated as:

∑ ∑

+

=

= +

=

=

=

1

1 1

1

) , ( )

, (

x i

x i

y j

y j

)/T j i (

y x

MV

m

(2)

where MV is the predicted motion vector, x and y denote the block coordinates of

current frame and m(i, j) is the motion vector of block (i, j) on previous frame (The

number 1~9 in Fig. 3.2). The precise prediction of the motion vector may be

calculated by weighting each block according to the area proportion of the block to

(42)

22

4×4 block. For example, Table 3.4 illustrates the area proportion of each block to 4×4

block for the macroblock shown in Fig. 3.2. The value of T in Eq. (2) is obtained as

the summation of all the weighting factors.

Fig. 3.2 Distribution of motion vectors for a macroblock on P-frame #2.

Table 3.4 The number of the available motion vectors in Fig. 3.2 No.1 No.2 No.3 No.4 No.5 No.6 No.7 No.8 No.9 T

16 0 4 8 16 0 0 0 0 44

Step 2. Calculation of the mode histogram

Once each macroblock on current frame is tracked, we may find the

corresponding macroblock shown in Fig. 3.3. With the tracked position of each

macroblock, the mode histogram may be obtained by calculating the numbers of the

estimation modes of the neighboring macroblocks centered at the position of the

tracked macroblock.

(43)

23

Fig. 3.3 The prediction motion vector corresponding to macroblock.

The mode histogram is calculated by the following steps. All the modes in Fig.

1.1 are classified into five categories listed in Table 3.5 according to their block size.

Based on the mode categories in Table 3.5, we sort the number of partitioned blocks

for each category and then choose the top two categories as the candidate motion

estimation modes.

Table 3.5 Five categories for block modes.

Mode Category

1 SKIP / 16×16

2 16×8 / 8×16

3 8×8

4 8×4 / 4×8

5 4×4

Step 3. Select the motion estimation mode in the candidate categories.

Step 4. In order to prevent the drift phenomenon, the candidate categories need to be

refined when the R-D cost for a macroblock is larger than predefined threshold.

(44)

24

Firstly, we record the R-D cost for each macroblock in first P-frame obtained from the

JM mode decision process (full searching). Then, the R-D cost of each macroblock

for the successive P-frame will compared to the R-D cost of the corresponding

macroblock on the first P-frame. If the R-D cost is greater than a predefined value, the

mode decision process will be refined as the following rule:

If 8×8 mode is not chosen, then the mode 8×8 is considered as the candidate

mode.

Else if 16×16 mode is not chosen, then the mode 16×16 is considered as the

candidate mode.

Else the mode 4×4 is considered as the candidate mode.

Step 5. Once the motion estimation modes in each macroblock are determined, the

partition scheme and corresponding motion vectors are recorded.

The block diagram of fast mode decision method using the spatial-temporal

correlation is illustrated in Fig. 3.4.

(45)

25

Fig. 3.4 The block diagram of fast mode decision method.

(46)

26

CHAPTER 4 Experimental Results

Here, all the efficiency and rate-distortion analyses are constructed on the basis of

JM-9.3 reference software and the simulation environments are defined as:

1. The length of video frames for the simulation is 200;

2. The number of reference frames is 5;

3. The search range for the motion estimation is 16 pixels;

4. The Hadamard transform for encoding DC components is used;

5. The rate-distortion optimization is applied;

6. The length of GOP is 13;

7. Nine video sequence of QCIF format are used as the testing video.

Based on the environment setting, the efficiency and rate-distortion analyses for the

proposed system are illustrated.

4.1 The Efficiency and PSNR Analyses for Fixed QP Parameters

In this section, the efficiency and rate-distortion analyses for fixed QP

parameters are illustrated. Here the QP parameters in H.264/AVC are fixed as 28, 32,

36, 40 respecting for the above analyses. The computation efficiency is analyzed with

Eq. (3) and the transmission rate is analyzed with Eq. (4). Firstly, the efficiency

(47)

27

analyses are illustrated in Table 4.1 ~ 4.3. The computation times of the motion

estimation process for nine video sequences using the JM-9.3 reference software are

listed in Table 4.1 and the computation times using our proposed method are listed in

Table 4.2. The speed up analysis is given in Table 4.3. It is obviously that the fast

mode decision using the spatial-temporal correlation may reduce the computation

time about 62.94%. Especially, the computation time may be reduced more than

67.33% for the low-motion video sequences: Akiyo, Container, Hall_Monitor and

Moth&Daug. The PSNR analyses are illustrated in Table 4.4 ~ 4.6, The PSNR of the

motion estimation process for nine video sequences using the JM-9.3 reference

software are listed in Table 4.4 and the PSNR using our proposed method are listed in

Table 4.5. The PSNR analysis is given in Table 4.6. It is obviously that the fast mode

decision using the spatial-temporal correlation is decreased about 0.05dB. The bit-rate

analyses are illustrated in Table 4.7 ~ 4.9. The bit-rate of the motion estimation

process for nine video sequences using the JM-9.3 reference software are listed in

Table 4.7 and the bit-rate using our proposed method are listed in Table 4.8. The

bit-rate analysis is given in Table 4.9. It is obviously that the fast mode decision using

the spatial-temporal correlation is only increased about 1.32%. In general, the degree

of the efficiency improvement for the video sequences with high motion (Carphone,

Coastgrd, Foreman) is less than the one for the video sequences with low motion

(48)

28 (Akiyo, Container, Hall_Monitor, Moth&Daug).

% ] 100

[

] [

]

[ x

JM Time

proposed Time

JM

T Time

=

∆ (3)

% ] 100

[

] [ ]

[ x

proposed BitRate

JM BitRate proposed

BitRate

R

=

∆ (4)

Table 4.1 The computation times (secs) of the motion estimation process for nine video sequences using the JM-9.3 reference software.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo 287.314 312.212 289.464 282.45 Container 269.474 266.356 264.932 259.672 Hall_Monitor 249.655 250.843 254.847 257.433

A

Moth&Daug 280.982 282.236 275.035 265.734 News 267.245 269.825 271.863 270.774

B

Salesman 271.056 282.485 288.397 287.349

Carphone 321.541 315.659 310.295 301.799 Coastgrd 371.431 372.472 359.324 343.851

C

Foreman 318.449 316.305 315.779 310.682

Table 4.2 The computation times (secs) of the motion estimation process for nine video sequences using our proposed method.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo 111.302 103.608 101.118 89.628

Container 90.484 81.677 73.227 67.697

Hall_Monitor 80.144 78.406 79.468 76.155

A

Moth&Daug 121.263 109.707 86.865 75.448

News 103.086 101.961 87.118 80.992

B

Salesman 101.996 101.530 96.495 81.251

Carphone 188.059 154.296 116.575 92.569 Coastgrd 189.512 166.412 140.789 116.430

C

Foreman 179.554 165.550 143.719 121.728

(49)

29

Table 4.3 The speed up analysis.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo 61.26% 66.81% 65.07% 68.27%

Container 66.42% 69.34% 72.36% 73.93%

Hall_Monitor 67.90% 68.74% 68.82% 70.42%

A

Moth&Daug 56.84% 61.13% 68.42% 71.61%

News 61.43% 62.21% 67.96% 70.09%

B

Salesman 62.37% 64.06% 66.54% 71.72%

Carphone 41.51% 51.12% 62.43% 69.33%

Coastgrd 48.98% 55.32% 60.82% 66.14%

C

Foreman 43.62% 47.66% 54.49% 60.82%

Table 4.4 The PSNR (dB) of the motion estimation process for nine video sequences using the JM-9.3 reference software.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo 38.77 35.91 33.32 30.85

Container 36.59 34.05 31.48 29.01

Hall_Monitor 37.81 35.03 32.09 29.29

A

Moth&Daug 37.78 35.07 32.72 30.59

News 37.25 34.25 31.46 28.75

B

Salesman 36.07 33.15 30.43 28.04

Carphone 37.53 34.81 32.36 29.83

Coastgrd 34.32 31.41 29.02 26.97

C

Foreman 36.80 34.09 31.48 28.84

(50)

30

Table 4.5 The PSNR (dB) of the motion estimation process for nine video sequences using our proposed method.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo 38.73 35.83 33.27 30.79

Container 36.56 34.00 31.42 28.95

Hall_Monitor 37.78 35.04 32.03 29.24

A

Moth&Daug 37.74 35.01 32.65 30.54

News 37.20 34.16 31.36 28.68

B

Salesman 36.05 33.11 30.40 28.01

Carphone 37.48 34.73 32.25 29.73

Coastgrd 34.30 31.38 28.97 26.91

C

Foreman 36.76 34.01 31.37 28.77

Table4.6 The PSNR (dB) analysis.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo -0.04 -0.08 -0.05 -0.06

Container -0.03 -0.05 -0.06 -0.06

Hall_Monitor -0.03 0.01 -0.06 -0.05

A

Moth&Daug -0.04 -0.06 -0.07 -0.05

News -0.05 -0.09 -0.1 -0.07

B

Salesman -0.02 -0.04 -0.03 -0.03

Carphone -0.05 -0.08 -0.11 -0.1

Coastgrd -0.02 -0.03 -0.05 -0.06

C

Foreman -0.04 -0.08 -0.11 -0.07

(51)

31

Table 4.7 The bit-rate (kbits) of the motion estimation process for nine video sequences using the JM-9.3 reference software.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo 63.61 42.86 29.47 20.66

Container 84.79 53.74 35.28 24.13

Hall_Monitor 99.23 66.08 43.84 29.28

A

Moth&Daug 81.40 48.74 30.31 19.17

News 125.21 82.76 54.76 35.96

B

Salesman 110.98 68.51 41.59 25.30

Carphone 140.69 85.41 54.07 34.88

Coastgrd 271.33 134.35 69.78 39.51

C

Foreman 179.53 113.69 72.89 47.48

Table 4.8 The bit-rate (kbits) of the motion estimation process for nine video sequences using our proposed method.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo 64.49 42.81 29.35 20.74

Container 85.56 53.76 35.13 24.00

Hall_Monitor 102.93 68.15 44.52 29.66

A

Moth&Daug 83.17 48.71 30.26 19.15

News 130.20 84.51 55.52 36.31

B

Salesman 114.17 69.90 41.77 25.31

Carphone 145.14 86.38 53.88 34.95

Coastgrd 280.52 138.28 70.75 39.89

C

Foreman 187.65 117.04 73.78 47.91

(52)

32

Table 4.9 The bit-rate analysis.

Sequence QP 28 QP 32 QP 36 QP 40

Akiyo 1.36% -0.12% -0.41% 0.39%

Container 0.90% 0.04% -0.43% -0.54%

Hall_Monitor 3.59% 3.04% 1.53% 1.28%

A

Moth&Daug 2.13% -0.06% -0.17% -0.10%

News 3.83% 2.07% 1.37% 0.96%

B

Salesman 2.79% 1.99% 0.43% 0.04%

Carphone 3.07% 1.12% -0.35% 0.20%

Coastgrd 3.28% 2.84% 1.37% 0.95%

C

Foreman 4.33% 2.86% 1.21% 0.90%

4.2 The Rate-Distortion Analyses

In section 4.2, the rate-distortion analyses for the bit-rate 100K to 3000K bits/sec

are illustrated. The rate-distortion analyses are performed with the following two

environment setting: (1) All the motion estimation modes in the JM-9.3 are used, (2)

Only the motion estimation modes: 16×16, 16×8, and 8×16 are used. From Fig. 4.1 to

Fig. 4.9, the simulation results show that the PSNR value is closed the optimal value

obtained from the JM-9.3 reference software.

(53)

33

0 200 400 600 800 1000 1200 1400

40 42 44 46 48 50 52 54 56 58 60

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-akiyo.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.1 Rate-distortion curves for Akiyo video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

0 500 1000 1500 2000 2500 3000 3500

35 40 45 50 55 60

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-carphone.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.2 Rate-distortion curves for Carphone video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

(54)

34

0 500 1000 1500 2000 2500 3000

25 30 35 40 45 50 55

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-coastgrd.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.3 Rate-distortion curves for Coastgrd video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

0 500 1000 1500 2000 2500 3000

35 40 45 50 55 60

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-container.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.4 Rate-distortion curves for Container video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

(55)

35

0 500 1000 1500 2000 2500 3000

30 35 40 45 50 55 60

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-foreman.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.5 Rate-distortion curves for Foreman video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

0 500 1000 1500 2000 2500 3000

38 40 42 44 46 48 50 52 54 56

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-hall-monitor.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.6 Rate-distortion curves for Hall_monitor video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

(56)

36

0 500 1000 1500 2000 2500 3000

35 40 45 50 55 60

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-mother-and-daughter.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.7 Rate-distortion curves for Mother_and_daughter video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

0 200 400 600 800 1000 1200 1400 1600 1800 2000 35

40 45 50 55 60

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-news.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.8 Rate-distortion curves for News video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

(57)

37

0 500 1000 1500 2000 2500 3000

35 40 45 50 55 60

Bit rate (kbit/s) @ 30.00 Hz

SNR Y(dB)

SNR vs. bit-rate for sequence-salesman.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.9 Rate-distortion curves for Salesman video sequence obtained by JM-9.3, our proposed method, and JM-9.3 with only Macroblock.

0 200 400 600 800 1000 1200 1400

50 100 150 200 250 300

Bit rate (kbit/s) @ 30.00 Hz

ME time(secs)

ME time vs. bit-rate for sequence-akiyo.qcif JM-9.3 Proposed JM-9.3 only MB

0 500 1000 1500 2000 2500 3000 3500

140 160 180 200 220 240 260 280 300 320

Bit rate (kbit/s) @ 30.00 Hz

ME time(secs)

ME time vs. bit-rate for sequence-carphone.qcif

JM-9.3 Proposed JM-9.3 only MB

Fig. 4.10 The computation time of the motion estimation process for Akiyo (left) and Carphone (right) video sequences using the JM-9.3 reference software, our proposed

method, and JM-9.3 with only Macroblock.

參考文獻

相關文件

Animal or vegetable fats and oils and their fractiors, boiled, oxidised, dehydrated, sulphurised, blown, polymerised by heat in vacuum or in inert gas or otherwise chemically

Milk and cream, in powder, granule or other solid form, of a fat content, by weight, exceeding 1.5%, not containing added sugar or other sweetening matter.

Since the assets in a pool are not affected by only one common factor, and each asset has different degrees of influence over that common factor, we generalize the one-factor

General overview 1-2–1-3 Reference information 6-1–6-15 Emergency Power Off button 6-11 Focusing the video image 4-3 Foot Switches 6-14. General Overview 1-2

Reading Task 6: Genre Structure and Language Features. • Now let’s look at how language features (e.g. sentence patterns) are connected to the structure

Step 1: With reference to the purpose and the rhetorical structure of the review genre (Stage 3), design a graphic organiser for the major sections and sub-sections of your

Process:  Design  of  the  method  and  sequence  of  actions  in  service  creation and  delivery. Physical  environment: The  appearance  of  buildings, 

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the

The temperature angular power spectrum of the primary CMB from Planck, showing a precise measurement of seven acoustic peaks, that are well fit by a simple six-parameter

Table 3 Numerical results for Cadzow, FIHT, PGD, DRI and our proposed pMAP on the noisy signal recovery experiment, including iterations (Iter), CPU time in seconds (Time), root of

Microphone and 600 ohm line conduits shall be mechanically and electrically connected to receptacle boxes and electrically grounded to the audio system ground point.. Lines in

The continuity of learning that is produced by the second type of transfer, transfer of principles, is dependent upon mastery of the structure of the subject matter …in order for a

It is interesting that almost every numbers share a same value in terms of the geometric mean of the coefficients of the continued fraction expansion, and that K 0 itself is

Biases in Pricing Continuously Monitored Options with Monte Carlo (continued).. • If all of the sampled prices are below the barrier, this sample path pays max(S(t n ) −

The manufacturing cycle time (CT) of completing the TFT-LCD processes is about 9~13 days which is the summation of about 5-7 days for Array process, 3-5 days for Cell process and 1

Filter coefficients of the biorthogonal 9/7-5/3 wavelet low-pass filter are quantized before implementation in the high-speed computation hardware In the proposed architectures,

(2) We emphasized that our method uses compressed video data to train and detect human behavior, while the proposed method of [19] Alireza Fathi and Greg Mori can only

Zhang, “A flexible new technique for camera calibration,” IEEE Tran- scations on Pattern Analysis and Machine Intelligence,

Ahmad, A Variable Block Size Motion Estimation Algorithm for Real-time H.264 Video Encoding,  Signal Processing: Image Communication,

Our experimental results show that when using the same set of training and test data, the proposed multi-angle hand posture recognition method can achieve

Heiji L, Dahlen G, Sundin Y, Wenander A, Goodson JM (1991), “A 4-quadrant Comparative Study of Periodontal Treatment Using Tetracycline-containing Drug Delivery Fibers and

Figure 1.1 Variable block sizes and corresponding mode number ...5 Figure 1.2 The advantage of multiple frame reference...6 Figure 1.3 Motion estimation with multiple reference

In addition to the construction of Lemma 3.4.1, there is a quite orthogonal way of constructing a pushdown automaton that accepts the language generated by a given