具邊界前瞻之非失真影像預測編碼技術

(1)

國

立

交

通

大

學

電機與控制工程學系

博

士

論

文

具邊界前瞻之非失真影像預測編碼技術

Predictively Encoded Techniques with Edge-look-ahead

for Lossless Compression of Images

研

究生：高立人

(2)

具邊界前瞻之非失真影像預測編碼技術

Predictively Encoded Techniques with Edge-look-ahead

for Lossless Compression of Images

研究生：高立人 Student：Lih-Jen Kau

指導教授：林源倍 Advisor：Yuan-Pei Lin

國立交通大學

電機與控制工程學系

博士論文

A Dissertation

Submitted to Department of Electrical and Control Engineering College of Electrical Engineering

National Chiao Tung University in partial Fulfillment of the Requirements

for the Degree of Doctor of Philosophy

in

Electrical and Control Engineering September 2008

(3)

具邊界前瞻之非失真影像預測編碼技術

研究生：高立人 指導教授：林源倍博士

國立交通大學電機與控制工程學系（研究所）博士班

摘要

非失真影像編碼在許多場合皆有其應用之需求性；例如醫學影像編碼、遠端感測，以及影像壓縮等。在進行非失真影像編碼過程中，如何有效地移除統計累贅至今仍是訊號源編碼研究領域的一個主要挑戰課題。因此已有許多關於非失真影像編碼的方法被提出。其中有部分的研究係使用可還原小波轉換。然而由各項文獻中可以發現，使用轉換編碼所得到的結果往往卻不若以空間域之預測編碼結合環境建模（predictive coding with context modeling in spatial domain）所得到之結果來的好。

在本篇論文中，我們將針對近年來非失真影像編碼之概況做一簡介。此外我們亦將提出一基於線性預測之架構並以最小平方法進行此預測器係數之修正。由於使用最小平方法進行預測器係數之修正具有所謂邊界導向之特性（edge-directed characteristic），因此對於位處影像邊界附近像素之預測具有非常良好的效果。然而若將整張圖像皆以最小平方法進行預測器係數之修正勢將導致極高之複雜度，因此我們提出當編碼過程遭遇影像之邊界時才以最小平方法進行預測器係數之修正；如此運算複雜度將可大幅地降低。為了能夠於編碼過程中事先偵測影像邊界之存在與否，我們提出了一個非常簡易、有效而且僅使用已掃瞄/編碼像素（causal pixels）的邊界偵測法。如此一來我們的系統便可以預知影像邊界存在與否，並在遭遇邊界時，事先以最小平方法進行預測器係數之修正以防範較大預測誤差之發生。在我們所提出的方法中僅使用到已掃瞄/編碼像素來進行預測編碼；因此並無需傳送額外之資訊。經由實驗證明，我們所提出的方法能夠有效地在預測結果與運算複雜度之間取得良好的平衡點。此外我們也透過大量的實驗並針對當前非失真影像壓縮領域最先進的預測器（predictor）與編碼器（coder）進行比較來證明所提出方法之可用性。

(4)

除上述所提出具邊界前瞻（edge-look-ahead）能力之非失真預測編碼架構外，我們發現較大的預測誤差通常好發於影像中具有邊界之處。因此，我們在本篇論文中提出一創新概念，亦即利用控制工程的技術來改進影像中位於邊界附近像素之預測結果。之所以有這樣的想法是因為我們瞭解控制系統的目的本就是希望系統的輸出能夠準確地遵循所輸入之控制命令；因此在目的上與預測編碼是一致的。此外影像中的邊界（亦即像素的急速變化）亦可視為控制系統中的步階命令（step command）。基於前述的觀察使我們產生了嘗試以控制方法來改進預測編碼效能的想法。為了實現這樣想法，我們也以 Takagi 以及 Sugeno 兩位學者所提出之模糊類神經網路實做了一適應性預測編碼架構。此外我們也將控制領域中經常被使用的Proportional Controller（P 型控制器）實現於此一模糊類神經網路中以強化影像中位於邊界像素之預測結果。我們發現這樣的作法對於預測編碼的效能改善確實是有所幫助的；雖然目前的改進效果並不是那麼地顯著，但是這樣的概念的確為非失真影像預測編碼領域開啟了一個全然不同的問題思考以及解決方式。

(5)

Abstract

Lossless image coding is required by many applications, such as medical imaging, remote sensing, and image archiving. It has re-mained a major challenge to source coding community for the dif-ficulty of removing statistical redundancy effectively and efficiently. Therefore, many approaches have been proposed for lossless compres-sion of images. Among proposed approaches, some of which are based on reversible transform coding, like integer wavelet transformation. However, we find in literatures that the results obtained by using transform coding are typically inferior to that of obtained by predic-tively encoded techniques with context modeling in spatial domain.

In this dissertation, an introduction on recent advances in lossless image coding will be given. Moreover, we will propose an approach based on linear predictive coding with least-squares (LS) optimiza-tion for the adaptaoptimiza-tion of predictor coefficients. The LS-based adap-tive predictor, for its edge-directed characteristic, has been shown to be useful for the prediction of pixels around boundaries. Instead of performing LS adaptation in a pixel-by-pixel manner, we adapt the predictor coefficients only when an edge is detected so that the compu-tational complexity can be significantly reduced. For this, we propose a simple yet effective edge detector using only causal pixels. This way, the proposed system can look ahead to determine if the coding pixel is around an edge and initiate the LS adaptation in advance to prevent the occurrence of a large prediction error. Furthermore, only causal pixels are used for estimating the coding pixels in the proposed encoder; no additional side information needs to be transmitted. As

(6)

prediction results and the computational complexity can be obtained with the proposed approach. Besides, extensive experiments as well as comparisons to existing state-of-the-art predictors and coders will be given to demonstrate the usefulness of the proposed approach.

In addition to the proposed edge-look-ahead approach, we find a large prediction error can usually take place for pixels around bound-aries. Therefore, we also propose in this dissertation a novel concept of using control technologies to improve prediction result for pixels around boundaries. This idea comes from the fact that the purpose of a control system is to follow the input command as precisely as pos-sible, which has the same objective with predictive coding. Moreover, an edge or a boundary can be regarded as a step command in control system. The above observations lead to the idea of solving this prob-lem using control technologies. To realize this idea, we also impprob-lement an adaptive predictor using Takagi-Sugeno fuzzy neural network (TS-FNN). Moreover, the widely used proportional controller in control theory is applied implicitly in the consequent part of the network as a compensator to enhance the prediction result around edges. The effectiveness of the proposed novel approach, though not very con-spicuous at present, can be further improved if a more sophisticated compensator is applied, and what’s more, we have brought up an idea of solving this problem in a quite different aspect for lossless compres-sion of images.

(7)

誌

謝

回首民國七十六年時母親引領著我來到交大控制工程系報到；從那一刻起，交大控制系這個名字便已伴隨著我開始了人生的重要旅程。當時的情景依然如此地鮮明在目，然而轉瞬卻已過了二十一個年頭。在這二十一個年頭當中，立人在交大控制系歷經了人生最重要的大學時代、專任助教工作、碩士學位以及甫完成的博士學位等不同階段。一直以來，立人在系裡面受到了許多師長的照顧與提攜；這點點滴滴，立人都感激並謹記在心。在立人博士班就學期間承蒙林源倍教授不棄，願收入門下加以指導並進行論文研究。林源倍教授治學極為嚴謹；在林源倍教授之指導下，立人得以培養謹慎的研究態度與研究方法。若無老師不厭其煩地反覆指導實驗進行與論文寫作，立人絕無法達成今日之目標。此間立人亦因工作之故，必須往返花蓮新竹兩地，對於自身之研究進度多未能如期完成；然而老師非但未曾加以責難，尚且加以關懷立人之家庭與工作概況，亦為師亦為友；此份包容與提攜之情，使立人甚感慚愧與感激。在研究期間，實驗室歷屆學弟妹們的相互鼓勵與扶持使我能走過研究的低潮。特別是建樟，常常受到我的麻煩，最是辛苦。由於實驗室學弟妹成員眾多無法一一列出，在此一併致上立人的感謝與祝福。在最後階段的博士論文口試期間，系上的鄧清政教授、林進燈教授以及吳炳飛教授，清大電機系的陳博現教授以及中研院資訊所的黃文良教授慨然允諾擔任口試委員並撥空前來給予立人指導，斧正論文，使此一論文能更臻於完備，在此向老師們致上最深的感謝。口試畢後各位老師所給予高度的評價更使立人感到惶恐；立人也深知自己的耕耘不夠；這份肯定更代表未來需要加倍地努力。立人於民國八十七年返回花蓮大漢技術學院任職至今，期間前後任校長張國照教授以及康自立教授不以立人駑鈍，屢屢委以重任；此份提攜之情，立人點滴感受在心。此外兩位校長的智慧以及謙謙君子的學者風範更是我後生晚輩所需效法學習的對象。岳父林添龍先生，岳母張春崇女士在立人與內人林純雯女士婚後，持續給予最大的支持，使我們能夠順利地建立起屬於一家人的溫暖天地。此外岳父母尚且需忍受兩個小外孫需索無度的玩具購買要求，真是難為他兩老了。立人與兄樹人雖幼年失怙，然母親黃牡丹女士一肩挑起重擔，獨立扶養我兄弟二人長大成人，提供我們衣食無缺、沒有憂慮的生活並接受高等教育；慈恩浩蕩，山高水長。

(8)

好人好事代表，獲頒八德獎。古諺云：「積善人家慶有餘」；相信這也是我兄弟二人多年以來能遇見許多師長、朋友提攜扶持的原因吧。年已古稀的母親不忍立人忙於學校行政、教學工作，同時又有進修學位的壓力，原應含飴弄孫、享清福的母親還需每日哄騙我們家裡那兩個小惡魔，照顧他們洗澡、吃飯、就寢還要受氣。凡此種種，做為不孝兒子的我要向您說一聲：「媽媽您辛苦了，媽媽我愛您」。在立人成長的過程中，雖然沒有父親的榜樣可以學習，但是母親以及兄長樹人一路的引導，使立人在面對重大挑戰與困難時猶如一盞指引的明燈。正所謂「長兄如父」，大哥的擔子也總是特別地重，從小以來您就是我的榜樣，謝謝您。這八年來，除進修博士學位外，也完成了終身大事。九十二年與九十五年間，長子高暉以及次子高旭亦分別報到。內人純雯在我擔任行政工作並進修博士學位期間分擔我家那兩隻小犬的照顧工作，使我沒有後顧之憂。時常身兼兩職的內人也使孩子們瞭解到慈母也有威嚴的一面，不可放肆。辛苦妳了純雯，謝謝妳讓我有了一個完整的家。學生生涯階段雖然已告一段落，但是真正的挑戰（研究）才要開始。面對眼前一片開闊的天際，我將帶著大家的祝福繼續在學術研究的路上前行、飛翔。最後謹以此論文獻給關心立人的師長、朋友以及我的家人，謝謝大家。高立人 97 年 09 月 9 日於新竹

(9)

List of Tables

4.1 First-order entropies of uncompensated prediction errors and

percentage of pixels that activate the LS adaptation when diﬀerent context is used for the edge detection. (Run on a P4-1.4GHz machine; without doing error compensation and entropy coding). . . 41

4.2 Compression ratio and the running time (in seconds, on a

P3-1.06GHz machine) of the constructed coder vary with diﬀerent prediction order using the proposed approach. . . 48

4.3 First-order entropies of prediction errors. (Only the regular

mode is used in the proposed algorithm; the run mode is dis-abled.) . . . 51

4.4 Percentage of pixels performing LS adaption and the resulting

ﬁrst-order entropy by varying the variance threshold γ₁ in the proposed approach (The image “Lennagrey” is used for the test with γ₂ = 10, θ = 10 for all cases). . . 53

4.5 Comparisons with existing lossless image coders (in bits/sample).

The ﬁfth column is the execution time of the proposed ap-proach (on a P3-1.06GHz machine). . . 57

(13)

4.6 Percentage of pixels performing LS adaption and the num-ber of pixels performing Cholesky decomposition and Singular

Value Decomposition (SVD). (Only the regular mode is used;

the run mode is disabled.) . . . 60

4.7 Operation counts for edge detector in (2.4). . . 61

5.1 Initial parameters for the six Gaussian membership functions

in Layer2. . . 86

5.2 Initial parameters for matrix A. (i.e., coeﬃcients of the nine

sixth-order predictors) . . . 88

5.3 Initial parameters for matrix B. (i.e., parameters of the nine

“P-controller” compensators) . . . 89

5.4 Learning rates and momentum for network adaptation process. 89

5.5 The usefulness of the proposed “P-controller” compensator.

(i.e., the term B_ku) . . . 90

5.6 Comparisons on ﬁrst-order entropy with existing

state-of-the-art predictors. . . 94

5.7 Comparisons on actual bit rates with existing state-of-the-art

lossless image coders. (run on a P4-1.4GHz machine with

(14)

List of Figures

1.1 Basic block diagram of a lossless diﬀerential encoding system. 2

2.1 Proposed RALP coding system. . . 9

2.2 The ordering of pixels for prediction inputs. . . 11

2.3 Area that contains a vertical edge. . . 13

2.4 A typical histogram of an area that contains an edge. . . 16

2.5 The online training regions for the proposed predictor. . . 18

2.6 The image “Lennagrey”. . . 20

3.1 Histogram of errors in quantization bins for image “Lenna-grey”. (using a sixth-order LS-based predictor with γ₁ = 100, γ₂ = 10) . . . 30

4.1 (a) The image “Shapes”. (b) Pixels for which (2.4) is satisﬁed in the image “Shapes”. . . 33

4.2 (a) The image “Noisesquare”. (b) Pixels for which (2.4) is satisﬁed. . . 34

4.3 Pixels for which (2.4) is satisﬁed in the image “Lennagrey” (γ₁ = 100, γ₂ = 10). . . 34

4.4 Masks used in “Sobel” operator. (a) The mask G_x for the computation of vertical derivatives. (b) The mask G_y for the computation of horizontal derivatives. . . 35

(15)

4.5 Pixels detected as around an edge in image “Lennagrey” when the Sobel operator is used. (a) threshold= 50. (b) threshold= 100. . . 36

4.6 Pixels detected as around an edge in image “Lennagrey” when

the Sobel operator is used. (a) threshold= 150. (b) threshold= 200. . . 37

4.7 4-point edge detector for image “Lennagrey”. (a) Pixels

de-tected as around an edge. (b) Pixels for which LS adaptation is used. . . 39

de-tected as around an edge. (b) Pixels for which LS adaptation is used. . . 40 4.10 100-point edge detector for image “Lennagrey”. (a) Pixels

de-tected as around an edge. (b) Pixels for which LS adaptation is used. . . 40 4.11 (a) Image of reﬁned errors using the proposed approach for

“Lennagrey”. (using a sixth-order LS-based predictor with

γ₁ = 100, γ₂ = 10) (b) Histogram of prediction errors for

image “Lennagrey”. . . 42 4.12 (a) The image “Airplane”. (b) Histogram of prediction errors

for image “Airplane”. . . 44 4.13 (a) The image of uncompensated errors for “Airplane”. (b)

(16)

4.14 (a) The image “Goldhill”. (b) Histogram of prediction errors

for image “Goldhill”. . . 45

4.15 (a) The image of uncompensated errors for “Goldhill”. (b) Image of reﬁned errors for “Goldhill”. . . 46

4.16 (a) The image “Peppers”. (b) Histogram of prediction errors for image “Peppers”. . . 46

4.17 (a) The image of uncompensated errors for “Peppers”. (b) Image of reﬁned errors for “Peppers”. . . 47

4.18 (a) Pixels for which LS adaption is used in the proposed edge-look-ahead predictor for the image “Lennagrey”. (using a sixth-order predictor with γ₁ = 100, γ₂ = 10) (b) Image of un-compensated prediction errors using the proposed edge-look-ahead approach for “Lennagrey”. . . 49

4.19 Histogram of uncompensated prediction errors for the pro-posed approach and that of a pixel-by-pixel adaptation. (both using a sixth-order predictor) . . . 50

4.20 The X-ray image “Neck”. . . 52

4.21 (a) Pixels for which the LS adaptation is used in the proposed edge-look-ahead predictor for the X-ray image “Neck”. (using a sixth-order predictor) (b) Image of uncompensated predic-tion error for the X-ray image “Neck”. . . 53

4.22 (a) Histogram of uncompensated prediction errors for the pix-els in Fig. 4.3. (b) Histogram of uncompensated prediction errors for those pixels in Fig. 4.1(b). . . 55

5.1 Proposed TS-FNN based coding system. . . 70

5.2 Proposed TS-FNN predictor. . . 72

(17)

5.4 Histogram of reﬁned errors in quantization bins for image “Lennagrey.” . . . 84

5.5 The six membership functions in layer2. (a) associated with

input variable z₁. (b) associated with input variable z₂. . . . 87

5.6 The image “Barb.” . . . 91

5.7 Uncompensated prediction error for image “Barb.” (a) with B

matrix absent. (b) with B matrix in presence. . . 92

5.8 Histogram of uncompensated prediction error for the image

“Barb.” (both with the four-pixel online training area) . . . . 93

5.9 Prediction errors for the image “Lennagrey.” (a)

Uncompen-sated. (b) Reﬁned. . . 95 5.10 Histogram of uncompensated and reﬁned prediction errors for

(18)

Chapter 1 Introduction

Lossless image coding is required by many applications, such as medical imaging, remote sensing, and image archiving. It has remained a major challenge to source coding community for the difficulty of removing statistical redundancy effectively and efficiently. In this chapter, we will give a short description on the differential encoding technique, which is widely used in image and speech encoding system. Moreover, a review on recent advances in lossless image coding will also be introduced. After that is an overview of the proposed predictive coding scheme. Finally, the chapter outline of the dissertation will be given.

(19)

Channel _{Decompression} Compression Entropy Coding Entropy Decoding Predictor Predictor X e e X X ^ X ^ _X^_X^

Figure 1.1: Basic block diagram of a lossless diﬀerential encoding system.

1.1 Advances in Lossless Image Coding

There have been great advances in lossless image coding recently [1]-[31]. Some of which are based on reversible wavelet transformation using lifting structure [6]-[10]. By using integer wavelet transformation, lossless to near-lossless compression as well as progressive reconstruction of image data can be achieved [6]-[10]. However the compression results obtained with the use of integer wavelet transformation are typically inferior to that of obtained by predictively encoded techniques [17].

The predictive coding scheme, known as the differential pulse code mod-ulation (DPCM), is used in a wide variety of applications such as image and speech compression for ease of implementation [1]. Due to the high correlation between successive image samples, the differential encoding tech-nique removes the inter-pixel redundancy by encoding the difference between successive image samples rather than the samples themselves. Since the dif-ference between samples is expected to be smaller than the actual sampled amplitudes, fewer bits are required to represent the difference. Thus, the differential encoding removes the inter-pixel redundancies and encodes only

(20)

For lossless compression of images, we show in Fig. 1.1 the basic block diagram of a differential encoding system (predictive coding system). As can be seen in Fig. 1.1, the lossless predictive coding system is composed of two major blocks; the predictor and the entropy coder. In order that a lower first-order entropy and hence a lower actual bit rate can be obtained, many researches on the design of an effective and efficient predictor for removing the statistical redundancy among coding pixels have been proposed. Among which, adaptive predictors with context modeling are often used to accom-modate the varying statistics of coding images [11]-[31]. Besides, adaptive prediction is achieved in most of the coders by using multi-predictor struc-tures [11]-[22]. Among which, the CALIC coding system [14], a state-of-the-art lossless coder proposed for JPEG-LS, uses a gradient adjusted predictor (GAP). Based on the gradient of neighboring pixels, one out of a set of seven predictors is chosen. The LOCO-I coder [15], an algorithm motivated by CALIC [14] and standardized into JPEG-LS, uses a median edge detec-tor (MED) to choose one of three predicdetec-tors for current prediction. In [16], adaptive prediction is achieved by choosing one out of a set of predictors that minimizes the energy of prediction errors in a specified cluster of causal pix-els, and the predictor coefficients of the selected predictor are then updated by applying gradient descent rule.

In [17]-[20], multi-pass prediction is introduced. With multiple passes, progressive transmission of lossless and near-lossless coding of image data can be achieved. Besides, the encoder can form a 360 degree prediction [19] or perform a global image analysis [20] by using multi-pass prediction. A highly complex two-pass coder called TMW has been proposed in [20]. Using multiple linear predictors and global image analysis, the TMW system can achieve lower bit rates than existing coders for most images. While achieving

(21)

very low bit rates, the computational cost is regarded as prohibitive in TMW [20]. Recently, a fuzzy logic-based adaptive DPCM algorithm called FMP [21] is proposed. The FMP presents a competitive, and in some cases superior result than TMW but with a lower computational cost. Though FMP is eﬀective in removing the statistical redundancy, it still takes minutes.

In the context of optimal predictors, the minimum mean square error estimate of Y given observations X₁, X₂,· · · , X_n is E{Y |X₁, X₂,· · · , X_n},

generally a nonlinear function. Therefore, there have been many results using neural networks as nonlinear estimators [22]-[24]. Neural network based predictors perform well in slowly varying areas. However, there can be large prediction error around boundaries [32]. The result can be improved using additional hidden layers or hidden neurons, but this incurs a drastic increase in complexity [23], [33].

The performance of predictive image coding scheme highly depends upon the effectiveness of the predictor used in the coding process. Most of the image predictors perform very well in slowly varying areas. However, large prediction errors can take place around edges and boundaries, and this has become a major problem to be conquered so far. Intuitively, the prediction results can be improved if we can foresee the existence of an edge and then predict along the edge orientation. However, the design of a robust edge de-tector and the analysis of edge orientation are difficult problems themselves, let alone to predict along the edge orientation. Recently, linear predictors adapted by least-squares (LS) optimization have been proposed as an effi-cient approach to accommodate varying statistics of coding images [25]-[31]. Among which, the EDP [26] pointed out that the superiority of LS adap-tation is in its edge-directed property. That is, the LS-based predictor can adjust the prediction support along the edge orientation automatically during

(22)

the adaptation process. With the edge-directed property, LS-based adaptive predictor performs very well for pixels around boundaries. For complexity consideration, performing the LS adaptation process in a pixel-by-pixel man-ner is regarded as prohibitive. Therefore, the EDP [26] proposed initiating the LS optimization process only when the prediction error is beyond a pre-selected threshold such that the computational complexity can be reduced. The EDP [26] has made a noticeable improvement over the state-of-the-art lossless coder CALIC [14]. On the other hand, we know that the normal equa-tions provide the key for LS adaptation, and some fast algorithms, Cholesky decomposition for example, can be applied in the LS adaptation process. Therefore, the complexity in solving the normal equations itself is not a problem. Nevertheless, the computational cost for the construction of nor-mal equations is rather high. Thus, an algorithm for the fast construction of normal equations has been proposed in [25] so that the computational cost for LS adaptation process can be reduced signiﬁcantly.

(23)

1.2 The Proposed approach

It is known that many coding methods are more efficient with some images than others. In particular, run-length coding is very useful for coding areas of little changes. Adaptive predictive coding achieves high coding efficiency for fast changing areas like edges. In this dissertation, we propose a switch-ing codswitch-ing scheme that will combine the advantages of both Run-length and Adaptive Linear Predictive coding (RALP). There are other switching meth-ods that achieve very low bit rates [20]-[23]. However the results are usually obtained with a very high computational complexity [20]-[23]. On the con-trary, the proposed RALP coder can achieve a very good coding efficiency but still with a moderate computational complexity. In the proposed ap-proach, the run-length encoder is used for pixels in slowly varying areas; otherwise an LS-based adaptive predictor is used. The LS-based predictor has been shown to be very useful for the prediction of pixels around an edge [26], [27]. Moreover, we adapt the predictor coefficients only when an edge is detected or when the prediction error is beyond a pre-selected threshold so that the computational cost can be significantly reduced [27]. To do this, we use a simple and efficient edge detector that uses only causal pixels, i.e., pixels that have already been coded. This way, the predictor can look ahead if the coding pixel is around an edge and initiate the LS adaptation process beforehand to prevent the occurrence of a large prediction error. With the proposed switching structure, very good prediction results can be obtained in both slowly varying areas and pixels around boundaries. Some preliminary results regarding the proposed LS-based predictor with edge-look-ahead can be found in [27].

(24)

narrower histogram and hence a lower first-order entropy. As we will see in the experiments that the switching structure combined with edge-look-ahead prediction as well as automatic error modeling renders the proposed RALP highly adaptable and very feasible under limited resources. A very good trade-off between coding efficiency and computational complexity can be achieved. Comparisons with existing state-of-the-art LS-based predictors can also be found in our experiments.

1.3 Outline of the Dissertation

The rest of the paper is organized as follows. Chapter 2 introduces the pro-posed LS-adaptive predictor with edge-look-ahead. The entropy coding of prediction error is addressed in chapter 3. Extensive experiments of the pro-posed method and comparisons to existing predictors and coders are given in chapter 4. Chapter 5 investigates the use of control technologies to en-hance prediction result of pixels around boundaries. A conclusion is given in Chapter 6.

(25)

Chapter 2 Least-squares Based Adaptive

Predictor with Edge-look-ahead

In this chapter, details on the proposed least-squares (LS) based adaptive predictor will be addressed. First of all, we will give an overview on the proposed predictive coding system. Secondly, an illustrative example will be given to manifest the edge-directed property of LS adaptation. In order that the proposed system can foresee the existence of an edge, we will introduce in this chapter the proposed causal edge detector. After that, a detailed description on the LS adaptation process will be given. Finally, the so-called bias cancelation technique for prediction error reﬁnement will be addressed.

(26)

Edge Detector Entropy Coder B C F Context Modeling Code Stream xn xp xcp d xn " ep ALP: Adaptive Linear Predictor D Error Estimate Mode Selection Regular Mode Run Mode SW

Run Length Encodings 1. Run Length Counter. 2. Unsuccessful Run Counter.

Run mode enable ?

A E ? xn(1) xn(2) xn(3) xn(4)

Error Compensation Mechanism

Edge Detector Entropy Coder B C F Context Modeling Code Stream xn xp xcp d xn " ep ALP: Predictor D Error Estimate Mode Selection Regular Mode Run Mode SW

Run Length Encodings 1. Run Length Counter.

Run mode enable ?

Run Count A E ? xn(1) xn(2) xn(3) xn(4)

Figure 2.1: Proposed RALP coding system.

2.1 An Overview of the Proposed System

Many coding methods are more efficient with some images than others. In particular, run-length coding is very useful for coding areas of little changes. Adaptive predictive coding achieves high coding efficiency for fast changing areas like edges. In this dissertation, we propose a switching coding scheme (as shown in Fig. 2.1) that will combine the advantages of both Run-length and Adaptive Linear Predictive coding (RALP). For pixels in slowly varying areas, run-length coding is used; otherwise least-squares (LS) based adaptive predictive coding is used. Instead of performing LS adaptation in a pixel-by-pixel manner, we adapt the predictor coefficients only when an edge is detected so that the computational complexity can be significantly reduced. For this, we use a simple yet effective edge detector using only causal pixels. This way, the proposed system can look ahead to determine if the coding pixel is around an edge and initiate the LS adaptation in advance to prevent

(27)

the occurrence of a large prediction error. With the proposed switching structure, very good prediction results can be obtained in both slowly varying areas and pixels around boundaries. Furthermore, only causal pixels are used for estimating the coding pixels in the proposed encoder; no additional side information needs to be transmitted.

The proposed RALP system, as shown in Fig. 2.1, has two operation modes, run mode and regular mode. The “mode selection” block (Fig. 2.1) determine if the current pixel is in an local area of little changes. If it is, the run mode is triggered and the current pixel is encoded using run-length encoding. If not, the regular mode is assumed and the pixel is encoded using predictive coding.

Run mode

It is known that the run-length coding is most eﬃcient for the encoding of consecutive pixels with identical grey values. The case that consecutive pixels are identical can usually occur in an artiﬁcial image or in slowly varying areas of a natural image. Therefore, we use the run-length coding in the proposed RALP system for the encoding of pixels in an area of little changes. If the four pixels x_n(1), . . . , x_n(4) in Fig. 2.2 are identical, the run mode is switched on and the run-length is encoded using an arithmetic coder with an alphabet set of {0, 1, . . . , 20}. The “0”, called escape symbol in the proposed approach, is used to indicate an unsuccessful run and should be encoded if the grey value of the coding pixel x_n and that of x_n(1), . . . , x_n(4) are distinct so that the decoder can also make a right decision and quit the run mode automatically. This time, the regular mode is used for the encoding of the current pixel. It is noted that the run mode can also be broken by ends of lines, in which case the encoder returns to the regular mode, i.e., the regular mode is assumed for

(28)

the first pixel of every line. Moreover, the encoding of an escape symbol can cause penalty and degrade the coding efficiency. Therefore, we record the number of times of run mode triggered and the times of unsuccessful run. If the percentage of unsuccessful run is greater than a predefined threshold, the run mode is disabled and not to be used for the rest of the coding process. It is noted that all the pixels used for mode selection are causal and the decoder can reproduce the same decisions without any side information.

xn(1) xn(2) xn(3) xn(4) xn(7) xn(5) xn(11) xn(8) xn(6) xn(9) xn(10) xn(12) xn xn(1) xn(2) xn(3) xn(4) xn(7) xn(5) xn(11) xn(8) xn(6) xn(9) xn(10) xn(12) xn

Figure 2.2: The ordering of pixels for prediction inputs.

Regular mode

In the regular mode, pixels are encoded using predictive coding. Predic-tive coding can be very efficient for the removal of statistical redundancy between neighboring pixels in slowly varying areas. However, there can have a large prediction error around boundaries. In this paper, we will use LS op-timization to update the predictor coefficients on the fly so that the predictor can adapt itself to the varying statistics [25]-[28]. It is known that the LS-based adaptive predictor is an efficient approach to improve the prediction result around boundaries for its edge-directed property [26]-[28]. However, a pixel-by-pixel adaptation of the predictor coefficients is computationally expensive and often not necessary [25]-[28]. Therefore, we will initiate

(29)

adap-tation only when the prediction is inadequate, which is around an edge. For this, an “edge detector” is used to look ahead and determine if the coding pixel is around an edge so that the predictor can adapt itself beforehand to prevent the occurrence of a large prediction error. In the regular mode, the prediction error is further reﬁned using error compensation. That is, the predictor output x_p is added by a correction term e_p (as shown in Fig. 2.1) to

get a compensated prediction x_cpd = x_p + e_p. The amount of compensation

e_p is determined through an error modeling mechanism. The reﬁned error

signal ε = x_n−x_cpd can then be entropy encoded using conditional arithmetic coding to produce the coded bit stream.

In the proposed encoder, only causal pixels, i.e., pixels that have already been coded, are used for estimating the coding pixels; no additional side information needs to be transmitted. Moreover, the proposed RALP coder is symmetric, meaning that the decoder has the same predictor switch as the encoder, and performs prediction and error compensation just like the encoder. Therefore, the actual pixel value can be reconstructed in the de-coder with the received bit stream of reﬁned errors. Details of the individual components of the system are introduced in subsequent sections.

(30)

s

r

s

r

s

r

s

r

s

r

a1

a2

n

x

Figure 2.3: Area that contains a vertical edge.

2.2 Edge-directed Characteristic of LS

adap-tation

In predictive coding schemes, the eﬀectiveness of any adaptive predictor de-pends upon its capability of adapting from slowly varying areas to edge regions [26]. Intuitively, the prediction results can be improved if we can foresee the existence of an edge and then predict along the edge orienta-tion. However, the design of a robust edge detector and the analysis of edge orientation are diﬃcult problems themselves, let alone to predict along the edge orientation. In contrast, the LS-based adaptation provides an elegant way of approximating the optimal orientation adaptive prediction due to its edge-directed property [26].

To illustrate the edge-directed property of LS adaptation, a simple exam-ple will be given so that the relationship between the prediction support and the edge orientation can be quantitatively analyzed. For this, we show in Fig. 2.3 an example in which the current pixel is along a sharp vertical edge, i.e., |r − s| 0. For simplicity, we only consider the second-order predictor ˆ

(31)

12 elements (Fig. 2.3). To ﬁnd the least-squares solution of the second-order predictor coeﬃcients, i.e., a₁ and a₂, we start with the following equation, which is constructed by using the 12 training pixels in raster scan order.

                    r r r r r s s s s s r r r r r s s s s s r r r r                     a₁ a₂ =                     r r s s s r r s s s r r                     , (2.1)

where a₁ and a₂ are the predictor coeﬃcients to be determined. Obviously,

(2.1) is an over determined system. To ﬁnd the LS solution of (2.1), the transpose of the coeﬃcient matrix in (2.1) is left multiplied to both sides of itself, and we can get

6r2+ 2rs + 4s2 6r2+ 6s2 8r2+ 4s2 6r2 + 2rs + 4s2 a₁ a₂ = 6r2+ 6s2 6r2+ 2rs + 4s2 . (2.2)

By some simple operations, we ﬁnd the optimal LS solution for the predictor

coeﬃcients are given by

a₁ a₂ = 0 1 . (2.3)

The edge-directed characteristic of LS adaptation can be best observed from (2.3) where the prediction support has been adjusted along the edge orien-tation (i.e., vertical orienorien-tation in this case) automatically. For a horizontal or arbitrarily oriented edges, the edge-directed property can also be veri-ﬁed by using similar mathematical derivations [26]. As we will see later in

(32)

property, can have a very good performance for pixels around boundaries, and the edge-directed characteristic of LS adaptation will be utilized in the proposed approach.

(33)

2.3 Proposed Edge Detector

In the regular mode, we use a LS-based adaptive predictor to accommodate the varying statistics of the image. To save computations, the predictor is adapted only when prediction error is large or likely to be large. For a pixel around an edge, prediction error is usually large and adaptation is needed. To determine whether the coding pixel is around an edge, we use a simple yet eﬀective edge detector in this paper. It should be noted that conventional edge detectors, e.g., “Sobel” operator, can not be applied here because they use non-causal pixels, i.e., pixels yet to be encoded.

2 s 2 l s 2 h s

Figure 2.4: A typical histogram of an area that contains an edge.

We observe that an area that contains an edge usually has a large variance. Furthermore, the histogram of such an area tends to have two peaks, one on each side of the mean value (Fig. 2.4). We will use these two observations to

(34)

{xn(1), . . . , xn(4)} in Fig. 2.2 are used for the detection. The mean ¯x and variance σ2 of the set κ are calculated. Furthermore, the four pixels can be divided into two groups, the pixels with gray levels higher than ¯x in group κ_h and the rest in group κ_l. We also compute the respective variance σ2_h, σ_l2 of the pixels in κ_h and κ_l (Fig. 2.4).

A pixel around an edge is likely to have a large σ2 but small σ2_h and σ_l2. We determine whether the coding pixel is around an edge if the following two conditions are both satisﬁed,

σ2 ≥ γ₁, and σ2 ≥ γ₂(σ_h2 + σ_l2). (2.4)

It is noted that the second condition in (2.4) is included because a region with uniformly distributed gray values also results in a large σ2. Therefore,

the switch ﬁrst examines if σ2 ≥ γ₁ when a large σ2 is detected then the

switch checks the second inequality in (2.4). In this paper, the LS adaptation process in the regular mode is activated whenever the two conditions in (2.4)

are satisﬁed. We have found through experiments that γ₁ = 100 and γ₂ = 10

work very well and these values will be used throughout this dissertation.

It should also be noted that the run mode will be triggered when σ2 = 0,

i.e., the case that x_n(1), . . . , x_n(4) are identical. In this case, we do not have to check the conditions in (2.4). As we will see later in experiments that the proposed detector is very eﬀective in detecting edges although only four pixels are used. Moreover, since we use only causal pixels for the detection of an edge, the decoder can perform the same edge detection operation and switches on the LS adaptation process.

(35)

7 7 ? xn 6 6 7 7 ? xn 6 6

Figure 2.5: The online training regions for the proposed predictor.

2.4 Least-squares Based Adaptive Prediction

In the regular mode, the LS adaptation process is activated whenever the two conditions in (2.4) are both satisfied or when the prediction error is greater than a predefined threshold θ. The corresponding predictor inputs for different prediction orders are shown in Fig. 2.2 where the ordering of pixels is based on the distance to the pixel to be encoded. In this paper, the predicted value x_p of the coding pixel x_n is a linear combination of its causal neighbors given by x_p = N k=1 a(k)x_n(k), (2.5)

where N is the prediction order, x_n(k) is the kth nearest neighbor of x_n and

a(k) is the corresponding predictor coeﬃcient. It is noted that we do not use

any training set for the optimization of initial predictor coefficients in this paper. The initial coefficients for the proposed predictor are equally weighted, i.e., the coefficients a(k) for the N th-order predictor are 1/N respectively.

The training area for LS adaptation process of the coding pixel is shown in Fig. 2.5. Suppose we have M pixels in the training area, our objective is

(36)

to ﬁnd a least-square solution for the system Pa = y, (2.6) where P =      x_n−1(1) x_n−1(2) . . . x_n−1(N ) x_n−2(1) x_n−2(2) . . . x_n−2(N ) .. . ... . .. ... x_n−M(1) x_n−M(2) . . . x_n−M(N )     , a =      a(1) a(2) .. . a(N )     , y =      x_n−1 x_n−2 .. . x_n−M     .

The optimal a that minimizes the square errorsy−Pa2₂can be obtained

by solving the normal equations [35]

PTPa = PTy. (2.7)

There are well-developed numerical approaches to solve (2.7). For the

case that P has full rank; i.e., rank N , PT_{P is nonsingular and positive}

deﬁnite [35], [36]. The normal equations will have a unique solution a =

(PT_P)−1_PT_{y. In this case, the Cholesky decomposition, which requires only}

half the usual number of multiplications than alternative methods, can be used to solve (2.7) [35], [36].

If P is defective; i.e., rank < N , PTP fails to be positive deﬁnite and

the singular value decomposition (SVD) can be used to solve (2.7) [35], [36] .

The positive deﬁnite property of PT_{P can be easily examined in the process}

of Cholesky decomposition [36].

In the proposed approach, updated predictor coeﬃcients are used for current prediction and passed on to the next coding pixel. For non-edge pixels, we use the stored prediction coeﬃcients of the four nearest causal

(37)

50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500

Figure 2.6: The image “Lennagrey”.

neighbors to generate four prediction values and take their average as the ﬁnal prediction result. This manner, the predictor can resist against moder-ate salt-and-pepper noise. To summarize, the pseudo code of the proposed algorithm is given in appendix A.

(38)

2.5 Prediction Error Reﬁnement

It is known that the prediction error x_n − x_p in the regular mode can be

further reﬁned by learning from previous predictions, i.e., the so-called bias cancelation technique [14], [15], [19]. To do this, we deﬁne the compound

context v of a coding pixel as,

v ={x_n(1), . . . , x_n(6), e_n(1), . . . , e_n(4)}, (2.8) where x_n(i), i = 1, . . . , 6 are as shown in Fig. 2.2 and e_n(i), i = 1, . . . , 4 are the uncompensated prediction errors corresponding to x_n(1), . . . , x_n(4) respectively. We have incorporated prediction errors in error modeling be-cause the amount of compensation is likely to be related to the prediction errors of neighboring pixels.

In the proposed RALP system, error modeling is achieved by performing context clustering with a fixed number of contexts. For this, a set of initial contexts are generated off-line using the image “Lennagrey” (Fig. 2.6) for the clustering process. For each coding pixel, the compound context v is assigned to one of the existing contexts using mean absolute error (MAE) distance measure and the corresponding context is then modified accordingly in the coding process. By classifying coding pixels with similar context into

the same group, the amount of compensation e_p for the current prediction

can be estimated by calculating the sample mean of prediction errors in

that group. Therefore, the value e_p to be used in compensating the current

prediction is given by

e_p = S

N, (2.9)

where S is the prediction errors accumulated in the context which v belongs to, and N is the number of members in that context. With the correction

(39)

term e_p, we now form a more reﬁned prediction x_cpd = x_p + e_p. The

com-pensated error ε = x_n − x_cpd has a narrower histogram and hence a lower

ﬁrst-order entropy. In the regular mode, the reﬁned error ε is then entropy encoded using a conditional arithmetic coder to produce the bit stream [28], [34].

(40)

2.6 Concluding Remarks

In this chapter, we have introduced the proposed switching structure for lossless compression of images. Moreover, a detailed description on the pro-posed LS-based adaptive predictor with edge-look-ahead, i.e., the core of the regular mode, is also given. In order that the adaptive predictor in regular mode can look ahead if the coding pixel is around an edge, we also propose in this research an edge detector using only causal pixels. Summarizing, the proposed approach has the following properties:

1. The run mode is triggered if the four pixels x_n(1), x_n(2) . . . x_n(4) in Fig. 2.2 are identical. The run length is then entropy encoded using an arithmetic coder to produce the bit stream.

2. An escape symbol have to be encoded to indicate an unsuccessful run if the value of the coding pixel x_nand that of x_n(1), . . . , x_n(4) are distinct. However, the encoding of an escape symbol can cause penalty and degrades the coding eﬃciency. Therefore, the run mode is disabled once the percentage of unsuccessful run is beyond a predeﬁned threshold. 3. The proposed LS-based adaptive predictor, for its edge-directed

prop-erty, is very useful for the prediction of pixels around boundaries. 4. The proposed edge detector uses only causal pixels, i.e., pixels that have

been coded. Therefore, the decoder can perform the same operation just like the encoder and determine if the coding pixel is around an edge without any side information.

5. The LS-based adaptation process in regular mode is initiated only when an edge is detected or when the prediction error is greater than a

(41)

prede-ﬁned threshold so that the computational complexity of the LS adap-tation process can be signiﬁcantly reduced.

6. With the proposed approach, the results obtained are are very close to those with pixel-by-pixel LS adaptation, but with a much reduced complexity. A very good trade-oﬀ between the prediction result and the computational complexity can be obtained as we will see later in Chapter 4.

(42)

Chapter 3 Entropy Coding of Prediction

Errors

In the proposed approach, the refined error in regular mode is then entropy encoded through a conditional arithmetic coder to produce the bit stream. For entropy coding of the refined error signal ε, we borrow the concepts of “error sign flipping”, “error remapping”, and “histogram tail truncation” in [14], [15], [19] so that the coding efficiency in actual bit rates can be further improved. All the details on how the entropy coder works will be addressed in this chapter.

(43)

3.1 Conditional Entropy Coding

It is known that the coding efficiency can be further improved with the use of conditional probability models. By classifying similar prediction errors in the same group, the coding efficiency can be improved by sharpening the error histogram in each group and thus a smaller conditional entropy can be obtained. To perform the classification, we have to define an error strength estimate ∆ so that both the encoder and decoder know exactly which group does the prediction error belongs to or which probability model should be used. In this dissertation, we define the error strength estimate of the coding pixel to be |e_p|, that is

∆ =|e_p|. (3.1)

It is noted that the e_p, introduced in section 2.5, is used in compensating

the prediction error. Moreover, it is also available in both the encoder and the decoder. Therefore, we have applied the use of |e_p| as the error strength estimate.

By conditioning on the error strength estimate ∆, we can quantize refined errors into classes of different variances [14], [19], [26]. Moreover, the error histogram in each quantization bins can be sharpened and thus a smaller conditional entropy can be obtained. Therefore, we are using the conditional probability model P (ε|∆) instead of P (ε) for entropy coding of refined errors. Furthermore, to find the optimal number of quantization bins as well as an optimal quantization for ∆ so that the conditional entropy can be minimized, we define a cost function C(Q, ε, ∆) as

C(Q, ε, ∆) = E{ H( ε | Q(∆)) } =− N P (Q()) P (ε| Q())Log₂[P (ε| Q())] , (3.2)

(44)

where Q(.), which maps ∆ into one of the N quantization bins, is the quan-tizer to be designed, and N is the number of quantization bins to be decided. With such cost function, we are calculating the ﬁrst-order entropy of ε as-signed in each quantization bins, i.e., H( ε| Q(∆)), and then take expectation

on the entropy H( ε | Q(∆)).

To minimize the cost function in (3.2), we use a set of training pairs (ε, ∆) recorded in the prediction process of the fourteen test images [20] in Chapter 4. By performing dynamic programming oﬀ-line, the ﬁnal number of quantization bins is found to be three and the approximately optimized quan-tization of ∆ is found to be [0, 1], (1, 55], (55,∞). Though the quantization may be suboptimal, the compression result is satisfactory as the simulation results in Chapter 4 will demonstrate. Using the above quantization bins, one out of a set of three probability models is chosen for the entropy coding of ε based on the value of error strength estimate ∆ as below

   using pmf 1, if 0≤ ≤ 1 using pmf 2, if 1 < ≤ 55 using pmf 3, otherwise (3.3)

(45)

3.2 Further Manipulations on the Reﬁned

Er-rors

In order that a further coding gain on the actual bit rates can be obtained, the following techniques are applied to process the reﬁned errors prior to they are entropy encoded.

3.2.1 Error Sign Flipping

The concept of error sign flipping in [14], [15], [19] is also used in this paper for coding the refined error signal ε. Since e_p, the amount for prediction error compensation, is an average result of past history or experiences, it is very likely that the refined error ε has the same sign as e_p. Therefore, the refined error ε is encoded according to the following equation,

encode − ε, if e_p < 0

encode ε, otherwise. (3.4)

It is noted that all the pixels used in the proposed coding system are causal, the decoder can calculate the error estimate e_p just like the encoder. There-fore, the decoder can reconstruct the sign ﬂipped errors successfully when the image is to be decompressed.

3.2.2 Error Remapping

The range of the reﬁned error ε is [−255, 255]. In general, a probability model with a set of 511 symbols should be used for the entropy coding of prediction errors. However, they can only take on values in the range [−x_cpd, 255− x_cpd] [14], [15], [19]. Therefore, we can use the following error remapping before it is entropy encoded so that the symbols to be encoded will fall in the range

(46)

of [−128, 127], i.e., a set with 256 symbols.

Encode ε by (ε + 256), if ε <−128

Encode ε by (ε− 256), if ε ≥ 128 (3.5)

In this case, the number of symbols used is reduced to 256, which further improves the coding gain in the entropy coding stage. Since all the pixels used are causal and the decoder performs prediction and compensation just like the encoder, the predicted value x_p, the error estimate e_p and thus the compensated prediction, i.e., x_cpd = x_p+ e_p, can be calculated in the decoder. Therefore, the decoder can reconstruct the remapped ε by

Reconstruct ε by (ε + 256), if ε <−x_cpd

Reconstruct ε by (ε− 256), if ε > (255 − x_cpd). (3.6)

3.2.3 Histogram Tail Truncation

With the quantization bins in (3.3), the error histogram in each quantization bin for the image “Lennagrey” (Fig. 2.6) are plotted in Fig. 3.1. The curve of Bin3 is not shown because no error strength estimate falls in that region. As can be seen, most of the reﬁned errors fall in the region around 0. Though seldom occur, the count of occurrence for those away from 0 are initialized to be 1 in case they do occur. This operation degrades the performance of entropy coding. To conquer this problem, we use the concept of histogram

tail truncation in [14].

The probability distribution of the prediction error is usually a two-sided Laplacian distribution [14], [19]. Instead of remapping the error into a one-sided monotonically decreasing probability distribution in [14], the error his-togram tail are truncated symmetrically in each quantization bin. The cut

oﬀ region for quantization bins are chosen to be [−25, 25], [−48, 48] and

[−128, 127] such that over 99% of the reﬁned errors in each quantization bin are within the truncated regions. With histogram tail truncation, an error

(47)

−600 −40 −20 0 20 40 60 0.05 0.1 0.15 0.2 0.25 Refined Error Probability Bin#1 Bin#2

Figure 3.1: Histogram of errors in quantization bins for image “Lennagrey”. (using a sixth-order LS-based predictor with γ₁ = 100, γ₂ = 10)

30 in Bin1, for example, is encoded ﬁrst to be 25 using pmf 1, followed by 5 using pmf 2.

(48)

3.3 Concluding Remarks

In this chapter, we have introduced the entropy coding stage of the proposed system. In order that a lower actual bit rates can be obtained, we use a conditional arithmetic coder in the proposed system. To do this, we deﬁne

an error strength estimate ∆ = e_p, and then we ﬁnd an approximately

op-timal quantization for ∆ by using dynamic programming oﬀ-line. In this dissertation, the number of quantization bins for ∆ is found to be three, and the approximately optimal quantization is shown in (3.3). By conditioning on the error strength estimate ∆, the reﬁned error is then entropy encoded using one out of the three probability models as in (3.3). Besides, the entropy decoder also knows which probability model should be used if the received bit stream is to be decoded.

In addition to the conditional arithmetic corder, the concept of “error sign flipping”, “error remapping”, and “histogram tail truncation” are also applied in the proposed system before the refined error is entropy encoded so that a further coding gain on actual bit rates can be obtained. We would like to mention again that all the pixels used in the encoder are causal. Therefore, the decoder can reconstruct the remapped and sign flipped error just like the encoder without any side information. The proposed entropy coder deter-mines the coding performance of actual bit rates, and the usefulness of the proposed entropy coding stage can be found in Chapter 4, where comparisons with existing state-of-the-art lossless image coders will be given.

(49)

Chapter 4 Experiments

In this chapter, we evaluate the performance of the proposed lossless image codec. Extensive experiments as well as comparisons to existing state-of-the-art predictors and coders will be given to demonstrate the usefulness of the proposed system. All the test images used in the experiments are from TMW [20]. We will ﬁrst demonstrate the usefulness of the proposed edge detector and then the error compensation mechanism in the regular mode. After that, we will demonstrate the usefulness of the proposed edge-look-ahead approach in the regular mode. Then, the entropy and bit rate performance of the system is presented. Finally, the computational complexity of the proposed system will be discussed.

(50)

50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500 (a) 50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500 (b)

Figure 4.1: (a) The image “Shapes”. (b) Pixels for which (2.4) is satisﬁed in the image “Shapes”.

4.1 The Edge Detector

To demonstrate the effectiveness of the proposed edge detector, we use the image “Shapes” (Fig. 4.1(a)), an artificial image with many edges and lines. The pixels that satisfy the two conditions in (2.4) are marked in Fig. 4.1(b). We can see from Fig. 4.1(b) that the edge detector has successfully picked out the pixels around edges. To test the robustness of the detector, we apply the edge detector to the image “Noisesquare” (Fig. 4.2(a)), an image with salt-and-pepper noise. The pixels picked out by the edge detector are as shown in Fig. 4.2(b). We see from Fig. 4.2(b) that the edge detector is robust to moderate salt-and-pepper noise. In addition to artificial images, we also apply the edge detector to “Lennagrey”, a natural image that is shown in Fig. 2.6. As can be seen in Fig. 4.3, the pixels around the edges have been picked out successfully.

(51)

50 100 150 200 250 50 100 150 200 250 (a) 50 100 150 200 250 50 100 150 200 250 (b)

Figure 4.2: (a) The image “Noisesquare”. (b) Pixels for which (2.4) is satis-ﬁed. 50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500

Figure 4.3: Pixels for which (2.4) is satisﬁed in the image “Lennagrey” (γ₁ = 100, γ₂ = 10).

(52)

-1 -2 -1 0 0 0 1 2 1 (a) -1 0 1 -2 0 2 -1 0 1 (b)

Figure 4.4: Masks used in “Sobel” operator. (a) The mask G_x for the

com-putation of vertical derivatives. (b) The mask G_y for the computation of

horizontal derivatives.

4.1.1 Comparison with “Sobel” Operator

In the proposed approach, we use raster scan for the encoding of image pixels, i.e., the pixels are encoded in a sequence from left to right and top to bottom. In order that the decoder can perform the same edge detection as the encoder, non-causal pixels, i.e., pixels that lies below and to the right of the coding pixel, can not be used because they are not available in the decoder at the moment. Therefore, we have indicated in section 2.3 that conventional edge detectors, e.g., “Sobel” operator, can not be applied here because they use non-causal pixels. That is also why we have to propose an edge detector that uses only causal pixels in this paper.

It is noted that the “Sobel” operator deﬁnes two masks G_x and G_y (as

in Fig. 4.4) for the computation of vertical and horizontal derivatives respec-tively. Moreover, the gradient of the pixel is deﬁne by,

∇f = |Gx| + |Gy|. (4.1) A pixel is detected as around an edge in “Sobel” operator if the following equation is satisﬁed.

∇f ≥ threshold. (4.2)

As can be seen in (4.2), we have to deﬁne a gradient threshold for the edge detection when using “Sobel” operator. To compare the eﬀectiveness of the

(53)

proposed edge detector with “Sobel” operator, we use the image “Lennagrey” as the test image. We show in Fig. 4.5 to Fig. 4.6 the image of pixels that is detected as around an edge using “Sobel” operator with threshold varies from 50 to 200 respectively. The image of pixels that is detected as around an edge by using the proposed edge detector is shown in Fig. 4.3. As can be seen in Fig. 4.6(b) and Fig. 4.3, the image of pixels that is detected as around an edge using the proposed edge detector is very similar to that of obtained by using the non-causal “Sobel” operator. The proposed edge detector is very eﬀective in detecting edges although only four causal pixels are used. The Matlab program for the “Sobel” operator in this experiment is given in Appendix B. 100 200 300 400 500 50 100 150 200 250 300 350 400 450 500 (a) 100 200 300 400 500 50 100 150 200 250 300 350 400 450 500 (b)

Figure 4.5: Pixels detected as around an edge in image “Lennagrey” when the Sobel operator is used. (a) threshold= 50. (b) threshold= 100.

(54)

100 200 300 400 500 50 100 150 200 250 300 350 400 450 500 (a) 100 200 300 400 500 50 100 150 200 250 300 350 400 450 500 (b)

Figure 4.6: Pixels detected as around an edge in image “Lennagrey” when the Sobel operator is used. (a) threshold= 150. (b) threshold= 200.

4.1.2 Texture context for Edge Detection

In this experiment, we will show that a larger context for the detection of an edge is not necessary. For this, we construct four sixth-order LS based predictors each with an edge detectors that uses 4, 6 ,8 and 10 pixels respec-tively. The pixels used in the four constructed edge detectors are selected in an order as defined in Fig. 2.2. Again, the image “Lennagrey” is used for the experiment. To show that the proposed 4-point edge detector is sufficient for the detection of an edge, we compare the improvement on the prediction result and the complexity increased when the number of pixels used for the edge detector varies from 4 to 10. In this experiment, the run mode is dis-abled; only the regular mode is used. Moreover, the LS adaptation process is activated when an edge is detected or when the prediction error is greater than a predefined threshold.

(55)

We show in Fig. 4.7.(a) to Fig. 4.10.(a) the image of pixels that is detected as around an edge by using the four constructed edge detectors. Besides, the corresponding pixels for which the LS adaptation is activated are shown in Fig. 4.7.(b) to Fig. 4.10.(b) respectively. Moreover, we show in Table 4.1 the first-order entropies of the uncompensated prediction errors and the percent-age of pixels that activates the LS adaptation when different context is used for edge detection. The execution times of the proposed edge-look-ahead predictor with different context edge detection are also recorded in the last column of Table 4.1. As can be seen in Table. 4.1, the percentage of pixels that activates the LS adaptation process by using the proposed 4-point edge detector is 17.90% and that of obtained by the use of a 10-point edge detector is 21.13%. There is an increase of about 3.2% in performing LS adaptation, which means an increased computational complexity. However, the predic-tion results (in terms of first-order entropy) almost remain unchanged as can be seen in Table 4.1; only marginal improvement can be obtained. Therefore, the use of a larger context for the detection of an edge is not necessary and the proposed 4-point context is sufficient for the edge detection process.

具邊界前瞻之非失真影像預測編碼技術

國

立

交

通

大

學

電機與控制工程學系

博

士

論

文

具邊界前瞻之非失真影像預測編碼技術

Predictively Encoded Techniques with Edge-look-ahead

for Lossless Compression of Images

研

究 生：高立人

具邊界前瞻之非失真影像預測編碼技術

Predictively Encoded Techniques with Edge-look-ahead

for Lossless Compression of Images

研 究 生：高立人 Student：Lih-Jen Kau

指導教授：林源倍 Advisor：Yuan-Pei Lin

國 立 交 通 大 學

電 機 與 控 制 工 程 學 系

博 士 論 文

具邊界前瞻之非失真影像預測編碼技術

國立交通大學電機與控制工程學系（研究所）博士班

摘 要

誌

謝

Contents

List of Tables

List of Figures

Chapter 1

Introduction

1.1

Advances in Lossless Image Coding

1.2

The Proposed approach

1.3

Outline of the Dissertation

Chapter 2

Least-squares Based Adaptive

Predictor with Edge-look-ahead

2.1

An Overview of the Proposed System

s

r

r

s

s

s

r

r

s

s

s

r

r

s

r

r

s

s

r

r

r

a1

a2

n

x

2.2

Edge-directed Characteristic of LS

adap-tation

2.3

Proposed Edge Detector

2.4

Least-squares Based Adaptive Prediction

2.5

Prediction Error Reﬁnement

究生：高立人

研究生：高立人 Student：Lih-Jen Kau

國立交通大學

電機與控制工程學系

博士論文

摘要