Digital Image Processing 3/e ─ Gonzalez &ndash

全文

(1)2010/9/10. Digital Image Processing Instructor: Feng-Yang Hsieh Assist. Prof., Ta Hwa Institute of Technology Sep. 2009 ~ Jan. 2010. Course Information • Text Book – Digital Image Processing 3/e ─ Gonzalez – 數位影像處理 ─ 繆紹剛. • References – Book: 數位影像處理─儒林圖書公司 – Slides: 影像處理 ─ 靜宜大學資訊工程學系胡育誠博士 – Slides: 影像處理matlab ─中原大學電子工程學系繆紹綱博士 – Slides: Digital Image Processing ─Lucia Ballerini, Per Zetterberg. 2. 1.

(2) 2010/9/10. Course Information • E-mail: [email protected] • Web: http://120.105.186.3 • How to grade Regular – 20% (Tests) Exercises and Reports – 40% Final – 40%. 3. 4. 2.

(3) 2010/9/10. 5. 6. 3.

(4) 2010/9/10. 7. 8. 4.

(5) 2010/9/10. 9. 10. 5.

(6) 2010/9/10. 正子放射型電腦斷層攝影 11. 12. 6.

(7) 2010/9/10. Infrared. 13. 14. 7.

(8) 2010/9/10. 15. 16. 8.

(9) 2010/9/10. 17. 18. 9.

(10) 2010/9/10. 19. 20. 10.

(11) 2010/9/10. 21. 22. 11.

(12) 2010/9/10. 23. 24. 12.

(13) 2010/9/10. Image Compression. source image. 12x compression. 25. Image Enhancement. 26. 13.

(14) 2010/9/10. 27. 28. 14.

(15) 2010/9/10. 29. 30. 15.

(16) 2010/9/10. 31. 32. 16.

(17) 2010/9/10. 33. 34. 17.

(18) 2010/9/10. 35. 36. 18.

(19) 2010/9/10. 37. 38. 19.

(20) 2010/9/10. 39. 40. 20.

(21) 2010/9/10. 41. 42. 21.

(22) 2010/9/10. Image Sensing and Acquisition. 43. Image Storage. 44. 22.

(23) 2010/9/10. Digital Image Processing – Image Enhancement. 46. 23.

(24) 2010/9/10. 47. 48. 24.

(25) 2010/9/10. 49. 50. 25.

(26) 2010/9/10. Neighborhood. 51. Bit-plane MSB. LSB. 52. 26.

(27) 2010/9/10. 53. 54. 27.

(28) 2010/9/10. 55. 56. 28.

(29) 2010/9/10. Gray-level Transforms Binarization 二值化. 57. 58. 29.

(30) 2010/9/10. 59. The Inverse Transform. 60. 30.

(31) 2010/9/10. 61. 62. 31.

(32) 2010/9/10. Gamma Correction. s  cr . 63. 64. 32.

(33) 2010/9/10. 65. 66. 33.

(34) 2010/9/10. 67. 68. 34.

(35) 2010/9/10. 69. 70. 35.

(36) 2010/9/10. 71. Logic Operations. 72. 36.

(37) 2010/9/10. 73. 74. 37.

(38) 2010/9/10. 75. 76. 38.

(39) 2010/9/10. Convolution. 77. 78. 39.

(40) 2010/9/10. Averaging Masks. 79. 80. 40.

(41) 2010/9/10. 81. 82. 41.

(42) 2010/9/10. 83. Digital Image Processing – Spatial Filters. 42.

(43) 2010/9/10. Convolution • Convolution serves as a basic mechanism in spatial filtering. 85. 86. 43.

(44) 2010/9/10. 87. Low-pass filtering. High-pass filtering. 88. 44.

(45) 2010/9/10. 89. 90. 45.

(46) 2010/9/10. Mean filter. 91. 92. 46.

(47) 2010/9/10. Mean filter. 93. 94. 47.

(48) 2010/9/10. Gaussian filter. 2D. 1D. 95. Gaussian filter. σ=1.0. 96. 48.

(49) 2010/9/10. 97. 98. 49.

(50) 2010/9/10. Gradient  f    G  f ( x, y )    fx     y . M. f f  x y For efficiency.  f   f  Magnitude : M  G        x   y  2.   tan 1. 2. f / y f / x 99. 100. 50.

(51) 2010/9/10. 101. 102. 51.

(52) 2010/9/10. 103. Second order derivative  2 f f / x     f ( x  1)  f ( x)   f ( x)  f ( x  1) x 2 x  f ( x  1)  f ( x  1)  2 f ( x). 1. -2. 1. 104. 52.

(53) 2010/9/10. Laplace filter 2 f  f ( x  1, y )  f ( x  1, y )  2 f ( x, y ) x 2 2 f  f ( x, y  1)  f ( x, y  1)  2 f ( x, y ) y 2 2 f 2 f  f  2  2 x y   f ( x  1, y )  f ( x  1, y )  f ( x, y  1)  f ( x, y  1)  4 f ( x, y ) 2. 105. 106. 53.

(54) 2010/9/10. f(x) 比較器 df (x) dx 邊緣內側 d 2 f ( x) dx 2. 比較器. 比較器邊緣外側. 零穿越點 107. 108. 54.

(55) 2010/9/10. 109. 110. 55.

(56) 2010/9/10. Sharpening. 111. 112. 56.

(57) 2010/9/10. f(x). d2f (x) dx 2. f (x) +a. d2f (x) dx 2. 113. 114. 57.

(58) 2010/9/10. 115. 116. 58.

(59) 2010/9/10. Median filter. 117. Median filtering. 118. 59.

(60) 2010/9/10. Mean filtering. 119. 120. 60.

(61) 2010/9/10. 121. Temporal Filtering. 122. 61.

(62) 2010/9/10. 2. 4. 16. 123. 124. 62.

(63) 2010/9/10. Impulsive noise!. 125. Digital Image Processing – Fourier Transform Convolution Theorem Filtering in The Frequency Domain. 126. 63.

(64) 2010/9/10. Fourier’s contribution • Any function that periodically repeats itself can be expressed as the sum of sines and/or cosines of different frequencies, each multiplied by a different coefficient (Fourier series). F(u) =  f(x)e j2π2πdx e jθ = cosθ + jsin θ 127. F(u) =  f(x)e j2π2πdx inverse : f(x)=  F(u)e j2π2πdu. 128. 64.

(65) 2010/9/10. 129. j = 1. F( 0,0 ) =. 1 M 1 N 1  f(x,y) MN x=0 y=0. 130. 65.

(66) 2010/9/10. 131. 132. 66.

(67) 2010/9/10. 133. 134. 67.

(68) 2010/9/10. 135. 136. 68.

(69) 2010/9/10. Separability. 137. shift. 138. 69.

(70) 2010/9/10. 139. Conjugate Symmetry F(u,v)=. 1 M 1 N 1 f(x, y)e j2π2π(/ M +vy / N)  MN x=0 y=0. 1 M 1 N 1 F  ( u,v)= f(x, y)e j2π2ux / M vy / N)  MN x=0 y=0. e jθ = cos( θ)+ jsin ( θ) = cosθ  jsin θ conj(e jθ ) = cos( θ)  jsin ( θ) = cosθ + jsin θ = e jθ. 140. 70.

(71) 2010/9/10. 141. 142. 71.

(72) 2010/9/10. x0 1 δ(x) =  0 elsewhere 143. 144. 72.

(73) 2010/9/10. 145. 146. 73.

(74) 2010/9/10. 147. 148. 74.

(75) 2010/9/10. 149. 150. 75.

(76) 2010/9/10. ILPF. 151. 152. 76.

(77) 2010/9/10. 153. 154. 77.

(78) 2010/9/10. Butterworth Lowpass Filters. 155. 156. 78.

(79) 2010/9/10. 157. Gaussian Lowpass Filters. 158. 79.

(80) 2010/9/10. 159. 160. 80.

(81) 2010/9/10. 161. Highpass Filters. 162. 81.

(82) 2010/9/10. 163. IHPF. 164. 82.

(83) 2010/9/10. BHPF. 165. GHPF. 166. 83.

(84) 2010/9/10. Laplacian in Frequency Domain. 167. 168. 84.

(85) 2010/9/10. 169. Homomorphic Filtering • Use illumination-reflectance model to develop a frequency domain procedure for contrast enhancement. • Detailed in Section 4.5. 170. 85.

(86) 2010/9/10. Homomorphic Filtering. 171. Homomorphic Filtering. 172. 86.

(87) 2010/9/10. Implementation • What is in Section 4.6: – 2-D Discrete Fourier Transform – Compute Inverse FT Using Forward Transform Algorithm – Fast Fourier Transform (FFT). 173. 174. 87.

(88) 2010/9/10. Digital Image Processing – Image Restoration. Image Restoration • We want to restore an image that has been degraded in some way. • We make a model of the degenerating process and use inverse methods.. • In comparison to image enhancement, which is subjective way to present the image in a “better” way, image restoration is a more objective method where a priori information of the degradation is used. 176. 88.

(89) 2010/9/10. 177. Degradation / Restoration. g ( x, y )  h ( x, y ) * f ( x, y )   ( x, y ) G (u , v)  H (u, v) F (u, v)  N (u, v) 178. 89.

(90) 2010/9/10. Noise Models • Additive noise g ( x, y )  h ( x, y ) * f ( x, y )   ( x, y ) G (u , v)  H (u, v) F (u, v)  N (u, v). – – – – – –. Gaussian Rayleigh Gamma Exponential Uniform Impulse 179. Gaussian Noise PDF of Gaussian Noise. 180. 90.

(91) 2010/9/10. Uniform Noise. 181. Impulse Noise b>a: gray-level b will appear as a light dot, gray-level a will appear as a dark dot. If positive: salt If negative: pepper. 182. 91.

(92) 2010/9/10. 183. 184. 92.

(93) 2010/9/10. 185. 186. 93.

(94) 2010/9/10. Mean Filters • Arithmetic mean filter. • Geometric mean filter. – It tends to lose less image detail. 187. 188. 94.

(95) 2010/9/10. Order-Statistics Filters • Median filter. • Max and min filters. 189. Median Filter. 190. 95.

(96) 2010/9/10. Max and Min Filters. 191. Alpha-Trimmed Mean Filter • d/2 lowest and d/2 highest gray-level values of g(s,t) are deleted.. d = 0 : arithmetic mean filter d = mn – 1: median filter • Situations involving multiple types of noise 192. 96.

(97) 2010/9/10. ab cd ef a. b. c. d. e. f.. Uniform noise Salt-and-pepper Arith-mean filter Geo-mean filter Median filter Alpha-trimmed mean filter (d=5) 193. Adaptive Filters • The above filters discussed are applied without regard for how image characteristics vary from one point to another!. 194. 97.

(98) 2010/9/10. Adaptive Median Filter • Median filter performs well as long as the density of the impulse noise is not large. • More probabilities… • Unlike other spatial filters, the adaptive median filter changes the size of window Sxy.. 195. Adaptive Median Filter • Algorithm Zmin = min in Sxy Zmax = max in Sxy Zmed = med in Sxy Zxy = g(x,y) Smax = max size of Sxy. A1 = Zmed – Zmin A2 = Zmed – Zmax If A1>0 AND A2 < 0, goto Level B Else increase the window size If window size ≤ Smax goto level A Else output Zmed. Level A:. B1 = Zxy – Zmin B2 = Zxy – Zmax If B1>0 AND B2<0, output Zxy Else output Zmed. Level B:. 196. 98.

(99) 2010/9/10. 197. 198. 99.

(100) 2010/9/10. 199. Notch Filters. 200. 100.

(101) 2010/9/10. 201. 202. 101.

(102) 2010/9/10. 203. 204. 102.

(103) 2010/9/10. Estimating the Degradation • Observation • Experimentation • Mathematical Modeling. 205. By Observation (Heuristic) • Suppose we have no knowledge of the degraded image: – One way to estimate the degradation is to gather information from itself! – We can look a small section with simple structures. – We would look for areas of strong signal. – Estimating from subimage 206. 103.

(104) 2010/9/10. By Experimentation • If equipment used to acquire the degraded image is available…. 207. 208. 104.

(105) 2010/9/10. 209. 210. 105.

(106) 2010/9/10. Wiener Filtering • To minimize the mean square error. 211. 212. 106.

(107) 2010/9/10. Digital Image Processing – Geometric Transformations & Interpolation. Geometric correction • Why? – Correct for distortions due to • Tilted surface in satellite and aerial images (rectification) • Inhomogeneous magnetic field in MR images • Motion of object between successive images (perspective correction) • Lens distortion (ex. fish-eye photography) • Etc…. 214. 107.

(108) 2010/9/10. 215. 216. 108.

(109) 2010/9/10. 217. 218. 109.

(110) 2010/9/10. 219. Interpolation • When images are re-sampled (geometric transformation, zooming etc.) interpolation is needed to calculate the gray scale values in the new grid. • Strategies – Nearest neighbor interpolation – Bi-linear interpolation – Cubic spline 220. 110.

(111) 2010/9/10. 221. Nearest Neighbor. 222. 111.

(112) 2010/9/10. 223. 224. 112.

(113) 2010/9/10. 225. 226. 113.

(114) 2010/9/10. Digital Image Processing – Image Coding and Compression. 227. Data and Information • Data is not the same thing as information Data is the means with which information is expressed. The amount of data can be much larger than the amount of information. • Data that provide no relevant information = redundant data or redundancy. • Image coding or compression has as a goal to reduce the amount of data by reducing the amount of redundancy. 228. 114.

(115) 2010/9/10. 229. Definitions n1=data stream 1 (ex. before compression) n2=data stream 2 (ex. after compression) Compression ratio: CR=n1/n2 Relative redundancy: RD=1-1/CR. 230. 115.

(116) 2010/9/10. 231. 232. 116.

(117) 2010/9/10. Image Compression can be • Reversible (lossless) -no loss of information – New image is identical to original image (after decoding) – Necessary in most image analysis – Compression ratio typically 2-10x. • Non reversible (lossy) - loss of some information – Often used in image communication, video, www – Important: that the image visually “nice” – Compression ratio typically 10-30x 233. 234. 117.

(118) 2010/9/10. PSNR • Peak Signal-to-Noise Ratio. • Typical values for the PSNR in lossy image and video compression are between 30 and 50 dB, where higher is better. 235. Subjective Measures of Image Quality • Let a number of test persons grade the images as bad/OK/good etc • Very important to model PV redundancy, as we do not have a complete model of human vision • Quality factors – – – – – –. Overall appearance Color Resolution (small details) Artifacts Application dependent … 236. 118.

(119) 2010/9/10. How much information is present in the image? • If p(E) is the probability of an event, then I(E) = -log[p(E)] is a measure of the information that the event provides • If E always happens, i.e. p(E)=1, the event E does not provide any information, and I(E)=0. • If p(E)=0.99 , some information is provided, I(E)=0.0044. The event that E does not happen with p(not E)=0.01, contains much more information: I(not E)=2. • The average information is called entropy (Shannon entropy). 237. Coding Redundancy • Basic idea: different gray-levels occur with different probability (non-uniform histogram). Use shorter code words for the more common gray-levels and longer code words for the less common gray-levels. This is called Variable Length Coding. • The amount of data in an MxN image with L gray-levels = MxNxLavg where Lavg = l(rk) is the number of bits used to represent gray-level rk p(rk) is the probability of gray-level rk in the image 238. 119.

(120) 2010/9/10. 239. This will however NOT work since the code is ambiguous. What does for example the code 010 mean? Use Huffman coding! 240. 120.

(121) 2010/9/10. Huffman Coding • One way to tend towards the smallest possible number of code symbols per source symbol. • Step 1 – – – –. sort the gray-levels by decreasing probability add the two smallest probabilities sort the new value into the list repeat until only two probabilities remain. • Step 2 – give the code 0 to the highest probability, and the code 1 to the lowest probability in the present node – go backwards through the tree and add 0 to the highest and 1 to the lowest probability in each node until all gray-levels have a unique code 241. Huffman Coding • Build the Huffman Tree • Generate the Huffman Code • Calculate CR & RD CR=n1/n2 RD=1-1/CR. 242. 121.

(122) 2010/9/10. Huffman Coding • The Huffman code results in an unambiguous code, i.e. no code can be created by combining other codes • The code is reversible without loss. • The table for the translation of the code has to be stored together with the coded image (dictionary) • The Huffman code does not take correlation between adjacent pixels into consideration. 243. Huffman Coding • Decoding – Side Information • Dictionary. – Coded Bits. • Coding Rate = Uncompressed Size / (Side Information + Coded Bits). 244. 122.

(123) 2010/9/10. 245. 246. 123.

(124) 2010/9/10. 247. 248. 124.

(125) 2010/9/10. 249. 250. 125.

(126) 2010/9/10. 251. Bit-plane Coding • Divide the grayscale/color image into a series of binary images (one image per bit). Code each image separately using the above described methods. An 8-bit image will be represented by 8 coded binary images. • PNG. 252. 126.

(127) 2010/9/10. Psycho-Visual Redundancy • If the image will only be used for visual observation (i.e. illustrations on the web etc), a lot of the information is usually psycho-visually redundant. It can be removed without changing the visual quality of the image. This kind of compression is usually irreversible.. 253. Psycho-visual redundancy is often reduced by quantification: Example: ► Uniform quantification of graylevels 􀂃 remove the least significant bits of the data 􀂃 causes edge effects ► The edge effects can be reduced by "Improved Gray Scale", IGS – Remove the least significant bits and add a ”random number” based on the sum of the least significant bits of the present and the previous pixel. – special case if the gray-level of a pixel in an 8-bit image is 1111xxxx, add 0000. – IGS reduces edge effects but will at the same time smoothes true edges. 254. 127.

(128) 2010/9/10. Grey-level Quantization. 255. IGS. 256. 128.

(129) 2010/9/10. More Quantification Methods • Motion pictures ► method 1: 1. transfer the first image to the observer 2. find the changes from the previous image 3. transfer only the changes ► method 2: 1. transfer the most important information (e.g. the lowest frequencies) first 2. send the less important information later 257. Transform coding 1. Divide the image into nxn sub-images 2. Transform each sub-image using a reversible transform (e.g. the Hotelling transform, the Discrete Fourier Transform (DFT), the Discrete Cosine Transform (DCT)). 3. Quantize, i.e. truncate the transformed image, (for example with DFT and DCT frequencies with small amplitude can be removed without much information loss). The quantification can be either image dependent (IDP) or image independent (IIP). 4. Code the resulting data, normally using some kind of "variable length coding", for example Huffman code. ► The coding is not reversible (unless step 3 is skipped) 258. 129.

(130) 2010/9/10. Image/Video Formats • JPEG (Joint Photographic Experts Group) – exists in many different versions but is always some kind of transform coding. JPEG is not reversible due to quantification.. MPEG (Motion Pictures Experts Group) – Similar to JPEG, but the motion in comparison to the previous image is calculated and used in the compression.. 259. Choice of Image Format • Images to be used for image analysis should always be saved in a lossless format! • Images for the WWW have to be either GIF/PNG or JPEG • Rule of thumb – Chose GIF/PNG for graphs and hand drawn figures with few color shades (JPEG transform coding and truncation can cause artefacts around sharp edges) – Chose JPEG for photos and figures with many colors and smooth transitions between colors (GIF reduces the number of colors to 256). 260. 130.

(131) 2010/9/10. 261. JPEG2000 • ISO group Joint Photographic Experts Group, ISO/IEC 29 WG01 • Wavelets instead of DCT • Compressions up to 1:50 without noticable artifacts (depends on the image, of course!). 262. 131.

(132) 2010/9/10. 263. JPEG Scheme RGB to YUV. Downsampling. 8x8 partitions. Input Image 24bit Quantization. Zig-Zag Scan Entropy Coding. Discrete Cosine Transform. 1101010100010011101... Output Bit Stream. 264. 132.

(133) 2010/9/10. RGB to YUV • Y: Luminance • U, V: Chrominance. 265. Downsampling • The human eye can see more detail in the Y component (brightness) than in U and V • 4:2:0 downsampling. Y. U. V 266. 133.

(134) 2010/9/10. Block Splitting • Each channel is split into 8x8 blocks. 267. Discrete Cosine Transform • Like any Fourier-related transform, discrete cosine transforms (DCTs) express a function or a signal in terms of a sum of sinusoids with different frequencies and amplitudes.. 268. 134.

(135) 2010/9/10. DCT • Example. 269. DCT - [128]8x8 =. 270. 135.

(136) 2010/9/10. DCT. Aj,k 271. Quantization. Qj,k 272. 136.

(137) 2010/9/10. Quantization. Bj,k 273. Zig-Zag Scan. 274. 137.

(138) 2010/9/10. Entropy Coding • Run-Length Encoding (RLE) • Huffman Coding (VLC) • Arithmetic Coding is supported – 5% smaller than Huffman Coding – Slower in Encoding in Decoding – Patent. 275. Encoding & Decoding. 276. 138.

(139) 2010/9/10. Decoding. Dequantization. Inverse DCT. +128 277. Decoding PSNR = 32.56. Difference Matrix:. Original. Compressed 278. 139.

(140) 2010/9/10. Arithmetic Coding • Unlike Variable Length Coding, arithmetic coding: one-to-one correspondence between source symbols and code words does not exist. • An entire sequence of source symbols is assigned a single arithmetic code word. • The code word itself defines an interval between 0 and 1. 279. Arithmetic Coding. Input sequence: a1a2a3a3a4 Set low to 0.0 Set high to 1.0 While there are still input symbols do get an input symbol code_range = high - low. high = low + range * high_range(symbol) low = low + range * low_range(symbol) End of While output. 280. 140.

(141) 2010/9/10. Arithmetic Coding. Input. Low. High. a1 a2 a3 a3 a4. 0 0 0.04 0.056 0.0624 0.06752. 1 0.2 0.08 0.072 0.0688 0.0688. *[0.0, 0.2) *[0.2, 0.4) *[0.4, 0.8) *[0.4, 0.8) *[0.8, 1.0) Output 0.068 281. Arithmetic Coding. 282. 141.

(142) 2010/9/10. Arithmetic Coding • Decoding get encoded number Do find symbol whose range straddles the encoded number output the symbol range = symbol low value - symbol high value subtract symbol low value from encoded number divide encoded number by range until no more symbols 283. Arithmetic Coding Number. Low. High. a1 , (n – 0) / 0.2 0.068 0 0.2 a2 , (n – 0.2) / 0.2 0.34 0.2 0.4 a3 , (n – 0.4) / 0.4 0.7 0.4 0.8 a3 , (n – 0.4) / 0.4 0.75 0.4 0.8 0.875 0.8 1.0 a4  Output Use EOF or given symbol length to stop decoding. 284. 142.

(143) 2010/9/10. Arithmetic Coding • Where’s the beef? • Consider sequence “AAAAAAAAAB” with probability of “A” is 90%. • Huffman coded will be 10 bits…. 285. Arithmetic Coding Input. Low. High. 0. 1. *[0, 0.9). A. 0. 0.9. *[0, 0.9). A. 0. 0.81. *[0, 0.9). A. 0. 0.729. *[0, 0.9). A. 0. 0.6561. *[0, 0.9). A. 0. 0.59049. *[0, 0.9). A. 0. 0.531441. *[0, 0.9). A. 0. 0.4782969. *[0, 0.9). A. 0. 0.43046721. *[0, 0.9). A. 0. 0.387420489. *[0.9, 1.0). B. 0.3486784401 0.387420489. Output 0.35 286. 143.

(144) 2010/9/10. Arithmetic Coding • For sequence “AAAAAAAAAB”: – 0.35 can uniquely represents the sequence. – 0.35 will need less than 7 bits to specify! – Each symbol costs 0.7 bits in average, which is not possible in Huffman Coding.. • Side Information – Also, frequency table. – Symbol length 287. Practice • Encode the sequence (write the output number): AAAAB, and decode the number to 5 digits: 0.91. 288. 144.

(145) 2010/9/10. Shannon Entropy • Representing a fundamental mathematical limit on the best possible lossless data compression. n = 256 for 8-bit data, entropy < arithmetic coding < Huffman coding 289. Shannon Entropy • For sequence “AAAAABBBBB”: H(X) = -0.5 * Log2(0.5) – 0.5 * Log2(0.5) =1 (bit per symbol) • For sequence “AAAAAAAAAB”: H(X) = -0.9 * Log2(0.9) – 0.1 * Log2(0.1) = 0.468995594 (bit per symbol) Total bits = 4.68995594 (theoretical limit) 290. 145.

(146) 2010/9/10. Lossless Predictive Coding • An error-free compression approach • Based on eliminating the inter-pixel redundancies of closely spaced pixels. • The new information of a pixel is defined as the difference between the actual and predicted value of that pixel.. 291. Predictive Coding • Input: fn , predictor: fn , prediction error: en. 292. 146.

(147) 2010/9/10. Sunset Algorithm • The state-of-the-art LIC scheme – A prediction step. – The determination of a context – A probabilistic model for the prediction residual (or errors) 293. Predictor: MED • Median Edge Detector • Adopted in JPEG-LS. 294. 147.

(148) 2010/9/10. Predictor: MED. • It tends to pick b in cases where a vertical edges exists left, a in cases of an horizontal edges above 295. MED Example. input image. MED prediction. prediction error. 296. 148.

(149) 2010/9/10. MED Example. Prediction errors H = 2.065451402. Original image H = 2.346819315. 297. Comparison • Difference method. input image. Prediction errors H = 2.917405816. 298. 149.

(150) 2010/9/10. CALIC • A Context Based Adaptive Lossless Image Codec • Predictor: GAP (Gradient Adjusted Prediction). 299. GAP / MED. GAP H=4.39 bpp. MED H=4.56 bpp 300. 150.

(151) 2010/9/10. Predictor: MMSE • An optimal single predictor • High computational complexity • Tending to be slow in both encoding and decoding. 301. Predictor: MMSE • Minimum Mean Square Error: – Ordering the casual neighbors. – N-th order linear predictor. 302. 151.

(152) 2010/9/10. Predictor: MMSE Let. Hence. 303. Predictor: MMSE • To minimize the L2-norm of a error vector:. The solution can be carried out through LS-optimization. 304. 152.

(153) 2010/9/10. Summary • In this chapter, we learned.. – To compress image in different ways – How to measure our results (CR, PSNR) – Some general purposed compression algorithms (Huffman, RLE, arithmetic) – Transform coding – JPEG – Lossless predictive image compression 305. Digital Image Processing – Image Segmentation. 306. 153.

(154) 2010/9/10. 307. Segmentation • Full segmentation: Individual objects are separated from the background and given individual ID numbers (labels). • Partial segmentation: The amount of data is reduced (usually by separating objects from background) to speed up the further processing. • Segmentation is often the most difficult problem to solve in the process; there is no universal solution! • The problem can be made much easier if solved in cooperation with the constructor of the imaging system (choice of sensors, illumination, background etc) .. 308. 154.

(155) 2010/9/10. Types of Segmentation • Classification – Based on some similarity measure between pixel values. The simplest form is thresholding. • Edge-based – Search for edges in the image. They are then used as borders between regions • Region-based – Region growing, merge & split. Common idea: search for discontinuities or/and similitudes in the image 309. Thresholding global or local • global: based on some kind of histogram: greylevel, edge, feature etc. – Lighting conditions are extremely important, and it will only work under very controlled circumstances.. • Fixed thresholds: the same value is used in the whole image • local (or dynamic thresholding): depends on the position in the image. The image is divided into overlapping sections which are thresholded one by one. 310. 155.

(156) 2010/9/10. 311. Classical Automatic Thresholding Algorithm 1. Select an initial estimate for T (usually 128) 2. Segment the image using T. This produces 2 groups: G1 , pixels with value >T and G2 , with value <T 3. Compute μ1 and μ2, average pixel value of G1 and G2 4. New threshold: T=0.5(μ1+μ2) 5. Repeat steps 2 to 4 until T stabilizes. Very easy + very fast Assumptions: normal dist. + low noise 312. 156.

(157) 2010/9/10. 313. Optimal Thresholding • Based on the shape of the current image histogram. Search for valleys, Gaussian distributions etc.. 314. 157.

(158) 2010/9/10. Histograms. 315. Thresholding and Illumination • Solutions: – Calibration of the imaging system – Percentile filter with very large mask – Morphological operators. 316. 158.

(159) 2010/9/10. MR non-uniformity. 317. Basic Adaptive Thresholding • Use subimages. 318. 159.

(160) 2010/9/10. 319. More Thresholding • Can also be used on other kinds of histogram: grey-level, edge, feature etc. – Multivariate data. • Problems: – Only considers the gray-level pixel value, so it can leave “holes” in segmented objects. • Solution: post-processing with morphological operators. – Requires strong assumptions to be efficient – Local thresholding is better ⇒ see region growing techniques 320. 160.

(161) 2010/9/10. Point Detection. 321. Line Detection. 322. 161.

(162) 2010/9/10. Line Detection. 323. Edge-Based Segmentation • Based on finding discontinuities (local variations of image intensity) 1. Apply an edge detector ex: gradient operator (Sobel) second derivative (Laplace) 2. Threshold the edge image to get a binary image 3. Depending on the type of edge detector: – Link edges together to close shapes (using edge direction for ex) – Remove spurrious edges 324. 162.

(163) 2010/9/10. 325. Model of Edges. 326. 163.

(164) 2010/9/10. 327. 328. 164.

(165) 2010/9/10. 329. Gradient Based Procedure. 330. 165.

(166) 2010/9/10. 331. 332. 166.

(167) 2010/9/10. Diagonal Edges. 333. Zero-Crossing Based Procedure. 334. 167.

(168) 2010/9/10. Laplacian of Gaussian. 335. Sobel Gradient. LoG. Thresholding. Zero-Crossing. 336. 168.

(169) 2010/9/10. Edge-Based Segmentation. 337. Edge Linking • Local Processing – Analyze the characteristics of pixels in a small neighborhood (say, 3x3 or 5x5) – Two principal properties used for establishing similarity of edge pixels: • The strength of gradient • The direction. 338. 169.

(170) 2010/9/10. Edge Linking Local Processing. 339. • • •. The Hough Transform. Global Processing A method for finding global relationships between pixels. Example: We want to find straight lines in an image 1. Apply edge enhancing filter (ex: Laplace) 2. Set a threshold for what filter response is considered a true ”edge pixel” 3. Extract the pixels that are on a straight line using the Hough transform. 340. 170.

(171) 2010/9/10. The Hough Transform. • Finding straight lines: 1. 2. 3. 4.. consider a pixel in position (xi, yi) equation of a straight line yi=axi+b set b=-axi+ yi and draw this (single) line in ”ab-space” consider the next pixel with position (xj, yj) and draw the line b=-axj+yj ”ab-space” (also called parameter space). The points (a’, b’) where the two lines intersect represent the line y=a’x+b’ in “xy-space” which will go through both (xi, yi) and (xj, yj). 5. draw the line in ab-space corresponding to each pixel in xyspace. 6. divide ab-space into accumulator cells and find most common (a’, b’) which will give the line connecting the largest number of pixels 341. The Hough Transform. 342. 171.

(172) 2010/9/10. The Hough Transform. 343. The Hough Transform • In reality we have a problem with y=ax+b because a reaches infinity for vertical lines.. 344. 172.

(173) 2010/9/10. The Hough Transform. 345. The Hough Transform • It is common to use ”filters” for finding the intersection: ”butterfly filters” • Different variations of the Hough transform can also be used for finding other shapes of the form g(v,c)=0, v is a vector of coordinates, c is a vector of coefficients. • Possible to find any kind of simple shape ex. circle: (3D parameter space). 346. 173.

(174) 2010/9/10. The Hough Transform & Edge Linking. 347. Canny Edge Detector A Computational Approach to Edge Detection JOHN CANNY, MEMBER, IEEE IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAM1-8, NO. 6, NOVEMBER 1986. 348. 174.

(175) 2010/9/10. Canny Edge Detector • The performance criteria – Good detection – Good localization – Only one response to a single edge. 349. Good Detection • Apply 2D Gaussian smoothing operator：. • Convolve the image with an operator Gn. where n is the direction, And f is the image 350. 175.

(176) 2010/9/10. Good Detection. 351. Good Localization • The edge location is at the local maximum of the image f convolved with Gn :. (non-maximal suppression) 352. 176.

(177) 2010/9/10. Edge Strength. 353. Canny Edge Detector • How to find a local maximum edges: • Non-maximum suppression Algorithm – Quantize edge directions to eight ways with 8 connectivity. – For each pixel (x, y) with non-zero edge magnitude M(x,y), inspect its two adjacent pixels indicated by the direction of its edge. – If both magnitudes of the two adjacent pixels are not larger than M(x,y), record the pixel (x,y) as an edge candidate.. 354. 177.

(178) 2010/9/10. One Response • Streaking problem – breaking up edge • Streaking can be eliminated by thresholding with hysteresis. – high threshold – low threshold. 355. Canny Edge Detector • How to threshold an edge: • Algorithm to filter out spurious edges. – Mark all edge pixels with magnitudes greater than t1 as correct. – Scan all pixels with edge magnitude in the range [t0,t1]. – If such a pixel borders another already marked as an edge, then mark it too. – Repeat from Step 2 until stability. 356. 178.

(179) 2010/9/10. Canny Edge Detector. 357. Canny – Examples. Lena (512x512, gray level). sigma=1.0, LT=0, HT=255 358. 179.

(180) 2010/9/10. Canny – Examples. sigma=0.5, LT=0, HT=255. sigma=0.75, LT=0, HT=255 359. Canny – Examples. sigma=1.5, LT=0, HT=255. sigma=2.0, LT=0, HT=255 360. 180.

(181) 2010/9/10. Region Based Segmentation • Work by extending some region based on local similarities between pixels – region growing (bottom-up method) – region splitting and merging (top-down method). • Bottom-up: from data to representation • Top-down: from model to data. 361. Basic Formulation • R: entire image region • Segmentation: R is partitioned into R1,...,Rn, such that. P(Rk) is a logical predicate over Rk 362. 181.

(182) 2010/9/10. Region Growing (bottom-up method) 1. Find starting points (seeds) 2. Include neighboring pixels with similar features (grey-level, texture, color). 3. Continue until all pixels have been included with one of the starting points. •. Problems: – Not trivial to find good starting points, difficult to automate – Need good criteria for similarity. 363. 364. 182.

(183) 2010/9/10. Region Splitting & Merging A region splitting algorithm – Let R represent the entire image and select a predicate P – if P(R)=FALSE, we divide R into quadrants – Repeat the previous step for each quadrant, respectively until all P(R)=TRUE. 365. Region Splitting & Merging. 366. 183.

(184) 2010/9/10. Region Splitting & Merging • Splitting & Merging – Split into four disjoin quadrants for which P(R)=FALSE – Merge any adjacent regions for which P(merged)=TRUE – Stop when no further merging or splitting is possible. 367. Region Splitting & Merging. 368. 184.

(185) 2010/9/10. Morphological Watershed • A well known region-based segmentation. • Often produces more stable segmentation results.. 369. Watershed – Basic Concepts Watershed lines. Catchment basins. 370. 185.

(186) 2010/9/10. Watershed – Basic Concepts. 371. Watershed – Basic Concepts. 372. 186.

(187) 2010/9/10. Watershed – Basic Concepts • (Morphological) Gradient. You can use Sobel operators to approximate it. Image. Gradient. Watershed. Segmentation result. 373. Watershed – Dam Construction. 374. 187.

(188) 2010/9/10. Watershed. 375. Watershed. 376. 188.

(189) 2010/9/10. Watershed – Use of Markers. 377. Watershed – Use of Markers. 378. 189.

(190) 2010/9/10. Labeling Algorithm • Not segmentation, but similar to regiongrowing segmentation. • A common technique in segmentation post-processes. • So-called “Connected Component”.. 379. Labeling Algorithm. 4-way Labeling. 8-way 1. 2. 380. 190.

(191) 2010/9/10. Labeling Algorithm • Counting objects in an image. 381. Labeling Algorithm Recursion (4-way) n := 0; visited(I) := false; for each P in image I do if not_visited(P) then LabelCC(P, n); n := n+1; end function LabelCC(P,n) label(P) := n; visited(P) := true; if f(P)=f(Pleft) and not_visited(Pleft) then LabelCC(Pleft,n); if f(P)=f(Pright) and not_visited(Pright) then LabelCC(Pright,n); if f(P)=f(Ptop) and not_visited(Ptop) then LabelCC(Ptop,n); if f(P)=f(Pbottom) and not_visited(Pbottom) then LabelCC(Pbottom,n);. Deep recursions will easily cause stack overflow!. 382. 191.

(192) 2010/9/10. Labeling Algorithm Iteration (4-way) – using stack to simulate recursion n := 0; visited(I) := false; for each P in image I do if not_visited(P) then LabelCC(P, n); in:= n+1; end function LabelCC(P,n) push(P); while not_empty(stack) P = pop(); label(P) := n; visited(P) := true; if f(P)=f(Pleft) and not_visited(Pleft) then push(Pleft); if f(P)=f(Pright) and not_visited(Pright) then push(Pright); if f(P)=f(Ptop) and not_visited(Ptop) then push(Ptop); if f(P)=f(Pbottom) and not_visited(Pbottom) then push(Pbottom); end 383. Digital Image Processing – Wavelets and Multiresolution Processing. 384. 192.

(193) 2010/9/10. Preview • Besides Fourier transform, wavelet transform is now making it even easier to compress, transmit, and analyze images. • Unlike Fourier transform, whose basis functions are sinusoids, wavelet transforms are based on small waves, call wavelets.. 385. Preview • In 1987, wavelets were first shown to be the foundation of powerful new approach to signal processing and analysis called multiresolution theory. • The appeal of such an approach is obvious – features that might go undetected at one resolution may be easy to spot at another. 386. 193.

(194) 2010/9/10. Preview. 387. require lower resolution. require higher resolution 388. 194.

(195) 2010/9/10. Continuous Wavelet Transform • Mathematical Background – L2(R) denotes the space that of all squaresummable functions.. 389. Multiresolution Expansions • Multiresolution Analysis (MRA) • In MRA, a scaling function is used to create a series of approximations of a function (or image). – Lowpass • Additional functions, called wavelets, are then used to encode the difference in information between adjacent approximations. – Highpass 390. 195.

(196) 2010/9/10. Series Expansions • A signal or function f(x) can be analyzed as a linear combination of expansion functions. αk: coefficients, φk: expansion functions if the expansion is unique, the φk(x) are called basis functions. The set {φk(x) } is called a basis. 391. Series Expansions • The closed span of the basis. • For expansion set {φk(x) }, there is a set of dual functions, that can be used to compute the αk coefficients.. 392. 196.

(197) 2010/9/10. Series Expansions Case 1: If the expansion functions form an orthonormal basis for V, meaning that  j (x), k (x) = δ jk = 0 j  k    1 j = k . the basis and its dual are equivalent. i.e.. 393. Series Expansions • Case 2: If the expansion functions are not orthonormal, but are orthogonal, then  j (x), k (x) = 0 j  k. are called biorthogonal. The biorthogonal basis and its dual are. 394. 197.

(198) 2010/9/10. Scaling Functions • Now consider the set of expansion functions {φj,k(x) } where. for all j, k belongs to Z. Here, k determines its position, j determines its width – how broad or narrow. By choosing φ(x) wisely, {φj,k(x) } can be made to span L2(R). 395. Scaling Functions • If we restrict j to a specific value j=j0, the resulting set {φj0,k(x) } will not span L2(R), but a subspace. We define it as If f belongs to Vj0, More generally, we define. 396. 198.

(199) 2010/9/10. Example – Haar • Consider the scaling function (Haar). 397. Example – Haar. coarse. fine 398. 199.

(200) 2010/9/10. Example – Haar. 399. Example – Haar. 400. 200.

(201)