Introduction - 積分圖形之自適性閥值方法應用於三維距離資料壓縮

There are still many opportunities for growth in computer vision despite decades of development. It has long been desired that the computer can interact with the natural world. To reach this goal, various instruments have been used to collect environmental data actively. The range data is a common data format for goal.

There are many methods for the collection of range data. The distance between a sensor to object is often called as 3D range data . However, the amount of data collected by instruments is usually too large to effectively process, store, or transmit.

So such data must be compressed before processing. Due to fact that features in the natural world may have a high probability of redundant information, 3D range data can be compressed in accordance based on the needs of applications.

This work is an extension from “Resampling and compression on 3D range data from a LIDAR” [1]. It focuses on the compression algorithm applied to 3D range data.

We have developed an algorithm called “compression of 3D range data with adaptive thresholding using the integral image” and it will be referred to hereafter as ”C3RDATII”. C3RDATII is a kind of lossy compression. The rate of compression can be adjusted according to application demand.

The “adaptive thresholding” technique is commonly used to process situations with variations in illumination. There are many applications with adaptive thresholding methods [2] to [11]. The C3RDATII algorithm extends the algorithm called “adaptive thresholding using the Integral Image” [11], which was proposed in 2007. Hereafter, the algorithm is referred to as “ATII”. The ATII method is also an extension to a previous method called “Adaptive thresholding for the digital desk” [12]

that was proposed in 1993. Hereafter, it is referred to as “ATDD”.

1.1 Adaptive thresholding for the digital desk (ATDD)

The concept of ATDD is to determine whether a point is white or black according to the near value generated by the average serial scan. Set n is the index value of grids which was began at 1. p_n is the gray value of n grid, s is the value of range that from grid n - s + 1 to grid n, as shown in Figure 1.1.

Figure 1.1: p_n and s.

gs(n) is the weight value of a cell of the grid. That value is used to calculate the thresholding value. The formula is shown as (1.1). If the cell is closer to grid n, its weight value is larger. In other words, it influences more to the thresholding value of grid n.

gs(n) depends only on the value of neighboring points of horizontal position.

According to (1.2), it associate with the value of neighboring points of vertical position, then a more objective weight value can be got. This method will be used in the following ATII and C3RDATII algorithm. In the following article, w is the width of the two-dimensional picture or the width of two-dimensional mask array.

𝐺

_𝑠

(𝑛) = 𝑔

_𝑠

(𝑛) + 𝑔

_𝑠

(𝑛 − 𝑤)

2 (1.2)

By g_s(n), the binarization (1.3) can get. B_s(n) is the binarization result of a cell of the grid. If it is 1, it means foreground, data will be reserved. If it is 0, it means background, data will be deleted. t is a constant of range 0 ≦ t ＜ 100 and it can quickly adjust the binarization. This formula will be used in ATII and C3RDATII algorithm.

𝐵

𝑠

(𝑛) =

�

𝑓𝑓𝑓𝑓𝑔𝑓𝑓𝑓𝑛𝑓 1 𝑖𝑓 𝑝

_𝑛

<

^�

𝐺

𝑠(

𝑛

⁾

𝑠

^{� �}

100 − 𝑡 100

^�

𝑏𝑏𝑏𝑏𝑔𝑓𝑓𝑓𝑛𝑓 0 𝑓𝑡ℎ𝑓𝑓𝑤𝑖𝑠𝑓 (1.3)

Obtained for each grid cell by the Bs(n), we can get binarization mask. However the ATDD method is not good at edge reconstruction and over-reliance on progressive scan. Its influence on the data of the vertical direction is weak.

1.2 adaptive thresholding using the Integral Image (ATII)

In recent years, a lot of improvements have been made to make ATDD method more efficient. One of the improvement is the ATII method.

To improve the problem of horizontal scan order with ATDD, the ATII method uses a “summed-area mask” instead of a serial scan. The formula for each point within the summed-area mask is shown as (1.4).

𝑆(𝑥, 𝑦) = � � 𝑝

𝑖,𝑗 obtained by summing numbers from the upper left and itself.

An example of changing an original mask into summed-area mask is shown as Figure 1.2. The original mask in ATII method is a 2-dimensional array of the color intensity values of a picture. So, p_1,1 = 1, p_2,1 = 2, p_3,1 = 3, S(3,1) = p_1,1 + p_2,1 + p_3,1 = 6.

Figure 1.2: An example of original mask change into summed-area mask

1.3 Lossless compression algorithms–DEFLATE

The part of C3RDATII algorithm is a lossless compression that has been widely used. I chose the common algorithm which was called DEFLATE invented by Phil Katz which was specified in RFC 1951[13][14][15]. DEFLATE uses LZ77 algorithm and Huffman Coding and is a lossless data compression algorithm. There is no patent restriction in using it. In this paper, the data using in DEFLATE algorithm process will be zip format.

1.4 Velodyne HDL-64E S2

The range sensor used collect data in this work is the Velodyne HDL-64E S2.

Velodyne HDL-64E S2 is state of the art range sensing equipment at the time this paper is written. Figure 1.3 is a picture of the device. The device’s camera lens can rotate at a maximum of 15 revolutions per second to collect surrounding 3D range

data. It sends raw data over Ethernet using the user datagram protocol (UDP). The device’s specifications are shown in Table 1.1.

Figure 1.3: Velodyne HDL-64E S2 Laser Rangefinder.

Table 1.1: Specification of Velodyne HDL-64E S2

Sensor:  64 lasers/detectors

 0.09 degree angular resolution (azimuth)

 26.8 degree vertical field of view (elevation) -+2° up to -24.8° down with 64 equally spaced angular subdivisions (approximately 0.4°)

 1.5 cm (1 SIGMA) distance resolution

 5-15 Hz rotation rate update (user selectable)

 50 meter range for pavement (~0.10 reflectivity)

 120 meter range for cars and foliage (~0.80 reflectivity)

 >1.333 M points per second Laser:  4 x 16 laser block assemblies

 905 nm wavelength

 5 nanosecond pulse

Mechanical:  Size of H257.3mm X W223.5mm X D231.1mm

 300 RPM - 900 RPM spin rate (user selectable)

1.5 HDL-64E S2 Data Packet

The HDL-64E S2 outputs a raw data with UDP ethernet packet. Each packet with a data payload of 1206 bytes which contains 12 laser blocks of 100 bytes firing data followed by 6 bytes attachment. The attachment consists of a spin counter and the firmware version information. The beginning two- bytes of each laser block is header information and is used to identify upper or lower laser block. The next two- bytes is the rotation angle of the device. The firing data of each angle block is obtained under rotating the device with same angle. The laser heads are at different positions on the top of the device, so the firing data are not located on the same

azimuth under the same rotation angle. The next 96 bytes are the return values of 32 laser heads. Each return value is 3 bytes, the first two bytes is length information and is an integer. The distance information can be obtained by multiply length information and 2mm. The last byte represents intensity information that is not used in this study.

The angle block is composed of an upper laser block and a lower laser block.

The upper laser block is in front of the lower laser block in each angle block. Each packet will include 6 angle blocks and the last six spare bytes of space, that is, the structure shown in Figure 1.4.

Figure 1.4: Structure for data payload of packet.

在文檔中積分圖形之自適性閥值方法應用於三維距離資料壓縮 (頁 11-18)