Merging 3D image with background image - Long-Range View Construction and Display

Chapter 6 Long-Range View Construction and Display

6.4 Merging 3D image with background image

In this section, we introduce the proposed method for mapping the stitched image into 3D space and rendering them together with nearby object models which have been constructed as discussed in the previous chapter.

At first, we establish a bipolar coordinate system as illustrated in Fig. 6.12. Each point p on a sphere can be described by the polar coordinates (r, , ). Then, we map the panoramic image resulting from image stitching onto the a sphere. In the coordinates (r, , ) of each point transformed from an image point, the value r is a fake depth, and the orientation values of  and  are determined by the vertical and the horizontal viewing angles of the KINECT device, respectively.

r





(r, ,  )

Fig. 6.12 A 3D point in a sphere expressed by polar coordinates.

More specifically, the coordinates (x, y, z) of a 3D image point can be transformed into polar coordinates (r, , ) by the following equations:

x = r cos sin; (6.2)

y = r sin; (6.3)

z = r coscos. (6.4)

An example of the results of applying the above coordinate transformation to a color image acquired by a KINECT device is shown in Fig. 6.13. As can be seen from Fig. 6.13(c), the 2D image has become part of a spherical shape.

(a)

(b)

(c)

Fig. 6.13. A result of polar coordinate transformation applied to a color image acquired by a KINECT device. (a) Seen from the front. (b) Seen from the top with a

little slant. (c) Seen right from the top.

Now, we describe the proposed method for merging an extracted 3D image with a color background image. Firstly, each object in a given color image is removed. The removal scheme is implemented by the object detection method described by Algorithm 5.7. Specifically, if an object existing in the depth image is detected, then we remove it from the color image, so that a background image is obtained. After that, a merge of the extracted object in 3D form with the resulting background image is conducted. An algorithm implementing this idea of merging is proposed in the following.

Algorithm 6.3: merging 3D image with 2D background image.

Input: A 3D image I3D and a set of background images IB.

Output: A 2D-3D-mixed image IM resulting from merging I3D and IB. Steps:

Step 1 Stitch the background images in I_Binto a panoramic image I_P suing Algorithm 6.2.

Step 2 Transform the panoramic image I_P into the 3D polar coordinate system using Equations (6.2), (6.3), and (6.4) to get an image IP' which becomes part of a spherical surface.

Step 3 The result is rendered by showing IP' and I3D in the same scene.

After the above algorithm is performed, the 2D-3D-mixed image with objects in front a panoramic background image is obtained. Some experimental result of this algorithm is shown in the next section.

6.5 Experimental Results

The first part of the experimental results we show here is the removal of the nearby object from the color image. An example of such results are shown in Fig. 6.13, where Fig. 6.14(a) shows a color image with a nearby object (a red car) removed, and Fig. 6.14(b) shows the result of rendering the removed part as a 3D image. Also, the result of merging the 3D image and the background image is shown in Fig. 6.14. And Fig. 6.15 shows

(a) (b)

Fig. 6.14 The images before merged. (a) Nearby object in the color image being removed. (b) Rendering of the removed object part into the 3D space.

(a) (b)

Fig. 6.15 The result of merged data. (a) Seen on the front of image. (b) Seen from the top view.

Fig. 6.16 The result of merged 3D image on to background data which come from stitching by 3 background images.

Another 2D-3D-mixed stitching result is shown in Fig. 6.17, where Fig. 6.17(a) shows the result of removing a car in front from a panoramic image constructed from stitching three background images using Algorithm 6.2. The car was not removed cleanly because the KINECT device senses no reflective signal from the glass portions of the car window and light covers and from car portion too far away. The removed part is rendered as a 3D image and shown in Fig. 16.7(b). Furthermore, we merge the 3D image of Fig. 6.17(b) into the panoramic image of Fig. 16.7(a) to yield Fig. 6.18, in which three views of the result are shown: a front view, a side view, and a top view. These views are useful for judging the relative position of the car with respect to the background scene.

(a) (b)

Fig. 6.17 Removal of a near-by car in a panoramic image. (a) Nearby car in the color image being removed. (b) Rendering of the removed car part into the 3D space.

(a) (b)

Fig. 6.18 Different views of result of merging 3D image and 2D panoramic background. (a) A front view. (b) A side view. (c) A top view.

(c)

Fig. 6.19 Different views of result of merging 3D image and 2D panoramic background. (a) A front view. (b) A side view. (c) A top view. (Cont'd)

Chapter 7 Experimental Results and Discussions

7.1 Experimental Results

In this chapter, we present more experimental results obtained in this study. At first we describe some results of our experiment of KINECT device calibration. In the calibration steps, we set a box in the overlap region in the views of two neighboring KINECT devices, as illustrated in Fig. 7.1. result of calibration is like fig.7.2.

(a)

(b) (c)

Fig. 7.1 A calibrate box placed in the overlap portion of views of two neighboring KINECT devices. (a) Illustration of the overlap area. (b) and (c) The box in the two color images acquired by the KINECT devices.

(a) (b)

Fig. 7.2 Result of calibration. (a) Two 3D images seen before calibration. (b) A single 3D image seen after calibration

Furthermore, we use a trolley to simulate a car driven on the road as shown in Fig.

7.3. Three KINECT devices are affixed on a board put on the trolley. Then we wheeled the trolley straight forward while a toy car is made to pass by. Three images acquired from the KINECT devices are shown in Fig. 7.4. The nearby objects in the images then are removed, with the result shown in Fig. 7.5. Then, we used the images in Fig. 7.5 as input to the stitching algorithm (Algorithm 6.2) to obtain a panoramic image as shown in Fig. 7.6.

(a)

(b)

Fig. 7.3 Simulation using a trolley as a car and a toy car as a by-passing car. (a) three KINECT affixed on a board on the trolley. (b) The board is put on the trolley.

Fig.7.4 Three images acquired from the three KINECT devices.

Fig.7.5 Background images resulting from nearby object removal.

Fig. 7.6 A stitched image which will be the background.

On the other hand, the removed objects may be merged by using the obtained calibration information and proper coordinate transformations, with the result shown in Fig. 7.7. Finally, the background image of Fig. 7.6 was merged with the nearby objects of Fig. 7.7 as shown in Fig. 7.8. Another similar result of merging 3D images into panoramic background images is shown in Fig. 7.9.

Fig.7.7 Merged objects in the scene.

Fig. 7.8 The nearby object merged with the panoramic background image.

(a)

Fig. 7.9 Another example of merging a panoramic background image with a 3D image (a car on the lateral side) (a) The panoramic background image. (b) 3D Model of detected car from images of two KINECT devices. (c) Merge result of (a) and (b). (d) A slightly-shifted view of the result of (c). (e) A top view of the result of (c).

(b)

(c)

(d)

Fig. 7.10 Another example of merging a panoramic background image with a 3D image (a car on the lateral side) (a) The panoramic background image. (b) 3D Model of detected car from images of two KINECT devices. (c) Merge result of (a) and (b).

(d) A slightly-shifted view of the result of (c). (e) A top view of the result of (c).

(Cont’d).

(e)

Fig. 7.11 Another example of merging a panoramic background image with a 3D image (a car on the lateral side) (a) The panoramic background image. (b) 3D Model of detected car from images of two KINECT devices. (c) Merge result of (a) and (b).

(d) A slightly-shifted view of the result of (c). (e) A top view of the result of (c).

(Cont’d).

Finally, we show another result of stitching all the images acquired by the KINECT devices on the vehicle used in our experiment into a panoramic image, as shown in Figs. 7.10 and 7.11.

Fig. 7.12 A panoramic image.

Fig. 7.13 A well-cut version of Fig. 7.9.

78 lead to bad results because of lacks of enough prominent features for matching. As a solution, extension of the distance-weighted correlation (DWC) to a 3D version may be used.

2. In the image recording step, the KINECT devices are dealt with by the system one by one, so the images acquired from two neighboring KINECTs have a little delay, though not much. But the totally FPS is low due to the use of 14 KINECT devices.

To solve this problem, more computers are needed to speed up the image acquisition and processing speed, but then some computer communication algorithm more complex than the current used one should be designed.

3. In the outdoor environment, the infra light emitted by the KINECT device is interfered by the sun. Consequently, the time for good uses of the KINECT devices is restricted, but the proposed system can still work in a certain number of environments such as in the air port and big shopping mall, ..., etc. where the buildings are covered by huge roofs and the interior space is not interfered by the sun. Furthermore, the proposed system can still work in the afternoon in the outdoor environment, and it really help us to conduct off-line inspections of accidents occurring around the car.

4. The panoramic image constructed from images with nearby objects removed seems not good because there are “holes” created by the removed objects in the image. This can be solved by learning the background images in advance when no

car appear nearby. This is feasible if the EDR is used in local areas like school campus, large park area, fixed routes through cities or country sides, etc. Also, to place extracted 3D nearby objects like cars onto correct positions on the constructed panoramic image, we can use the navigation technique to find the current vehicle position on the map, or use the GPS.

Chapter 8 Conclusions and Suggestions for Future Works

8.1 Conclusions

In this study, we have proposed methods for construction of a 3D EDR (event data recorder) using multiple KINECT devices on the car. In this chapter, we make conclusions about our study and some suggestions for future works to improve our system or enrich the views acquired by our system. The conclusions are described as follows.

1. A 3D imaging system is proposed and constructed by affixing multiple KINECT devices around the car with different views to cover the 360^o surround.

2. A method for transforming 2D images to 3D images and a reverse version of the method are proposed based on the pinhole camera model.

3. A method for calibration of the relationship between neighboring KINECT devices using a calibration target is proposed, which is speeded up by the use of some learned information about the relative position of the KINECT devices.

4. A method for 3D image merging is proposed, which maps the images acquired by each KINECT device into the 3D space according to the relationship obtained from the calibration step.

5. A method for speeding up rendering of the 3D image for display is proposed, which utilizes the mesh data structure and the QEM algorithm.

6. A method for panoramic image creation by stitching multiple color images into a

single view by using a dynamic adjustment of the parameters.

7. A method for combining 3D images with 2D panoramic scene images is proposed, which can be used to render nearby objects and long-range views together.

8.2 Suggestions for Future Works

According to our experiences in the study, some suggestions for future works are made as follows.

1. It is worth studying automation of the calibrate task required for the proposed imaging system as well as adaptation of the calibration results to different environments.

2. It is interesting to improve the KINECT depth sensor for imaging in day time because so far the KINECT devices used in this study can only be used in environments with little sun light.

3. It's a challenge to embed the system into the car electronics system to improve the system processing speed.

4. A more precise transformation from the 2D coordinate system to a 3D one is needed.

5. Improvement on the proposed method of stitching images and combining nearby objects into panoramic images is needed to deal with more complicated environments.

6. The problem of removing nearby objects to create big “holes” in the resulting color images so that stitching of the images becomes impossible should be solved.

References

[1] A.D. Wilson and H. Benko, “Combining multiple depth cameras and projectors for interactions on, above, and between surfaces,” in Proc. ACM symposium on User interface software and technology, 2010, pp. 273-282.

[2] P. Biber, H. Andreasson, T. Duckett and A. Schilling, “3D modeling of indoor environment by a mobile robot with a laser scanner and a panoramic camera, ” Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2004), Sendai, Japan, vol.4, pp. 3430 - 3435, February 2004.

[3] H. Du, P. Henry, X. Ren, M. Cheng, D. B Goldman, S. M. Seitz, and D. Fox,

“Interactive 3D modeling of indoor environments with a consumer depth camera,”

in Proc. 13th ACM International Conference on Ubiquitous Computing, 2011, pp.

75-84.

[4] MIT, U. of Washington and Intel Labs. at Seattle, Visual Odometry For GPS-Denied Flight And Mapping Using A Kinect [Online], 2011, Retrieved from http://groups.csail.mit.edu/rrg/index.php?n=Main.VisualOdometryForGPS-Denied Flight.

[5] S. Izadi, R. A. Newcombe, D. Kim, O. Hilliges, D. Molyneaux, S. Hodges, P.

Hokli, J. Shotton, A. J. Davison, and A. Fitzgibbon, “KinectFusion real-time dynamic 3D surface reconstruction and interaction,” in Proc. ACM SIGGRAPH, 2011, pp. 23-23.

[6] P. Besl and N. McKay, “A Method for Registration of 3D Shapes,” IEEE Trans.

Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 239-256, Feb. 1992.

[7] RUSINKIEWICZ, S. AND LEVOY, M. 2001. “Efﬁcient Variants of the ICP

Algorithm,” Proc. 3DIM 2001.

[8] N. Gelfand, L. Ikemoto, S. Rusinkiewicz, and M. Levoy, “Geometrically Stable Sampling for the ICP Algorithm,” Proc. Int’l Conf. 3D Digital Imaging and Modeling, Oct. 2003.

[9] Garland M and Heckbert P Surface simpliﬁcation using quadric error metrics.

Computer Graphics (SIGGRAPH ’97 Proceedings) (1997), 209–216

[10] M. Brown and D. Lowe. Automatic panoramic image stitching using invariant features. Intl. J. of Computer Vision, 2007, pages 59–73.

[11] OpenCV library; http://code.opencv.org.

[12] Richard Szeliski. Image alignment and stitching: A tutorial. Technical Report MSR-TR-2004-92, Microsoft Research, December 2004.

[13] Heung-Yeung Shum and Richard Szeliski. Construction of panoramic mosaics with global and local alignment. International Journal of Computer Vision, 36(2):101-130, February 2000. Erratum published July 2002, 48(2):151-152.

[14] Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk and Aaron Bobick. Graphcut Textures: Image and Video Synthesis Using Graph Cuts. To appear in Proc.

ACM Transactions on Graphics, SIGGRAPH 2003.

在文檔中利用多部KINECT建構環車3D行車紀錄器及其應用 (頁 77-0)