Chapter 4. EXPERIMENT RESULT
4.2 Results of Proposed Method
4.2.3 Objective Results
In this section, we would compare the synthesized view to the ground truth from the multiview video sequences. In Fig. 25, the synthesized video is camera 6 of lovebird1.
For the red line, it is synthesized from camera 5 of lovebird1 by the proposed method.
- 40 -
According to the intrinsic parameter of camera, we set the focal length to 2017.8074. And then, set the distance between camera 5 and camera 6 to 38.66 by translation parameter.
Although the inaccuracy of moving object could influence the result, the quality of the background view synthesis is kept to the average value 27.93 in PSNR.
For the blue line, it is the synthesized from camera 5 and camera 8 of lovebird1 by the tool of multiview synthesis VSRS [3]. However, it needs to be preprocessed to estimate the depth map [1] from the left and right view as shown in Fig. 2. So, there are six views to be used to estimate and synthesize to the virtual one. The average of PSNR from the proposed method from the monocular video is only lower than the average of PSNR from multiview synthesis by about 3 to 4 dB.
Figure 25- Synthesize the camera 6 of lovebird1 from single view and multiview synthesis.
The red line is synthesized by VSRS, and the blue line is synthesized by the proposed method.
The average of red line in PSNR is 31.88, and the average of blue line in PSNR is 27.93.
In Fig. 26, the synthesized video is camera 8 of Alt Moabit. For the red line, it is synthesized from camera 7 of Alt Moabit. The focal length and the distance between camera 7 and camera 8 are set by camera parameter to 1382.4 and 62.05 respectively. In this video, the
- 41 -
quality of the synthesized video is unsteady. The chief influence of PSNR is the misses of moving object detection. Since there are some moving objects with large size in video, the misses of the object would decrease the value of PSNR substantially. In frame number 36, the big bus is moving in the frame. The transparent windows of the bus would mislead the moving object detection. These windows are classified as the static background, so that the value of PSNR decreases substantially from frame number 36. However, except for that, the quality in PSNR is kept at about 29.44.
For the blue line, it is synthesized from camera 7 and camera 10 of Alt Moabit by the tool of multiview synthesis VSRS [3]. There are six views to be used to estimate and synthesize to the virtual one as the above-mentioned. The average of PSNR from the proposed method from monocular is only lower than the average of PSNR from multiview synthesis by about 4dB.
Figure 26- Synthesize camera 8 of Alt Moabit from single view and mutiview synthesis.
The red line is synthesized by VSRS, and the blue line is synthesized by the proposed method.
The average of red line in PSNR is 33.7, and the average of blue line in PSNR is 29.44
- 42 -
Chapter 5.
CONCLUSION
5.1 Conclusion
In this study, we propose the transform algorithms to synthesize stereoscopic view from monocular video. The fundamental idea is based on the stereoscopic model constructed by vanishing point. Therefore, the preprocessing steps are necessary before view synthesis.
To improve the effect of moving object detection, we make some modification from [8].
The improvement of modification is shown in Fig. 6. The result of background registration has been promoted. Another preprocessing step is vanishing point dectection. We adopt the algorithm in [6] to search the vanishing lines and vanishing point.
In view synthesis, there are two partitions: background projection and moving object projection. The effect of the proposed transforming algorithms could be shown in subjective method, which exhibits the stereoscopic effect by constructing the red-cyan anaglyph, and be shown in objective method, which compares the synthesized view with the ground truth in PSNR. It is different with other works of view synthesis from single view. Although the value of PSNR is lower than the synthesized view from multiview, the synthesized view of the proposed method could keep the difference of PSNR within about 3 to 4 dB even if there is much less information in monocular video than multiview ones.
The proposed method provides novel transforming algorithms and the results could be compared in objective and subjective method instead of only compared by subjective results.
- 43 -
REFERENCES
[1] Masayuki Tanimoto, Toshiaki Fujii, Kazuyoshi Suzuki, “Multi-view depth map of Rena and Akko & Kayo”, ISO/IEC JTC1/SC29/WG11, M14888, 2008.
[2] Masayuki Tanimoto, Toshiaki Fujii, Kazuyoshi Suzuki, “Experiment of view synthesis using multi-view depth”, ISO/IEC JTC1/SC29/WG11, M14889, 2008.
[3] Cheon Lee, Yo-Sung Ho, “View Synthesis Tools for 3D Video”, ISO/IEC JTC1/SC29/WG11, M15851, 2008.
[4] Guofeng Zhang, Wei Hua, Xueying Qin, Tien-Tsin Wong, and Hujun Bao, “Stereoscopic Video Synthesis from a Monocular Video”, IEEE Transactions on Visualization and Computer Graphics, Vol. 13. No. 4, 2007.
[5] Yu-Lin Chang, Chih-Ying Fang, Li-Fu Ding, Shao-Yi Chen, and Liang-Gee Chen,
“Depth Map Generation for 2D-to-3D Conversion by Short-Term Motion Assisted Color Segmentation”, IEEE International Conference on Multimedia and Expo, pp. 1958-1961, 2007
[6] S. Battiato, A. Capra, S. Curti, M. La Cascia, “3D Stereoscopic Image Pairs by
Depth-Map Generation”, Proceedings of the 2nd International Symposium on 3D Data Processing, Visualization, and Transmission, pp. 124-131, 2004
[7] Shao-Yi Chien, Yu-Wen Huang, Bing-Yu Hsieh, Shyh-Yih Ma, and Liang-Gee Chen,
“Fast Video Segmentation Algorithm With Shadow Cancellation, Global Motion
Compensation, and Adaptive Threshold Technique”, IEEE Transactions On Multimedia, Vol. 6, No. 5, October 2004.
[8] Shao-Yi Chien, Shyh-Yih Ma, and Liang-Gee Chen, “Efficient Moving Object
Segmentation Algorithm Using Background Registration Technique”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, No. 7, July 2002.
- 44 -
[9] Sohaib Khan, Mubarak Shah, “Object Based Segmentation of Video Using Color, Motion and Spatial Information”, IEEE Computer Society Conference on CVPR, pp. II -746-751, 2001.
[10] Andreas Krutz, Matthias Kunter, Mrinal Mandal, Michael Frater, “Motion-based Object Segmentation using Sprites and Anisotropic Diffusion”, Image Analysis for Multimedia Interactive Services, WIAMIS ’07 Eighth International Workshop on, pp. 35, 2007.
[11] Youichi Horry, Ken-ichi Anjyo, Kiyoshi Arai, “Tour Into the Picture: Using a Spidery Mesh Interface to Make Animation from a Single Image”, Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pp. 225-232, 1997.
[12] Karl Kral, “Sid-to-Side head movements to obtain motion depth cues: A short review of research on the praying mantis”, Behavioural Processes, Vol. 43, Issue 1, pp. 71-77, April 1998,
[13] D. Comaniciu, P. Meer, “Robust Analysis of Feature Spaces: Color Image Segmentation”, In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 750-755, June 1997.