結論與未來工作 - 使用二維攝影機陣列訊號合成自由視點視訊

6.1 結論

多重視點視訊已逐漸成為多媒體領域中的研究主項，而自由視點視訊是其中的重要議題。故本論文以其為研究對象，將現有的影像擷取設備由一維攝影機陣列拓展為二維攝影機陣列，並利用空間和時間上的相關性來提昇虛擬視點影像的合成結果。

在空間相關性上，我們使用四張由不同搜尋方向得到視差圖來修正因遮蔽效應引起的 occlusion 問題，並搭配一個與對應點搜尋方向相關的投票機制來提昇視差圖的精確度。由於二維攝影機陣列中的攝影機數量非常龐大，為了降低整體系統計算視差圖的次數，我們提出了棋盤式視差估測。此種視差估測演算法是利用相鄰視差圖之間的平移特性，以四張鄰近的主視差圖來合成被其包圍的副視差圖。棋盤式視差估測使得計算視差圖的次數降為原來的一半，而且合成出來的副視差圖仍然保有高度的精確度。

在時間相關性上，我們使用移動向量來降低視差估測所需要的時間。為了取得更準確的移動向量，我們採用全域搜尋移動估測，並加入已知的視差圖作為參考。由實驗顯示，視差圖資訊和原始影像資訊在移動估測中的權重以1:0.1為一個較理想的比例。而比較合成一張影像所需的時間和略過的 frame 數量之間的關係，以略過三張 frame 的結果最佳，其計算時間可以降至原本的三分之一；當略過的 frame 數量超過三張時，則會因為移動估測的搜尋範圍太大而無法有效降低計算時間，甚至會出現計算時間增加的情形。隨著略過的 frame 數量增加，合成出來的影像品質會逐漸呈現遞減的趨勢，但是變化不大。在實驗中最差的情況下，

略過一張 frame 僅使得 PSNR 下降 0.02dB，而略過三張 frame 時 PSNR 只下降 0.05dB。

同時考量時間效益以及影像品質，略過三張 frame 是一個較為理想的選擇。

108

6.2 未來工作

由於本論文中使用電腦合成影像作為測試資料，因此並未考量在現實攝影機陣列中容易遇到的問題，例如攝影機校正、不同攝影機擷取影像的亮度不同等等。

為了能夠更接近實際應用，未來實驗所用的測試影像應該要由實際攝影機陣列取得。

為了取得精確的移動向量，我們在實驗中採用了全域搜尋移動估測。但此方法搜尋範圍太大，導致無法有效地降低整體時間。在視訊壓縮技術成熟的今日，

我們或許可以測試各種不同的移動估測方式，來找出一個兼顧影像品質和時間效能的演算法。

109

參考文獻

[1] Applications and Requirements for 3DAV, ISO/IEC JTC1/SC29/WG11 N5877, July 2003.

[2] Report on 3DAV Exploration, ISO/IEC JTC1/SC29/WG11 N5878, July 2003.

[3] M. Tanimoto, “FTV (free viewpoint television) creating ray-based image engineering,” IEEE Int. Conf. Image Process., Genova, Italy, 2005.

[4] M. Tanimoto, “FTV (free viewpoint television) for 3-D scene reproduction and creation,” IEEE Conf. Comput. Vision Pattern Recog., New York, NY, 2006.

[5] P. E. Debevec, C. J. Taylor, and J. Malik, “Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach,” in Proc.

ACMAnnu. Computer Graphics Conf., Aug. 1996, pp. 11–20.

[6] M. Tanimoto, “Overview of free viewpoint television,” Signal Processing: Image Communication, vol. 21, no. 6, pp.454-461, July 2006.

[7] R. C. Bolles and H. H. Baker, “Epipolar-plane image analysis: A technique for analyzing motion sequences,” in Proc. IEEE 3rd Workshop Computer Vision:

Representation and Control, Bellaire, Oct. 1985, pp. 168-178.

[8] Baker, H. and R. Bolles, “Generalizing epipolar-plane image analysis on the spatiotemporal surface,” in DARPA Image Understanding Workshop, Cambridge, MA. April 6-8, pp. 1022-1030.

[9] T. Naemura and H. Harashima, “Real-Time Video-Based rendering for augmented spatial communication,” Proc. VCIP, vol. 3653, SPIE Press, Bellingham, Wash., 1999, pp. 620-631.

[10] N. Grammalidis and M. G. Strintzis, “Disparity and occlusion estimation in mutiocular systems and their coding for the communication of multiview image sequences,” IEEE Trans. Circuits Syst. Video Technol., vol. 8, pp. 328–344, June 1998.

110

[11] Y. Ohta and T. Kanade, “Stereo by intra- and inter-scanline search using dynamic programming,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol.

PAMI-7, No. 2, pp. 139-154, March, 1985.

[12] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” Int’l J. Computer Vision, vol. 47, pp. 7-42, Apr. 2002.

[13] G. V. Meerbergen, M. Vergauwen, M. Pollefeys, and L. V. Gool, “A hierarchical symmetric stereo algorithm using dynamic programming,” Int’l J. Computer Vision, 47(1/2/3):275–285, April-June 2002.

[14] R. Yang and M. Pollefeys. “Multi-resolution real-time stereo on commodity graphics hardware,” Proc. of CVPR, v1. pp. 211-220. 2003.

[15] S. T. Hsu, Disparity Estimation Using Multiple Images. National Chaio Tung University, M.S., 2008

[16] T. Kanade and M. Okutomi, “A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, Sept. 1994.

111

自傳

徐崇毓，民國七十三年出生於台中縣豐原市。民國九十六年畢業於國立中正大學電機系，同年進入國立交通大學電子所攻讀碩士學位。指導教授為杭學鳴博士，研究方向為多重視點視訊之研究。於民國九十七年取得碩士學位。

在文檔中使用二維攝影機陣列訊號合成自由視點視訊 (頁 125-129)