• 沒有找到結果。

Chapter 6 The Avoidance of Degeneracy in Texture Matching

6.3 Image Selection and Filtering

6.4.3 Accurate matching

function to measure the similarity, our method is easy to recognize the best 3D position and has higher credibility.

6.4.3 Accurate matching

Figure 6-6 Top: the first ground truth data got from [90]. Bottom left: the number of points with different errors. Bottom right: the cumulative number of points with different errors.

In this section, we use two ground truth data got from [90] to estimate the three methods.

As shown in Figure 6-6, the data set has eleven multiple images. We randomly select about 627 ground truth 3D points for estimating their results. We use interpolation method to get 75 3D points from 0.97 to 1.03 of the distance between the reference camera center and the 3D point. The average distance between two interpolation points is about 0.007 meters. A similar result is in the experiment by the ground truth data got from [36] as shown in Figure 6-7. The

Error distribution

0 0.007 0.014 0.021 0.028 0.035 0.042 0.049 0.056 0.063 0.07 0.077 0.084 Error (m )

0 0.007 0.014 0.021 0.028 0.035 0.042 0.049 0.056 0.063 0.07 0.077 0.084 Errors

Cumulative number of points

MSP Patch Block

mutually supported patch matching has more correct matching results than the traditional block matching and patch-based matching. Left figures show the matching point distribution in different errors. The y-axis is the number of matching points and the x-axis is the error region. Right figures show the cumulative point distribution in different errors.

Cumulative point distribution

Cumulative number of points

MSP Patch Block Matching point distribution in different errors

0

Figure 6-7 Top: the ground truth data got from [36]. Bottom left: the number of points with different errors. Bottom right: the cumulative number of points with different errors.

The statistical results show that our method has higher probability of getting less error matching than block and patch methods. Another ground truth data set has eight multiple images as shown in Figure 6-8. The number of the ground truth 3D points is 782. The average distance between two interpolation points is about 0.018 meters. Because this data set only has eight images and the angles between the views are not large enough, the patch scaling

problem is unapparent. Therefore, our method is almost the same to the patch method. The experimental results show that our method can solve the patch scaling problem, and if there is not patch scaling problem, our method has the same level as the patch method.

Figure 6-8 Top: the second ground truth data got from [90]. Bottom left: the number of points with different errors. Bottom right: the cumulative number of points with different errors.

6.5 Summary

In this chapter, we proposed a new method to improve the accuracy of the correspondence matching according to the multiple view geometry.

We analyze and compare the regular block matching and patch based matching. We find that the regular block matching and patch based matching both easily cause the similarity measurement to be inaccurate with different camera views and captured depth. Therefore, we

Error distribution

0 0.018 0.036 0.054 0.072 0.09 0.108 0.126 0.144 0.162 0.18 0.198 0.216 Error (m )

0 0.018 0.036 0.054 0.072 0.09 0.108 0.126 0.144 0.162 0.18 0.198 0.216 Errors

Cumulative number of points

MSP Patch Block

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

141

propose an algorithm with mutually supported patch, dynamic Gaussian filtering, and photometric similarity function to modify the patch based method.

In the experiment, we find that our method can obviously judge whether the patch is close to the surface by the similarity function. Besides, it can solve the patch scaling problem.

After using mutually supported patch matching, it can help to get accurate patch with less outlier.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

142

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

143

Chapter 7 Conclusions

In this dissertation, the discussed degeneracy problems cover a wider scope. It not only includes the traditional degeneracy as described in mathematics but also includes the corresponding point matching error caused by the improper models in geometries and in the textures. The degeneracy problems normally are ignored by most researchers. Yet they are very important that produce impacts to the accuracy of the estimated parameters and computed 3D models. We analyze and classify the multi-view degeneracy problems into three categories and provide methods and guidelines for avoiding the degeneracy problems in multi-view image processing. More specifically, we provide guidelines to avoid generating degeneracies in capturing multiple views that are used to estimate parameters and construct 3D models. We conclude the dissertation in this chapter.

7.1 Multi-view critical configurations

As indicated previously, the first category of the degeneracies occur when the camera parameters and geometries are unknown after capturing the image, the corresponding points can be used to estimate the parameters or geometries in multi-view images. Certain camera pose and 3D point configurations may lead to a non-unique solution on these estimations.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

144

This is referred as the critical configurations [38][40]. Especially, when all 3D points and two camera centers lie on a ruled quadric, the well-known critical configuration will occur.

However, we find that when we estimate the geometries with large number of the corresponding points, the robust methods can be used to process it. We present the parameter estimation algorithms and the robust methods and analyze these methods in processing the critical configurations in Chapter 3 and Chapter 4. Only if it is in the special cases, when all 3D points lie on a plane, it is also easy to occur. The robust methods must be combined with the checked constraints in avoiding the points near a plane. Besides, when the texture of the image is clean and the number of the corresponding points is little, all the 3D points and cameras may lie on a ruled quadric. At such cases, the critical configurations must also be processed. In section 2.4.1.2, we show that we can estimate the quadric by seven 3D points and two camera centers. However, to judge if a quadric is a ruled quadric is still a difficult problem. And, fortunately, the probability of this situation is low. Only if there are enough corresponding points, it is easy to filter the critical configuration with robust parameter estimations. The experiments in Chapter 4 show that the proposed robust methods can achieve more accurate estimation of the parameters than the traditional methods. Our proposed robust methods also have higher probability to find the optimal solution than the traditional methods.

7.2 Degeneracy in geometry transfer

In the second category of the degeneracies, we find that even when the camera parameters or geometries are known, there exist degeneracy problems which caused by the homography, the epipolar transfer, or the transfer by trifocal tensor. We show that it is easy to process the transfer in homography by checking if the estimated points are collinear. The

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

145

probability of the degeneracy of the transfer by trifocal tensor is also low, because it occurs when the 3D points close to the baseline. On the other hand, the influence of the epipolar transfer is quite large. Whether the points are close to the trifocal plane or three cameras are near collinear, the degeneracy problem will occur. As one can see in the analysis of Section 5.2, we strongly suggest that do not use the epipolar transfer in multi-view images. The trifocal tensor is a method which can be used to replace it. An application to reconstruct the multiple views by SfM is discussed in the Chapter 5. We discuss about how to use the trifocal tensor for this problems. The experimental results show that we can improve the accuracy of the SfM estimation successfully. The average reprojection errors are reduced from dozen of pixels to less than 1 pixel.

7.3 Degeneracy in the geometry and texture matching

For the third category of the degeneracies, in addition to the geometries, the texture is other important kinds of information in multi-view images. The popular patch-based method considers both kinds of the information for estimating the similarity of the corresponding points in multi-view images. However, we show that the patch plane and the object surface may be inconsistent in certain conditions. The similarity will be influenced when the distance from views to the object is quite far. We call this patch-based matching scaling problem. We propose a mutually supported patch for processing this type of problems. The rotation of the camera may also influence the similarity measurement. We suggest that when selecting the images, do not select that with near 45 degrees rotations of the camera. The similarity will be influenced too when the object surface is a curved surface, not a plane. We suggest two guidelines for handling this type of problems. First, select the images with small angles.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

146

Second, this matching point can be ignored. The experiments show that our method can solve the problems. After using mutually supported patch matching, it can help to get accurate patch with less outlier.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

147

Chapter 8 Future Work

8.1 Degeneracy problems

In Section 2.4.3 and Chapter 6, we discussed the degeneracy in the patch-based matching.

The mutually supported patch is proposed for solving the matching scaling problem. However, it may have limitations in solving this problem. Finding such limitations is one of the interesting researches. In addition, the α-invariant and the depth in multiple views can also be explored for the patch-based matching.

In Chapter 5, we discussed how to use the trifocal tensor for processing the degeneracy in the SfM applications. However, when there are too many multi-view images, the computation time may grow exponentially. Finding efficient algorithms for this kind of problems are also fun.

In Section 2.4.1, we discussed the first category of degeneracy problems in multi-view images. It can be seen that when the texture of the image is clean and the number of the corresponding points is few, one cannot process the critical configuration. This is because that it is also a difficult problem to judge if a quadric is a ruled quadric. Therefore, finding such solution is also important.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

148

8.2 Parameter estimation

In Chapter 3 and Chapter 4, we also discussed several parameter estimation algorithms and proposed some robust methods. For the fitness function in the proposed parameter estimation methods, it is decided by a geometric distance. However, one may use multi-objective functions as the fitness function to getting more robustness. We believe that the similarity of the corresponding points can be one of the multi-objective functions. This could be explored in the future.

Furthermore, various variant parameter estimation algorithms have been proposed in recent years. Each method has its own advantages as well as disadvantages. Finding a more robust algorithm by combining the existing methods or cooperated with a new method is also a challenging and interesting work.

8.3 Intelligence tools for more applications

The proposed guidelines can help us to avoid degeneracies in multi-view image processing. It is useful for improving the estimation of the corresponding points and the geometries in multi-view images. We can use the guidelines to analyze and improve the multi-view image applications, such as 3D reconstruction, image inpainting, object tracking, etc. One may also try to explore more problems which will influence the accuracy. The degeneracies and the parameter estimation methods could be considered together for the multi-view image applications in the future.

Moreover, the multi-view geometry plays an important role in many researches and applications in the field of robotics. For example, the simultaneous localization and mapping (SLAM) [21][63] need accurate 3D reconstruction technologies in constructing the 3D

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

149

environments. Figure 8-1 shows an example of the map built by SLAM. The constructed environment is also helpful for the surveillance applications. Some of the applications in robotics use the SfM to build such environments. There are many challenges and further research issues of these kinds.

Figure 8-1 An example of the simultaneous localization and mapping [63].

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

150

Reference

[1] Agarwal, S., Y. Furukawa, N. Snavely, B. Curless, S.M. Seitz, and R. Szeliski, Reconstructing Rome, IEEE Computer Society, Vol. 43, pp. 40-47, 2010.

[2] Agarwal, S., Y. Furukawa, N. Snavely, B. Curless, S. M. Seitz and R. Szeliski, Building Rome in a Day. Communications of the ACM, Vol. 54, No. 14, pp. 105-112, 2011.

[3] Armangué, X. and J. Salvi, Overall View Regarding Fundamental Matrix Estimation.

Image and Vision Computing, Vol. 21, No. 2, pp. 205-220, 2003.

[4] Barjatya, A., Block Matching Algorithm for Motion Estimation. DIP 6620 Final Project Paper, 2004.

[5] Barnard, S. T. and M.A. Fischler, Computational Stereo. ACM Computing Surveys, Vol. 14, pp. 553-572, 1982.

[6] Beck, J.V. and K.J. Arnold, Parameter Estimation in Engineering and Science. Wiley series in probability and mathematical statistics, Wiley, New York, 1977.

[7] Bouguet, J.Y., Camera Calibration Toolbox for Matlab, 2010, Online URL

<http://www.vision.caltech.edu/bouguetj/calib_doc/>

[8] Bradley, D., T. Boubekeur, and W. Heidrich, Accurate Multi-View Reconstruction Using Robust Binocular Stereo and Surface Meshing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

151

[9] Brahmachari, A.S., S. Sarkar, BLOGS: Balanced Local and Global Search for Non-Degenerate Two View Epipolar Geometry. In Proceedings of the Twelfth International Conference on Computer Vision, Kyoto, Japan 2009.

[10] Brown, M. and D. Lowe, Automatic Panoramic Image Stitching Using Invariant Features. In International Journal of Computer Vision, Vol. 74, No. 1, pp. 59-77, 2007.

[11] Brown, M. and D. G. Lowe, Unsupervised 3D Object Recognition and Reconstruction in Unordered Datasets. International Conference on 3-D Digital Imaging and Modeling, 2005.

[12] Campbell, N.D.F., G. Vogiatzis, C. Hernández, and R. Cipolla, Automatic 3D Object Segmentation in Multiple Views Using Volumetric Graph-Cuts. Image and Vision Computing, Vol. 28, pp. 14-25, 2008.

[13] Campbell, N.D.F., G. Vogiatzis, C. Hernández, and R. Cipolla, Using Multiple Hypotheses to Improve Depth-Maps for Multi-View Stereo. In Proceedings of the European Conference of Computer Vision, pp. 766-779, 2008.

[14] Cernuschi-Frias, B., Cooper, D. B., Hung, Y.-P., Belhumeur, P. N.,1989. Toward a Model-Based Bayesian Theory for Estimating and Recognizing Parameterized 3-D Objects Using Two or More Images Taken from Different Positions. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, 1028-1052.

[15] Chai, J. and S.D. Ma, Robust epipolar geometry estimation using genetic algorithm.

Pattern Recognition Letters. Vol. 19, No. 9, pp. 829-838, 1998.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

152

[16] Chan, K.-H., C.-Y. Tang, M.-K. Hor, and Y.-L. Wu, Robust trifocal tensor constraints for structure from motion estimation. Pattern Recognition Letters, Vol. 34, No. 6, pp.

627-636, 2013.

[17] Chan, K.-H., C.-Y. Tang, Y.-L. Wu, and M.-K. Hor, Robust Orthogonal Particle Swarm Optimization for Estimating the Fundamental Matrix, In Proceedings of IEEE Visual Communications and Image Processing, Tainan, Taiwan, November 2011.

[18] Chum, O. and J. Matas, Matching with PROSAC - Progressive Sample Consensus.

In Conference on Computer Vision and Pattern Recognition, pp. 220-226, 2005.

[19] Connor, K. and I. Reid, Novel view specification and synthesis. In Proceeding of the British Machine Vision Conference, 2002.

[20] Dhond, U. R. and J. K. Aggarwal, Structure from Stereo: a Review. IEEE Transactions on System, Man, and Cybernetics, Vol. 19, pp. 1489-1510, 1989.

[21] Durrant-Whyte, H. and T. Bailey, Simultaneous localization and mapping: part I.

IEEE Robotics and Automation Magazine, Vol. 13, No. 2, pp. 99-108, 2006.

[22] Eberhart, R. and Y. Shi, Particle Swarm Optimization: Developments, Applications and Resources. In Proceedings of the 2001 Congress on Evolutionary Computation, pp. 81-86, 2001.

[23] Faugeras, O. and B. Mourrain, About the Correspondences of Points Between N Images. In Proceedings of the IEEE Workshop on Representation of Visual Scenes, pp. 37–44, 1995.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

153

[24] Faugeras, O.D., L. Quan, and P. Sturm, Self-calibration of a 1D Projective Camera and its Application to the Self-calibration of a 2D Projective Camera. In Proceeding of the European Conference on Computer Vision, Vol. 1, pp. 36-52, 1998.

[25] Faugeras, O. D., Q. Luong, and S. Maybank,. Camera self-calibration: Theory and experiments. In Proceeding of the European Conference on Computer Vision, pp.

321-334, 1992.

[26] Feng, C.L. and Y.S. Hung, A Robust Method for Estimating the Fundamental Matrix.

In Proceedings of Conference on Digital Image Computing: Techniques and Applications, pp. 633-642, 2003.

[27] Fischler, M. A. and R. C. Bolles, Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography.

Communications of the ACM, Vol. 24, pp. 381-395, 1981.

[28] Forsyth, D. A. and J. Ponce, Computer Vision A Modern Approach. Second Edition, Prentice Hall, New Jersey, 2011.

[29] Föstner, W., Reliability Analysis of Parameter Estimation in Linear Models with Application to Mensuration Problems in Computer Vision. Computer Vision, Graphics and Image Processing, Vol. 40, No. 3, pp. 273-310, 1987.

[30] Frahm, J.-M., P. Fite-Georgel, D. Gallup, T. Johnson, R. Raguram, C. Wu, Y.-H. Jen, E. Dunn, B. Clipp, S. Lazebnik, and M. Pollefeys, Building Rome on a Cloudless Day. In Proceedings of the European Conference on Computer Vision, pp. 368–381, 2010.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

154

[31] Furukawa, Y. and J. Ponce, Accurate Camera Calibration from Multi-view Stereo and Bundle Adjustment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.

[32] Furukawa, Y. and J. Ponce, Accurate, Dense, and Robust Multi-view Stereopsis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2007.

[33] Furukawa, Y. and J. Ponce, High-fidelity image-based modeling. Technical Report, UIUC, 2006.

[34] Furukawa, Y., B. Curless, S.M. Seitz, and R. Szeliski, Towards Internet-scale Multi-view Stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1434-1441, 2010.

[35] Furukawa, Y., Patch-based Multi-view Stereo Software, 2010, Online URL <

http://grail.cs.washington.edu/software/pmvs/>.

[36] Gao, W., Robot Vision Group Data Set, 2011. Online URL <

http://vision.ia.ac.cn/index.htm>.

[37] Goesele, M., N. Snavely, B. Curless, H. Hoppe, and S.M. Seitz, Multi-View Stereo for Community Photo Collections. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1-8, 2007.

[38] Hartley, R. I. and A. Zisserman, Multiple View Geometry in Computer Vision, Second Edition, Cambridge University Press, 2004.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

155

[39] Hartley, R. I., and F. Kahl, A Critical Configuration for Reconstruction from Rectilinear Motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 511–517, 2003.

[40] Hartley R. I. and F. Kahl, Critical Configurations for Projective Reconstruction from Multiple Views. International Journal of Computer Vision, Vol. 71, No. 1, pp.5-47, 2006.

[41] Hedayat, A. S., N. J. A. Sloane, and J. Stufken, Orthogonal Arrays: Theory and Applications. New York: Springer-Verlag, 1999.

[42] Heinrich, S. B. and W.E. Snyder, Internal Constraints of the Trifocal Tensor. In Proceedings of CoRR, 2011.

[43] Hiep, V.H., R. Keriven, P. Labatut, and J.P. Pons, Towards High-resolution Large-scale Multi-view Stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1430-1437, 2009.

[44] Ho, S.-Y., H.-S. Lin, W.-H. Liauh, and S.-J. Ho, OPSO: Orthogonal Particle Swarm Optimization and Its Application to Task Assignment Problems. IEEE Transactions on Systems, Man, and Cybernetics—Part a: Systems and Humans, Vol. 38, No. 2, pp.

288-298, 2008.

[45] Hor, M.-K., C.-Y. Tang, Y.-L. Wu, K.-H. Chan, and J.-J. Tsai, Robust Refinement Methods for Camera Calibration and 3D Reconstruction from Multiple Images.

Pattern Recognition Letters, Vol. 32, No. 8, pp.1210-1221, 2011.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

156

[46] Hor, M.-K., and K.-H. Chan, and C.-Y. Tang, 3D Model Reconstruction Refinement from Multiple Images. International Conference on Multimedia Technology, 2011.

[47] Hor, M.-K., W.-C. Chen, C.-Y. Tang, Y.-L. Wu, K.-H. Chan, and K.-S. Wu, Refinement of 3D Models Reconstructed from Visual Hull. International Display Manufacturing Conference/3D Display System and Application/Asia Display, 2009.

[48] Hor, M.-K., W.-C. Chen, C.-Y. Tang, Y.-L. Wu, K.-H. Chan, and J.-J. Tsai, Using 3D Patches for Refinement of 3D Reconstruction from Multiple Images. International Display Manufacturing Conference/3D Display System and Application/Asia Display, 2009.

[49] Hor, M.-K., W.-C. Chen, C.-Y. Tang, Y.-L. Wu, K.-H. Chan, and J.-Y. Tsai, Generation of Dense Image Matching Using Epipolar Geometry. International Display Manufacturing Conference/3D Display System and Application/Asia Display, 2009.

[50] Hu, M., B. Yuan and X. Tang, Robust estimation of trifocal tensor using messy genetic algorithm. Chinese Journal of Electronics, Vol. 12, pp. 174-178, 2003.

[51] Huang, J. -F., S. -H. Lai, C. -M. Cheng, Robust Fundamental Matrix Estimation with Accurate Outlier Detection. Journal of Information Science and Engineering. Vol. 23, No. 4, pp. 1213-1226, 2007.

[52] Hung, Y.-P., Cooper, D. B., Cernuschi-Frias, B., Asymptotic Bayesian Surface Estimation Using an Image Sequence. International Journal of Computer Vision, Vol.

6, pp. 105-132, 1991.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

157

[53] Kennedy, J. and R. C. Eberhart, Particle Swarm Optimization. IEEE International Conference on Neural Networks, pp. 1942-1948, 1995.

[54] Kim, K. and L.S. Davis, Multi-camera Tracking and Segmentation of Occluded People on Ground Plane Using Search-guided Particle Filtering. In Proceedings of European Conference on Computer Vision, Vol. 3953, pp.98-109, 2006.

[55] Levoy, M., Stanford Spherical Gantry, 2002, Online URL

<http://graphics.stanford.edu/projects/gantry/>.

[56] Li, J., E. Li, Y. Chen, L. Xu, and Y. Zhang, Bundled Depth-map Merging for Multi-view Stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2769-2776, 2010.

[57] Li, R., B. Zeng, and M. L. Liou, A New Three-Step Search Algorithm for Block Motion Estimation. IEEE Transactions Circuits And Systems For Video Technology, Vol. 4., No. 4, pp. 438-442, 1994.

[57] Li, R., B. Zeng, and M. L. Liou, A New Three-Step Search Algorithm for Block Motion Estimation. IEEE Transactions Circuits And Systems For Video Technology, Vol. 4., No. 4, pp. 438-442, 1994.