結論和未來展望 - 排除難/無匹配點之非監督式單眼深度估計

自從深度學習開始火紅之後，使用深度學習作三維重建相關研究也越來越多了，

其中用來估計深度的方法也有很多種，我覺得當中非監督式的方法很有研究的價值，有別於以往深度學習的方法使用端到端去進行監督，非監督式的方法可以說是非常的新奇，相信有做過深度學習相關研究的人都知道標記數據的取得無非是深度學習的難點之一，然而非監督式的方法資料取得非常的容易，我覺得這就是非監督式的方法很大的優勢之一。

在本論文裡我們從各種不同的角度去思考如何讓非監督式的方法結果變得更好，我們更改了網路架構、排除了一些造成錯誤估計的部分，也融入雙眼資料來訓練我們的模型，從最後的實驗結果可以看出，我們的每個方法都是有效的，而且我們的結果也比其它方法要來的好。

在最後我們也思考我們的研究未來還有哪些可以改善的地方，首先是我們證明了靜態場景的有效性，因此如果我們有辦法可以直接找出場景中移動的物體進行排除的話，例如使用光流之類的方法，那我們的結果還能夠更好，在來是更好的網路架構，在深度學習中網路架構非常的重要，在我們的方法中證明了我們的網路架構確得更改實是有效的，那麼如果能找到一個更適合這個問題的網路架構，

那麼結果也能夠更好，最後就是將我們的方法用在不同的資料集中，驗證我們的方法能夠靈活的運用在各種不同的場景中。

參考文獻

[1] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. “ImageNet: A Large-Scale Hierarchical Image Database,” In CVPR09, 2009.

[2] Ashutosh Saxena, Sung H. Chung, and Andrew Y. Ng, “Learning depth from single monocular images,” In NIPS'05 Proceedings of the 18th International Conference on Neural Information Processing Systems Pages 1161-1168, 2005.

[3] A. Krizhevsky, I. Sutskever, and G.E. Hinton, “Imagenet classification with deep convolutional neural networks,” In Advances in Neural Information Processing Systems 25, pages 1106–1114, 2012.

[4] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” In Advances in Neural Information Processing Systems, 2014.

[5] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” In ECCV, 2012.

[6] D. Eigen, R. Fergus, “Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture,” In ICCV ,2015.

[7] K. Simonyan, A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556, 2014.

[8] J. Long, E. Shelhamer, T. Darrell, “Fully convolutional networks for semantic segmentation,” arXiv:1411.4038, 2014.

[9] I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, “Deeper depth prediction with fully convolutional residual networks,” In 3DV, 2016.

[10] Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, Philip H. S. Torr, “Conditional Random Fields as Recurrent Neural Networks,” In ICCV, 2015.

[11] Fayao Liu, Chunhua Shen, Guosheng Lin, “Deep Convolutional Neural Fields for Depth Estimation from a Single Image,” In CVPR,2015.

[12] R. Garg, V.K. BG, G. Carneiro, and I. Reid, “Unsupervised CNN for single view depth estimation: Geometry to the rescue,” In ECCV, 2016.

[13] C. Godard, O. Mac Aodha, and G.J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” In CVPR, 2017.

[14] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, 2004.

[15] T. Zhou, M. Brown, N. Snavely, and D.G. Lowe, “Unsupervised learning of depth and ego-motion from video,” In CVPR, 2017.

[16] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” In CVPR , 2012.

[17] O. Ronneberger, P. Fischer, T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in: MICCAI, Vol. 9351, pp. 234-241, 2015.

[18] Diederik P. Kingma, and Jimmy Ba, ”Adam: A Method for Stochastic Optimization,” arXiv:1412.6980 , 2014.

[19] J.C. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” Journal of Machine Learning Research, 2011.

[20] Zongwei Zhou, Md Mahfuzur Rahman Siddiquee, Nima Tajbakhsh, and Jianming Liang, “UNet++: A Nested U-Net Architecture for Medical Image Segmentation”

arXiv:1807.10165, 2018.

[21] Clément Godard, Oisin Mac Aodha, Michael Firman, and Gabriel Brostow,

“Digging Into Self-Supervised Monocular Depth Estimation,” arXiv:

1806.01260,2018.

[22] Z. Yin, J. Shi, “GeoNet: Unsupervised learning of dense depth, optical flow and camera pose,” In CVPR, 2018.

在文檔中排除難/無匹配點之非監督式單眼深度估計 (頁 67-70)