Conclusion - 於重新對焦影像域中訓練深層光場合成網路

In this thesis, we have described a novel loss called refocused image error for light field synthesis.

It drives a deep network to minimize light field loss in the 4D light field domain and the refocused image domain at the same time, resulting in high-quality refocused images. The superior performance of the proposed loss is supported by a theoretical analysis that shows the refocused image error is related to the summation of the inner products of spectra errors between all view pairs of a synthesized light field. In effect, our technique takes whole light field into consideration.

The experimental results using INRIA (a real dataset) and HCI (a software-rendered dataset) clearly show that the proposed regularization is more effective than the conventional one that only considers the individual view quality of a light field. The proposed loss is potentially useful for other light-field related tasks such as light field compression [34] and super-resolution [35]. These topics are worth further investigation in the future.

Appendix

For simplicity, let ˆLdenote the alias of Gθ(S) andL^h_s =L_s(x+h).The definitions of UCRIE2 and shift-and-add operator in Eqs. (6) and (1) establish the equation:

Because the light fields are finite-valued, we can interchange the order of summation:

The final step is to interchange the summation and the integration again,

For CRIE, we only need to replace D with infinity and add g(r) to the equation. Therefore, we have

Reference

[1] R. Ng et al., “Light Field Photography with a Hand-held Plenoptic Camera,” p. 11.

[2] T. Georgeiv, K. C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala, “Spatio-Angular Resolution Tradeoff in Integral Photography,” p. 10.

[3] M. Levoy and P. Hanrahan, “Light field rendering,” in Proceedings of the 23rd annual conference on Computer graphics and interactive techniques - SIGGRAPH ’96, Not Known, 1996, pp. 31–42, doi: 10.1145/237170.237199.

[4] G. Chaurasia, O. Sorkine, and G. Drettakis, “Silhouette-Aware Warping for Image-Based Rendering,” Comput. Graph. Forum, vol. 30, no. 4, pp. 1223–1232, Jun. 2011, doi:

10.1111/j.1467-8659.2011.01981.x.

[5] G. Chaurasia, S. Duchene, O. Sorkine-Hornung, and G. Drettakis, “Depth synthesis and local warps for plausible image-based navigation,” ACM Trans. Graph., vol. 32, no. 3, pp. 1–12, Jun.

2013, doi: 10.1145/2487228.2487238.

[6] S. Wanner and B. Goldluecke, “Variational Light Field Analysis for Disparity Estimation and Super-Resolution,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 3, pp. 606–619, Mar. 2014, doi: 10.1109/TPAMI.2013.147.

[7] M. Goesele, J. Ackermann, S. Fuhrmann, C. Haubold, R. Klowsky, and T. Darmstadt,

“Ambient point clouds for view interpolation,” ACM Trans. Graph., vol. 29, no. 4, p. 1, Jul. 2010, doi: 10.1145/1778765.1778832.

[8] P. P. Srinivasan, T. Wang, A. Sreelal, R. Ramamoorthi, and R. Ng, “Learning to Synthesize a 4D RGBD Light Field from a Single Image,” ArXiv170803292 Cs, Aug. 2017.

[9] N. K. Kalantari, T.-C. Wang, and R. Ramamoorthi, “Learning-based view synthesis for light field cameras,” ACM Trans. Graph., vol. 35, no. 6, pp. 1–10, Nov. 2016, doi:

10.1145/2980179.2980251.

[10] Y. Wang, F. Liu, Z. Wang, G. Hou, Z. Sun, and T. Tan, “End-to-End View Synthesis for Light Field Imaging with Pseudo 4DCNN,” in Computer Vision – ECCV 2018, vol. 11206, V. Ferrari, M.

Hebert, C. Sminchisescu, and Y. Weiss, Eds. Cham: Springer International Publishing, 2018, pp.

340–355.

[11] G. Lippmann, “La photographie integrale,” ComptesRendus Acad. Sci., 1908.

[12] “Lytro.” [Online]. Available: https://www.lytro.com/.

[13] “Raytrix,” 2018. [Online]. Available: https://raytrix.de/.

[14] W. Lin and C.-C. Jay Kuo, “Perceptual visual quality metrics: A survey,” J. Vis. Commun.

Image Represent., vol. 22, no. 4, pp. 297–312, May 2011, doi: 10.1016/j.jvcir.2011.01.005.

[15] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual Losses for Real-Time Style Transfer and Super-Resolution,” in Computer Vision – ECCV 2016, vol. 9906, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 694–711.

[16] S. Bosse, D. Maniry, K.-R. Müller, T. Wiegand, and W. Samek, “Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment,” IEEE Trans. Image Process., vol. 27, no. 1, pp. 206–219, Jan. 2018, doi: 10.1109/TIP.2017.2760518.

[17] N. Meng, T. Zeng, and E. Y. Lam, “Perceptual loss for light field reconstruction in high-dimensional convolutional neural networks,” in Imaging and Applied Optics 2019 (COSI, IS, MATH, pcAOP), Munich, 2019, p. CW1A.5, doi: 10.1364/COSI.2019.CW1A.5.

[18] J. Bruna, P. Sprechmann, and Y. LeCun, “Super-Resolution with Deep Convolutional Sufficient Statistics,” ArXiv151105666 Cs, Mar. 2016.

[19] A. Lamb, V. Dumoulin, and A. Courville, “Discriminative Regularization for Generative Models,” ArXiv160203220 Cs Stat, Feb. 2016.

[20] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” ArXiv14091556 Cs, Apr. 2015.

[21] M. Goesele, N. Snavely, B. Curless, H. Hoppe, and S. M. Seitz, “Multi-View Stereo for Community Photo Collections,” in 2007 IEEE 11th International Conference on Computer Vision, Rio de Janeiro, Brazil, 2007, pp. 1–8, doi: 10.1109/ICCV.2007.4408933.

[22] Y. Furukawa and J. Ponce, “Accurate, Dense, and Robust Multi-View Stereopsis,” p. 8.

[23] A. Levin and F. Durand, “Linear view synthesis using a dimensionality gap light field prior,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp.

1831–1838, doi: 10.1109/CVPR.2010.5539854.

[24] L. Shi, H. Hassanieh, A. Davis, D. Katabi, and F. Durand, “Light Field Reconstruction Using Sparsity in the Continuous Fourier Domain,” ACM Trans. Graph., vol. 34, no. 1, pp. 1–13, Dec.

2014, doi: 10.1145/2682631.

[25] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning Spatiotemporal Features with 3D Convolutional Networks,” in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015, pp. 4489–4497, doi: 10.1109/ICCV.2015.510.

[26] P. R. Clement, “The Chebyshev approximation method,” Q. Appl. Math., vol. 11, no. 2, pp.

167–183, Jul. 1953, doi: 10.1090/qam/58024.

[27] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image Quality Assessment: From

Error Visibility to Structural Similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004, doi: 10.1109/TIP.2003.819861.

[28] W. Xue and L. Zhang, “Gradient Magnitude Similarity Deviation: An Highly Efficient Perceptual Image Quality Index,” p. 12.

[29] V. K. Adhikarla et al., “Towards a Quality Metric for Dense Light Fields,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp.

3720–3729, doi: 10.1109/CVPR.2017.396.

[30] K. Honauer, O. Johannsen, D. Kondermann, and B. Goldluecke, “A Dataset and Evaluation Methodology for Depth Estimation on 4D Light Fields,” in Computer Vision – ACCV 2016, vol.

10113, S.-H. Lai, V. Lepetit, K. Nishino, and Y. Sato, Eds. Cham: Springer International Publishing, 2017, pp. 19–34.

[31] X. Jiang, M. Le Pendu, R. A. Farrugia, and C. Guillemot, “Light Field Compression With Homography-Based Low-Rank Approximation,” IEEE J. Sel. Top. Signal Process., vol. 11, no. 7, pp. 1132–1145, Oct. 2017, doi: 10.1109/JSTSP.2017.2747078.

[32] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” ArXiv14126980 Cs, Dec. 2014.

[33] H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss Functions for Image Restoration with Neural Networks,” IEEE Trans. Comput. IMAGING, vol. 3, no. 1, p. 11, 2017.

[34] M. Gupta, A. Jauhari, K. Kulkarni, S. Jayasuriya, A. Molnar, and P. Turaga, “Compressive Light Field Reconstructions Using Deep Learning,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 1277–1286, doi:

10.1109/CVPRW.2017.168.

[35] Y. Yoon, H.-G. Jeon, D. Yoo, J.-Y. Lee, and I. S. Kweon, “Learning a Deep Convolutional Network for Light-Field Image Super-Resolution,” in 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), Santiago, Chile, 2015, pp. 57–65, doi:

10.1109/ICCVW.2015.17.

Publications

[1] Liu, C. L., Fu, S. W., Lee, Y. J., Tsao, Y., Huang, J. W., & Wang, H. M. (2019). Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks. IEEE Transactions on Audio, Speech and Language Processing, February 2020

[2] Liu, C. L., Shih, K. T., & Chen, H. H. (2019). Light Field Synthesis by Training Deep Network in the Refocused Image Domain. arXiv preprint arXiv:1910.06072. (Accepted by IEEE Transactions on Image Processing)

[3] Liu, C. L., Shih, K. T., & Chen, H. H. Color Enhancement for See-Through Display with Motion Compensation. (To be submitted to IEEE Transactions on Image Processing)

在文檔中於重新對焦影像域中訓練深層光場合成網路 (頁 29-35)