研究結果之應用

第四章研究過程與結果分析

4.4 研究結果之應用

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

4.3.3. 即時物體偵測可行性探討

由4.2.3 的分析與偵測結果，原始 ckpt 是無法偵測水中影像中的 jellyfish 與 fish，以遷移學習做重新訓練，結果為可偵測 jellyfish 與 fish，顯示其可行性。

4.4. 研究結果之應用

圖4.14、將研究結果應用物體偵測系統之示意圖

我們從上面的研究結果中，在收到即時拍攝的水中影像後，完成顏色修正，

並將修正後影像進行物體偵測，再將結果輸出到顯示器中，如圖 4.14 的示意圖所展示。由於具有顏色修正功能，因此如有未來增加偵測類別的需求，只需要收集日常影像即可，再藉由遷移學習的概念，只需再重新訓練物體偵測模型即可。

Color correction model Object detection

model

Embedded Image input

Result in image:

- Location - Class

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

第五章

結論與未來研究方向

5.1. 結論

由於色彩在不同介質會發生不同的變化，因此先對顏色修正部分深入研究，

而由physics based 與 learning based 這兩種方法的比較與分析，訂定了使用深度學習來進行色彩修正，因此使用基於 condition GAN 的 pix2pix 將水下拍攝的影像修改成空氣介質中拍攝的影像，此為藉由cycleGAN 並分析訓練集品質，產出訓練所需要的paired image。此外，為了訓練的收斂，也對 pix2pix 的損失函數進行深入的探討，使得pix2pix 產出的影像達到預想的結果。

物件偵測部分，藉由探討二階段處理與一階段處理的差異，預先找出適用的模型，並利用各個模型的pre-trained CKPT，根據偵測狀況與處理速度選擇使用 SSD model 作為物件偵測模型。此外，分析了 SSD 訓練集的拍攝型態與影像大小，找出最佳的訓練結果。

在前述的過程中，確立了此研究的訓練模型，並找出顏色修正的訓練參數與物件偵測的訓練參數，包含確定訓練集型態、損失函數選定與其參數設定、訓練的迭代次數、影像品質的分析、甚至與影像修正軟體進行比較，使得最終的結果可讓魚跟水母偵測大於7 成，而對於做顏色修正與否，其 mAP 則由 0.47 提高為 0.607。

我們試圖採用較簡單的物件偵測訓練方法，有別於一般物體偵測在訓練前準備訓練集所需要花費的時間與金錢，此研究建立色彩修正模型，可應用到各種光線變化的場景；而色彩修正模型建立後，透過現有的資料集及遷移學習，也可再

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

次利用過往收集的資料集，大幅下降收集訓練資料及模型訓練的時間，即可滿足在不同場景偵測物體的需求，達到相對簡單的物件偵測訓練。

5.2. 未來研究方向

在水下攝影的時候，攝影者與潛伴的距離難控制且水下活動不易，使得拍攝到的潛水員影像狀況不一，除了會有三個方向（上下、左右、前後）而導致拍攝者在不同距離及不同視角拍攝到潛水員，加上潛水員裝備彼此有差異，拍攝到局

部或全部皆會影響訓練結果，導致AP 較低，因此未來可針對此狀況進行討論而

提高偵測潛水員之準確度。

在本論文的第二章提到，光散射有對於增加物體的照明有一定幫助，但事實上也會降低物體和背景之間的對比度而使的得到的圖像變的模糊，這個問題與光

源照度強弱無關，並不會因為較強的光源就有所改善，而此問題也會造成 mAP

下降，因此可由信噪比等方面下手，針對模糊的狀況做出相對應的處理，使得影像品質提高。

此外，本論文為基於pix2pix model，而此模型在影像解析度上有很大的限制，

一般是輸出256x256 的影像（本研究為 512x512），小解析度雖然可讓速度較快，

但也會影響偵測率，而如果強制輸出高解析度(如 2048x1024)，會使得輸出影像品質下降，且在訓練穩定度也不夠，因此未來可針對高解析度進行研究，例如探討pix2pixHD[20] model 作法可行性。

‧

[3] Klemm et al., “Exploring Our Fluid Earth is based on the nationally recognized Fluid Earth/Living Ocean (FELO) aquatic science curriculum.” 1995.

[4] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.

[5] Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems.

2015.

[6] Redmon, Joseph, et al. "You only look once: Unified, real-time object

detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

[7] Steiner, A. "Understanding the Basics of Underwater Lighting." Ocean News &

Technology 19.4 (2013): 10-12.

[8] Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.

[9] Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).

[10] Kuznetsova, Alina, et al. "The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale."

(2020).

‧

[11] Zhu, Jun-Yan, et al. "Unpaired image-to-image translation using cycle-consistent adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017.

[12] Sato, Kazunori. "An inside look at google bigquery." White paper, URL:

https://cloud. google. com/files/BigQueryTechnicalWP. pdf (2012).

[13] Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint arXiv:1411.1784 (2014).

[14] Isola, Phillip, et al. "Image-to-image translation with conditional adversarial networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.

[15] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.

[16] Panaretos, Victor M., and Yoav Zemel. "Statistical aspects of Wasserstein distances." Annual review of statistics and its application 6 (2019): 405-431.

[17] Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." arXiv preprint arXiv:1701.07875 (2017).

[18] Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." Advances in neural information processing systems. 2017.

[19] Wu, N., and V. Rathod. "Tensorflow detection model zoo." (2017).

[20] Wang, Ting-Chun, et al. "High-resolution image synthesis and semantic manipulation with conditional gans." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.

‧

[21] Pedersen, Malte, et al. "Detection of marine animals in a new underwater dataset with varying visibility." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2019.

[22] Li, Xirong, et al. "COCO-CN for Cross-Lingual Image Tagging, Captioning, and Retrieval." IEEE Transactions on Multimedia 21.9 (2019): 2347-2360.

[23] Kornblith, Simon, Jonathon Shlens, and Quoc V. Le. "Do better imagenet

models transfer better?." Proceedings of the IEEE conference on computer vision and pattern recognition. 2019.

[24] Teama, Tarek, et al. "Real Time Object Detection Based on Deep Neural Network." International Conference on Intelligent Robotics and Applications.

Springer, Cham, 2019.T Lin, Aire, et al. "Focal loss for dense object detection."

IEEE Trans. Pattern Anal. Mach. Intell. 42.2 (2020): 318-327.

[25] Lin, Aire, et al. "Focal loss for dense object detection." IEEE Trans. Pattern Anal. Mach. Intell. 42.2 (2020): 318-327.

[26] Akkaynak, Derya, and Tali Treibitz. "Sea-thru: A method for removing water from underwater images." Proceedings of the IEEE conference on computer vision and pattern recognition. 2019.

[27] Kolsur, Anoop, Sandeep Awale, and Nagraj Ullagaddi. "POV: Persistence of Vision."

[28] Mittal, Anish, Rajiv Soundararajan, and Alan C. Bovik. "Making a “completely blind” image quality analyzer." IEEE Signal processing letters 20.3 (2012): 209-212.

[29] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "YOLOv4:

Optimal Speed and Accuracy of Object Detection." arXiv preprint arXiv:2004.10934 (2020).

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

附錄

1. 資料集範例：魚

2. 資料集範例：水母

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

3. 資料集範例：潛水員

在文檔中應用於水下休閒活動之即時物件偵測系統 - 政大學術集成 (頁 55-0)

第四章 研究過程與結果分析

4.4 研究結果之應用

國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

4.4. 研究結果之應用

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

第五章

結論與未來研究方向

5.1. 結論

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

5.2. 未來研究方向

‧

‧

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

附錄

1. 資料集範例：魚

2. 資料集範例：水母

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

3. 資料集範例：潛水員

第四章研究過程與結果分析

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學