• 沒有找到結果。

add realism to the synthetic images preserves the annotation information

N/A
N/A
Protected

Academic year: 2022

Share "add realism to the synthetic images preserves the annotation information"

Copied!
33
0
0
顯示更多 ( 頁)

全文

(1)

.GCTPKPIHTQO5KOWNCVGF

7PUWRGTXKUGF+OCIGU

VJTQWIJ#FXGTUCTKCN6TCKPKPI



$TKCP*WCPI

(2)

Ruslan Salakhutdinov

@NIPS2016

(3)

Outline

– Paper’s contents – Intro

– S+U learning with SimGAN – Experiment

– Conclusion – Discussion

(4)

Intro

(5)

Intro

– Synthetic images are useful but not good enough – Improve the simulator

– Improve the realism of synthetic images using U

– Labeled synthetic images (generated by computer) – Unlabeled real data

– Train a network to do this

– The network using adversarial loss

(6)

SimGAN Architecture

(7)

– The algorithm

– The loss functions – Training tricks

– Local adversarial loss

– Update D using the history of refined images

S+U learning with SimGAN

(8)

Some notations

(9)

Algorithm

___________________________________

___________________________________

____________________

(10)

Algorithm

(11)

Loss functions - Discriminator

_________ _____________

probability of

being synthetic probability of being real

Cross Entropy

D’s Target: 0 for every y, 1 for every x̃

(12)

Loss functions - Refiner network

_______________ ________________

add realism to the synthetic images

preserves the annotation information

D’s Target:

0 for every y 1 for every x̃

(13)

Loss functions - Refiner network

need to fake D __________________

_______________________

L1 normalization

_______________ ________________

add realism to the

synthetic images preserves the annotation information

__________

D’s Target:

0 for every y 1 for every x̃

To preserve the annotations

(14)

Local adversarial loss

Summing all local adversarial loss as the final D loss

(15)

Update D

using a History of refined images

sample b/2 from current refined network and sample b/2 from buffer

then randomly replace b/2 images in buffer and the size of buffer fixed at B

with newly generated refined images B: the size of the buffer

b: the size of the mini-batch

(16)

Experiment

– Gaze estimation

– Hand pose estimation

– Analysis of modifications of adversarial training

(17)

UnityEyes: Synthetic images MPIIGaze: Real Images

Gaze estimation - dataset (UnityEyes)

(18)

Gaze estimation - dataset (MPIIGaze)

UnityEyes: Synthetic images MPIIGaze: Real Images

(19)

UnityEyes: Synthetic images MPIIGaze: Real Images

Gaze estimation - Qualitative result

(20)

Gaze estimation - visual Turing test

choose correct label 517 times out of 1000 trails, not outperforms than random (p = 0.148)

____ ____

____ ____

Synthetic images: choose correctly 162 times out of 200 trials (p < 10^(-8))

(21)

Gaze estimation - Quantitative result

Train on Refined Images and test on MPIIGaze

(22)

Gaze estimation - Quantitative result

(23)

Hand pose estimation - dataset

NYU Hand Pose Dataset:

contains real and synthetic

(24)

Hand pose estimation - dataset

NYU Hand Pose Dataset:

contains real and synthetic collected by Kinect

(25)

Hand pose estimation - Qualitative result

NYU Hand Pose Dataset:

contains real and synthetic

Kinect 拍出來來的照片有深度差距 因此 Real image 會有⽑毛邊

(26)

Hand pose estimation - Quantitative result

(27)

Analysis of modifications of adversarial training

- local adversarial loss

(28)

Analysis of modifications of adversarial training - History of refined images

減少偽造感

(29)

Let’s look back to intro

– Synthetic images not good enough – Improve the simulator

– Improve the realism of synthetic images using U Simulate

– Train a refine network to do this

– The network using adversarial loss Generative Adversarial Network

(30)

Conclusion

– Refine a Simulator's output with Unlabeled data – S+U can add realism into synthetic images

– S+U can also preserve annotations of synthetic images – Refined images really help improving the testing result – Generate > 1 images for each synthetic image

(31)

Discussion

– A signal from Apple

– Some future applications of SimGAN

將SimGANs應⽤用於基於無監督學習的⾃自主駕駛

想像你是Comma.AI的⼀一員,有⼤大量量的由Dash收集的真實的未標記的駕駛數據。雖然你當前的標記數據的⽅方法很 棒,但你只有少量量的有標記數據。使⽤用SimGAN,你可以訓練⼀一個改進神經網絡來來改進俠盜⾶飛⾞車車的數據(開發者 使⽤用俠盜⾶飛⾞車車遊戲模擬真實⾞車車輛⾏行行駛),使數據看起來來像是來來⾃自你的真實數據集,同時保留留標註。現在,你可以 在這個幾乎無限精細標記的數據集上訓練你的⽣生產模型,並使⽤用少量量的真實標記數據集作為驗證。

我沒有參參加任何⾃自動駕駛的課程,但我知道他們使⽤用俠盜⾶飛⾞車車和模擬環境來來訓練他們的模型。有了了這樣的技術,

他們的軟件可以更更接近現實世界。SimGAN在現實世界中似乎有許多可能的應⽤用,⾃自主駕駛只是我選擇使⽤用的⼀一 個有趣的例例⼦子。

(32)

Reference

– Paper: https://arxiv.org/pdf/1612.07828.pdf

– SimGAN Implemented in tensorflow: https://

github.com/carpedm20/simulated-unsupervised- tensorflow

– Comments from experts: http://mp.weixin.qq.com/

s/2Ltb249M71lMWrTbhnYPEQ

– Translation of the paper: http://tech.163.com/16/

1227/07/C99CBP7P00097U80.html

(33)

6JCPMUHQTNKUVGPKPI

#P[3WGUVKQPU!

參考文獻

相關文件

• Fredo Durand, Julie Dorsey, Fast Bilateral Filtering for the Display of High Dynamic Range Images SIGGRAPH the Display of High Dynamic Range Images, SIGGRAPH 2002. •

– Any set of parallel lines on the plane define a vanishing

Modeling and Rendering Architecture from Photographs: A Hybrid Rendering Architecture from Photographs: A Hybrid Geometry- and Image-Based Approach, SIGGRAPH 1996. Tour Into

Fig. 3 MR images of the mandible. a T1-weighted MR image showing a decrease in the signal intensity of the bone marrow in the left lower premolar and molar regions, except for

The acquisition slab was oriented in the transverse direction on the sagittal and coronal scout images so that both sides of the trigeminal nerve could be included in the image.

Creating New Images From the Outline

T transforms S into a region R in the xy-plane called the image of S, consisting of the images of all points in S.... So we begin by finding the images of the sides

Use images to adapt a generic face model Use images to adapt a generic face model. Creating

– Change Window Type to Video Scene Editor – Select Add → Images and select all images – Drag the strip to the “1st Frame” in Layer

– For each image, use RANSAC to select inlier features from 6 images with most feature matches. •

Example: Image produced by a spherical mirror... 14.5 Spherical

• Similar to Façade, use a generic face model and view-dependent texture mapping..

Examples of thermal image (left) and processed binary images (middle and right) of

Drew, Removing Shadows From Images, ECCV 2002 Original Image Illumination invariant image.

• Decide the best sampling frequency by experimenting on 32 real image subject to synthetic transformations. (rotation, scaling, affine stretch, brightness and contrast change,

Carve if not photo- -consistent consistent Project to visible input images Project to visible input images. Multi-pass

Carve if not photo- -consistent consistent Project to visible input images Project to visible input images.. Multi-pass

To improve the quality of reconstructed full-color images from color filter array (CFA) images, the ECDB algorithm first analyzes the neighboring samples around a green missing

For example, both Illumination Cone and Quotient Image require several face images of different lighting directions in order to train their database; all of

A digital color image which contains guide-tile and non-guide-tile areas is used as the input of the proposed system.. In RGB model, color images are very sensitive

Mutual information is a good method widely used in image registration, so that we use the mutual information to register images.. Single-threaded program would cost

The files of the JPEG2000 format of the images from the camera are transmitted to the remote user; therefore the remote control and remote image store are obtained.. Key Word :

Using transient elastic waves in conjunction with Synthetic Aperture Focusing Technique (SAFT) may present the information of interior defects with scanning images.. The