Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
109
PART II: Variants of Neural Networks
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
Widely used in image processing
110
Why CNN for Image
Can the network be simplified by considering the properties of images?
x1
x2
……
xN
…… …… ……
……
……
……
The most basic classifiers
Use 1st layer as module to build classifiers
Use 2nd layer as module ……
Represented as pixels
(Zeiler, M. D., ECCV 2014)
111
Why CNN for Image
Some patterns are much smaller than the whole image
A neuron does not have to see the whole image to discover the pattern.
“beak” detector
Connecting to small region with less parameters
112
Why CNN for Image
The same patterns appear in different regions.
“upper-left beak” detector
“middle beak”
detector
They can use the same set of parameters.
Do almost the same thing
113
Why CNN for Image
Subsampling the pixels will not change the object
subsampling bird
bird
We can subsample the pixels to make image smaller
Less parameters for the network to process the image
114
Three Steps for Deep Learning
Step 1:
define a set
of function
Step 2:
goodness of function
Step 3: pick the best function Deep Learning is so simple ……
Convolutional Neural Network
115
Image Recognition
116
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
The Whole CNN
Fully Connected Feedforward network
cat dog ……
Convolution
Max Pooling
Convolution
Max Pooling Flatten
Can repeat many times
117
The Whole CNN
Convolution
Max Pooling
Convolution
Max Pooling Flatten
Can repeat many times
Some patterns are much
smaller than the whole image
The same patterns appear in different regions
Subsampling the pixels will not change the object
Property 1
Property 2
Property 3
118
Image Recognition
119
Local Connectivity
120
Neurons connect to a small region
Parameter Sharing
The same feature in different positions
121
Neurons share the same weights
Parameter Sharing
Different features in the same position
122
Neurons have different weights
Convolutional Layers
123
depth
width width
depth
weights weights
height
shared weight
Convolutional Layers
124
c1
c2
b1
b2 a1
a2
a3
depth = 2 depth = 1
Convolutional Layers
125
c1 b1
b2 a1
a2
d1
b3 a3
c2
d2
depth = 2 depth = 2
Convolutional Layers
126
c1 b1
b2 a1
a2
d1
b3 a3
c2
d2
depth = 2 depth = 2
Convolutional Layers
127
A B C
A B C D
Hyper-parameters of CNN
Stride
128
Padding
0 0
Stride = 1
Stride = 2
Padding = 0
Padding = 1
Example
129
Output
Volume (3x3x2)
Input
Volume (7x7x3)
Stride = 2
Padding = 1
http://cs231n.github.io/convolutional-networks/
Filter (3x3x3)
Convolutional Layers
130
http://cs231n.github.io/convolutional-networks/
Convolutional Layers
131
http://cs231n.github.io/convolutional-networks/
Convolutional Layers
132
http://cs231n.github.io/convolutional-networks/
Convolutional Layers
133
http://cs231n.github.io/convolutional-networks/
Pooling Layer
134
1 3 2 4
5 7 6 8
0 0 3 3
5 5 0 0
4 5
5 3
7 8
5 3
Maximum Pooling
Average Pooling
Max(1,3,5,7) = 7 Avg(1,3,5,7) = 4
no overlap
no weights
depth = 1
Max(0,0,5,5) = 5
Why “Deep” Learning?
135
Visual Perception of Human
136
http://www.nature.com/neuro/journal/v8/n8/images/nn0805-975-F1.jpg
Visual Perception of Computer
137
Convolutional Layer
Convolutional
Layer Pooling Layer Pooling
Layer
Receptive Fields Receptive Fields Input
Layer
Visual Perception of Computer
138
Input Layer
Convolutional Layer with Receptive Fields:
Max-pooling Layer with
Width =3, Height = 3
Filter Responses
Filter Responses Input Image
Fully-Connected Layer
Fully-Connected Layers : Global feature extraction
Softmax Layer: Classifier
139
Convolutional
Layer Convolutional Layer
Pooling Layer Pooling
Layer Input
Layer Input
Image
Fully-Connected
Layer Softmax Layer
5 7
Class Label
Convolutional Neural Network
Step 1:
define a set
of function
Step 2:
goodness of function
Step 3: pick the best function
Convolutional Neural Network
CNN
“monkey”
“cat”
“dog”
Convolution, Max Pooling, fully connected
1 0
0
……
target
140
What CNN Learned
Alexnet
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
141
http://vision03.csail.mit.edu/cnn_art/data/single_layer.png
DNN are easily fooled
142
Nguyen et al., “Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images,” arXiv:1412.1897.
Visualizing CNN
143
CNN
CNN flower
random noise
filter response
filter response
filter response:
Gradient Ascent
Magnify the filter response
144
random noise:
score:
lower score
higher score
gradient:
filter response:
Gradient Ascent
Magnify the filter response
145
random noise:
gradient:
lower score
higher score
update
learning rate
Gradient Ascent
146
Different Layers of Visualization
147
CNN
Multiscale Image Generation
148
visualize resize
visualize resize
visualize
Multiscale Image Generation
149
Deep Dream
Given a photo, machine adds what it sees ……
http://deepdreamgenerator.com/
CNN
3.9
−1.5 2.3
⋮ Modify
image
CNN exaggerates what it sees
150
Deep Dream
Given a photo, machine adds what it sees ……
http://deepdreamgenerator.com/
151
Deep Style
Given a photo, make its style like famous paintings
http://deepdreamgenerator.com/
152
Deep Style
Given a photo, make its style like famous paintings
http://deepdreamgenerator.com/
153
Deep Style
A Neural Algorithm of Artistic Stylehttps://arxiv.org/abs/1508.06576
CNN CNN
content style
CNN
?
154
Neural Art Mechanism
155
Brain Artist
Scene Style ArtWork
Computer Neural Networks
Go Playing
Network
(19 x 19 positions) Next move19 x 19 vector Black: 1 white: -1
none: 0
19 x 19 vector
Fully-connected feedforward network can be used
But CNN performs much better.
19 x 19 matrix (image)
156
More Application: Playing Go
CNN
CNN
record of previous plays
Target:
“天元” = 1 else = 0
Target:
“五之 5” = 1 else = 0
Training:
黑: 5之五 白: 天元 黑: 五之5 …157
Why CNN for playing Go?
Some patterns are much smaller than the whole image
The same patterns appear in different regions
Alpha Go uses 5 x 5 for first layer
158
Why CNN for playing Go?
Subsampling the pixels will not change the object How to explain this???
159
PART II: Variants of Neural Networks
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
Neural Network with Memory
160
Example Application
Slot Filling
I would like to arrive Taipei on November 2nd.
ticket booking system
Destination:
time of arrival:
Taipei
November 2nd
Slot
161
Example Application
x1 x2
y2
y1
Taipei
Input: a word
(Each word is represented as a vector) Solving slot filling by
feedforward network?
162