Deep Learning of Binary Hash Codes for Fast Image Retrieval
Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-Song Chen
Institute of Information and Science (IIS), Academia Sinica, Taipei, Taiwan
Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan Yahoo! Taiwan
IEEE Conference on Computer Vision and Pattern Recognition, DeepVision Workshop, 2015.
Extended version in arXiv Pre-print 1507.00101 1
Large-scale Image Search
Query
Top Retrieved Image
Database
Search Strategy
• Images are represented by features.
• Nearest neighbor search: neighbors of a point are determined by Euclidean distance.
• Challenge: How to efficiently search over millions or billions of images?
3
Query Image Images in Database
features features features
Solution: Binary Codes
• Images are represented by binary codes.
• Fast search can be carried out via Hamming distance measurement. (XOR operation)
Query Image Images in Database
features features features
111000 111000 111110
Related works: Learning Binary Codes (1/2)
• Unsupervised learning
– Use only training data, no label info.
– Locality-Sensitive Hashing (LSH) [1]
– Iterative quantization (ITQ) [2]
• Supervised learning
– Use supervised info. (labels, pairwise similarities, ...) – Binary reconstructive embedding (BRE) [3]
– Minimal loss hashing (MLH) [4]
5 [1] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Proc.
FOCS, 2006.
[2] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, 2013.
[3] B. Kulis and T. Darrell, “Learning to hash with binary reconstructive embeddings,” in Proc. NIPS, 2009.
[4] M. Norouzi and D. J. Fleet, “Minimal loss hashing for compact binary codes,” in Proc. ICML, 2011.
1 0 0 0 1 0 0 0 1
Related works: Learning Binary Codes (2/2)
• Supervised deep Learning
– Take advantage of deep neural network
– Convolutional Neural Network Hashing (CNNH) [5]
– Deep Neural Network Hashing (DNNH) [6]
[5] R. Xia, Y. Pan, H. Lai, C. Liu, and S. Yan. Supervised hashing for image retrieval via image representation learning. In Proc.
AAAI Conference on Artificial Intelligence (AAAI), 2014.
[6] H. Lai, Y. Pan, Y. Liu, and S. Yan. Simultaneous feature learning and hash coding with deep neural networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
Goal
• Can we take the advantage of deep CNN to achieve hashing?
• Can we generate the binary compact codes directly from the deep CNN?
• Solution: Supervised Semantic-preserving Deep Hashing (SSDH)
7
Approach
• Assume the classification outputs rely on a set of hidden attributes on and off.
0 1
1 1
1 0
1 1
Classification output 1
1 1
0
Input image Attributes
Approach
• We add the fully connected latent layer H between F7 and F8.
• The neurons in H are activated by sigmoid functions.
9
Approach
• Overall learning objective
Approach
• Compute binary codes for fast image retrieval
11
Experiments
• Datasets
CIFAR10 MNIST SUN397 Yahoo-1M
Experiments
• CIFAR10
• D
• MNIST
13
Experiments
Experiments
15
Experiments
• SUN397
16 [7] G. Lin, C. Shen, Q. Shi, A. van den Hengel, and D. Suter, “Fast supervised hashing with decision trees for high-dimensional data,” in Proc. CVPR, 2014
[8] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, 2013.
[9] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Proc.
[7]
[8]
[8]
[9]
Experiments
• Yahoo-1M
17
Experiments
Query Top 5 Retrieved Image
AlexNetOursAlexNetOurs
Experiments
• Image classification results
19
Experiments
• Image classification results
Experiments
• Image classification results
21 [64] M. D. Zeiler and R. Fergus, “Stochastic pooling for regularization of deep convolutional neural networks,” in Proc. ICLR, 2013.
[65] J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” in Proc. NIPS, 2012.
[66] M. Lin, Q. Chen, and S. Yan, “Network in network,” in Proc. ICLR, 2014.
[67] Z. Jie and S. Yan, “Robust scene classification with cross-level LLC coding on CNN features,” in Proc. ACCV, 2014
[68] Y. Gong, L. Wang, R. Guo, and S. Lazebnik, “Multi-scale orderless pooling of deep convolutional activation features,” in Proc. ECCV, 2014.
Experiments
• Search with 64-bit codes is approximately
~982,600x faster than traditional exhaustive search with 4096-dimensional deep features.
Descriptor Measurement Time CNN-fc7-4096 Euclidean
distance 22.6 μs SSDH-64 Hamming
distance 23.0 ps
Conclusion
• SSDH constructs hash functions as a latent layer between the feature layer and classification
layer in a network.
• SSDH jointly learn binary codes, features, and classification by optimizing the parameters of the network with the proposed objective
function.
• SSDH is scalable to large scale search.
23
Application: Mobile clothing search
• The technology has been integrated in
• Download Apps
Thank you!
Download our codes and models at
https://github.com/kevinlin311tw/caffe-cvprw15
25