Deep Learning of Binary Hash Codes for Fast Image Retrieval

(1)

Deep Learning of Binary Hash Codes for Fast Image Retrieval

Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-Song Chen

Institute of Information and Science (IIS), Academia Sinica, Taipei, Taiwan

Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan Yahoo! Taiwan

IEEE Conference on Computer Vision and Pattern Recognition, DeepVision Workshop, 2015.

Extended version in arXiv Pre-print 1507.00101 ¹

(2)

Large-scale Image Search

Query

Top Retrieved Image

Database

(3)

Search Strategy

• Images are represented by features.

• Nearest neighbor search: neighbors of a point are determined by Euclidean distance.

• Challenge: How to efficiently search over millions or billions of images?

3

Query Image Images in Database

features features features

(4)

Solution: Binary Codes

• Images are represented by binary codes.

• Fast search can be carried out via Hamming distance measurement. (XOR operation)

Query Image Images in Database

features features features

111000 111000 111110

(5)

Related works: Learning Binary Codes (1/2)

• Unsupervised learning

– Use only training data, no label info.

– Locality-Sensitive Hashing (LSH) [1]

– Iterative quantization (ITQ) [2]

• Supervised learning

– Use supervised info. (labels, pairwise similarities, ...) – Binary reconstructive embedding (BRE) [3]

– Minimal loss hashing (MLH) [4]

5 [1] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Proc.

FOCS, 2006.

[2] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, 2013.

[3] B. Kulis and T. Darrell, “Learning to hash with binary reconstructive embeddings,” in Proc. NIPS, 2009.

[4] M. Norouzi and D. J. Fleet, “Minimal loss hashing for compact binary codes,” in Proc. ICML, 2011.

1 0 0 0 1 0 0 0 1

(6)

Related works: Learning Binary Codes (2/2)

• Supervised deep Learning

– Take advantage of deep neural network

– Convolutional Neural Network Hashing (CNNH) [5]

– Deep Neural Network Hashing (DNNH) [6]

[5] R. Xia, Y. Pan, H. Lai, C. Liu, and S. Yan. Supervised hashing for image retrieval via image representation learning. In Proc.

AAAI Conference on Artificial Intelligence (AAAI), 2014.

[6] H. Lai, Y. Pan, Y. Liu, and S. Yan. Simultaneous feature learning and hash coding with deep neural networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

(7)

Goal

• Can we take the advantage of deep CNN to achieve hashing?

• Can we generate the binary compact codes directly from the deep CNN?

• Solution: Supervised Semantic-preserving Deep Hashing (SSDH)

7

(8)

Approach

• Assume the classification outputs rely on a set of hidden attributes on and off.

0 1

1 1

1 0

1 1

Classification output 1

1 1

0

Input image Attributes

(9)

Approach

• We add the fully connected latent layer H between F7 and F8.

• The neurons in H are activated by sigmoid functions.

9

(10)

Approach

• Overall learning objective

(11)

Approach

• Compute binary codes for fast image retrieval

11

(12)

Experiments

• Datasets

CIFAR10 MNIST SUN397 Yahoo-1M

(13)

Experiments

• CIFAR10

• D

• MNIST

13

(14)

Experiments

(15)

Experiments

15

(16)

Experiments

• SUN397

16 [7] G. Lin, C. Shen, Q. Shi, A. van den Hengel, and D. Suter, “Fast supervised hashing with decision trees for high-dimensional data,” in Proc. CVPR, 2014

[8] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, 2013.

[9] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Proc.

[7]

[8]

[9]

(17)

Experiments

• Yahoo-1M

17

(18)

Experiments

Query Top 5 Retrieved Image

AlexNetOursAlexNetOurs

(19)

Experiments

• Image classification results

19

(20)

Experiments

(21)

Experiments

21 [64] M. D. Zeiler and R. Fergus, “Stochastic pooling for regularization of deep convolutional neural networks,” in Proc. ICLR, 2013.

[65] J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” in Proc. NIPS, 2012.

[66] M. Lin, Q. Chen, and S. Yan, “Network in network,” in Proc. ICLR, 2014.

[67] Z. Jie and S. Yan, “Robust scene classification with cross-level LLC coding on CNN features,” in Proc. ACCV, 2014

[68] Y. Gong, L. Wang, R. Guo, and S. Lazebnik, “Multi-scale orderless pooling of deep convolutional activation features,” in Proc. ECCV, 2014.

(22)

Experiments

• Search with 64-bit codes is approximately

~982,600x faster than traditional exhaustive search with 4096-dimensional deep features.

Descriptor Measurement Time CNN-fc7-4096 Euclidean

distance 22.6 μs SSDH-64 Hamming

distance 23.0 ps

(23)

Conclusion

• SSDH constructs hash functions as a latent layer between the feature layer and classification

layer in a network.

• SSDH jointly learn binary codes, features, and classification by optimizing the parameters of the network with the proposed objective

function.

• SSDH is scalable to large scale search.

23

(24)

Application: Mobile clothing search

• The technology has been integrated in

• Download Apps

(25)

Thank you!

Download our codes and models at

https://github.com/kevinlin311tw/caffe-cvprw15

25