• 沒有找到結果。

Deep Learning of Binary Hash Codes for Fast Image Retrieval

N/A
N/A
Protected

Academic year: 2022

Share "Deep Learning of Binary Hash Codes for Fast Image Retrieval"

Copied!
25
0
0
顯示更多 ( 頁)

全文

(1)

Deep Learning of Binary Hash Codes for Fast Image Retrieval

Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-Song Chen

Institute of Information and Science (IIS), Academia Sinica, Taipei, Taiwan

Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan Yahoo! Taiwan

IEEE Conference on Computer Vision and Pattern Recognition, DeepVision Workshop, 2015.

Extended version in arXiv Pre-print 1507.00101 1

(2)

Large-scale Image Search

Query

Top Retrieved Image

Database

(3)

Search Strategy

• Images are represented by features.

• Nearest neighbor search: neighbors of a point are determined by Euclidean distance.

• Challenge: How to efficiently search over millions or billions of images?

3

Query Image Images in Database

features features features

(4)

Solution: Binary Codes

• Images are represented by binary codes.

• Fast search can be carried out via Hamming distance measurement. (XOR operation)

Query Image Images in Database

features features features

111000 111000 111110

(5)

Related works: Learning Binary Codes (1/2)

• Unsupervised learning

– Use only training data, no label info.

– Locality-Sensitive Hashing (LSH) [1]

– Iterative quantization (ITQ) [2]

• Supervised learning

– Use supervised info. (labels, pairwise similarities, ...) – Binary reconstructive embedding (BRE) [3]

– Minimal loss hashing (MLH) [4]

5 [1] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Proc.

FOCS, 2006.

[2] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, 2013.

[3] B. Kulis and T. Darrell, “Learning to hash with binary reconstructive embeddings,” in Proc. NIPS, 2009.

[4] M. Norouzi and D. J. Fleet, “Minimal loss hashing for compact binary codes,” in Proc. ICML, 2011.

1 0 0 0 1 0 0 0 1

(6)

Related works: Learning Binary Codes (2/2)

• Supervised deep Learning

– Take advantage of deep neural network

– Convolutional Neural Network Hashing (CNNH) [5]

– Deep Neural Network Hashing (DNNH) [6]

[5] R. Xia, Y. Pan, H. Lai, C. Liu, and S. Yan. Supervised hashing for image retrieval via image representation learning. In Proc.

AAAI Conference on Artificial Intelligence (AAAI), 2014.

[6] H. Lai, Y. Pan, Y. Liu, and S. Yan. Simultaneous feature learning and hash coding with deep neural networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

(7)

Goal

• Can we take the advantage of deep CNN to achieve hashing?

• Can we generate the binary compact codes directly from the deep CNN?

• Solution: Supervised Semantic-preserving Deep Hashing (SSDH)

7

(8)

Approach

• Assume the classification outputs rely on a set of hidden attributes on and off.

0 1

1 1

1 0

1 1

Classification output 1

1 1

0

Input image Attributes

(9)

Approach

We add the fully connected latent layer H between F7 and F8.

The neurons in H are activated by sigmoid functions.

9

(10)

Approach

• Overall learning objective

(11)

Approach

• Compute binary codes for fast image retrieval

11

(12)

Experiments

• Datasets

CIFAR10 MNIST SUN397 Yahoo-1M

(13)

Experiments

• CIFAR10

D

• MNIST

13

(14)

Experiments

(15)

Experiments

15

(16)

Experiments

• SUN397

16 [7] G. Lin, C. Shen, Q. Shi, A. van den Hengel, and D. Suter, “Fast supervised hashing with decision trees for high-dimensional data,” in Proc. CVPR, 2014

[8] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, 2013.

[9] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Proc.

[7]

[8]

[8]

[9]

(17)

Experiments

• Yahoo-1M

17

(18)

Experiments

Query Top 5 Retrieved Image

AlexNetOursAlexNetOurs

(19)

Experiments

• Image classification results

19

(20)

Experiments

• Image classification results

(21)

Experiments

• Image classification results

21 [64] M. D. Zeiler and R. Fergus, “Stochastic pooling for regularization of deep convolutional neural networks,” in Proc. ICLR, 2013.

[65] J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” in Proc. NIPS, 2012.

[66] M. Lin, Q. Chen, and S. Yan, “Network in network,” in Proc. ICLR, 2014.

[67] Z. Jie and S. Yan, “Robust scene classification with cross-level LLC coding on CNN features,” in Proc. ACCV, 2014

[68] Y. Gong, L. Wang, R. Guo, and S. Lazebnik, “Multi-scale orderless pooling of deep convolutional activation features,” in Proc. ECCV, 2014.

(22)

Experiments

• Search with 64-bit codes is approximately

~982,600x faster than traditional exhaustive search with 4096-dimensional deep features.

Descriptor Measurement Time CNN-fc7-4096 Euclidean

distance 22.6 μs SSDH-64 Hamming

distance 23.0 ps

(23)

Conclusion

• SSDH constructs hash functions as a latent layer between the feature layer and classification

layer in a network.

• SSDH jointly learn binary codes, features, and classification by optimizing the parameters of the network with the proposed objective

function.

• SSDH is scalable to large scale search.

23

(24)

Application: Mobile clothing search

• The technology has been integrated in

• Download Apps

(25)

Thank you!

Download our codes and models at

https://github.com/kevinlin311tw/caffe-cvprw15

25

參考文獻

相關文件

Ward, Fast Robust Image Registration for Compositing High Dynamic Range Photographs from Hand-held Exposures,

We augment each image with auxiliary visual words (features) by con- sidering semantically related VWs in its textual cluster and representative VWs in its visual cluster!. When

• learning data representation: Using deep neural networks for fea- ture extraction, our proposed deep reinforcement learning model is the first model that automatically learns

Exten- sive experimental results justify the validity of the novel loss function for making existing deep learn- ing models cost-sensitive, and demonstrate that our proposed model

In order to encourage the reinforcement learning agent to discover positive symptoms more quickly, a simple heuristic is to provide the agent with an auxiliary piece of reward when

D. Existing cost-insensitive active learning strategies 1) Binary active learning: Active learning for binary classification (binary active learning) has been studied in many works

Since we target a general framework for serving different appli- cations, we will first adopt the proposed method to visual domain for image object retrieval in Section VIII-A and

Agent: Okay, I will issue 2 tickets for you, tomorrow 9:00 pm at AMC pacific place 11 theater, Seattle, movie ‘Deadpool’. User:

• User goal: Two tickets for “the witch” tomorrow 9:30 PM at regal meridian 16, Seattle. E2E Task-Completion Bot (TC-Bot) (Li et

 End-to-end reinforcement learning dialogue system (Li et al., 2017; Zhao and Eskenazi, 2016)?.  No specific goal, focus on

 Retrieval performance of different texture features according to the number of relevant images retrieved at various scopes using Corel Photo galleries. # of top

Examples of thermal image (left) and processed binary images (middle and right) of

Ongoing Projects in Image/Video Analytics with Deep Convolutional Neural Networks. § Goal – Devise effective and efficient learning methods for scalable visual analytic

○ Value function: how good is each state and/or action. ○ Policy: agent’s

Agent learns to take actions maximizing expected reward.. Machine Learning ≈ Looking for

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 3/24.:. Deep Learning Deep

Ward, Fast Robust Image Registration for Compositing High Dynamic Range Photographs from Hand-held Exposures,

Sergey Ioffe, Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”, 2015... Batch normalization

They are suitable for different types of problems While deep learning is hot, it’s not always better than other learning methods.. For example, fully-connected

◉ State-action value function: when using

“A feature re-weighting approach for relevance feedback in image retrieval”, In IEEE International Conference on Image Processing (ICIP’02), Rochester, New York,

Mehrotra, “Content-based image retrieval with relevance feedback in MARS,” In Proceedings of IEEE International Conference on Image Processing ’97. Chakrabarti, “Query

Kato, “Database architecture for content-based image retrieval,” SPIE Image Storage and Retrieval Systems, Vol.. Shu, “The Virage image search engine: An open framework for