81
8. Conclusion and Future Work
8.1. Conclusion
This thesis proposes a framework for region-based image retrieval. Region- based image retrieval, regarding it as a special type of content-based image retrieval, fulfills the image understanding by use of the region-based representation such that the retrieval task could be more accurate. Generally speaking, a good region-based image retrieval system needs to involve three major issues – (i) how to represent images according to the segmented regions and extracted visual features, (ii) how to compare and match images based on the image representation, and (iii) how to interactively estimate the user intention according to the user feedbacks. Specially, the problem of semantic gap, between low-level visual features of images and high-level concepts in the human intention, is one of the most challenging issues in region-based image retrieval. Many researchers have paid much attention to different issues of region-base image retrieval. Unfortunately, it is still far from the success for the problem.
Our works aims to handle the problem of semantic gap through the three issues
in region-based image retrieval. A literature review is provided in Chapter 2 to show
the state-of-the-art approaches of these issues. In the beginning, we propose the color-
size features, in Chapter 3, which integrate both color and region-size information for
images. Two types of region-based image representation have been presented – using
visual-word-based image features and using semantic-based image features in Chapter
4 and 5, respectively. For the former, the visual-word-based image feature is built
82
according to the low-level features, and it could be categorized into middle-level information of images. On the other hand, the semantic-based image feature, for the latter, is generated by use of the results of image annotation Moreover, we propose an interactively approach, in Chapter 7, to estimating the user intention according to positive examples in relevance feedbacks, and then fulfill the image matching and ranking by use of the similarity measure between two images.
In this thesis, we try to solve the problem of semantic gap in the following two ways. The first is to construct a scheme for region-based image representation. On one hand we design the visual-word-based image feature for providing the representing units for the user intention in the visual feature space, and on the other the semantic-based image feature could discover the semantic contents by use of the image annotation. The second is to estimate what the user requests the query involves in a query session. We design an interactive approach to estimate the user intention using the previous two types of image representation.
8.2. Future Work
In this thesis, our proposed framework for region-based image retrieval can be used to applications of image retrieval. For example, TRECVID and CLEF are two famous contests for image indexing and retrieval. TRECVID aims to promote progress in content-based retrieval from digital video via open, metrics-based evaluation [TREC], and we are planning to attend TRECVID in the next year. Cross- Language Evaluation Forum (CLEF) offers a series of evaluation tracks to test different aspects of cross-language information retrieval system development [CLEF].
Besides, in our work, the semantic gap in image retrieval may be reduced, but not
83