Previous Relevance Feedback Works - 醫學影像資料庫之研究

Chapter 1 Introduction

4.1 Previous Relevance Feedback Works

The original relevance feedback method, in which the vector model [8] [59]

[61] is used for document retrieval, can be illustrated by the Rocchio’s formula [54]

as following equations of relevant documents D’R and nonrelevant documents D’N given by the user, the optimal new query, Q^’, is the one that is moved toward positive example points and away from negative example points. This technique is also implemented in many content-based image retrieval systems [30] [36]. Experiments show that the retrieval performance can be improved considerably by using this approach.

The weighting method (e.g., [30] [57] [58]) adjusts larger weight with more

important dimensions and smaller weight with less important ones. For example, [58] generalizes a relevance feedback framework of the low-level feature based relevance feedback methods. And ideal query vector for each feature i is described by the weighted sum of all positive feedback images as

∑

= _n

j j i T i

q Y

π ,

Where Yi is the n × Ki ( Ki is the length of feature i) training sample matrix for feature I obtained by stacking the n positive feedback training vectors Xi+

into a matrix. The n element vector π = [π1, π2,…πn] represents the degree of relevance of each of the n positive feedback images, which can be determined by the user at each feedback interaction. The system then uses q_i as the optimal query to evaluate the relevance of the images in database. This strategy is widely used by many other image retrieval and relevance feedback systems [30][57][58].

Quicklook [11], the innovative part of the system is the relevance feedback method. After the relevant images are selected, each of their features contributes to the new query feature vector if its distance to the average over the relevant images' feature is sufficiently large (three times the standard deviation). The new query feature vector is the average of the contributing features.

ImageRETRO [71] let Is be the image set after s reductions (filtering) and let F denote the set of 10 color features described. The image set is clustered based on an automatically selected feature subset Fs of F. The images from Is are ranked independently for each feature in Fs, and each such ranking is divided into 4 clusters (corresponding to a reduction of 25%) and each cluster centroid is chosen as the cluster representative. The union of these representatives for all rankings forms the representative set of Is, which will be shown to the user for the next reduction. The choice of feature subset Fs at stage s in the retrieval process is based on statistical

analysis. For each feature, the variance of the feature values of all images is computed and Fs are made of the features with highest variances that do not highly correlate with each other.

The Webseek [65] let user has the possibility of selecting positive and negative examples from the result of a query in order to reformulate the query. If the set of relevant images is denoted by Ir and the set of nonrelevant images is In then the new query histogram at the (k+1)th feedback iteration is computed by

∑

∈

∑

∈

Bayesian estimation methods have been used in the probabilistic approaches to relevance feedback. Cox et al. [40], Vasconcelos and Lippman [70], Meilhac and Nastar [38] all used Bayesian learning to incorporate user feedbacks to update the probability distribution of all the images in the database. They consider the feedback examples as a sequence of independent queries and try to minimize the retrieval error by Bayesian rules. That is, given a sequence of queries, they try to minimize the probability of retrieval error as

)}. a prior belief about the ability of the ith image class to explain the queries.

PicHunter [13] implements a probabilistic relevance feedback mechanism, which tries to predict the target image the user wants based on his actions (the images he selects as similar to the target in each iteration of a query session). A vector is used for retaining each image's probability of being the target. The vector is updated during each iteration of the relevance feedback, based on the history of the session (images displayed by the system and user's actions in previous iterations).

The updating formula is based on Bayes' rule. If the n database images are noted Tj, j=1,…,n, and the history of the session through iteration t is denoted

H_t={D₁,A₁,D₂,A₂,……,D_t,A_t}, with Dj and Aj being the images displayed by the system and, respectively, the action taken by the user at the iteration j, then the iterative update of the probability estimate of an image Ti being the target, given the history Ht, is:

in computing the probability of a user to take a certain action At given the history so far and the fact that the target is indeed Ti, namely P(At|T=Ti,Dt,Ht-1), a few models were tested. One approach is to estimate the probability of the user to pick an image Xa from X1,…,Xnt by

and in the case of choosing any number of images, to assume that each image is selected independently acording to a psoftmin.

Efforts have also been made to address the problem of slow response time in content-based image retrieval, the problem being caused mainly by the high dimensionality of the feature space, typically hundreds to thousand. Ng and Sedighian [43] made direct use of eigenimages, a method from face recognition [34], to carry out the dimension reduction, Faloutsos and Lin [18], Chandrasekaren et al.

[10] and Brunelli and Mich [53] used principal Component analysis (PCA) to perform the dimension reduction in feature spaces. Experimental results in these works show that most real image feature sets can be considerably reduced in dimension without significant degradation in retrieval quality.

Previous researches only allow user marked positive examples and/or negative

examples of results. It is successfully used in the document vector based, because they rely on the keyword vectors. In the image features, they contain variety different features, like as color, shape, and textures etc. User may concentrate on the similar color than we may give color more significant weighting. In this paper we proposed a more robust relevance feedback mechanism to adjust the weighting of different features.

在文檔中醫學影像資料庫之研究 (頁 66-70)