• 沒有找到結果。

Similarity retrieval by 2D C-trees matching in image databases

N/A
N/A
Protected

Academic year: 2021

Share "Similarity retrieval by 2D C-trees matching in image databases"

Copied!
14
0
0

加載中.... (立即查看全文)

全文

(1)

ARTICLE NO. VC980377

Similarity Retrieval by 2D C-Trees Matching in Image Databases

Fang-Jung Hsu* and Suh-Yin Lee

Institute of Computer Science and Information Engineering, National Chiao Tung University, HsinChu, Taiwan

and Bao-Shuh Lin

Computer & Communication Research Laboratories, Industrial Technology Research Institute, HsinChu, Taiwan Received June 18, 1997; accepted February 26, 1998

Logical features are abstract representations of images at various levels of detail. Some logical features such as spa-The image retrieval based on spatial content is an attracting

task in many image database applications. The 2D strings pro- tial-location and spatial-relation [18] may be synthesized vide a natural way of constructing spatial indexing for images from primitive features, whereas others can only be ob-and support effective picture query. Nevertheless, the 2D string tained through considerable human involvement. Spatial is deficient in describing the spatial knowledge of nonzero sized constraint is a significant logical feature and is our focus objects with overlapping. In this paper, we use an ordered in this article.

labeled tree, a 2D C-tree, to be the spatial representation for

The intelligent image database system (IIDS) [4] pro-images and propose the tree-matching algorithm for similarity

vides high-level object-oriented search and supports spatial retrieval. The distance between 2D C-trees is used to measure

query. The spatial reasoning is based on a data to structure the similarity of images. The proposed tree comparison

algo-called 2D string [3] which preserves the objects’ spatial rithm is also modified to compute the partial tree distance

knowledge embedded in images. Each symbolic picture for subpicture query. Experimental results for verifying the

can be represented by a 2D string and a picture query can effectiveness of similarity retrieval by 2D C-trees matching are

also be specified by a 2D string. The problem of image presented.1998 Academic Press

retrieval then becomes a problem of 2D string subsequence matching [15]. Lee and Hsu [13] proposed 2D C-string 1. INTRODUCTION representation for nonzero sized objects with a set of spa-tial operators and a more efficient cutting mechanism. All Content-based image retrieval plays a principal activity the spatial relations among objects with efficient segmenta-in many application areas, such as picture archivsegmenta-ing and tion are preserved with 2D C-string representation. The communication systems, geographic information systems, problems of how to infer the spatial relations between biomedical, education, and home entertainment systems pictorial objects from a given 2D C-string in spatial reason-[1]. In a content-based image retrieval system, it is required ing and similarity retrieval are solved by using the ranking to effectively and efficiently retrieve information from the mechanism [14].

image repositories. Such a system helps users retrieve rele- In general, similarity retrieval is needed when users can-vant images based on their content [8, 20]. Current ap- not express queries in a precise way. The target of similarity proaches [6, 9, 18, 21] to content-based image retrieval retrieval for images is to retrieve the images that are most differ in terms of image features extracted, the level of similar to the query image. The similarity between two abstraction manifested in the features, and the desired patterns or pictures can be measured on the basis of the degree of domain independence. There are two major cate- maximum-likelihood or minimum-distance criterion [12]. gories of features: primitive and logical. Primitive, or low The similarity based upon the minimum-distance criterion level, image features such as object colors [6] and bound- has been proposed using the techniques of 2D string match-aries can be extracted automatically or semi-automatically. ing defined in terms of longest common subsequence [3]. However, the 2D string representation is deficient in de-scribing the spatial knowledge of the nonzero sized objects

* Corresponding author.

E-mail: [email protected]. with overlapping. The similarity retrieval based on 2D 87

1047-3203/98 $25.00 Copyright1998 by Academic Press All rights of reproduction in any form reserved.

(2)

C-strings investigated by Lee and Hsu instead adopts the maximum-likelihood approach in terms of maximum de-gree of object-pairs clique [14]. All the spatial relationships between object pairs, which is O(N2) for N objects in an

image, need to be reasoned first. The algorithm for similar-ity retrieval based on 2D C-strings actually finds a maxi-mum clique and becomes an NP-complete problem.

In this paper, we use a transformed structure of the 2D C-string, called 2D C-tree [10], to be the spatial representa-tion for images and propose the tree-matching algorithm

FIG. 1. A symbolic image and a query sketch.

for similarity retrieval. The 2D C-tree is an ordered labeled tree, which preserves the complete spatial knowledge among objects without spatial operators. The structural tree representation plays a significant role in retrieving

relation. The operator ‘‘:’’ denotes the ‘‘in the same set images by tree-matching. We briefly review the 2D string

as’’ relation. The symbolic picture f1 in Fig. 1a may be

indexing approach in the next section. In Section 3 the

represented as the 2D string (A5 D : E ,, B ,, C, A ,, basic structure of a 2D C-tree and a sample image

represen-B5 C ,, D : E) or as (A 5 DE ,, B ,, C, A ,, B 5 C ,, tation are introduced. The metric for tree distance

compu-DE), where the symbol ‘‘:’’ can be omitted and is omitted. tation is defined in Section 4. Then we propose a specific

The 2D string representation is also suitable for formu-tree-matching algorithm to solve the problem of image

lating picture queries. In fact, we can imagine that the retrieval in Section 5. The image retrieval algorithm is

query can be specified graphically, by drawing an iconic modified to compute the partial tree distance for subpicture

image on the screen of a computer. The graphic representa-query. This work is explored in Section 6. Simulation

re-tion, called an icon sketch, can be translated into 2D string sults for verifying the effectiveness of similarity retrieval

representation. For example, we may want to retrieve im-by 2D C-trees matching are presented in Section 7. Also,

ages satisfying a certain icon sketch q1as in Fig. 1b. Then

an experimental project applying the proposed algorithms

q1can be translated into the 2D string (A5 E ,, C, A ,,

to video information system is described in Section 8.

Fi-C,, E). This query string is a substring of the 2D string nally, conclusions are summarized in the last section.

representation of the example image f1. The problem of

image retrieval then becomes the problem of 2D string 2. 2D STRING INDEXING APPROACHES

subsequence matching.

However, the spatial operators of 2D strings are not Given a physical image at the pixel level, the objects and

sufficient to give a complete description of spatial knowl-their relative positions within the image can be extracted by

edge for images of arbitrary complexity. The 2D G-string using various image processing and understanding

tech-representation were proposed to handle more types of niques [17]. Although this task is computationally

expen-relations between pictorial objects [2], but they are not sive, it is performed only at the time of inserting images

economic for complex images in terms of storage space into the database. Moreover, this task may be carried out

efficiency and navigation complexity. Lee and Hsu [13] in an automated fashion or in a human-assisted

semi-auto-proposed 2D C-string representation with a set of spatial mated fashion, depending on the domain and the

complex-operators and a more efficient cutting mechanism. They ity of the images. A symbolic image is then obtained by

employed a characteristic set of spatial operators illus-associating a name with each of the domain objects thus

trated in Table 1 to give a complete description for images identified. An image composed of a set of graphic icons

of arbitrary complexity. that represents the symbolic objects is named an iconic

Basically, the 2D C-string approach performs a cut to image. The idea of representing physical images by iconic

handle the cases of objects with partly overlapping. The images is similar to the representation of documents by

global operators ‘‘,’’ and ‘‘u,’’ which are employed in the

index terms in bibliographic information systems. We use

original 2D string approach, handle the cases of nonover-the terms image, symbolic image, and iconic image

inter-lapping. The extended operators ‘‘5,’’ ‘‘[,’’ ‘‘%,’’ and ‘‘],’’ changeably in this article.

called the local operators, and a pair of separators ‘‘( )’’ The 2D string approach for spatial indexing was initially

handle the cases of overlapping. The picture f2in Fig. 2a

proposed by Chang et al. [3] to represent iconic images.

is similar to f1in Fig. 1a, except that the objects in f2are

Three spatial relation operators ‘‘,,’’ ‘‘5,’’ and ‘‘:’’ are

nonzero sized objects as opposed to point objects in f1. The

employed in 2D strings. The operator ‘‘,’’ denotes the

2D C-string representation of the picture f2is (A]D[EzBzC,

‘‘left-right’’ or ‘‘below-above’’ spatial relation. The

(3)

TABLE 1

The Definition of Characteristic Spatial Operators

Notation Condition Meaning

A, B Ae, Bb A disjoints B.

Au B Ae5 Bb A is edge to edge with B.

A5 B Ab5 Bb, and Ae5 Be A is the same as B.

A [ B Ab5 Bb, and Ae. Be A contains B and they have the

same begin-bound.

A ] B Ab, Bb, and Ae5 Be A contains B and they have the

same end-bound.

A % B Ab, Bb, and Ae. Be A contains B and they do not

FIG. 3. The signed 2D C-trees of image f2.

have the same bound. A / B Ab, Bb, Ae, Be A is partly overlapping with B.

Note. The notations Aband Ae(Bband Be) denote the values of

begin-and end-bounds of objects A begin-and B, respectively.

sponding association graph and becomes an NP-complete problem although there are some polynomial time algo-rithms for the average case. Therefore we explore a more intact without cutting because the case of partly overlap- efficient representation and a matching algorithm to solve

ping does not happen. the problem of image similar retrieval.

The 2D C-string is efficient in the representation and

manipulation of images, but it is not suitable in image 3. 2D C-TREE retrieval based on 2D string subsequence matching. For

example, we use a query sketch q2as in Fig. 2b, which is The 2D C-tree is an ordered labeled tree. We first

intro-a subpicture of f2. The 2D C-string of the query image, duce the basic structure of a 2D C-tree. The 2D C-tree

(A%E ,, C, A ,, C ,, E) of q2, is quite different in representation still employs the sparse cutting mechanism

the format from the 2D C-string of f2, due to the spatial to handle the case of symbolic images with partly

overlap-operators. The string q2is not a substring of the string f2 ping objects [10]. The cutting mechanism performs only

any longer. The operators are needed to handle the global essential cuttings to get rid of the ambiguity incurred due and local relations among symbolic objects in a 2D C- to partly overlapping. After cutting, an image is partitioned string and cannot be omitted. to some portions between two cuttings. All the portions Although the inference of the spatial relations between are sequentially linked to a root, R, which is initialized to objects from a given 2D C-string in spatial reasoning can represent the margin or boundary of the area covered by be solved by using the ranking mechanism [14], the compu- a given image.

tation of object ranks in a 2D C-string is somewhat compli- The original 2D C-tree representation, called the signed cated. Moreover, all the spatial relationships of objects 2D C-tree, is proposed with associated spatial operators. pairs, which is O(N2) for N objects in an image, are required

Each node with label, or symbol name, represents an object to be reasoned first by adopting the 2D longest common in the image. The link connecting two nodes, called the subsequence algorithm [15]. The algorithm for similarity signed link, is signed with the relation operator. For the retrieval actually finds a maximum clique of the corre- ordered subtree rooted at node S with n immediate descen-dants in the ordering s1, s2, . . . , sn, S, being the parent,

actually contains the local body consisting of all its immedi-ate child-nodes s1, s2, . . . , sn. The relation operator

be-tween node S and its first child-node s1is surely a local

operator that indicates the ensemble relationship between S and the local body consisting of all its child-nodes s1, s2,

. . . , sn. The relation operators between node S and other

child-nodes si(2 # i # n) are definitely global operators

that indicate the sibling relation between the child-node si(2# i # n) and the prior child-node si21of node S. The

signed 2D C-trees of f2are constructed as shown in Fig. 3.

However, a tree with signed links is somewhat unusual for general applications. The empty-node and set-node are employed in order to remove the relation operator from

FIG. 2. A symbolic image with nonzero sized objects and a

(4)

deleting. Inserting s as a child of r will make s become the parent of a consecutive subsequence of the current children of r.

These editing operations can be represented asa R b, where a is either L (null) or a label in tree T1 and b is

either L or a label in tree T2. We call a R b a relabel

operation ifa ? L and b ? L, a delete operation if b 5 L, and an insert operation if a 5 L. Let D be a cost function which assigns a nonnegative real number, referred as D(a R b), to each editing operation a R b. The cost

FIG. 4. The General 2D C-trees of image f2. can vary in different operations on different nodes. For example, a higher node in a tree has a greater weight than a lower one. Nevertheless, the cost of each editing operation on any node is set equal for simplicity in this operators. An empty-node is a pseudo node which is

la-paper. beled ‘‘«’’ and can be of various sizes. The relation operator

The cost function D of each editing operation is con-of the signed link can be removed by inserting some

suit-strained to be a distance metric [5]. That is, able empty-nodes. When the relation operators are

stripped off from a signed 2D C-tree, each node of the (i) D(a R b) $ 0, D(a R a) 5 0, positivity; transformed tree has at least two child-nodes, except the

(ii) D(a R b) 5 D(b R a), symmetry;

node that originally connects to its single child-node by

(iii) D(a R c) # D(a R b) 1 D(b R c), the ‘‘5’’ operator. The ‘‘5’’ operator possesses the

commu-triangle inequality.

tation law, which is different from other relation operators of the 2D C-tree. The objects which are connected with the ‘‘5’’ operator have the same begin-bound and end-bound. For the reasoning of spatial relationship among the nodes of the 2D C-tree, a special set-node is introduced for treating a set of lineage that each node has single child-node. A set-node is a multilabel node consisting of objects that have the same begin-bound and end-bound. The de-tailed transformation rules are investigated in [10]. The sample symbolic image f2 in Fig. 2a is represented in a

General 2D C-tree as shown in Fig. 4. 4. TREE METRIC

Ordered labeled trees are trees whose nodes are labeled and in which the left to right ordering among siblings is significant. The distance and/or similarity of such trees have many applications in computer vision, pattern recog-nition, programming compilation, and natural language processing [7]. The distance between two ordered trees is considered to be the weighted number of editing opera-tions required to transform one tree to another. Many algorithms have been developed for ordered labeled tree matching and comparison [23]. Currently the best algo-rithm for computing the editing distance was presented by Zhang and Shasha [22]. In this section we introduce the distance metric between trees to be the basis for presenting an elegant tree-matching algorithm in image retrieval.

Three kinds of editing operations of a labeled tree [23] are considered and illustrated in Fig. 5. Relabeling a node

s means changing the label on s. Deleting a node s means

making the children of s become the children of the parent

FIG. 5. Three editing operations on labeled tree.

(5)

Let E be a sequence e1, e2, . . . , ekof editing operations.

An E-derivation from tree A to tree B is a sequence of trees A0, A1, . . . , Ak such that A 5 A0, B 5 Ak, and

Ai21 R Ai, via editing operation eifor 1 # i # k. Then

the cost function D can be extended to the sequence of editing operations by letting

D(E) 5

o

uiE5u1D(ei). (1)

The editing distance between two trees is defined as minimum cost of the editing sequence that transforms one

FIG. 6. The 2D C-tree f2Xof Fig. 4a with postorder numbering.

tree to the other. Formally the editing distance between trees T1and T2is defined as

d(T1, T2)5 minhD(E) u E is an editing sequence

(2) tree. Since the nodes of a 2D C-tree may be empty-nodes from T1to T2j. or set-nodes, we must make a small but significant

modifi-cation of the tree-matching algorithm developed by Zhang The editing sequence can be treated as a mapping that and Shasha [22].

is a graphical specification of editing operations applied to Suppose that A is a node in tree T1. N(A) denotes the the nodes in the two ordered trees. Suppose that we have number of labels of node A. If A is an empty-node, A has a numbering mechanism, for example, the postorder num- a special label ‘‘«’’ and N(A) is one. If A is a set-node, bering for a tree. Let T [i] be the ith node of tree T in N(A) must be more than one. For an editing operation the postorder numbering. Formally, we define a triple (M, AR B, where B is a node in tree T2, the cost function

T1, T2) to be a mapping from T1to T2, where mapping M needs to be re-examined:

is the set of integer pairs (i, j) satisfying:

(1) The cost of a delete operation AR L, D(A R L), (1) 1# i # uT1u, 1 # j # uT2u; is defined as N(A). That is,D(A R L) 5 N(A).

(2) For two pairs of (i1, j1) and (i2, j2) in M, (2) The cost of an insert operationL R B, D(L R B),

is defined as N(B). That is,D(L R B) 5 N(B). (a) i15 i2if and only if j15 j2(one-to-one),

(3) The cost of a relabel operation AR B, D(A R B), (b) T1[i1] is to the left of T1[i2] if and only if T2[ j1] is

is defined as the larger of two numbers N(A/B) and to the left of T2[ j2] (sibling order preserved ),

N(B/A). Let N(A/B) represent the number of labels of (c) T1[i1] is an ancestor of T1[i2] if and only if T2[ j1]

node A which are differentiated from those of node B. is an ancestor of T2[ j2] (ancestor order preserved).

That is, D(A R B) 5 max{N(A/B), N(B/A)}. If one of (3) Let lca(i1, i2) represent the least common ancestor them is an empty-node, the cost is the number of labels of

node of i1 and i2. For three pairs of (i1, j1), (i2, j2), and the other node.

(i3, j3) in M, T1[lca(i1, i2)] is a proper ancestor of T1[i3] if

These cost functions defined above are still under the and only if T2[lca( j1, j2)] is a proper ancestor of T2[ j3].

constraints of distance metric. In the following some nota-Let M be a mapping from T1to T2. Let I and J be the

tions on trees are illustrated in Fig. 6 using the tree f2Xof

sets of unmatched nodes in T1 and T2, respectively. We

Fig. 4a with postorder numbering in the parenthesis as will use M instead of (M, T1, T2) if there is no confusion.

an example: Then we can define the cost of M:

(1) T [i]. The ith node in the tree T according to the D(M) 5

o

(i,j)[MD(T1[i]R T2[ j])1

o

i[ID(T1[i]R L) (3) left-to-right postorder numbering. (Ex. The label of T [2]

is E.) 1

o

j[JD(L R T2[ j]).

(2) u(i). The number of the leftmost leaf descendant

Hence, of the subtree rooted at T [i]. When T [i] is a leaf node,

u(i)5 i. (Ex.u(4)5 2; i.e., E is the leftmost leaf descendant d(T1, T2)5 minhD(M) u M is a mapping from T1to T2j. (4) of D.)

(3) ­(i). The depth of T [i]; it is the number of nodes

5. STRUCTURAL IMAGE RETRIEVAL

on the path from the root of tree T to node T [i], excluding

T [i]. (Ex.­(2) 5 3; i.e., the depth of E is 3.)

Now, we begin to introduce the tree matching algorithm

(6)

Pk(i) denotes the kth level predecessor of T [i], where the L

EMMA3. Let ip[ P(i) and jp[ P( j). Ifu(ip)?u(i),

oru( jp)?u( j), then

level is counted from node T [i] backward to the root. P(i)5 {Pk(i)u 1 # k # ­(i)}. (Ex. P(2) 5 {4, 5, 8}; i.e., the

ForestDist(u(ip) . . i,u( jp) . . j)5 min{

predecessors of E are D, A, and R.)

ForestDist(u(ip) . . i2 1,u( jp) . . j)1 D(T

1[i]R L),

(5) T [i . . j]. An ordered subforest of tree T induced

ForestDist(u(ip) . . i,u( jp) . . j2 1) 1 D(L R T

2[ j]), (9)

by the nodes numbered from i to j inclusive. If i. j, then

ForestDist(u(ip) . .u(i)2 1,u( jp) . .u( j)2 1) T [i . . j]5 B. (Ex. T [2 . . 6] includes E, «, D, A, and B.)

1 TreeDist(i, j)}. (6) Forest(i). An ordered subforest T [1 . . i]. (Ex.

Forest(4) includes«, E, «, and D.) Lemma 3 considers the distance between two forests.

The forest T1[u(ip) . . i] (T2[u( jp) . . j]) is led by the leftmost

(7) Tree(i). A subtree of T rooted at T [i]. T [u(i) . .

leaf node of a tree Tree(ip) (Tree( jp)) containing the

consid-i] will be referred to as Tree(i). (Ex. Tree(4) is equivalent

ered node T1[i] (T2[ j]) and concluded at T1[i] (T2[ j]). The

to T [2 . . 4].)

distance between T1[(u(ip) . . i] of T1and T2[(u( jp) . . j] of

(8) Size(i). The number of nodes in Tree(i). (Ex.

T2is the minimum of the three possible editing mapping Size(4)5 3.)

costs: delete T1[i] from T1, insert T2[ j] into T2, or substitute

(9) ForestDist(i9 . . i, j9 . . j). The distance between two the subtree Tree(i) of T

1by Tree( j) of T2.

subforests T1[i9 . . i] in T1 and T2[ j9 . . j] in T2. We use

For proofs of Lemmas 1, 2, and 3, refer to [22]. an abbreviated notation ForestDist(i, j) for the distance

The algorithm to compute the tree distance uses a dy-between T1[1 . . i] and T2[1 . . j] (see below).

namic programming style [16]. From Lemma 3 we observe (10) TreeDist(i, j). The distance between the subtree

that to compute TreeDist(ip, jp) we need in advance almost Tree(i) rooted at i in T1and the subtree Tree( j) rooted at

all values of TreeDist(i, j) where T1[ip] is the root of a

j in T2(see below).

subtree containing T1[i] and T2[ jp] is the root of a subtree

The following three lemmas are necessary for the tree containing T2[ j]. This suggests a bottom-up procedure for

distance computation algorithm. computing all subtree pairs.

ALGORITHM1. The computation of TreeDist(x, y). LEMMA1. Let ip[ P(i) and jp[ P( j). Then

Input: Two subtrees, Tree(x) rooted at x in tree T1and

(i) ForestDist(B, B) 5 0. (5)

Tree( y) rooted at y in tree T2.

(ii) ForestDist(u(ip) . . i,B) 5 ForestDist(u(ip) . . i2 1,

Output: The distance TreeDist(x, y).

B) 1 D(T1[i]R L). (6)

Begin (iii) ForestDist(B, u( jp) . . j) 5 ForestDist(B, u( jp) . .

ForestDist(B, B) 5 0;

j2 1) 1 D(L R T2[ j]). (7) for i :5u(x) to x

ForestDist(u(x) . . i,B) 5 ForestDist(u(x) . . i2 Case (i) requires no editing operation and is assigned 0

1,B) 1 N(i); for initialization. In (ii), the distance corresponds to the

for j :5u( y) to y cost of deleting a node T1[i] from a forest T1[u(ip) . . i].

ForestDist(B,u( y) . . j)5 ForestDist(B,u( y) . . The forest T1[u(ip) . . i] is led by the leftmost leaf node of

j2 1) 1 N( j);

a tree Tree(ip) containing node T

1[i] in T1 and concluded

for i :5u(x) to x at T1[i]. In (iii), the distance corresponds to the cost of

for j :5u( y) to y inserting a node T2[ j] into a forest T2[u( jp) . . j2 1] in T2.

ifu(i)5u(x) andu( j)5 u( y), then LEMMA 2. ForestDist(u(i) . . i, u( j) . . j) 5 TreeDist ForestDist(u(x) . . i,u( y) . . j)5 min{

(i, j)5 min{ ForestDist(u(x) . . i2 1,u( y) . . j)1 N(i),

ForestDist(u(x) . . i,u( y) . . j2 1) 1 N( j),

ForestDist(u(i) . . i2 1,u( j) . . j)1 D(T1[i]R L), ForestDist(u(x) . . i2 1,u( y) . . j 2 1) 1 ForestDist(u(i) . . i,u( j) . . j2 1) 1 D(L R T2[ j]), (8) max {N(A/B), N(B/A)}};

ForestDist(u(i) . . i2 1,u( j) . . j2 1) 1 D(T1[i]R T2[ j])}. TreeDist(i, j)5 ForestDist(u(x) . . i,u( y) . . j);

else

ForestDist(u(x) . . i,u( y) . . j)5 min{ Lemma 2 computes the distance between two subtrees

rooted at T1[i] in T1and T2[ j] in T2, respectively. Consider ForestDist(u(x) . . i2 1,u( y) . . j)1 N(i), ForestDist(u(x) . . i,u( y) . . j2 1) 1 N( j), the mapping of two roots, T1[i] and T2[ j]. The distance

between Tree(i) of T1 and Tree( j) of T2 is the minimum ForestDist(u(x) . .u(i)2 1,u( y) . .u( j)2

1) 1 TreeDist(i, j)}; of the three possible editing mapping costs: delete T1[i]

(7)

are n images in the database, P1, P2, . . . , Pn, and a query image Q. The most similar image(s) to Q is

{Piud(Pi, Q) is the minimum ofd(Pk, Q), 1# k # n}. 6. SUBPICTURE QUERY

Subpicture query is useful when a user cannot express queries in a precise way [5]. Sometimes we may ask ‘‘please

FIG. 7. The 2D C-trees of query sketch q2.

retrieve images that contain this specific subpicture,’’ or ‘‘I want some images that have some part like this query sketch.’’ For example, the query image q2in Fig. 2b is a

subpicture of f2 in Fig. 2a. An

approximate-tree-by-For two 2D C-trees, T1 and T2, rooted at R1 and R2,

example (ATBE) system [19] developed by Wang et al. respectively, the distance between them, denoted byd(T1,

manipulates the inexact query by approximate tree

match-T2), is computed as the value of TreeDist(R1, R2). We can

ing. But the cutting and pruning operations that remove use the algorithm to compute the distance of 2D C-trees

all the descendants of a node are somehow not suitable for for solving the picture query problem. Consider two

pic-subpicture query. The tree distance computation algorithm tures, P1 and P2. Two 2D C-tree representations of P1

proposed in the previous section not only can support a (P2), T1x (T2x), and T1Y (T2Y) along x-coordinate and

y-simple measure for similarity retrieval, but also it can be coordinate, respectively, are constructed. We define the

modified for a subpicture query. In essence, we adopt the distance between P1and P2as follows.

tree-matching algorithm and modify the cost functions of DEFINITION 1. The distance between two pictures P1 editing operations as required.

and P2, d(P1, P2), is d(T1x, T2x) * d(T1Y, T2Y). Ifd(T1x,

(1) The cost of delete operation is weighted zero.

Delet-T2x) is zero, then defined(P1, P2)5 d(T1Y, T2Y). On the

ing a symbol from a reference image means that this symbol contrary, if d(T1Y, T2Y) is zero, then define d(P1, P2) 5

does not appear in the query image. For subpicture query, d(T1X, T2X).

the symbols existing in the reference image may not be expressed in the query image or specified subpicture. In We use the example picture f2in Fig. 2a and query sketch

such a case, the superfluous symbols in reference image q2 in Fig. 2b to demonstrate the computation of picture

can be ignored on purpose when they do not appear in distance. The 2D C-trees of f2and q2along x-coordinate

the query image and are allowed to delete with zero cost. axis are in Figs. 4a and 7a correspondingly. The editing

The cost of editing operation AR L is weighted zero; i.e., distance between these two trees is the cost of editing

D(A R L) 5 0. operations required to transform f2xto q2x. At least two

editing operations are needed. That is, D(D R L) and (2) The cost of an insert operationL R B, i.e., D(L R D(B R «). So the tree distance ofd( f2x, q2x) is 2. Three B), is not changed because all the symbols of the query

delete operations, D(B R L), D(D R L), and D(« R L), image should be considered. That is, D(L R B) 5 N(B). are needed between f2Yin Fig. 4b and q2Yin Fig. 7b along (3) The cost of a relabel operation AR B, i.e., D(A R

y-coordinate. That is, the cost ofd( f2Y, q2Y) is 3. Finally, B), is slightly changed and is defined as the number of

the distance ofd( f2, q2) is 6. unmatched symbols in B, which do not appear in A. That

Moreover, in [22] Zhang and Shasha had defined an is,D(A R B) 5 N(B/A). One special case is for B 5 «. LR–keyroots set of tree T, LR–keyroots(T ), to efficiently The cost of D(A R «) is redefined to be 0 because the reduce the computation time of tree distance. The com- symbol(s) in A can be viewed as an empty-node in B. plexity of the algorithm is O(uT1u * uT2u * min(depth(T1),

leaves(T1)) * min(depth(T2), leaves(T2))). Let depth(T1) Obviously, the newly defined cost functions, called

par-tial cost functions, of editing operations do not obey the denote the depth of the tree T1and leaves(T1) denote the

number of leaf nodes of the tree T1. In general, the fast symmetry constraint of distance metric. Although the

de-lete operation is not the inverse function of insert operation algorithm takes O(n4) for computing the editing distance

between two trees consisting of n nodes. The parallel algo- any more, the constraint of a distance metric is not our major concern for subpicture query. The partial cost func-rithm is the time of complexity O(uT1u * uT2u) by using

O(min(uT1u, uT2u) * leaves(T1) * leaves(T2)) processors. tions directly affect computation of the tree distance, based

upon the lemmas in the previous section. In Lemma 1, the While all the tree distances between the query image

and the images in the database have been computed, the second sentence ForestDist(u(ip) . . i, B) is always zero

becauseD(T1[i]R L) 5 0. In Lemma 2 and Lemma 3, the

(8)

second value of the first statement in the minimum group, We use the example picture f2in Fig. 2a and query sketch

q2 in Fig. 2b to demonstrate the computation of partial

i.e.,D(T1[i]R L), is removed because the delete operation

is weighted zero also. Consequently, the algorithm of par- distance. The 2D C-trees of f2and q2are in Fig. 4 and Fig.

7, correspondingly. For computing the partial tree distance tial tree distance is modified as follows.

between f2xand q2x, the first editingD(D R L) is a delete

ALGORITHM 2. The computation of PartialTreeDist

operation having zero weight. The second editingD(B R

(x, y). «) is a special case of relabel operation also weighted zero.

So the tree distance of c( f2x, q2x) is 0 for x-coordinate

Input: Two subtrees, Tree(x) rooted at x in tree T1 and

direction. The costs of three delete operations,D(B R L),

Tree( y) rooted at y in tree T2.

D(D R L), and D(« R L), needed for transforming from

Output: The distance PartialTreeDist(x, y).

f2Y to q2Y along y-coordinate are all weighted zero. That

Begin

is, the cost ofc( f2Y, q2Y) is 0 also. Finally, the distance of

ForestDist(B, B) 5 0;

c( f2, q2) is 0. It means that q2 is a subpicture of f2with

for i :5u(x) to x

distance zero.

ForestDist(u(x) . . i,B) 5 0;

Analogously, the most similar image(s) that contains a for j :5u( y) to y

query subpicture Q from P1, P2, . . . , and Pnis

ForestDist(B, u( y) . . j) 5 ForestDist(B, u( y) . . j2 1) 1 N( j);

for i :5u(x) to x hPiu c(Pi, Q) is the minimum ofc(Pk, Q), 1# k # n}. for j :5u( y) to y

ifu(i)5u(x) andu( j)5 u( y), then 7. SIMULATION RESULTS

ForestDist(u(x) . . i,u( y) . . j)5 min{

ForestDist(u(x) . . i2 1,u( y) . . j), For verifying the effectiveness of similarity retrieval by

ForestDist(u(x) . . i,u( y) . . j2 1) 1 N( j), 2D C-trees matching, a test consisting of 10 simulation

ForestDist(u(x) . . i2 1,u( y) . . j 2 1) 1 pictures is evaluated. A symbolic image with random

spa-N( j/i)}; tial relationship among objects can be generated by

ran-PartialTreeDist(i, j) 5 ForestDist(u(x) . . i, dom generation of quadruple-values. Table 2 shows 10

u( y) . . j); random generated objects A through J with the bounds

else on x-axis and y-axis, respectively.

ForestDist(u(x) . . i,u( y) . . j)5 min{ We construct 10 simulation pictures, P1, P2, . . . , and

ForestDist(u(x) . . i2 1,u( y) . . j), P10. Without loss of generality, assume P1contains single

ForestDist(u(x) . . i,u( y) . . j2 1) 1 N( j), object, the first object (A). P2contains the first two objects

ForestDist(u(x) . .u(i)2 1,u( y) . .u( j)2 (A and B), and so on. The tenth picture P10 contains all 1) 1 PartialTreeDist(i, j)}; 10 objects. Fig. 8 depicts the symbolic images P3and P4.

End; It is interesting to find out that P3is a subpicture of P4.

It could be foreseen that a picture with less objects is a Then, we can use the partial tree-matching algorithm

subpicture of a picture with more objects in this experi-to compute the distance of 2D C-trees for solving the

ment; i.e., Piis always a subpicture of Pj, for i# j.

subpicture query problem. Letc(T1, T2) represent the

par-Then these 10 pictures are represented in 2D C-trees, tial tree distance between trees T1 and T2. The partial

respectively. The 2D C-trees of P3, referred to as T3Xand

distance between P1and P2is defined as follows.

T3Y, are shown in Fig. 9 and the 2D C-trees of P4in Fig. 10.

For illustrating the computations of tree distances among DEFINITION 2. The partial distance between two

pic-tures P1and P2,g(P1, P2), isg(T1x, T2x) *c(T1Y, T2Y). If these 10 pictures, the 2D C-tree representation is expressed

by a recursive sentence a(a1a2a3 . . . . an) for a node a,

c(T1x, T2x) is zero, then definec(P1, P2) 5 c(T1Y, T2Y).

On the contrary, ifc(T1Y, T2Y) is zero, then definec(P1, which has n immediate descendants in the orderinga1,a2,

a3, . . . , andan. For example, T3Xin Fig. 9 is represented

P2)5 c(T1X, T2X).

TABLE 2

The Simulation Data of Ten Objects

Label A B C D E F G H I J

x-axis 131, 347 165, 358 31, 641 192, 572 5, 358 80, 346 119, 470 213, 584 420, 610 364, 600 y-axis 167, 415 20, 332 217, 363 9, 241 36, 156 431, 466 208, 509 48, 50 355, 467 14, 545

(9)

FIG. 8. P3is a subpicture of P4.

FIG. 10. The 2D C-trees of P4. as R(C(«A(«B)B«)). Note that R, « and the bracket [ ]

denote the root of tree, an empty-node and a set-node, respectively. The 2D C-trees of the 10 simulation pictures

are constructed and listed in Fig. 11. the objects of Pi excluding the ith object. The distance

In the sample database, there are 10 generated pictures, between Piand Pjis always smaller than the distance be-P1, P2, . . . , and P10, as reference pictures. We also use Pi, tween Pi21and Pj. For two pictures Pkand Pk11containing

1# i # 10, as query picture to validate the correctness of more objects than Pj, the distance between Pk and Pj is

the matching algorithm. Listed in Table 3 is the exact query always smaller than the distance between Pk11and Pj. The

and we compute the tree distance d(reference, query). above statements confirm that the computation of tree Table 4 shows the subpicture query and we compute the distances is suitable for measuring the similarity between partial tree distancec(reference, query). two pictures. The smaller the distance between two

pic-There are some interesting observations in the simula- tures is, the more similar the two pictures are.

tion results: (3) It seems apparent that the partial tree distances in

Table 4 are not symmetric. The values of the lower-triangle (1) Table 3 shows that the tree distance computation

strictly obeys the constraints of distance metric. That is, for any Pi, Pj, and Pk,

(i) d(Pi, Pj)$ 0, andd(Pi, Pi)5 0 (positivity),

(ii) d(Pi, Pj)5d(Pj, Pi) (symmetry),

(iii) d(Pi, Pk) # d(Pi, Pj) 1 d(Pj, Pk) (triangle in-equality).

(2) For any Pj,

(i) if i, j, thend(Pi, Pj),d(Pi21, Pj);

(ii) if k. j, thend(Pk, Pj),d(Pk11, Pj).

Pjcontains the first j objects. Picontains the first i objects

and is a subset of objects of Pjif i , j. Pi21 contains all

FIG. 11. The 2D C-tree representations of 10 simulation pic-tures.

(10)

TABLE 3

The Tree Distance Result of Comparing Ten Simulation Pictures Reference d Query P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P1 0 9 42 110 169 255 418 594 750 1044 P2 9 0 12 56 100 168 304 456 594 858 P3 42 12 0 16 54 96 204 330 475 713 P4 110 56 16 0 16 40 120 220 345 551 P5 169 100 54 16 0 8 54 126 221 391 P6 255 168 96 40 8 0 20 70 143 285 P7 418 304 204 120 54 20 0 15 63 165 P8 594 456 330 220 126 70 15 0 20 88 P9 750 594 475 345 221 143 63 20 0 24 P10 1044 858 713 551 391 285 165 88 24 0

are almost the same as those of the tree distances in Table 8. PROTOTYPE SYSTEM 3. And the values of the upper-triangle are almost zero

We apply the above mechanisms to implement an inter-because the costs of delete operations are weighted zero.

active video information system in our experimental proj-Basically, the partial tree distance computation obeys the

ect [11]. We capture 48 streams from ‘‘The Lion King’’ distance metric except symmetry constraint due to the

par-cartoon produced by The Walt Disney Company and store tial cost functions defined. Note that some very small

non-the video data in AVI file format. Each stream takes about zero values appearing in the upper-triangle happen when

99 s and consists of about 1500 frames. Some key image the cutting causes some objects being segmented.

frames are identified in a human-assisted fashion for each (4) The value in the upper-triangle of Table 4 represents

of the video streams. Notes that this work can be benefited the partial tree distance between one (reference) picture

from the motion analysis of recorded scene. These key with more objects and another (query) picture with less

images become representative of the streams. There are

objects, i.e., c(Pj, Pi) for two pictures Pj and Pi, j . i.

351 key images in our experiment and some are listed in Since this value is zero or very closer to zero, Piis viewed

Fig. 12. For this popular animation, 78 roles are chosen to as a subpicture of Pj. For example,c(P4, P3)5 0 implies

be the objects, which are also extracted in human-assisted that P3 is a subpicture of P4 with zero cost. The result

fashion. These objects are represented by a set of designed complies with the fact of the simulation.

icons in the system. The objects and their bounding rectan-gles within images are also extracted after capturing the The above evidences validate the accountability of our

tree-matching algorithms and the effectiveness of similarity image from the source video. Each image containing about five objects in average is constructed into two 2D C-trees retrieval by 2D C-trees matching.

TABLE 4

The Partial Tree Distance Result of Comparing Ten Simulation Pictures Reference c Query P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P1 0 0 0 0 0 0 0 0 0 0 P2 9 0 0 0 0 0 0 0 0 0 P3 42 12 0 0 1 1 1 1 1 1 P4 110 56 16 0 3 2 2 2 2 2 P5 169 100 48 12 0 0 0 0 1 1 P6 255 168 96 36 8 0 0 0 1 1 P7 418 304 204 112 54 20 0 0 1 1 P8 594 456 330 209 126 70 15 0 2 2 P9 750 594 475 345 221 143 63 20 0 0 P10 1044 858 713 532 391 285 165 88 24 0

(11)
(12)

FIG. 13. An example query image.

along x- and y-axis directions independently and these 2D knowledge of the nonzero sized objects with overlapping. The similarity retrieval of images using 2D C-strings C-trees are stored associated with the source AVI file. The

number of objects within each image implies the number adopted the maximum-likelihood approach defined in terms of maximum degree of object-pairs clique. The algo-of nodes in its corresponding 2D C-trees.

The system supports single image query and frame se- rithm for similarity retrieval based on 2D C-strings actually finds a maximum clique and becomes an NP-complete quence query. The system allows users to draw a query

image by assembling object icons designed or use a before- problem though polynomial algorithm for average case is available.

hand query template consisting of a sequence of query

frames. An example query image is shown in Fig. 13. Two In this paper, we use an ordered labeled tree, 2D C-tree, to be the spatial representation for an image and 2D C-trees of each image in the query sequence is

con-structed first. Then, for each stream of video database, we propose the tree-matching algorithm for similarity re-trieval. The algorithm provides a simple fast comparison compute the partial distances between the stream and the

query sequence. An approximate sequence matching for computing tree distance among images. The computa-tion of distance between 2D C-trees can be used to measure (ASM) mechanism can compute the subsequence matching

distance. The stream with minimum distance represents the similarity of images with spatial constraint. This ap-proach provides an effective and efficient mechanism for the most similar stream for the query sequence. A result

of the query template in Fig. 13 is shown in Fig. 14. Our similarity retrieval in image databases. The tree distance comparison algorithm is also modified to compute the par-initial results validate the effectiveness of similarity

re-trieval by 2D C-trees matching. tial tree distance for subpicture query. We also validate the accountability of our tree-matching algorithms for simi-larity retrieval by simulation results. Moreover, the meth-9. CONCLUSIONS

odology of similarity retrieval is utilized in video sequence Similarity retrieval is one of the attracting functions of an matching in our video information retrieval project be-image database system that distinguishes it from traditional ing executed.

database systems. The goal is to retrieve the images that

are similar to the query image. The similarity retrieval APPENDIX: LIST OF SYMBOLS based upon the minimum-distance criterion had been

pro-posed in the techniques of 2D string matching defined in Ab the begin-bound of object A

Ae the end-bound of object A

terms of longest common subsequence. However, the 2D

(13)

FIG. 14. A result of the query template in Fig. 13.

Size(i) the number of nodes in Tree(i)

si the ith immediate descendant of node S

« empty-node uT u the number of nodes in the tree T

depth(T ) the depth of the tree T

T a rooted tree

T [i] the ith node of tree T in the postorder num- leaves(T ) the number of leaf nodes in the tree T

Pi the ith picture in the database

bering

e1. . ek the editing operations d(P1, P2) the tree distance between two pictures P1

and P2

D the cost function of editing operation

d(T1, T2) the tree distance between tree T1and tree T2 c(P1, P2) the partial distance between two pictures P1

and P2

lca(i, j) the least common ancestor node of T [i] and

T [ j]

N(T ) the number of symbols in tree T REFERENCES

u(i) the postorder number of the leftmost leaf

1. S. K. Chang, Principles of Pictorial Information Systems Design,

descendant in a subtree rooted at T [i]

Prentice–Hall, Englewood Cliffs, NJ, 1989.

­(i) the depth of T [i]

2. S. K. Chang, E. Jungert, and Y. Li, Representation and retrieval of

P(i) the set of predecessors of T [i]

symbolic pictures using generalized 2D strings, in SPIE Proc. on

Pk(i) the kth level predecessor of T [i]

Visual Communications and Image Processing, Philadelphia, 1989, T [i . . j] the ordered subforest of tree T induced by pp. 1360–1372.

the nodes numbered from T [i] to T [ j] in- 3. S. K. Chang, Q. Y. Shi, and C. W. Yan, Iconic indexing by 2-D strings, clusive IEEE Trans. Pattern Anal. Mach. Intell. PAMI-9, 1987, 413–428. Forest(i) an ordered subforest T [1 . . i] 4. S. K. Chang, C. W. Yan, D. C. Dimitrof, and T. Arndt, Intelligent

image database system, IEEE Trans. Software Eng. 14, 1988, 681–688.

(14)

5. P. Ciaccia, F. Rabitti, and P. Zezula, Similarity search in multimedia database systems, in Proceedings, The First International Conference on Visual Information Systems (VISUAL’96), Melbourne, Australia, 1996, pp. 107–115.

6. M. Flickner, H. Sawhney, W. Niblack J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, Query by image and video content: The QBIC system, IEEE Comput. 44, 1995, 23–32.

7. K. S. Fu, Syntactic Pattern Recognition and Application, Prentice– Hall, Englewood Cliffs, NJ, 1982.

FANG-JUNG HSU received his B.S. degree in computer science from 8. V. N. Gudivada and V. V. Raghavan, Content-based image retrieval the Soochow University and his M.S. degree in information science from systems, IEEE Comput. 44, 1995, 18–22. the Chiao Tung University, Taiwan in 1982 and 1989, respectively. He 9. A. Gupta, T. Weymouth, and R. Jain, Semantic queries with pictures: is currently a Ph.D. candidate in computer science and information engi-The VIMSYS model, in Proceedings, engi-The 17th International Confer- neering at Chiao Tung University. He has been a senior engineer at the ence on Very Large Data Bases, Barcelona, Spain, 1991, pp. 69– Computer Communication Research Laboratories (CCL) of the

Indus-79. trial Technology Research Institute (ITRI), Taiwan, since 1984. His

cur-rent research interests include multimedia information systems, image/ 10. F. J. Hsu, S. Y. Lee, and P. S. Lin, 2D C-tree spatial representation

spatial databases, object-oriented databases. for iconic image, in Proceedings, The 2nd International Conference

on Visual Information Systems, San Diego, CA, 1997, pp. 287– 294.

11. F. J. Hsu, S. Y. Lee, and P. S. Lin, Video data indexing by 2D C-trees, J. Vis. Lang. Comput., submitted.

12. H. V. Jagadish, A. O. Mendelzon, and T. Milo, Similarity-based queries, in Proceedings, The 14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, San Jose, CA, 1995, pp. 36–45.

13. S. Y. Lee and F. J. Hsu, 2D C-string: A new spatial knowledge

representation for image database systems, Pattern Recognit. 23, SUH-YIN LEE received her B.S.E.E. degree from the National Chiao 1990, 1077–1087. Tung University, Taiwan in 1972 and her M.S. degree in computer science 14. S. Y. Lee and F. J. Hsu, Spatial reasoning and similarity retrieval of from the University of Washington, Seattle in 1975. She joined the faculty images using 2D C-string knowledge representation, Pattern Recognit. of the Department of Computer Engineering at Chiao Tung University

25, 1992, 305–318. in 1976 and received the Ph.D. degree in electronic engineering there in

1982. Dr. Lee is now a professor in the Department of Computer Science 15. S. Y. Lee, M. K. Shan, and W. P. Yang, Similarity retrieval of iconic

and Information Engineering at Chiao Tung University. She chaired the image database, Pattern Recognit. 22, 1989, 675–682.

department from 1991 to 1993. Her current research interests include 16. U. Manber, Introduction To Algorithms: A Creative Approach,

multimedia information systems, object-oriented databases, image/spatial Addison–Wesley, Reading, MA, 1989.

databases, and computer networks. Dr. Lee is a member of Phi Tau Phi, 17. W. Rickert, Extracting area objects from raster image data, IEEE

the ACM, and the IEEE Computer Society. Comput. Graphics Appl. 13, 1993, 68–73.

18. A. Soffer and H. Samet, Pictorial queries by image similarity, in Proceedings, The 13th International Conference on Pattern Recogni-tion, Vienna, Austria, 1996, pp. 114–119.

19. J. S. Wang, K. Zhang, K. Jeong, and D. Shasha, A system for approxi-mate tree matching, IEEE Trans. Knowledge Data Eng. 6, 1994, 559–571.

20. K. Wakimoto, M. Shima, S. Tanaka, and A. Maeda, Content-based retrieval applied to drawing image database, SPIE 1908, 1993,

BAO-SHUH LIN (S’76-M’79-SM’89) received the Ph.D. degree in 74–84.

computer science from the University of Illinois, Urbana, IL in 1980. He 21. J. K. Wu, A. D. Narasimhalu, B. M. Mehtre, C. P. Lam, and Y. J. Gao, had been working as an R&D Manager of computer communication-CORE: A content-based retrieval engine for multimedia information related projects for AT&T Bell Laboratories, Racal Data Communica-systems, ACM Multimedia Syst. 3, 1995, 25–41. tions, Boeing, and Teknekron Communication Systems. He is currently 22. K. Zhang and D. Shasha, Simple fast algorithms for the editing dis- the Deputy General Director of the Computer Communication Research tance between trees and related problems, SIAM J. Comput. 18, Laboratories (CCL) of the Industrial Technology Research Institute 1989, 1245–1262. (ITRI), Taiwan, R.O.C. He is the director of many advanced projects in CCL, including high-performance computer systems, multimedia systems, 23. K. Zhang, Algorithms for the constrained editing distance between

Chinese information technologies, and advanced information technolo-ordered labeled trees and related problems, Pattern Recognit. 28,

gies. Dr. Lin is a senior member of IEEE Computer Society. 1995, 465–474.

數據

FIG. 2. A symbolic image with nonzero sized objects and a
FIG. 5. Three editing operations on labeled tree.
FIG. 6. The 2D C-tree f 2X of Fig. 4a with postorder numbering.
FIG. 8. P 3 is a subpicture of P 4 .
+5

參考文獻

相關文件

• Non-uniform space subdivision (for example, kd tree and octree) is better than uniform grid kd-tree and octree) is better than uniform grid if the scene is

Primal-dual approach for the mixed domination problem in trees Although we have presented Algorithm 3 for finding a minimum mixed dominating set in a tree, it is still desire to

• Definition: A max tree is a tree in which the key v alue in each node is no smaller (larger) than the k ey values in its children (if any). • Definition: A max heap is a

• Non-uniform space subdivision (for example, kd tree and octree) is better than uniform grid kd-tree and octree) is better than uniform grid if the scene is

The min-max and the max-min k-split problem are defined similarly except that the objectives are to minimize the maximum subgraph, and to maximize the minimum subgraph respectively..

Given a graph and a set of p sources, the problem of finding the minimum routing cost spanning tree (MRCT) is NP-hard for any constant p > 1 [9].. When p = 1, i.e., there is only

• Grow the binomial tree from these three nodes until time T to obtain a binomial-trinomial tree with..

The dynamic feature points are roughly clustered by the C-means algorithm and then a spatial-temporal shortest spanning tree is proposed to segment each