ARTICLE NO. VC980377
Similarity Retrieval by 2D C-Trees Matching in Image Databases
Fang-Jung Hsu* and Suh-Yin LeeInstitute of Computer Science and Information Engineering, National Chiao Tung University, HsinChu, Taiwan
and Bao-Shuh Lin
Computer & Communication Research Laboratories, Industrial Technology Research Institute, HsinChu, Taiwan Received June 18, 1997; accepted February 26, 1998
Logical features are abstract representations of images at various levels of detail. Some logical features such as spa-The image retrieval based on spatial content is an attracting
task in many image database applications. The 2D strings pro- tial-location and spatial-relation [18] may be synthesized vide a natural way of constructing spatial indexing for images from primitive features, whereas others can only be ob-and support effective picture query. Nevertheless, the 2D string tained through considerable human involvement. Spatial is deficient in describing the spatial knowledge of nonzero sized constraint is a significant logical feature and is our focus objects with overlapping. In this paper, we use an ordered in this article.
labeled tree, a 2D C-tree, to be the spatial representation for
The intelligent image database system (IIDS) [4] pro-images and propose the tree-matching algorithm for similarity
vides high-level object-oriented search and supports spatial retrieval. The distance between 2D C-trees is used to measure
query. The spatial reasoning is based on a data to structure the similarity of images. The proposed tree comparison
algo-called 2D string [3] which preserves the objects’ spatial rithm is also modified to compute the partial tree distance
knowledge embedded in images. Each symbolic picture for subpicture query. Experimental results for verifying the
can be represented by a 2D string and a picture query can effectiveness of similarity retrieval by 2D C-trees matching are
also be specified by a 2D string. The problem of image presented. 1998 Academic Press
retrieval then becomes a problem of 2D string subsequence matching [15]. Lee and Hsu [13] proposed 2D C-string 1. INTRODUCTION representation for nonzero sized objects with a set of spa-tial operators and a more efficient cutting mechanism. All Content-based image retrieval plays a principal activity the spatial relations among objects with efficient segmenta-in many application areas, such as picture archivsegmenta-ing and tion are preserved with 2D C-string representation. The communication systems, geographic information systems, problems of how to infer the spatial relations between biomedical, education, and home entertainment systems pictorial objects from a given 2D C-string in spatial reason-[1]. In a content-based image retrieval system, it is required ing and similarity retrieval are solved by using the ranking to effectively and efficiently retrieve information from the mechanism [14].
image repositories. Such a system helps users retrieve rele- In general, similarity retrieval is needed when users can-vant images based on their content [8, 20]. Current ap- not express queries in a precise way. The target of similarity proaches [6, 9, 18, 21] to content-based image retrieval retrieval for images is to retrieve the images that are most differ in terms of image features extracted, the level of similar to the query image. The similarity between two abstraction manifested in the features, and the desired patterns or pictures can be measured on the basis of the degree of domain independence. There are two major cate- maximum-likelihood or minimum-distance criterion [12]. gories of features: primitive and logical. Primitive, or low The similarity based upon the minimum-distance criterion level, image features such as object colors [6] and bound- has been proposed using the techniques of 2D string match-aries can be extracted automatically or semi-automatically. ing defined in terms of longest common subsequence [3]. However, the 2D string representation is deficient in de-scribing the spatial knowledge of the nonzero sized objects
* Corresponding author.
E-mail: [email protected]. with overlapping. The similarity retrieval based on 2D 87
1047-3203/98 $25.00 Copyright1998 by Academic Press All rights of reproduction in any form reserved.
C-strings investigated by Lee and Hsu instead adopts the maximum-likelihood approach in terms of maximum de-gree of object-pairs clique [14]. All the spatial relationships between object pairs, which is O(N2) for N objects in an
image, need to be reasoned first. The algorithm for similar-ity retrieval based on 2D C-strings actually finds a maxi-mum clique and becomes an NP-complete problem.
In this paper, we use a transformed structure of the 2D C-string, called 2D C-tree [10], to be the spatial representa-tion for images and propose the tree-matching algorithm
FIG. 1. A symbolic image and a query sketch.
for similarity retrieval. The 2D C-tree is an ordered labeled tree, which preserves the complete spatial knowledge among objects without spatial operators. The structural tree representation plays a significant role in retrieving
relation. The operator ‘‘:’’ denotes the ‘‘in the same set images by tree-matching. We briefly review the 2D string
as’’ relation. The symbolic picture f1 in Fig. 1a may be
indexing approach in the next section. In Section 3 the
represented as the 2D string (A5 D : E ,, B ,, C, A ,, basic structure of a 2D C-tree and a sample image
represen-B5 C ,, D : E) or as (A 5 DE ,, B ,, C, A ,, B 5 C ,, tation are introduced. The metric for tree distance
compu-DE), where the symbol ‘‘:’’ can be omitted and is omitted. tation is defined in Section 4. Then we propose a specific
The 2D string representation is also suitable for formu-tree-matching algorithm to solve the problem of image
lating picture queries. In fact, we can imagine that the retrieval in Section 5. The image retrieval algorithm is
query can be specified graphically, by drawing an iconic modified to compute the partial tree distance for subpicture
image on the screen of a computer. The graphic representa-query. This work is explored in Section 6. Simulation
re-tion, called an icon sketch, can be translated into 2D string sults for verifying the effectiveness of similarity retrieval
representation. For example, we may want to retrieve im-by 2D C-trees matching are presented in Section 7. Also,
ages satisfying a certain icon sketch q1as in Fig. 1b. Then
an experimental project applying the proposed algorithms
q1can be translated into the 2D string (A5 E ,, C, A ,,
to video information system is described in Section 8.
Fi-C,, E). This query string is a substring of the 2D string nally, conclusions are summarized in the last section.
representation of the example image f1. The problem of
image retrieval then becomes the problem of 2D string 2. 2D STRING INDEXING APPROACHES
subsequence matching.
However, the spatial operators of 2D strings are not Given a physical image at the pixel level, the objects and
sufficient to give a complete description of spatial knowl-their relative positions within the image can be extracted by
edge for images of arbitrary complexity. The 2D G-string using various image processing and understanding
tech-representation were proposed to handle more types of niques [17]. Although this task is computationally
expen-relations between pictorial objects [2], but they are not sive, it is performed only at the time of inserting images
economic for complex images in terms of storage space into the database. Moreover, this task may be carried out
efficiency and navigation complexity. Lee and Hsu [13] in an automated fashion or in a human-assisted
semi-auto-proposed 2D C-string representation with a set of spatial mated fashion, depending on the domain and the
complex-operators and a more efficient cutting mechanism. They ity of the images. A symbolic image is then obtained by
employed a characteristic set of spatial operators illus-associating a name with each of the domain objects thus
trated in Table 1 to give a complete description for images identified. An image composed of a set of graphic icons
of arbitrary complexity. that represents the symbolic objects is named an iconic
Basically, the 2D C-string approach performs a cut to image. The idea of representing physical images by iconic
handle the cases of objects with partly overlapping. The images is similar to the representation of documents by
global operators ‘‘,’’ and ‘‘u,’’ which are employed in the
index terms in bibliographic information systems. We use
original 2D string approach, handle the cases of nonover-the terms image, symbolic image, and iconic image
inter-lapping. The extended operators ‘‘5,’’ ‘‘[,’’ ‘‘%,’’ and ‘‘],’’ changeably in this article.
called the local operators, and a pair of separators ‘‘( )’’ The 2D string approach for spatial indexing was initially
handle the cases of overlapping. The picture f2in Fig. 2a
proposed by Chang et al. [3] to represent iconic images.
is similar to f1in Fig. 1a, except that the objects in f2are
Three spatial relation operators ‘‘,,’’ ‘‘5,’’ and ‘‘:’’ are
nonzero sized objects as opposed to point objects in f1. The
employed in 2D strings. The operator ‘‘,’’ denotes the
2D C-string representation of the picture f2is (A]D[EzBzC,
‘‘left-right’’ or ‘‘below-above’’ spatial relation. The
TABLE 1
The Definition of Characteristic Spatial Operators
Notation Condition Meaning
A, B Ae, Bb A disjoints B.
Au B Ae5 Bb A is edge to edge with B.
A5 B Ab5 Bb, and Ae5 Be A is the same as B.
A [ B Ab5 Bb, and Ae. Be A contains B and they have the
same begin-bound.
A ] B Ab, Bb, and Ae5 Be A contains B and they have the
same end-bound.
A % B Ab, Bb, and Ae. Be A contains B and they do not
FIG. 3. The signed 2D C-trees of image f2.
have the same bound. A / B Ab, Bb, Ae, Be A is partly overlapping with B.
Note. The notations Aband Ae(Bband Be) denote the values of
begin-and end-bounds of objects A begin-and B, respectively.
sponding association graph and becomes an NP-complete problem although there are some polynomial time algo-rithms for the average case. Therefore we explore a more intact without cutting because the case of partly overlap- efficient representation and a matching algorithm to solve
ping does not happen. the problem of image similar retrieval.
The 2D C-string is efficient in the representation and
manipulation of images, but it is not suitable in image 3. 2D C-TREE retrieval based on 2D string subsequence matching. For
example, we use a query sketch q2as in Fig. 2b, which is The 2D C-tree is an ordered labeled tree. We first
intro-a subpicture of f2. The 2D C-string of the query image, duce the basic structure of a 2D C-tree. The 2D C-tree
(A%E ,, C, A ,, C ,, E) of q2, is quite different in representation still employs the sparse cutting mechanism
the format from the 2D C-string of f2, due to the spatial to handle the case of symbolic images with partly
overlap-operators. The string q2is not a substring of the string f2 ping objects [10]. The cutting mechanism performs only
any longer. The operators are needed to handle the global essential cuttings to get rid of the ambiguity incurred due and local relations among symbolic objects in a 2D C- to partly overlapping. After cutting, an image is partitioned string and cannot be omitted. to some portions between two cuttings. All the portions Although the inference of the spatial relations between are sequentially linked to a root, R, which is initialized to objects from a given 2D C-string in spatial reasoning can represent the margin or boundary of the area covered by be solved by using the ranking mechanism [14], the compu- a given image.
tation of object ranks in a 2D C-string is somewhat compli- The original 2D C-tree representation, called the signed cated. Moreover, all the spatial relationships of objects 2D C-tree, is proposed with associated spatial operators. pairs, which is O(N2) for N objects in an image, are required
Each node with label, or symbol name, represents an object to be reasoned first by adopting the 2D longest common in the image. The link connecting two nodes, called the subsequence algorithm [15]. The algorithm for similarity signed link, is signed with the relation operator. For the retrieval actually finds a maximum clique of the corre- ordered subtree rooted at node S with n immediate descen-dants in the ordering s1, s2, . . . , sn, S, being the parent,
actually contains the local body consisting of all its immedi-ate child-nodes s1, s2, . . . , sn. The relation operator
be-tween node S and its first child-node s1is surely a local
operator that indicates the ensemble relationship between S and the local body consisting of all its child-nodes s1, s2,
. . . , sn. The relation operators between node S and other
child-nodes si(2 # i # n) are definitely global operators
that indicate the sibling relation between the child-node si(2# i # n) and the prior child-node si21of node S. The
signed 2D C-trees of f2are constructed as shown in Fig. 3.
However, a tree with signed links is somewhat unusual for general applications. The empty-node and set-node are employed in order to remove the relation operator from
FIG. 2. A symbolic image with nonzero sized objects and a
deleting. Inserting s as a child of r will make s become the parent of a consecutive subsequence of the current children of r.
These editing operations can be represented asa R b, where a is either L (null) or a label in tree T1 and b is
either L or a label in tree T2. We call a R b a relabel
operation ifa ? L and b ? L, a delete operation if b 5 L, and an insert operation if a 5 L. Let D be a cost function which assigns a nonnegative real number, referred as D(a R b), to each editing operation a R b. The cost
FIG. 4. The General 2D C-trees of image f2. can vary in different operations on different nodes. For example, a higher node in a tree has a greater weight than a lower one. Nevertheless, the cost of each editing operation on any node is set equal for simplicity in this operators. An empty-node is a pseudo node which is
la-paper. beled ‘‘«’’ and can be of various sizes. The relation operator
The cost function D of each editing operation is con-of the signed link can be removed by inserting some
suit-strained to be a distance metric [5]. That is, able empty-nodes. When the relation operators are
stripped off from a signed 2D C-tree, each node of the (i) D(a R b) $ 0, D(a R a) 5 0, positivity; transformed tree has at least two child-nodes, except the
(ii) D(a R b) 5 D(b R a), symmetry;
node that originally connects to its single child-node by
(iii) D(a R c) # D(a R b) 1 D(b R c), the ‘‘5’’ operator. The ‘‘5’’ operator possesses the
commu-triangle inequality.
tation law, which is different from other relation operators of the 2D C-tree. The objects which are connected with the ‘‘5’’ operator have the same begin-bound and end-bound. For the reasoning of spatial relationship among the nodes of the 2D C-tree, a special set-node is introduced for treating a set of lineage that each node has single child-node. A set-node is a multilabel node consisting of objects that have the same begin-bound and end-bound. The de-tailed transformation rules are investigated in [10]. The sample symbolic image f2 in Fig. 2a is represented in a
General 2D C-tree as shown in Fig. 4. 4. TREE METRIC
Ordered labeled trees are trees whose nodes are labeled and in which the left to right ordering among siblings is significant. The distance and/or similarity of such trees have many applications in computer vision, pattern recog-nition, programming compilation, and natural language processing [7]. The distance between two ordered trees is considered to be the weighted number of editing opera-tions required to transform one tree to another. Many algorithms have been developed for ordered labeled tree matching and comparison [23]. Currently the best algo-rithm for computing the editing distance was presented by Zhang and Shasha [22]. In this section we introduce the distance metric between trees to be the basis for presenting an elegant tree-matching algorithm in image retrieval.
Three kinds of editing operations of a labeled tree [23] are considered and illustrated in Fig. 5. Relabeling a node
s means changing the label on s. Deleting a node s means
making the children of s become the children of the parent
FIG. 5. Three editing operations on labeled tree.
Let E be a sequence e1, e2, . . . , ekof editing operations.
An E-derivation from tree A to tree B is a sequence of trees A0, A1, . . . , Ak such that A 5 A0, B 5 Ak, and
Ai21 R Ai, via editing operation eifor 1 # i # k. Then
the cost function D can be extended to the sequence of editing operations by letting
D(E) 5
o
uiE5u1D(ei). (1)The editing distance between two trees is defined as minimum cost of the editing sequence that transforms one
FIG. 6. The 2D C-tree f2Xof Fig. 4a with postorder numbering.
tree to the other. Formally the editing distance between trees T1and T2is defined as
d(T1, T2)5 minhD(E) u E is an editing sequence
(2) tree. Since the nodes of a 2D C-tree may be empty-nodes from T1to T2j. or set-nodes, we must make a small but significant
modifi-cation of the tree-matching algorithm developed by Zhang The editing sequence can be treated as a mapping that and Shasha [22].
is a graphical specification of editing operations applied to Suppose that A is a node in tree T1. N(A) denotes the the nodes in the two ordered trees. Suppose that we have number of labels of node A. If A is an empty-node, A has a numbering mechanism, for example, the postorder num- a special label ‘‘«’’ and N(A) is one. If A is a set-node, bering for a tree. Let T [i] be the ith node of tree T in N(A) must be more than one. For an editing operation the postorder numbering. Formally, we define a triple (M, AR B, where B is a node in tree T2, the cost function
T1, T2) to be a mapping from T1to T2, where mapping M needs to be re-examined:
is the set of integer pairs (i, j) satisfying:
(1) The cost of a delete operation AR L, D(A R L), (1) 1# i # uT1u, 1 # j # uT2u; is defined as N(A). That is,D(A R L) 5 N(A).
(2) For two pairs of (i1, j1) and (i2, j2) in M, (2) The cost of an insert operationL R B, D(L R B),
is defined as N(B). That is,D(L R B) 5 N(B). (a) i15 i2if and only if j15 j2(one-to-one),
(3) The cost of a relabel operation AR B, D(A R B), (b) T1[i1] is to the left of T1[i2] if and only if T2[ j1] is
is defined as the larger of two numbers N(A/B) and to the left of T2[ j2] (sibling order preserved ),
N(B/A). Let N(A/B) represent the number of labels of (c) T1[i1] is an ancestor of T1[i2] if and only if T2[ j1]
node A which are differentiated from those of node B. is an ancestor of T2[ j2] (ancestor order preserved).
That is, D(A R B) 5 max{N(A/B), N(B/A)}. If one of (3) Let lca(i1, i2) represent the least common ancestor them is an empty-node, the cost is the number of labels of
node of i1 and i2. For three pairs of (i1, j1), (i2, j2), and the other node.
(i3, j3) in M, T1[lca(i1, i2)] is a proper ancestor of T1[i3] if
These cost functions defined above are still under the and only if T2[lca( j1, j2)] is a proper ancestor of T2[ j3].
constraints of distance metric. In the following some nota-Let M be a mapping from T1to T2. Let I and J be the
tions on trees are illustrated in Fig. 6 using the tree f2Xof
sets of unmatched nodes in T1 and T2, respectively. We
Fig. 4a with postorder numbering in the parenthesis as will use M instead of (M, T1, T2) if there is no confusion.
an example: Then we can define the cost of M:
(1) T [i]. The ith node in the tree T according to the D(M) 5
o
(i,j)[MD(T1[i]R T2[ j])1o
i[ID(T1[i]R L) (3) left-to-right postorder numbering. (Ex. The label of T [2]is E.) 1
o
j[JD(L R T2[ j]).(2) u(i). The number of the leftmost leaf descendant
Hence, of the subtree rooted at T [i]. When T [i] is a leaf node,
u(i)5 i. (Ex.u(4)5 2; i.e., E is the leftmost leaf descendant d(T1, T2)5 minhD(M) u M is a mapping from T1to T2j. (4) of D.)
(3) (i). The depth of T [i]; it is the number of nodes
5. STRUCTURAL IMAGE RETRIEVAL
on the path from the root of tree T to node T [i], excluding
T [i]. (Ex.(2) 5 3; i.e., the depth of E is 3.)
Now, we begin to introduce the tree matching algorithm
Pk(i) denotes the kth level predecessor of T [i], where the L
EMMA3. Let ip[ P(i) and jp[ P( j). Ifu(ip)?u(i),
oru( jp)?u( j), then
level is counted from node T [i] backward to the root. P(i)5 {Pk(i)u 1 # k # (i)}. (Ex. P(2) 5 {4, 5, 8}; i.e., the
ForestDist(u(ip) . . i,u( jp) . . j)5 min{
predecessors of E are D, A, and R.)
ForestDist(u(ip) . . i2 1,u( jp) . . j)1 D(T
1[i]R L),
(5) T [i . . j]. An ordered subforest of tree T induced
ForestDist(u(ip) . . i,u( jp) . . j2 1) 1 D(L R T
2[ j]), (9)
by the nodes numbered from i to j inclusive. If i. j, then
ForestDist(u(ip) . .u(i)2 1,u( jp) . .u( j)2 1) T [i . . j]5 B. (Ex. T [2 . . 6] includes E, «, D, A, and B.)
1 TreeDist(i, j)}. (6) Forest(i). An ordered subforest T [1 . . i]. (Ex.
Forest(4) includes«, E, «, and D.) Lemma 3 considers the distance between two forests.
The forest T1[u(ip) . . i] (T2[u( jp) . . j]) is led by the leftmost
(7) Tree(i). A subtree of T rooted at T [i]. T [u(i) . .
leaf node of a tree Tree(ip) (Tree( jp)) containing the
consid-i] will be referred to as Tree(i). (Ex. Tree(4) is equivalent
ered node T1[i] (T2[ j]) and concluded at T1[i] (T2[ j]). The
to T [2 . . 4].)
distance between T1[(u(ip) . . i] of T1and T2[(u( jp) . . j] of
(8) Size(i). The number of nodes in Tree(i). (Ex.
T2is the minimum of the three possible editing mapping Size(4)5 3.)
costs: delete T1[i] from T1, insert T2[ j] into T2, or substitute
(9) ForestDist(i9 . . i, j9 . . j). The distance between two the subtree Tree(i) of T
1by Tree( j) of T2.
subforests T1[i9 . . i] in T1 and T2[ j9 . . j] in T2. We use
For proofs of Lemmas 1, 2, and 3, refer to [22]. an abbreviated notation ForestDist(i, j) for the distance
The algorithm to compute the tree distance uses a dy-between T1[1 . . i] and T2[1 . . j] (see below).
namic programming style [16]. From Lemma 3 we observe (10) TreeDist(i, j). The distance between the subtree
that to compute TreeDist(ip, jp) we need in advance almost Tree(i) rooted at i in T1and the subtree Tree( j) rooted at
all values of TreeDist(i, j) where T1[ip] is the root of a
j in T2(see below).
subtree containing T1[i] and T2[ jp] is the root of a subtree
The following three lemmas are necessary for the tree containing T2[ j]. This suggests a bottom-up procedure for
distance computation algorithm. computing all subtree pairs.
ALGORITHM1. The computation of TreeDist(x, y). LEMMA1. Let ip[ P(i) and jp[ P( j). Then
Input: Two subtrees, Tree(x) rooted at x in tree T1and
(i) ForestDist(B, B) 5 0. (5)
Tree( y) rooted at y in tree T2.
(ii) ForestDist(u(ip) . . i,B) 5 ForestDist(u(ip) . . i2 1,
Output: The distance TreeDist(x, y).
B) 1 D(T1[i]R L). (6)
Begin (iii) ForestDist(B, u( jp) . . j) 5 ForestDist(B, u( jp) . .
ForestDist(B, B) 5 0;
j2 1) 1 D(L R T2[ j]). (7) for i :5u(x) to x
ForestDist(u(x) . . i,B) 5 ForestDist(u(x) . . i2 Case (i) requires no editing operation and is assigned 0
1,B) 1 N(i); for initialization. In (ii), the distance corresponds to the
for j :5u( y) to y cost of deleting a node T1[i] from a forest T1[u(ip) . . i].
ForestDist(B,u( y) . . j)5 ForestDist(B,u( y) . . The forest T1[u(ip) . . i] is led by the leftmost leaf node of
j2 1) 1 N( j);
a tree Tree(ip) containing node T
1[i] in T1 and concluded
for i :5u(x) to x at T1[i]. In (iii), the distance corresponds to the cost of
for j :5u( y) to y inserting a node T2[ j] into a forest T2[u( jp) . . j2 1] in T2.
ifu(i)5u(x) andu( j)5 u( y), then LEMMA 2. ForestDist(u(i) . . i, u( j) . . j) 5 TreeDist ForestDist(u(x) . . i,u( y) . . j)5 min{
(i, j)5 min{ ForestDist(u(x) . . i2 1,u( y) . . j)1 N(i),
ForestDist(u(x) . . i,u( y) . . j2 1) 1 N( j),
ForestDist(u(i) . . i2 1,u( j) . . j)1 D(T1[i]R L), ForestDist(u(x) . . i2 1,u( y) . . j 2 1) 1 ForestDist(u(i) . . i,u( j) . . j2 1) 1 D(L R T2[ j]), (8) max {N(A/B), N(B/A)}};
ForestDist(u(i) . . i2 1,u( j) . . j2 1) 1 D(T1[i]R T2[ j])}. TreeDist(i, j)5 ForestDist(u(x) . . i,u( y) . . j);
else
ForestDist(u(x) . . i,u( y) . . j)5 min{ Lemma 2 computes the distance between two subtrees
rooted at T1[i] in T1and T2[ j] in T2, respectively. Consider ForestDist(u(x) . . i2 1,u( y) . . j)1 N(i), ForestDist(u(x) . . i,u( y) . . j2 1) 1 N( j), the mapping of two roots, T1[i] and T2[ j]. The distance
between Tree(i) of T1 and Tree( j) of T2 is the minimum ForestDist(u(x) . .u(i)2 1,u( y) . .u( j)2
1) 1 TreeDist(i, j)}; of the three possible editing mapping costs: delete T1[i]
are n images in the database, P1, P2, . . . , Pn, and a query image Q. The most similar image(s) to Q is
{Piud(Pi, Q) is the minimum ofd(Pk, Q), 1# k # n}. 6. SUBPICTURE QUERY
Subpicture query is useful when a user cannot express queries in a precise way [5]. Sometimes we may ask ‘‘please
FIG. 7. The 2D C-trees of query sketch q2.
retrieve images that contain this specific subpicture,’’ or ‘‘I want some images that have some part like this query sketch.’’ For example, the query image q2in Fig. 2b is a
subpicture of f2 in Fig. 2a. An
approximate-tree-by-For two 2D C-trees, T1 and T2, rooted at R1 and R2,
example (ATBE) system [19] developed by Wang et al. respectively, the distance between them, denoted byd(T1,
manipulates the inexact query by approximate tree
match-T2), is computed as the value of TreeDist(R1, R2). We can
ing. But the cutting and pruning operations that remove use the algorithm to compute the distance of 2D C-trees
all the descendants of a node are somehow not suitable for for solving the picture query problem. Consider two
pic-subpicture query. The tree distance computation algorithm tures, P1 and P2. Two 2D C-tree representations of P1
proposed in the previous section not only can support a (P2), T1x (T2x), and T1Y (T2Y) along x-coordinate and
y-simple measure for similarity retrieval, but also it can be coordinate, respectively, are constructed. We define the
modified for a subpicture query. In essence, we adopt the distance between P1and P2as follows.
tree-matching algorithm and modify the cost functions of DEFINITION 1. The distance between two pictures P1 editing operations as required.
and P2, d(P1, P2), is d(T1x, T2x) * d(T1Y, T2Y). Ifd(T1x,
(1) The cost of delete operation is weighted zero.
Delet-T2x) is zero, then defined(P1, P2)5 d(T1Y, T2Y). On the
ing a symbol from a reference image means that this symbol contrary, if d(T1Y, T2Y) is zero, then define d(P1, P2) 5
does not appear in the query image. For subpicture query, d(T1X, T2X).
the symbols existing in the reference image may not be expressed in the query image or specified subpicture. In We use the example picture f2in Fig. 2a and query sketch
such a case, the superfluous symbols in reference image q2 in Fig. 2b to demonstrate the computation of picture
can be ignored on purpose when they do not appear in distance. The 2D C-trees of f2and q2along x-coordinate
the query image and are allowed to delete with zero cost. axis are in Figs. 4a and 7a correspondingly. The editing
The cost of editing operation AR L is weighted zero; i.e., distance between these two trees is the cost of editing
D(A R L) 5 0. operations required to transform f2xto q2x. At least two
editing operations are needed. That is, D(D R L) and (2) The cost of an insert operationL R B, i.e., D(L R D(B R «). So the tree distance ofd( f2x, q2x) is 2. Three B), is not changed because all the symbols of the query
delete operations, D(B R L), D(D R L), and D(« R L), image should be considered. That is, D(L R B) 5 N(B). are needed between f2Yin Fig. 4b and q2Yin Fig. 7b along (3) The cost of a relabel operation AR B, i.e., D(A R
y-coordinate. That is, the cost ofd( f2Y, q2Y) is 3. Finally, B), is slightly changed and is defined as the number of
the distance ofd( f2, q2) is 6. unmatched symbols in B, which do not appear in A. That
Moreover, in [22] Zhang and Shasha had defined an is,D(A R B) 5 N(B/A). One special case is for B 5 «. LR–keyroots set of tree T, LR–keyroots(T ), to efficiently The cost of D(A R «) is redefined to be 0 because the reduce the computation time of tree distance. The com- symbol(s) in A can be viewed as an empty-node in B. plexity of the algorithm is O(uT1u * uT2u * min(depth(T1),
leaves(T1)) * min(depth(T2), leaves(T2))). Let depth(T1) Obviously, the newly defined cost functions, called
par-tial cost functions, of editing operations do not obey the denote the depth of the tree T1and leaves(T1) denote the
number of leaf nodes of the tree T1. In general, the fast symmetry constraint of distance metric. Although the
de-lete operation is not the inverse function of insert operation algorithm takes O(n4) for computing the editing distance
between two trees consisting of n nodes. The parallel algo- any more, the constraint of a distance metric is not our major concern for subpicture query. The partial cost func-rithm is the time of complexity O(uT1u * uT2u) by using
O(min(uT1u, uT2u) * leaves(T1) * leaves(T2)) processors. tions directly affect computation of the tree distance, based
upon the lemmas in the previous section. In Lemma 1, the While all the tree distances between the query image
and the images in the database have been computed, the second sentence ForestDist(u(ip) . . i, B) is always zero
becauseD(T1[i]R L) 5 0. In Lemma 2 and Lemma 3, the
second value of the first statement in the minimum group, We use the example picture f2in Fig. 2a and query sketch
q2 in Fig. 2b to demonstrate the computation of partial
i.e.,D(T1[i]R L), is removed because the delete operation
is weighted zero also. Consequently, the algorithm of par- distance. The 2D C-trees of f2and q2are in Fig. 4 and Fig.
7, correspondingly. For computing the partial tree distance tial tree distance is modified as follows.
between f2xand q2x, the first editingD(D R L) is a delete
ALGORITHM 2. The computation of PartialTreeDist
operation having zero weight. The second editingD(B R
(x, y). «) is a special case of relabel operation also weighted zero.
So the tree distance of c( f2x, q2x) is 0 for x-coordinate
Input: Two subtrees, Tree(x) rooted at x in tree T1 and
direction. The costs of three delete operations,D(B R L),
Tree( y) rooted at y in tree T2.
D(D R L), and D(« R L), needed for transforming from
Output: The distance PartialTreeDist(x, y).
f2Y to q2Y along y-coordinate are all weighted zero. That
Begin
is, the cost ofc( f2Y, q2Y) is 0 also. Finally, the distance of
ForestDist(B, B) 5 0;
c( f2, q2) is 0. It means that q2 is a subpicture of f2with
for i :5u(x) to x
distance zero.
ForestDist(u(x) . . i,B) 5 0;
Analogously, the most similar image(s) that contains a for j :5u( y) to y
query subpicture Q from P1, P2, . . . , and Pnis
ForestDist(B, u( y) . . j) 5 ForestDist(B, u( y) . . j2 1) 1 N( j);
for i :5u(x) to x hPiu c(Pi, Q) is the minimum ofc(Pk, Q), 1# k # n}. for j :5u( y) to y
ifu(i)5u(x) andu( j)5 u( y), then 7. SIMULATION RESULTS
ForestDist(u(x) . . i,u( y) . . j)5 min{
ForestDist(u(x) . . i2 1,u( y) . . j), For verifying the effectiveness of similarity retrieval by
ForestDist(u(x) . . i,u( y) . . j2 1) 1 N( j), 2D C-trees matching, a test consisting of 10 simulation
ForestDist(u(x) . . i2 1,u( y) . . j 2 1) 1 pictures is evaluated. A symbolic image with random
spa-N( j/i)}; tial relationship among objects can be generated by
ran-PartialTreeDist(i, j) 5 ForestDist(u(x) . . i, dom generation of quadruple-values. Table 2 shows 10
u( y) . . j); random generated objects A through J with the bounds
else on x-axis and y-axis, respectively.
ForestDist(u(x) . . i,u( y) . . j)5 min{ We construct 10 simulation pictures, P1, P2, . . . , and
ForestDist(u(x) . . i2 1,u( y) . . j), P10. Without loss of generality, assume P1contains single
ForestDist(u(x) . . i,u( y) . . j2 1) 1 N( j), object, the first object (A). P2contains the first two objects
ForestDist(u(x) . .u(i)2 1,u( y) . .u( j)2 (A and B), and so on. The tenth picture P10 contains all 1) 1 PartialTreeDist(i, j)}; 10 objects. Fig. 8 depicts the symbolic images P3and P4.
End; It is interesting to find out that P3is a subpicture of P4.
It could be foreseen that a picture with less objects is a Then, we can use the partial tree-matching algorithm
subpicture of a picture with more objects in this experi-to compute the distance of 2D C-trees for solving the
ment; i.e., Piis always a subpicture of Pj, for i# j.
subpicture query problem. Letc(T1, T2) represent the
par-Then these 10 pictures are represented in 2D C-trees, tial tree distance between trees T1 and T2. The partial
respectively. The 2D C-trees of P3, referred to as T3Xand
distance between P1and P2is defined as follows.
T3Y, are shown in Fig. 9 and the 2D C-trees of P4in Fig. 10.
For illustrating the computations of tree distances among DEFINITION 2. The partial distance between two
pic-tures P1and P2,g(P1, P2), isg(T1x, T2x) *c(T1Y, T2Y). If these 10 pictures, the 2D C-tree representation is expressed
by a recursive sentence a(a1a2a3 . . . . an) for a node a,
c(T1x, T2x) is zero, then definec(P1, P2) 5 c(T1Y, T2Y).
On the contrary, ifc(T1Y, T2Y) is zero, then definec(P1, which has n immediate descendants in the orderinga1,a2,
a3, . . . , andan. For example, T3Xin Fig. 9 is represented
P2)5 c(T1X, T2X).
TABLE 2
The Simulation Data of Ten Objects
Label A B C D E F G H I J
x-axis 131, 347 165, 358 31, 641 192, 572 5, 358 80, 346 119, 470 213, 584 420, 610 364, 600 y-axis 167, 415 20, 332 217, 363 9, 241 36, 156 431, 466 208, 509 48, 50 355, 467 14, 545
FIG. 8. P3is a subpicture of P4.
FIG. 10. The 2D C-trees of P4. as R(C(«A(«B)B«)). Note that R, « and the bracket [ ]
denote the root of tree, an empty-node and a set-node, respectively. The 2D C-trees of the 10 simulation pictures
are constructed and listed in Fig. 11. the objects of Pi excluding the ith object. The distance
In the sample database, there are 10 generated pictures, between Piand Pjis always smaller than the distance be-P1, P2, . . . , and P10, as reference pictures. We also use Pi, tween Pi21and Pj. For two pictures Pkand Pk11containing
1# i # 10, as query picture to validate the correctness of more objects than Pj, the distance between Pk and Pj is
the matching algorithm. Listed in Table 3 is the exact query always smaller than the distance between Pk11and Pj. The
and we compute the tree distance d(reference, query). above statements confirm that the computation of tree Table 4 shows the subpicture query and we compute the distances is suitable for measuring the similarity between partial tree distancec(reference, query). two pictures. The smaller the distance between two
pic-There are some interesting observations in the simula- tures is, the more similar the two pictures are.
tion results: (3) It seems apparent that the partial tree distances in
Table 4 are not symmetric. The values of the lower-triangle (1) Table 3 shows that the tree distance computation
strictly obeys the constraints of distance metric. That is, for any Pi, Pj, and Pk,
(i) d(Pi, Pj)$ 0, andd(Pi, Pi)5 0 (positivity),
(ii) d(Pi, Pj)5d(Pj, Pi) (symmetry),
(iii) d(Pi, Pk) # d(Pi, Pj) 1 d(Pj, Pk) (triangle in-equality).
(2) For any Pj,
(i) if i, j, thend(Pi, Pj),d(Pi21, Pj);
(ii) if k. j, thend(Pk, Pj),d(Pk11, Pj).
Pjcontains the first j objects. Picontains the first i objects
and is a subset of objects of Pjif i , j. Pi21 contains all
FIG. 11. The 2D C-tree representations of 10 simulation pic-tures.
TABLE 3
The Tree Distance Result of Comparing Ten Simulation Pictures Reference d Query P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P1 0 9 42 110 169 255 418 594 750 1044 P2 9 0 12 56 100 168 304 456 594 858 P3 42 12 0 16 54 96 204 330 475 713 P4 110 56 16 0 16 40 120 220 345 551 P5 169 100 54 16 0 8 54 126 221 391 P6 255 168 96 40 8 0 20 70 143 285 P7 418 304 204 120 54 20 0 15 63 165 P8 594 456 330 220 126 70 15 0 20 88 P9 750 594 475 345 221 143 63 20 0 24 P10 1044 858 713 551 391 285 165 88 24 0
are almost the same as those of the tree distances in Table 8. PROTOTYPE SYSTEM 3. And the values of the upper-triangle are almost zero
We apply the above mechanisms to implement an inter-because the costs of delete operations are weighted zero.
active video information system in our experimental proj-Basically, the partial tree distance computation obeys the
ect [11]. We capture 48 streams from ‘‘The Lion King’’ distance metric except symmetry constraint due to the
par-cartoon produced by The Walt Disney Company and store tial cost functions defined. Note that some very small
non-the video data in AVI file format. Each stream takes about zero values appearing in the upper-triangle happen when
99 s and consists of about 1500 frames. Some key image the cutting causes some objects being segmented.
frames are identified in a human-assisted fashion for each (4) The value in the upper-triangle of Table 4 represents
of the video streams. Notes that this work can be benefited the partial tree distance between one (reference) picture
from the motion analysis of recorded scene. These key with more objects and another (query) picture with less
images become representative of the streams. There are
objects, i.e., c(Pj, Pi) for two pictures Pj and Pi, j . i.
351 key images in our experiment and some are listed in Since this value is zero or very closer to zero, Piis viewed
Fig. 12. For this popular animation, 78 roles are chosen to as a subpicture of Pj. For example,c(P4, P3)5 0 implies
be the objects, which are also extracted in human-assisted that P3 is a subpicture of P4 with zero cost. The result
fashion. These objects are represented by a set of designed complies with the fact of the simulation.
icons in the system. The objects and their bounding rectan-gles within images are also extracted after capturing the The above evidences validate the accountability of our
tree-matching algorithms and the effectiveness of similarity image from the source video. Each image containing about five objects in average is constructed into two 2D C-trees retrieval by 2D C-trees matching.
TABLE 4
The Partial Tree Distance Result of Comparing Ten Simulation Pictures Reference c Query P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P1 0 0 0 0 0 0 0 0 0 0 P2 9 0 0 0 0 0 0 0 0 0 P3 42 12 0 0 1 1 1 1 1 1 P4 110 56 16 0 3 2 2 2 2 2 P5 169 100 48 12 0 0 0 0 1 1 P6 255 168 96 36 8 0 0 0 1 1 P7 418 304 204 112 54 20 0 0 1 1 P8 594 456 330 209 126 70 15 0 2 2 P9 750 594 475 345 221 143 63 20 0 0 P10 1044 858 713 532 391 285 165 88 24 0
FIG. 13. An example query image.
along x- and y-axis directions independently and these 2D knowledge of the nonzero sized objects with overlapping. The similarity retrieval of images using 2D C-strings C-trees are stored associated with the source AVI file. The
number of objects within each image implies the number adopted the maximum-likelihood approach defined in terms of maximum degree of object-pairs clique. The algo-of nodes in its corresponding 2D C-trees.
The system supports single image query and frame se- rithm for similarity retrieval based on 2D C-strings actually finds a maximum clique and becomes an NP-complete quence query. The system allows users to draw a query
image by assembling object icons designed or use a before- problem though polynomial algorithm for average case is available.
hand query template consisting of a sequence of query
frames. An example query image is shown in Fig. 13. Two In this paper, we use an ordered labeled tree, 2D C-tree, to be the spatial representation for an image and 2D C-trees of each image in the query sequence is
con-structed first. Then, for each stream of video database, we propose the tree-matching algorithm for similarity re-trieval. The algorithm provides a simple fast comparison compute the partial distances between the stream and the
query sequence. An approximate sequence matching for computing tree distance among images. The computa-tion of distance between 2D C-trees can be used to measure (ASM) mechanism can compute the subsequence matching
distance. The stream with minimum distance represents the similarity of images with spatial constraint. This ap-proach provides an effective and efficient mechanism for the most similar stream for the query sequence. A result
of the query template in Fig. 13 is shown in Fig. 14. Our similarity retrieval in image databases. The tree distance comparison algorithm is also modified to compute the par-initial results validate the effectiveness of similarity
re-trieval by 2D C-trees matching. tial tree distance for subpicture query. We also validate the accountability of our tree-matching algorithms for simi-larity retrieval by simulation results. Moreover, the meth-9. CONCLUSIONS
odology of similarity retrieval is utilized in video sequence Similarity retrieval is one of the attracting functions of an matching in our video information retrieval project be-image database system that distinguishes it from traditional ing executed.
database systems. The goal is to retrieve the images that
are similar to the query image. The similarity retrieval APPENDIX: LIST OF SYMBOLS based upon the minimum-distance criterion had been
pro-posed in the techniques of 2D string matching defined in Ab the begin-bound of object A
Ae the end-bound of object A
terms of longest common subsequence. However, the 2D
FIG. 14. A result of the query template in Fig. 13.
Size(i) the number of nodes in Tree(i)
si the ith immediate descendant of node S
« empty-node uT u the number of nodes in the tree T
depth(T ) the depth of the tree T
T a rooted tree
T [i] the ith node of tree T in the postorder num- leaves(T ) the number of leaf nodes in the tree T
Pi the ith picture in the database
bering
e1. . ek the editing operations d(P1, P2) the tree distance between two pictures P1
and P2
D the cost function of editing operation
d(T1, T2) the tree distance between tree T1and tree T2 c(P1, P2) the partial distance between two pictures P1
and P2
lca(i, j) the least common ancestor node of T [i] and
T [ j]
N(T ) the number of symbols in tree T REFERENCES
u(i) the postorder number of the leftmost leaf
1. S. K. Chang, Principles of Pictorial Information Systems Design,
descendant in a subtree rooted at T [i]
Prentice–Hall, Englewood Cliffs, NJ, 1989.
(i) the depth of T [i]
2. S. K. Chang, E. Jungert, and Y. Li, Representation and retrieval of
P(i) the set of predecessors of T [i]
symbolic pictures using generalized 2D strings, in SPIE Proc. on
Pk(i) the kth level predecessor of T [i]
Visual Communications and Image Processing, Philadelphia, 1989, T [i . . j] the ordered subforest of tree T induced by pp. 1360–1372.
the nodes numbered from T [i] to T [ j] in- 3. S. K. Chang, Q. Y. Shi, and C. W. Yan, Iconic indexing by 2-D strings, clusive IEEE Trans. Pattern Anal. Mach. Intell. PAMI-9, 1987, 413–428. Forest(i) an ordered subforest T [1 . . i] 4. S. K. Chang, C. W. Yan, D. C. Dimitrof, and T. Arndt, Intelligent
image database system, IEEE Trans. Software Eng. 14, 1988, 681–688.
5. P. Ciaccia, F. Rabitti, and P. Zezula, Similarity search in multimedia database systems, in Proceedings, The First International Conference on Visual Information Systems (VISUAL’96), Melbourne, Australia, 1996, pp. 107–115.
6. M. Flickner, H. Sawhney, W. Niblack J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, Query by image and video content: The QBIC system, IEEE Comput. 44, 1995, 23–32.
7. K. S. Fu, Syntactic Pattern Recognition and Application, Prentice– Hall, Englewood Cliffs, NJ, 1982.
FANG-JUNG HSU received his B.S. degree in computer science from 8. V. N. Gudivada and V. V. Raghavan, Content-based image retrieval the Soochow University and his M.S. degree in information science from systems, IEEE Comput. 44, 1995, 18–22. the Chiao Tung University, Taiwan in 1982 and 1989, respectively. He 9. A. Gupta, T. Weymouth, and R. Jain, Semantic queries with pictures: is currently a Ph.D. candidate in computer science and information engi-The VIMSYS model, in Proceedings, engi-The 17th International Confer- neering at Chiao Tung University. He has been a senior engineer at the ence on Very Large Data Bases, Barcelona, Spain, 1991, pp. 69– Computer Communication Research Laboratories (CCL) of the
Indus-79. trial Technology Research Institute (ITRI), Taiwan, since 1984. His
cur-rent research interests include multimedia information systems, image/ 10. F. J. Hsu, S. Y. Lee, and P. S. Lin, 2D C-tree spatial representation
spatial databases, object-oriented databases. for iconic image, in Proceedings, The 2nd International Conference
on Visual Information Systems, San Diego, CA, 1997, pp. 287– 294.
11. F. J. Hsu, S. Y. Lee, and P. S. Lin, Video data indexing by 2D C-trees, J. Vis. Lang. Comput., submitted.
12. H. V. Jagadish, A. O. Mendelzon, and T. Milo, Similarity-based queries, in Proceedings, The 14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, San Jose, CA, 1995, pp. 36–45.
13. S. Y. Lee and F. J. Hsu, 2D C-string: A new spatial knowledge
representation for image database systems, Pattern Recognit. 23, SUH-YIN LEE received her B.S.E.E. degree from the National Chiao 1990, 1077–1087. Tung University, Taiwan in 1972 and her M.S. degree in computer science 14. S. Y. Lee and F. J. Hsu, Spatial reasoning and similarity retrieval of from the University of Washington, Seattle in 1975. She joined the faculty images using 2D C-string knowledge representation, Pattern Recognit. of the Department of Computer Engineering at Chiao Tung University
25, 1992, 305–318. in 1976 and received the Ph.D. degree in electronic engineering there in
1982. Dr. Lee is now a professor in the Department of Computer Science 15. S. Y. Lee, M. K. Shan, and W. P. Yang, Similarity retrieval of iconic
and Information Engineering at Chiao Tung University. She chaired the image database, Pattern Recognit. 22, 1989, 675–682.
department from 1991 to 1993. Her current research interests include 16. U. Manber, Introduction To Algorithms: A Creative Approach,
multimedia information systems, object-oriented databases, image/spatial Addison–Wesley, Reading, MA, 1989.
databases, and computer networks. Dr. Lee is a member of Phi Tau Phi, 17. W. Rickert, Extracting area objects from raster image data, IEEE
the ACM, and the IEEE Computer Society. Comput. Graphics Appl. 13, 1993, 68–73.
18. A. Soffer and H. Samet, Pictorial queries by image similarity, in Proceedings, The 13th International Conference on Pattern Recogni-tion, Vienna, Austria, 1996, pp. 114–119.
19. J. S. Wang, K. Zhang, K. Jeong, and D. Shasha, A system for approxi-mate tree matching, IEEE Trans. Knowledge Data Eng. 6, 1994, 559–571.
20. K. Wakimoto, M. Shima, S. Tanaka, and A. Maeda, Content-based retrieval applied to drawing image database, SPIE 1908, 1993,
BAO-SHUH LIN (S’76-M’79-SM’89) received the Ph.D. degree in 74–84.
computer science from the University of Illinois, Urbana, IL in 1980. He 21. J. K. Wu, A. D. Narasimhalu, B. M. Mehtre, C. P. Lam, and Y. J. Gao, had been working as an R&D Manager of computer communication-CORE: A content-based retrieval engine for multimedia information related projects for AT&T Bell Laboratories, Racal Data Communica-systems, ACM Multimedia Syst. 3, 1995, 25–41. tions, Boeing, and Teknekron Communication Systems. He is currently 22. K. Zhang and D. Shasha, Simple fast algorithms for the editing dis- the Deputy General Director of the Computer Communication Research tance between trees and related problems, SIAM J. Comput. 18, Laboratories (CCL) of the Industrial Technology Research Institute 1989, 1245–1262. (ITRI), Taiwan, R.O.C. He is the director of many advanced projects in CCL, including high-performance computer systems, multimedia systems, 23. K. Zhang, Algorithms for the constrained editing distance between
Chinese information technologies, and advanced information technolo-ordered labeled trees and related problems, Pattern Recognit. 28,
gies. Dr. Lin is a senior member of IEEE Computer Society. 1995, 465–474.