withx and w being the vectors of the membership grades. Furthermore by admitting this form of generalization, we allow more than two information granules to form the re-spective mechanism of generalization or specialization (note that our previous construct was quite restrictive, in this re-gard).
— A logic-based transformation of the membership grades involving more advanced constructs such as compensative operators, weighted means, a family of OWAs operators and the like.
This study can indicate a useful possibility of experimenting with fuzzy sets. It is quite evident that the experimental studies concerning the use of real-world data is scarce. One can number a very few studies along this line, e.g., [14] and [15]. Practically, there are no experimental data sets available to experiment with and this situa-tion leads to the evident shortage of the ensuing experiment-oriented research. The proposed methodology of exploiting information gran-ulation through fuzzy clustering and developing information granules of different size may be of some help by providing synthetic member-ship data to experiment with. Moreover, it could be used in synthetic description of clustering results. This may pertain either to the same clustering algorithm and results obtained for different numbers of clusters; in this sense we are interested in learning how these infor-mation granules generalize (or specialize) some other elements in the family of the granules. One can envision another scenario where the results of clustering are generated by different clustering methods and one is looking for the relationship of generalization or specialization between the results produced by these methods.
REFERENCES
[1] C. Blake, E. Keogh, and C. J. Merz, UCI Repository of Machine Learning Databases. Irvine, CA: Univ. Calif., Dept. Inform. Comput. Sci..
[2] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algo-rithms. New York: Plenum, 1981.
[3] D. Butnariu and E. P. Klement, Triangular Norm—Based Measures and Games with Fuzzy Coalitions. Norwell, MA: Kluwer, 1993. [4] D. Dubois and H. Prade, “Unfair coins and necessity measures: Toward
a possibilistic interpretation of histograms,” Fuzzy Sets Syst., vol. 10, pp. 15–20, 1983.
[5] F. Höppner, F. Klawonn, R. Kruse, and T. Runkler, Fuzzy Cluster Anal-ysis. New York: Wiley, 1999.
[6] G. J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Appli-cations. Englewood Cliffs, NJ: Prentice-Hall, 1995.
[7] S. Medasani, J. Kim, and R. Krishnapuram, “Estimation of membership functions for pattern recognition and computer vision,” in Fuzzy Logic and Its Applications to Engineering, Information Sciences, and Intel-ligent Systems, K. C. Min and Z. Bien, Eds. Norwell, MA: Kluwer, 1995, pp. 45–54.
[8] W. Pedrycz and F. Gomide, An Introduction to Fuzzy Sets. Analysis and Design. Cambridge, MA: MIT Press, 1998.
[9] W. Pedrycz, “Fuzzy equalization in the construction of fuzzy sets,” Fuzzy Sets Syst., submitted for publication.
[10] W. Pedrycz and G. Vukovich, “Data-based design of fuzzy sets,” J. Fuzzy Log. Intell. Syst., vol. 9, no. 3, 1999.
[11] T. L. Saaty, The Analytic Hierarchy Process. New York: McGraw-Hill, 1975.
[12] B. Schweizer and A. Sklar, Probabilistic Metric Spaces. Amsterdam, The Netherlands: North-Holland, 1983.
[13] S. Weber, “A general concept of fuzzy connectives, negations and impli-cations based ont-norms,” Fuzzy Sets Syst., vol. 11, pp. 115–134, 1983. [14] H. J. Zimmermann, Fuzzy Set Theory and Its Applications, 2nd
ed. Norwell, MA: Kluwer, 1991.
[15] H. J. Zimmermann and P. Zysno, “Latent connectives in human decision making,” Fuzzy Sets Syst., vol. 4, pp. 37–51, 1980.
Document Retrieval Using Fuzzy-Valued Concept Networks
Shyi-Ming Chen, Yih-Jen Horng, and Chia-Hoang Lee
Abstract—This paper presents a new method for document retrieval using fuzzy-valued concept networks, where the relevant degrees between the concepts in a fuzzy-valued concept network are represented by arbitrary shapes of fuzzy numbers. There are two kinds of relevant relationships between any two concepts in a fuzzy-valued concept network, i.e., fuzzy positive association and fuzzy negative association. The relevant matrices and the relationship matrices are used to model the fuzzy-valued concept network. The elements in a relevant matrix represent the relevant degrees between concepts. The elements in a relationship matrix represent the relevant relationships between concepts. Furthermore, we also allow users’ queries to be represented by arbitrary shapes of fuzzy numbers and to use fuzzy positive association relationship and fuzzy negative association relationship for formulating their queries for increasing the flexibility of fuzzy information retrieval systems. We also present an information retrieval method in the Internet environment based on the network-type fuzzy-valued concept network architecture.
Index Terms—Document retrieval, fuzzy information retrieval, fuzzy numbers, fuzzy-valued concept networks, network-type fuzzy-valued concept networks, relationship matrices, relevant matrices.
I. INTRODUCTION
Most of the existing information retrieval systems are based on the traditional Boolean logic model [19]. The information retrieval systems based on the Boolean logic model all assume that the documents and the users’ queries should be represented by precise index terms. This makes these systems restricted in practical applications especially in the circumstance where the information has uncertainty or fuzziness. In order to overcome the drawbacks of the traditional Boolean logic model, some models like the probability model, the fuzzy set model, and the vector space model are proposed [19]. Since the fuzzy set model can properly represent the inexact and uncertain knowledge of human beings, many researches are devoted to use the fuzzy set theory in the design of fuzzy information retrieval systems. Moreover, many fuzzy information retrieval techniques have been presented such as [1], [3]–[5], [8], [9], [11], [14]–[18], and [20].
In [15], Lucarella et al. presented an information retrieval method that uses fuzzy concept networks for knowledge representation. A fuzzy concept network consists of nodes and links. Each node in a fuzzy concept network represents a document or a concept, i.e., an index item or a topic of documents. Each link in a fuzzy concept network connects two concepts and is associated with a real value between 0 and 1 which represents the relevant degree between two concepts. By means of the fuzzy inference through fuzzy concept networks, the information retrieval systems are developed. Since the fuzzy inference through the fuzzy concept network is time consuming, in [3] we used concept matrices to model fuzzy concept networks and perform fuzzy inference through concept matrices instead of fuzzy concept network. Since the fuzzy inference through concept matrices
Manuscript received May 10, 1999; revised September 23, 2000. This work was supported by the National Science Council, R.O.C., under Grant NSC 88-2213-E-011-084. This paper was recommended by Associate Editor Teodorescu.
S.-M. Chen is with the Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, R.O.C. (e-mail: [email protected]).
Y.-J. Horng and C.-H. Lee are with the Department of Computer and Infor-mation Science, National Chiao Tung University, Hsinchu, Taiwan, R.O.C.
Publisher Item Identifier S 1083-4419(01)00092-9. 1083–4419/01$10.00 © 2001 IEEE
flexibility. In [5], we presented a method for fuzzy query processing for document retrieval based on extended fuzzy concept networks. How-ever, the method presented in [5] also only allows the relevant degrees between concepts to be represented by real values between 0 and 1. Fur-thermore, the method presented in [5] is also restricted because users only can use real values between 0 and 1 rather than fuzzy numbers to formulate their queries. In [4], we presented a method for fuzzy query processing for document retrieval, where the relevant degrees between concepts are restricted to be represented by trapezoidal fuzzy numbers, and the relevant relationships between concepts are also restricted to be represented by the fuzzy positive relationship.
In this paper, we use fuzzy-valued concept networks to prop-erly represent fuzzy knowledge for fuzzy information retrieval. A fuzzy-valued concept network consists of nodes and links, each node represents a document or a concept, and each link between two nodes associated with a tuple(~; F R) represents the relevance between two nodes, where ~ is a fuzzy number with arbitrary shape representing the relevant degree between two nodes andF R represents the rele-vant relationship between two nodes, respectively. The values of the relevant degree between any two nodes not only can be real values between 0 and 1, but also can be arbitrary shapes of fuzzy numbers. Moreover, the relevant relationship between any two concepts not only can be a fuzzy positive association relationship, but also can be a fuzzy negative association relationship. In order to reduce the time of fuzzy inference, we use relevant matrices and relationship matrices to model fuzzy-valued concept networks. The elements in a relevant matrix represent the relevant degrees between concepts. The elements in a relationship matrix represent the relevant relationships between concepts. Furthermore, we also allow users’ queries to be represented by arbitrary shapes of fuzzy numbers and to use fuzzy positive as-sociation relationship and fuzzy negative asas-sociation relationship for formulating their queries for increasing the flexibility of fuzzy infor-mation retrieval systems.
Furthermore, because of the Internet, the documents required by the users should not be bound to a single-host computer. An intel-ligent information retrieval system must have the capability to help the users to get the documents on different computers through the In-ternet when the required documents cannot be found on the computers where the users submit their query expressions. Thus, in this paper we also extend the proposed fuzzy-valued concept network architec-ture to the network-type fuzzy-valued concept network architecarchitec-ture and present a fuzzy information retrieval method based on the network-type fuzzy-valued concept networks in the Internet environment.
The rest of this paper is organized as follows. In Section II, we present the fuzzy-valued concept network architecture for knowledge representation. In Section III, we present a method to model fuzzy-valued concept networks using relevant matrices and relationship ma-trices. In Section IV, we present an information retrieval method based on the fuzzy-valued concept networks. In Section V, we present a net-work-type fuzzy-valued concept network architecture for knowledge representation and present a fuzzy information retrieval method in the Internet environment based on the network-type fuzzy-valued concept network architecture. The conclusions are discussed in Section VI.
Fig. 1. Triangular fuzzy number.
Fig. 2. Trapezoidal fuzzy number.
II. FUZZY-VALUEDCONCEPTNETWORKS
In this section, we briefly review the definition of fuzzy numbers [7] and the concepts of fuzzy positive association relationship and fuzzy negative association relationship from [13].
Definition 2.1: A fuzzy number ~A is a fuzzy set defined in the uni-verse of discourse ofU that is both convex and normal. A fuzzy set ~A is convex if and only if for allu1; u2inU
fA~(u1+ (1 0 )u2) Min(fA~(u1); fA~(u2)) (1) wherefA~ is the membership function of the fuzzy set ~A; fA~ : U ! [0; 1] and 2 [0; 1]. A fuzzy set ~A is normal if there exists ui 2 U, such thatfA~(ui) = 1, where fA~is the membership function of fuzzy set ~A; fA~ : U ! [0; 1].
From Definition 2.1, we can see that a fuzzy number can be rep-resented by arbitrary shapes. For example, the triangle fuzzy number shown in Fig. 1 and the trapezoidal fuzzy number shown in Fig. 2 are the most often used fuzzy numbers. From Fig. 1, we can see that a tri-angular fuzzy number ~A can be represented by a triplet (a1; a2; a3), i.e., ~A = (a1; a2; a3). From Fig. 2, we can see that a trapezoidal fuzzy number ~B can be represented by a quadruple (b1; b2; b3; b4), i.e.,
~
B = (b1; b2; b3; b4).
In this paper, for the convenience of explanations, we assume that the fuzzy numbers used in the fuzzy-valued concept network are all represented by the “close to” shape. However, the fuzzy numbers of arbitrary shapes are also allowed in the fuzzy-valued concept network. According to [7], a “close to ” fuzzy number is shown in Fig. 3, where is a real number or an integer.
The membership function of the fuzzy number “close to ” is defined by
fclose to (u) = 1 + (1u0 )2
(2)
where the crossover points are atu = 6 , and the value of is the “half-width” of the curve at the crossover points. The larger the value
Fig. 3. “Close to ” fuzzy number.
of, the wider the curve is. In this paper, we assume that the value of is 0.1.
According to [12], a fuzzy number ~A can be decomposed into its level sets or-cuts), i.e.,
~
A = 1
0 ~A
(3)
where ~A = [()1 ; a()2 ] is the -cut of ~A and 2 [0; 1]. Assume that there is another fuzzy number ~B
~
B = 1
0 ~B
(4)
where ~B = [b()1 ; b()2 ] is the -cut of ~B and 2 [0; 1]. Then, ac-cording to [12], the “OR” operation and the “AND” operation of the fuzzy numbers ~A and ~B are defined by
~ A _ ~B = 1 0 a () 1 _ b()1 ; a()2 _ b()2 (5) ~ A ^ ~B = 1 0 a () 1 ^ b()1 ; a()2 ^ b()2 (6) where “ _ ” and “ ^ ” are the “OR” operator and the “AND” operator of the fuzzy numbers, respectively, where “^” is the minimum operator, “_” is the maximum operator, and 2 [0; 1].
Definition 2.2: Let ~A and ~B be two fuzzy numbers of the universe of discourseU with member functions fA~andfB~, respectively, where fA~: U ! [0; 1] and fB~ : U ! [0; 1]. If 8 ui2 U; fA~(ui) = fB~(ui), then the fuzzy numbers ~A and ~B are called equal, i.e., ~A = ~B.
In the following, we briefly review the concepts of fuzzy positive association relationship and fuzzy negative association relationship from [13]
1) Fuzzy positive association: It relates concepts that have, in some contexts, a similar meaning, e.g., person$ individual or which are typically used in the same context, e.g., person$ address. 2) Fuzzy negative association: It relates concepts which are
com-plementary, e.g. male$ female; incompatible, e.g., unemployed $ freelance; or antonyms, e.g., small $ large.
Definition 2.3: A fuzzy-valued concept network can be represented asEF CN(N; L), where N is a set of nodes, and each node stands for a concept or a document andL is a set of directed edges between nodes. If ` 2 L then ` is associated with a tuple (~; F R), where ~ represents the degree of linking strength between nodes and its value is a fuzzy number. F R is the relationship between two nodes linked by the directed edge `, and F R 2 fP; Ng, where P stands for fuzzy positive association relationship andN stands for fuzzy negative association relationship.
Example 2.1: Assume that there is a fuzzy-valued concept network as shown in Fig. 4.
Fig. 4. Example of fuzzy-valued concept network.
From Fig. 4, we can see that the relationship between conceptc4and conceptc1is a fuzzy positive association relationship with a relevant degree0:6, i.e., close to 0.6; the relationship between concept c1and conceptc3is a fuzzy negative association relationship with the relevant degree0:7, i.e., close to 0.7; concept c5and conceptc2are linked by a fuzzy positive association relationship with the relevant degree0:9, i.e., close to 0.9. Documentd2contains conceptc4with the relevant degree0:9, i.e., close to 0.9 and contains concept c5with the relevant degree0:6, i.e., close to 0.6.
III. RELEVANTMATRICES ANDRELATIONSHIPMATRICES In this section, we present a method to model fuzzy-valued concept networks using relevant matrices and relationship matrices. The defi-nitions of the transitive closure of relevant matrices and the transitive closure of relationship matrices are also presented in this section.
Definition 3.1: The relevant matrixV is a fuzzy matrix, where the elementvijrepresents the relevant degree between conceptciand con-ceptcjin a fuzzy-valued concept network, andvijis a fuzzy number. Ifvij = ~0, then it means that the relevant degree between concept ci and conceptcjis not given by the experts in the fuzzy-valued concept network.
Definition 3.2: LetP and Q be two relevant matrices with elements denoted bypijandqij, respectively, wherepijandqijare fuzzy num-bers and1 i j n. If 8i8j; pij= qij, then the relevant matrices P and Q are called equal, i.e., P = Q.
Definition 3.3: Assume thatV is a relevant matrix
V = v11 v12 1 1 1 v1n v21 v22 1 1 1 v2n .. . ... 1 1 1 ... vn1 vn2 1 1 1 vnn
wheren is the number of concepts in a fuzzy-valued concept network. See (7), shown at the bottom of the next page, where _ and ^ are the “OR” operator and “AND” operator of fuzzy numbers, respectively. Then there exists a positive integerp, p n 0 1, such that Vp = Vp+1 = Vp+2 = 1 1 1. Let T = VP, thenT is called the transitive closure of the relevant matrixV .
Definition 3.4: The relationship matrixR is a fuzzy matrix, where the elementrijrepresents the relationship between conceptciand con-ceptcjin a fuzzy-valued concept network andrij2 fP; N; Zg, where
P stands for the fuzzy positive association relationship, N stands for the fuzzy negative association relationship, andZ stands for the un-known relationship. Ifrij = Z, then it means that the relationship between conceptciand conceptcjis not given by the experts in the fuzzy-valued concept network.
Definition 3.5: LetR and S be two relationship matrices with ele-ments represented byrijandsij, respectively, whererij2 fP; N; Zg
andsij 2 fP; N; Zg, and 1 i j n. If rij = sij, then the relationship matricesR and S are called equal, i.e., R = S.
Definition 3.6: Assume thatR is a relationship matrix and
R = r11 r12 1 1 1 r1n r21 r22 1 1 1 r2n .. . ... 1 1 1 ... rn1 rn2 1 1 1 rnn
wheren is the number of concepts in the fuzzy-valued concept net-work,rij 2 fP; N; Zg; 1 i n, and 1 j n. See (8), shown at the bottom of the page, where “_ ” is the operator of choosing the fuzzy relationships whose priority is the highest. In this paper, we give the first priority to the fuzzy negative association relationship(N), the fuzzy positive association relationship(P ) gets the second priority, and the relationship(Z) gets the lowest priority, i.e., N > P > Z. “ ^ ” is the operator of choosing the combination of two relationships ac-cording to Table I. Then, there exists a positive integerp; p n 0 1 such thatRp= Rp+1= Rp+2= 1 1 1. Let L = Rp, whereL is called the transitive closure of the relationship matrixR.
Example 3.1: Assume that there is a fuzzy-valued concept network as shown in Fig. 5.
Then, we can use the relevant matrixV and the relationship matrix R shown as follows to model the fuzzy-valued concept network, where c1 c2 c3 c4 c5 V = c1 c2 c3 c4 c5 ~1 ~0 0:7 0:6 ~0 ~0 ~1 ~0 ~0 0:9 0:7 ~0 ~1 ~0 ~0 0:6 ~0 ~0 ~1 ~0 ~0 0:9 ~0 ~0 ~1
Fig. 5. Fuzzy-valued concept network used in Example 3.1. c1 c2 c3 c4 c5 R = c1 c2 c3 c4 c5 P Z N N Z Z P Z Z P N Z P Z Z N Z Z P Z Z P Z Z P :
According to Definition 3.3 and Definition 3.6 we can obtain the transitive closureT of the relevant matrix V and the transitive closure L of the relationship matrix R shown as follows:
c1 c2 c3 c4 c5 T = c1 c2 c3 c4 c5 ~1 ~0 0:7 0:6 ~0 ~0 ~1 ~0 ~0 0:9 0:7 ~0 ~1 0:6 ~0 0:6 ~0 0:6 ~1 ~0 ~0 0:9 ~0 ~0 ~1 c1 c2 c3 c4 c5 L = c1 c2 c3 c4 c5 P Z N N Z Z P Z Z P N Z P P Z N Z P P Z Z P Z Z P :
IV. FUZZYQUERYPROCESSING FORDOCUMENTRETRIEVALUSING FUZZY-VALUEDCONCEPTNETWORKS
In this section, we present a method for fuzzy query processing for document retrieval using fuzzy-valued concept networks. First, we
in-V2= V V = _
i=1;...;n(v1i ^ vi1) i=1;...;n _ (v1i ^ vi2) 1 1 1 i=1;...;n _ (v1i ^ vin) _
i=1;...;n(v2i ^ vi1) i=1;...;n _ (v2i ^ vi2) 1 1 1 i=1;...;n _ (v2i ^ vin) ..
. ... ... ...
_
i=1;...;n(vn1 ^ vi1) i=1;...;n _ (vni ^ vi2) 1 1 1 i=1;...;n _ (vni ^ vin)
(7)
R2= R 3 R = _
i=1;...;n(r1i ^ ri1) i=1;...;n_ (r1i ^ ri2) 1 1 1 i=1;...;n_ (r1i ^ rin) _
i=1;...;n(r2i ^ ri1) i=1;...;n_ (r2i ^ ri2) 1 1 1 i=1;...;n_ (r2i ^ rin) _
i=1;...;n... ... ... ...
_
i=1;...;n(rni ^ ri1) i=1;...;n_ (rni ^ ri2) 1 1 1 i=1;...;n_ (rni ^ rin)
troduce the definitions of document descriptor relevant matrices and document descriptor relationship matrices.
Definition 4.1: LetD be a set of documents in a fuzzy-valued con-cept network,D = fd1; d2; . . . ; dmg and let C be a set of concepts in a fuzzy-valued concept network,C = fc1; c2; . . . ; cng. Then, the document descriptor relevant matrixE is shown as follows:
c1 c2 . . . cn E = d1 d2 .. . dm e11 e12 . . . e1n e21 e22 . . . e2n .. . ... . . . ... em1 em2 . . . emn where m number of documents;
n number of concepts,eij stands for the relevant degree be-tween documentdiand conceptcj;
eij fuzzy number,1 i m, and 1 j n.
Definition 4.2: The document descriptor relationship matrixF is shown as follows: c1 c2 . . . cn F = d1 d2 .. . dm f11 f12 . . . f1n f21 f22 . . . f2n .. . ... ... ... fm1 fm2 . . . fmn
wherefijstands for the fuzzy relationship between documentdiand conceptcj; fij 2 fP; N; Zg.
However, the experts may forget to set the relevant degrees and rela-tionships between some documents and some concepts. Since the im-plicit relevant degrees and relationships between concepts can be ob-tained from the transitive closureT of the relevant matrix V and the transitive closureL of the relationship matrix R, we can use the transi-tive closureT of the relevant matrix V and the transitive closure L of the relationship matrixR to get the implicit relevant degrees and rela-tionships between documents and concepts. LetE3= E T , then E3 includes the implicit relevant degrees between documents and concepts. LetF3 = F 3 L, then F3includes the implicit relevant relationships between documents and concepts.E3andF3will then be used as a basis for similarity measures between queries and documents. Each row ofE3can be thought as a document descriptor relevant vector and each row ofF3can be thought as a document descriptor relationship vector.
The user’s queryQ can be represented by a query descriptor rele-vant vectorqv and a query descriptor relationship vector qr shown as follows:
qv = h ~x1; ~x2; . . . ; ~xni
qr = hy1; y2; . . . ; yni
wherex~i means the relevant degree between desired documents and conceptci; ~xiis a fuzzy number, and1 i n; yimeans the rela-tionship between desired documents and conceptciandyi2 fP; Ng. Ifyi = P , then the desired documents should contain ci; ifyi= N, then the desired documents should not contain conceptci. Moreover, if the user doesn’t set the values ofx~iandyi, then conceptciis thought as been neglected by the user, and~xiandyiwill be labeled as “-”. That is, the users “do not care” whether the retrieved documents contain con-ceptcior not.
Assume that there are two tuples, i.e.,h ~A; Bi and h ~C; Di, where ~A and ~C are fuzzy numbers, B 2 fP; N; Zg, and D 2 fP; N; Zg, then the degree of similarity betweenh ~A; Bi and h ~C; Di can be calculatedby
Y (h ~A; Bi; h ~C; Di) = 0; ifB 6= D 1 0 sup a 0c + a 0c 2 ; if B = D (9) where 2 [0; 1] and Y (h ~A; Bi; h ~C; Di) 2 [0; 1].
Assume that the document descriptor relevant vector dvi and the document descriptor relationship vectordriare represented as follows:
dvi= h~vi1; ~vi2; . . . ~vini
dri= hri1; ri2; . . . ; rini:
Then, the degree of satisfaction that documentdisatisfies the user’s queryQ can be evaluated by (10), shown at the bottom of the page, where
qv(j) jth element of the query descriptor relevant vector; qv; qr(j) jth element of query descriptor relationship vector qr;
1 j n; RS(di) 2 [0; 1];
k number of concepts not neglected by the user query. The information retrieval system would display every document having the degree of satisfaction greater than a threshold value, where 2 [0; 1], in a sequential order from the document with the highest degree of satisfaction to that with the lowest one.
Example 4.1: Assume that we have the same fuzzy-valued con-cept network as shown in Example 3.1, and there are four documents d1; d2; d3; d4as shown in Fig. 5. Then, the document descriptor rele-vant matrixE and the document descriptor relationship matrix F are as follows: c1 c2 c3 c4 c5 E = d1 d2 d3 d4 ~0 ~0 0:6 ~0 ~0 ~0 ~0 0:9 0:5 ~0 ~0 ~0 ~0 0:6 0:9 ~0 ~0 ~0 ~0 ~1 c1 c2 c3 c4 c5 F = d1 d2 d3 d4 Z Z N Z Z Z Z P P Z Z Z Z P P Z Z Z Z P :
The transitive closureT of the relevant matrix V and the transitive closureL of the relationship matrix R has been obtained as shown in Example 3.1. BecauseE3= E T and F3= F 3 L, we can obtain E3andF3shown as follows:
c1 c2 c3 c4 c5 E3= d1 d2 d3 d4 0:6 ~0 0:6 0:6 ~0 0:7 ~0 0:9 0:6 ~0 0:6 0:9 0:6 0:6 0:9 ~0 0:9 ~0 ~0 ~1 c1 c2 c3 c4 c5 F3= d1 d2 d3 d4 P Z N N Z N Z P P Z N P P P P Z P Z Z P : RS(di) = qv(j)6=` -'' and` qr(j)6=` -'' and` j=1;...;n Y (h~vij; riji; h~xj; yji) k (10)
Fig. 6. Architecture of the network-type fuzzy-value concept network.
Assume that the user’s queryQ is represented by the query descriptor relevant vector qv and the query descriptor relationship vector qr shown as follows:
qv = h0:6; -; 0:8; 0:6; 0:7i qr = hP; -; P; N; P i: Then, based on (10), we can get
RS(d1) = 1 + 0 + 1 + 04 = 24= 0:5
RS(d2) = 0 + 0:9 + 0 + 04 = 0:94 = 0:225
RS(d3) = 0 + 0:8 + 0 + 0:84 = 1:64 = 0:4
RS(d4) = 0 + 0 + 0 + 0:74 = 0:74 = 0:175:
Assume that the information retrieval threshold value = 0:2, then the sequential order from highest retrieval status value to that with the lowest retrieval status value isd1> d3> d2. In this case, document d1is the best choice for the user’s query, and documentd4will not be retrieved in this example due to the fact that its degree of satisfaction is smaller than 0.2.
V. FUZZYQUERYPROCESSINGUSINGFUZZY-VALUEDCONCEPT NETWORKS IN THEINTERNETENVIRONMENT
Since the Internet became prevalent [6], [21], the information about the documents needed by the user should not be bound on a single host computer. When the users’ queries cannot be satisfied on the local com-puter, the information retrieval system should expand its searching ca-pability to other computers on the Internet until the required documents are either found or they do not exist.
In this section, we present the network-type fuzzy-valued concept networks architecture as the basis for fuzzy information retrieval in the Internet environment. The architecture of the network-type fuzzy-value concept network is shown in Fig. 6.
From Fig. 6, we can see that each host links to the Internet by the bold black lines. Each host has its local fuzzy-valued concept network as the knowledge base of the documents and concepts. Substantially, the local fuzzy-valued concept networks inside these hosts are the same as the ones presented in the previous sections. That is, the fuzzy-valued concept networks inside these hosts allow the values of the relevant de-grees between concepts to be arbitrary shapes of fuzzy numbers, and
the relevant relationships between nodes to be not only fuzzy positive association relationship, but also fuzzy negative association relation-ship.
Since the local fuzzy-valued concept networks inside these hosts are the same as the ones presented in the previous sections, we can also model these local fuzzy-valued concept networks by using relevant ma-trices and relationship mama-trices. Furthermore, we can get the transitive closures of the relevant matrices and the transitive closures of the rela-tionship matrices when the relevant matrices and relarela-tionship matrices are known. The implicit relevant degrees and implicit relationships be-tween concepts then can be found in the transitive closures of the rel-evant matrices and the transitive closures of the relationship matrices, respectively.
The document descriptor relevant matrices and document descriptor relationship matrices can model the relevant degrees and fuzzy rela-tionships between documents and concepts in each local fuzzy-valued concept network inside each host in the Internet environment. However, the experts may forget to set the relevant degrees or fuzzy relationships between some documents and concepts. Because all associate concepts are linked together, we can get the implicit relevant degrees and fuzzy relationships between documents and concepts by the transitive clo-sures of the relevant matrices and the transitive cloclo-sures of the relation-ship matrices. Assume the document descriptor relevant matrix isE, and the transitive closure of the relevant matrix isT , let E3= E T , thenE3includes all the implicit relevant degrees between documents and concepts. Assume that the document descriptor relation matrix is F , and assume that the transitive closure of the relationship matrix is L, let F3= F 3 L, then F3includes all the implicit relationships be-tween documents and concepts.
By the previous discussions, we know that the fuzzy-valued concept networks contain nodes and links. These nodes stand for either docu-ments or concepts. In the network-type fuzzy-valued concept network architecture, we assume that each local fuzzy-valued concept network may have an identical number of concept nodes and a different number of document nodes. Therefore, the relevant matrices and relationship matrices used to model the local fuzzy-valued concept networks on each host are identical. But the document descriptor relevant matrices and the document descriptor relationship matrices are different on dif-ferent hosts.
Example 5.1: Assume that Figs. 7 and 8 are two local fuzzy-valued concept networks on host 1 and host 2, respectively, which are linked by the Internet. From Figs. 7 and 8, we can see that concepts c1; c2; c3; c4; c5, and documentsd1; d2; d3are located on host 1, and that conceptsc2; c3; c5; c6; c7 and documents d4; d5; d6 are located on host 2.
Fig. 7. Fuzzy-valued concept network on host 1.
Fig. 8. Fuzzy-valued concept network on host 2.
By the previous discussions, we can see that the relevant matrices and relationship matrices on these two hosts areV and R, respectively, where c1 c2 c3 c4 c5 V = c1 c2 c3 c4 c5 ~1 ~0 0:7 0:6 ~0 ~0 ~1 ~0 ~0 0:9 0:7 ~0 ~1 ~0 ~0 0:6 ~0 ~0 ~1 ~0 ~0 0:9 ~0 ~0 ~1 c1 c2 c3 c4 c5 R = c1 c2 c3 c4 c5 P Z N P Z Z P Z Z P N Z P Z Z P Z Z P Z Z P Z Z P :
Let the transitive closure of the relevant matrix V be T , and let the transitive closure of the relationship matrixR be L. Then, according to Definition 3.3 and Definition 3.6, we can getT and L shown as follows:
c1 c2 c3 c4 c5 T = c1 c2 c3 c4 c5 ~1 ~0 0:7 0:6 ~0 ~0 ~1 ~0 ~0 0:9 0:7 ~0 ~1 0:6 ~0 0:6 ~0 0:6 ~1 ~0 ~0 0:9 ~0 ~0 ~1 c1 c2 c3 c4 c5 L = c1 c2 c3 c4 c5 P Z N P Z Z P Z Z P N Z P N Z P Z N P Z Z P Z Z P :
Let the document descriptor relevant matrix and the document descriptor relationship matrix used to model the local fuzzy-valued concept network on host 1 beE1and F1, respectively, and let the document descriptor relevant matrix and the document descriptor relationship matrix used to model the local fuzzy-valued concept network on host 2 beE2andF2, respectively. Then, from Figs. 7 and 8, we can getE1; F1; E2, andF2, shown as follows:
c1 c2 c3 c4 c5 E1= d1 d2 d3 ~0 ~0 0:8 ~1 ~0 ~0 ~0 ~0 0:9 0:6 ~0 ~0 ~0 ~0 0:8 c1 c2 c3 c4 c5 F1= d1 d2 d3 Z Z N P Z Z Z Z P P Z Z Z Z P c1 c2 c3 c4 c5 E2= d4 d5 d6 ~0 ~0 0:9 0:8 ~0 ~0 ~0 ~0 0:6 0:9 ~0 ~0 ~0 ~0 ~1 c1 c2 c3 c4 c5 F2= d4 d5 d6 Z Z P N Z Z Z Z P P Z Z Z Z P :
LetE13= E1 T , then E13contains the implicit relevant degrees be-tween documents and concepts of the local fuzzy-valued concept net-works in host 1. LetF13 = F13 L, then F13contains the implicit re-lationships between documents and concepts of the local fuzzy-valued concept networks in host 1. LetE32 = E2 T , then E23contains the implicit relevant degrees between documents and concepts of the local fuzzy-valued concept networks in host 2. LetF23 = F23 L, then F23 contains the implicit relationships between documents and concepts of the local fuzzy-valued concept networks in host 2, whereE13; F13; E23, andF23are shown as follows:
c1 c2 c3 c4 c5 E3 1= d1 d2 d3 0:7 ~0 0:8 ~1 ~0 0:6 0:6 0:6 0:9 0:6 ~0 0:8 ~0 ~0 0:8 c1 c2 c3 c4 c5 F13= d1 d2 d3 P Z N P Z P P N P P Z P Z Z P c1 c2 c3 c4 c5 E3 2= d4 d5 d6 0:7 ~0 0:9 0:8 ~0 0:6 0:9 0:6 0:6 0:9 ~0 0:9 ~0 ~0 ~1 c1 c2 c3 c4 c5 F3 2 = d4 d5 d6 N Z P N Z P P N P P Z P Z Z P :
Then,E31; F13; E23, andF23form the basis for computing the similari-ties between documents and users’ queries.
Assume that a user formulates his/her query expression in the fuzzy information retrieval system based on the network-type fuzzy-valued concept network on host 1 shown in Fig. 6. First, the user’s query ex-pression is handled by the method presented in Section IV. If the de-sired documents are not found in host 1, the system can choose other hosts from a list of hosts. Then, the user’s query is sent to the other
2 (the degree of strength is about 0.8) and concept 4 (the degree of strength is about 0.9), but should not contain concept 5 (the degree of strength is about 1). Then, the user’s queryQ can be represented by a query descriptor relevant vectorqv1and query descriptor relationship vectorqr1shown as follows:
c1 c2 c3 c4 c5
qv1= h - 0:8; -; 0:9; 1 i
c1 c2 c3 c4 c5
qr1= h -; P; -; P; N i:
Then, based on the results of Example 5.1 and (10), we can get RS(d1) = 0 + 0:9 + 03 = 0:93 = 0:3
RS(d2) = 0:8 + 1 + 03 = 1:83 = 0:6
RS(d3) = 1 + 0 + 03 = 13= 0:333:
Assume that the information retrieval threshold value = 0:3, then the sequential order from the highest retrieval status value to that with the lowest retrieval status value isd2> d3> d1. In this case, document d2is the best choice for the user’s query.
Assume that the desired documents are not found on host 1 (although they exist on host 1 in this example), and assume that the user’s query is sent to host 2. Then, based on the results of Example 5.1 and (10), we can get
RS(d4) = 0 + 0 + 03 = 03 = 0
RS(d5) = 0:9 + 0:7 + 03 = 1:63 = 0:533
RS(d6) = 0:9 + 0 + 03 = 0:93 = 0:3:
Assume that the information retrieval threshold value = 0:3, then the sequential order from the highest retrieval status value to that with the lowest retrieval status value isd5> d6> d4. In this case, document d5is the best choice for the user’s query, and documentd4will not be retrieved in this example due to the fact that its degree of satisfaction is smaller than 0.3.
VI. CONCLUSION
In this paper, we have presented a new method for document retrieval using fuzzy-valued concept networks. The fuzzy-valued concept net-works allow the values of the relevant degree between concepts to be arbitrary shapes of fuzzy numbers, and the relevant relationships be-tween concepts not only to be fuzzy positive association relationship but also fuzzy negative association relationship. The fuzzy information retrieval systems based on the fuzzy-valued concept networks can be designed in a more flexible and more intelligent manner. Moreover, we also allow the users’ queries to be represented by arbitrary shapes of fuzzy numbers and to use fuzzy positive association relationship and
15–26, 1991.
[2] S. M. Chen, “A fuzzy reasoning technique based on the-cuts opera-tions of fuzzy numbers,” in Proc. 2nd Int. Conf. Automation Technology, vol. 3, Taipei, Taiwan, R.O.C., July 1992, pp. 147–154.
[3] S. M. Chen and J. Y. Wang, “Document retrieval using knowledge-based fuzzy information retrieval techniques,” IEEE Trans. Syst., Man, Cy-bern., vol. 25, pp. 793–803, May 1995.
[4] S. M. Chen, W. H. Hsiao, and Y. J. Horng, “A knowledge-based method for fuzzy query processing for document retrieval,” Cybern. Syst., vol. 28, pp. 99–119, Jan./Feb. 1997.
[5] S. M. Chen and Y. J. Horng, “Fuzzy query processing for document retrieval based on extended fuzzy concept networks,” IEEE Trans. Syst., Man, Cybern. B, vol. 29, pp. 126–135, Feb. 1999.
[6] D. E. Comer, Computer Networks and Internets. Englewood Cliffs, NJ: Prentice-Hall, 1997.
[7] J. Giarratano and G. Riley, Expert Systems: Principles and Program-ming. Boston, MA: PWS, 1994.
[8] G. T. Her and J. S. Ke, “A fuzzy information retrieval system model,” in Proc. 1983 Nat. Computer Symp., Taiwan, R.O.C., Dec. 1983, pp. 147–1551.
[9] Y. J. Horng and S. M. Chen, “Document retrieval based on extended fuzzy concept networks,” in Proc. 4th Nat. Conf. Defense Management, vol. 2, Taipei, Taiwan, R.O.C., Mar. 1996, pp. 1039–1050.
[10] , “Finding inheritance hierarchies in fuzzy-valued concept net-works,” IEEE Trans. Syst., Man, Cybern. B, vol. 29, pp. 126–135, Feb. 1999.
[11] Y. J. Horng, S. M. Chen, and C. H. Lee, “A fuzzy information retrieval method using fuzzy-valued concept networks,” in Proc. 10th Int. Conf. Tolls Artificial Intelligence, Taipei, Taiwan, R.O.C., Dec. 1998, pp. 104–111.
[12] A. Kandel, Fuzzy Mathematical Techniques with Applica-tions. Reading, MA: Addison-Wesley, 1986.
[13] M. Kracker, “A fuzzy concept network model and its applications,” in Proc. 1st IEEE Int. Conf. Fuzzy Systems, Mar. 1992, pp. 761–768. [14] M. Kamel, B. Hadfield, and M. Ismail, “Fuzzy query processing
using clustering techniques,” Inf. Process. Manage., vol. 26, no. 2, pp. 279–293, 1990.
[15] D. Lucarella and R. Morara, “FIRST: Fuzzy information retrieval system,” J. Inf. Sci., vol. 17, no. 1, pp. 81–91, 1991.
[16] S. Miyamoto, “Information retrieval based on fuzzy associations,” Fuzzy Sets Syst., vol. 38, pp. 191–205, 1990.
[17] T. Murai, M. Miyakoshi, and M. Shimbo, “A fuzzy document retrieval method based on two-valued indexing,” Fuzzy Sets Syst., vol. 30, pp. 103–120, 1989.
[18] T. Radechi, “Generalized Boolean methods of information retrieval,” Int. J. Man–Mach. Stud., vol. 18, no. 5, pp. 409–439, 1983.
[19] G. Salton and M. J. Mcgill, Introduction to Modern Information Re-trieval. New York: McGraw-Hill, 1983.
[20] V. Tahani, “A fuzzy model of document retrieval system,” Inf. Process. Manage., vol. 12, pp. 177–187, 1976.
[21] A. S. Tanenbaum, Computer Networks. Englewood Cliffs, NJ: Pren-tice-Hall, 1989.