• 沒有找到結果。

基於自動查詢語句擴展之主題地圖智慧型新聞搜尋引擎

N/A
N/A
Protected

Academic year: 2021

Share "基於自動查詢語句擴展之主題地圖智慧型新聞搜尋引擎"

Copied!
23
0
0

加載中.... (立即查看全文)

全文

(1)

ࡣᅠ⎊Ւው⥴⦝ןიଭʠʙ㆛߸ߧᇜ๬ࠣ

ᅘ⊏၃ଁ೧Ⴞ

An Intelligent News Search Engine with Topic Map

User Interface Based on Automatic Query Expansion

づൠ⺐

ߡἼᄎᗶञણߧሬ⫏⤻⎞ᒆጊણᶇἄ໽Ԟᄞ࿙

Chih-Ming Chen

Associate Professor, Graduate Institute of Library, Information and Archival Studies, National

Chengchi University

E-mail: chencm@nccu.edu.tw

೺⇾ⓧ

ߡἼቺⓧञણણ∳Ấ༬ᶇἄ໽ᷟं᫗ᶇἄᮝ

Mei-Hua Chang

Master Student, Graduate Institute of Learning Technology, National Dong Hwa University

(Meilun Campus)

E-mail: cutesandra15@yahoo.com.tw

⴯ϒ݄

ߡἼᄎᗶञણ⫏⤻Ấણᶇἄ໽ᷟं᫗ᶇἄᮝ

Wei-Chia Chiu

Master Student, Graduate Institute of Computer Science, National Chengchi University

E-mail: chiu.wei.jia@gmail.com

〦⼫⥱ņKeywordsŇŘ

ʙ㆛߸ߧņTopic MapsŇř⫏⤻ᒑ€ņInformation RetrievalŇřᵧ⨯቏㋤

ņOntologyŇ

řው⥴⦝ןიଭņQuery ExpansionŇ

řᅘ⊏၃ଁ೧ႾņNews

Search EngineŇ

ȹၪ⣬Ⱥ

Ⲗ ౺ ͗ ൬ Ⳍ ᱹ ଭ ᱿ ᅘ ⊏ ℐ Ἷ ⠘ ᮝ ʴ ͐ ᮢ

≛ଃᅠ⫏⤻ው⥴᱿೼ᣊトᖣŊ≟ᲿԊ⥓घᅘ

⊏ℐἿņ͛ॖŘGoogle NewsŇר˫⎊Ւ⸒

ଃ ℐ も ℐ ⭰ ᱿ ᅘ ⊏ Ⳗ ⠗ ᅘ ⊏ ʶ ˴ ᱿ ⎊ Ւ Ӡ

ㆩŊ˫ဏͧ⩊≛᳈׺ᅘ⊏ʶ˴᱿⊌ゝ⫏⤻Ŋ

(2)

20 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α

ʬဏͧ˫〦⼫઎Ⳗ⠗ᅘ⊏ው⥴᱿Լ⋱Ŋ̟ᆯ

ִᤀᘍဏͧᄮΤ᳈〦ᅘ⊏ʶ˴ᱹଭ⋸⃘᱿ው

⥴דͩჇʙ㆛᳈〦ʠᅘ⊏߸ߧ᱿ا᫠ᑨӼȯ

቏ᶇἄӴᮢ Google News ר˫⎊Ւ૽ᅘ⊏

ʶ˴Ӡㆩ᱿᧚ඖŊⳖɺᔎӴᮢᄊ⏦ೣパᇒ⓪

ᦲ ㆩ Ṙ ⃻ ℐ ⭰ ņ Modified Hopfield Neural

Networks, MHNNŇ⎊Ւᔖₗᮟᮝᅘ⊏ው⥴⦝

ןᵧ⨯቏㋤ņontologyŇŊʏͩჇ໽ᮟᮝʠ

ᅘ⊏ው⥴⦝ןᵧ⨯቏㋤ŊဏӛʴɺΤר˫ͩ

Ⴧ͐ᮢ≛฾⎟⬶ᅘ⊏ㆩӲ͗ϝᅘ⊏ʶ˴᱿ი

ଭው⥴ņquery expansionŇᑨӼŊ˫೼ջᅘ

⊏၃ଁ᱿ᄓ⋱ȯᔍकŊ☼᮫ʙ㆛߸ߧņtopic

mapsŇʠᅞೣဏͧሷӲᅠЗ⃥ᅘ⊏ا᫠ᅞೣ

⃛͐ᮢ≛ᡕ⤍〇⩊Ŋר˫૽ᅘ⊏ው⥴⃌ኞͩ

Ⴧᆹ⿵ᐻ⥆דʙ㆛〦⊓ا᫠Ŋ⩕͐ᮢ≛⋱ሩ

ᛤᎸ߸࿜ဝᄮΤᅘ⊏ʙ㆛᱿ᱹଭ⋸⃘ȯ

μAbstractν

With the rapid development of the

computer and Internet techniques, the

Internet appears some news aggregator sites

containing a large number of news articles,

thus leading to advanced information

retrieval requirements. Particularly, some

news sites, such as Google news, provide

automatically classified news information and

keyword based search mechanism to

readers for retrieving user-interested news

events. However, most news sites do not

provide currently to retrieve the developing

clues of a news event and display the search

results by the topic map with visualization

user interface. Therefore, this study presents

a novel news search engine with automatic

user query expansion mechanism and a

friendly topic map user interface based on

the automatic generation scheme of news

ontology constructed by Modified Hopfield

Neural Networks. The experimental results

indicate that the proposed query expansion

mechanism based on the generated news

ontology can efficiently help users to retrieve

the user-interested news events. Meanwhile,

the providing topic map of news events

ranked by time stamp also provides benefits

in terms of observing the developing context

of a news event.

ℨ⧄

ᓎ຀ᆪርᆪၰޠึ৥ȂᆪၰུᆹԚ࣐΢উഷள௦ ដޠၦଊٿྜϟΚȇུᆹηᗵ֥຀೩Ӽ२्ଊਁȂ ԥٳ२्ޠུᆹ٘пኈ៫ژ࢈ݾȃစᔽȃަ๊ཽР ८ޠึ৥ȄՅུᆹٲӈޠึҢȂηᡘұяҭࠊަཽ ึ৥ޠ૖୞Ȃ֊ਣජඬུᆹ؂ԥֆܼձҔጃޠցᘟ ᇅ؛๋Ȃࢉึ৥ًԂޠུᆹᔯસ׭೛Ȃ࢑Κঐय़Ϲ ޠःفឋᚡȞChen & Liu, 2008ȟȄ

ՅؑСུᆹٲӈቺяϛጐȂԄեւңԥਞޠၦଊ ᔯસ׭೛Ӷᛂτޠུᆹၦਠ৳ϜȂ׳яٻң޲మࢦ ၛޠၦଊ࢑ߩள२्ޠःفឋᚡȄளңޠၦଊᔯસ ዂ࠯τयѠϸ࣐Ȉҁݔړ኶ዂ࠯ȃӪ໕ޫ໣ዂ࠯І ᐡ౦ዂ࠯ήᆎȞBaeza-Yates & Ribeiro-Neto, 1999ȟȄ ҭࠊӶཫ൷Жᔞαࠍпҁݔᔯસഷኅ࣐ٻңȂҁݔ ᔯસණٽࢦၛᇮѰȞܗᜱᗥԆȟӶөНӈ໣ໍ՘һ ໲ȞANDȟȃᖓ໲ȞORȟܗৰ໲ȞNOTȟϟၽᆘȂ ᄈܼڏԥ݃ጃሰؒޠࢦၛມᔯસഁ࡚ץйߩளԥਞ ౦Ȃկᄈܼϛ݃ጃሰؒޠࢦၛܗ࢑ၷᜳߓႁϟᇮཏ ࢦၛ܂܂ആԚࢦྦ౦Ȟprecision rateȟմйє֥೩Ӽ ᚖଊȞnoiseȟޠુᘉȄ ᓎ຀ࢦၛᇮཏޠሰؒབٿབڨژτঢ়ޠ२ຝȂՅ ԥΠᇮཏᆪȞsemantic webȟȞBerners-Lee & Fischetti, 1999ȟޠᄻདྷȇוగ૗஡ӓ౩ၦଊᆪαޠၦਠȂᡑ ԚႬဟ૗౪၍ޠᇮّȞmachine-readableȟȂҭޠ࢑ᡲ Ⴌဟ૗Π၍΢উܛమߓႁޠ઎ҔཏࡧȄՅᇮཏᆪޠ ᄃ౫ሰٸᎭޤᜌҐᡞ፤Ȟontologyȟޠ࡛ဋȄޤᜌҐ ᡞ፤࢑Κঐ໧ቺԓ࢝ᄻȂڐ࢝ᄻ࢑ңٿඣख़࢛ঐስ ஀ޠޤᜌȞSwartout, Patil, Knight, & Russ, 1996ȟȄस ஡ޤᜌҐᡞ࡛ဋӶ੬ੇስ஀αȂඣख़੬ۢስ஀ޠޤ ᜌ᜹րȃ᜹րޠ឵ܓпІ᜹րᇅ᜹րϟ໣ޠᜱ߾Ȃ ໍΚؐႁԚ྆܉ᇮཏޠၦଊᔯસȂᄈܼٻң޲ࢦၛ ཏӪϛ݃ጃޠཫ൷ȂقಜѠпՍ୞ණٽΚٳࣻᜱࢦ ၛᇮѰޠᘘ৥Ȃпණାၦଊޠࢦӓ౦Ȅկᓎ຀ၦଊ

(3)

ץഁӵᡑ୞Ȃसп΢Ϗٿ࡛ဋޤᜌҐᡞȂϛ༊࢑ຳ ਣέຳΩȂٛश࡛ဋԂޠޤᜌҐᡞηৡܿႇਣȃϛ ڏኇܓȄԥᠧܼԫȂԄեՍ୞Ͼ࡛ဋࢦၛᇮѰޤᜌ ҐᡞȂໍՅМනԥਞུᆹᔯસ࢑Ґःفޠл्ःف ឋᚡȄ Κૢޠཫ൷ЖᔞӶ֖౫Ϯ८೪ॏαȂ஡Щᄈ๗ݏ пఽ൑ޠРԓٿ֖౫࣐л्ޠዂԓȄҭࠊηึ৥Π ԥրܼ༉ಜ֖౫РԓϟຝញϾཫ൷ЖᔞȞvisualization engineȟȄ჌ Mooter ཫ൷ЖᔞȞMooter search engineȟ ւңထಣ׭೛ȂпຝញРԓ஡ཫ൷๗ݏϸ᜹ԚӼထ Рԓ֖౫๞ٻң޲ᘳ៕ȇKartoo ཫ൷ЖᔞȞKartoo search engineȟϸݚܛᇕ໲ޠ୧ཿᆪયܛණٽϟ݉ ଡ଼᜹࠯ٿ୉ϸထȂٯւң݉ଡ଼ޠᜱᗥມٿᡘұڐᆪ યޠᜱᖓᏳ៕Ȅկ࢑ҭࠊӶпུᆹཫ൷࣐лޠཫ൷ ЖᔞαȂۧુнпຝញϾޠ֖౫РԓȂР߰ٻң޲ ᘳ៕Ꭸ᠟ཫ൷๗ݏϟϮ८ȄҐःفпڑᙡޠ Google news ࣐ःفᄈຬȂॷӒၽңུᆹޤᜌҐᡞٿМන ུᆹཫ൷ЖᔞޠࢦၛᇮѰՍ୞ᘘ৥Ȃп෉૗ቩђᔯ સޠਞ૗ȇӕࠍ஡ཫ൷๗ݏплᚡӵშРԓ֖౫Ȃ пР߰ٻң޲Ꭸ᠟ࣻᜱུᆹึ৥૖๝Ȅ

ᄽ᪇࿯⤽

Ґःفл्ҭޠӶܼٸᐄ Google news ུᆹϸ᜹ ࢝ᄻȂණяΚঐՍ୞Ͼ౱ҢུᆹࢦၛມޤᜌҐᡞϟ ԥਞРݳȂٯ஡ڐᔗңܼණུ݈ᆹᔯસਞ૗Ȅ਴ ᐄԫःفҭޠȂҐ࿾஡ଭᄈࣻᜱܼҐःفޠၦଊ ᔯસ׭೛ձНᝧޠᇕ໲ᇅϸݚȂձ࣐Ґःفޠ౪ ፤ஆᙄȄॷӒϮಞ༉ಜၦଊᔯસ׭೛ІڐӶၦଊ ᔯસαޠ४ښȇ௦຀௥ଇࢦၛᇮѰޠᘘ৥Ȃє֥ ࣻᜱӲ㔴ȃӤဏມڸޤᜌҐᡞ፤Ԅեԥਞණ݈ᔯ સਞ૗ȇӕᇴ݃᜹ઢစᆪၰޠஆҐ࢝ᄻȂձ࣐Ж Ᏻ᠟޲ᕤ၍Ґःفܛ௵ңϟ׾ًԓᓔ඾ຆᅮ᜹ઢ စᆪၰȂໍ՘ུᆹޤᜌҐᡞ࡛ᄻޠஆᙄȇഷࡤඣख़ лᚡӵშ౪፤ІڐᄈຝញϾ֖౫ၦଊᔯસ๗ݏޠ ᓻᘉȄ

З⃥᱿⫏⤻ᒑ€༬⠛

༉ಜޠၦଊᔯસ׭೛л्єࢃᜱᗥԆՍ୞ᘞڦȃ ᜱᗥມસЖȃӓНᔯસȃНӈՍ୞ϸ᜹ІНӈՍ୞ᄣ ्๊ȞάϊፇȂ1996ȟȇၦଊᔯસዂԓл्ԥήτ᜹

ȞBaeza-Yates & Ribeiro-Neto, 1999ȟȈҁݔړ኶ዂ ࠯ȃӪ໕ޫ໣ዂ࠯Іᐡ౦ᔯસዂ࠯Ȃпί஡ഃΚ ϮಞȄ

లኚӜᄲᑁࠣ

ҁݔዂԓȞBoolean Modelȟ࢑Κᆎഷᙐ൑ޠᔯસ РݳȂഇႇ໲Ӭ౪፤ȞSet Theoryȟᇅҁݔх኶ ȞBoolean AlgebraȟޠၽᆘȂ֊НӈϲৡಓӬᔯસ ມȞܗࢦၛᇮѰȟϟ໣ޠ ANDȞ

ȟȃORȞ

ȟ І NOTȞ

¬

ȟϟҁݔၽᆘ޲ϘڦяȂϛಓӬ޲֊ ௼џȄ ӶϭСϬԥ೩Ӽτ࠯قಜ௵ңԫ᜹ޠᔯસዂԓȂ ᓻᘉ࢑ԫᔯસዂԓޠᔯસഁ࡚ץйᄃ౫Рݳᙐ൑Ȃ ᄈܼሰؒ݃ጃޠᔯસߩளԥਞȇુᘉ࢑ᔯસޠ๗ݏ ءԥٸྲಓӬโ࡚௷זйٻң޲ၷᜳпԫߓႁፓᚖ ࢦၛ఩ӈȄӱԫҁݔᔯસৡܿ౱Ңڎ྄Ͼޠ౫ຬȈ ାѷం౦Ȟsearch failureȟܗାྗᔯ౦Ȟinformation overloadingȟȞBaeza-Yates & Ribeiro-Neto, 1999ȟȄ ӱ ԫ Ȃ ԥ ٳ Ᏹ ޲ ණ я ᘘ щ ҁ ݔ ዂ ࠯ Ȟ Extended Boolean ModelȟȂ࡛ឋђ᠍ؑঐᜱᗥԆȂпණ݈ҁ ݔᔯસޠਞ૗ȞChoi, Kim & Raghavan, 2001ȟȄ

׿⸇ἇ⿵ᑁࠣ

SaltonȞ 1989ȟ ܼ 1965 Ԓ ණ я Ӫ ໕ ޫ ໣ ዂ ࠯ ȞVector Space Model, VSMȟޠᄻདྷȂϛӤܼҁݔ ᔯસޠ࢑ѻϛӕѬ࢑ΡϰϾޠЩᄈȄSalton ᇰ࣐ ഌӌޠ఩ӈಓӬ࢑ѠпԇӶޠȂӱԫණяւңᜱ ᗥ Ԇ ມ я ౫ ޠ ᓝ ౦ ٿ ඳ ᆘ ڐ ᠍ २ অ ܗ ࣻ ծ ࡚ ȞsimilarityȟȄڐ኶Ᏹዂ࠯ޠஆҐ঩౪࢑஡Нӈп ඳᆘϟ᠍२ߓұԚΚঐӪ໕ԓȂٻң޲ޠᔯસມη ߓ ұ Ԛ ѫ Κ Ӫ ໕ ԓ Ȃ ٻ ң Ρ ঐ Ӫ ໕ Ꮈ ۿ ַ ِ ȞCosineȟٿॏᆘࢦၛມᇅ࢛ΚНӈޠࣻծ࡚ȇ࿌ ڎ຺ַِϊȂڐࣻծ຺࡚ାȂЇϟȂࠍࣻծ຺࡚մȂ Ԅშ 1 ܛұȄՅࣻծ࡚অϮܼ 0 ژ 1 ϟ໣Ȃ஡ܛԥ НӈᇅࢦၛᇮѰޠࣻծ࡚অђп௷זȂ֊Ѡᕖூၦ ଊᔯસ๗ݏȄԫᆎ੬ܓȂ֊ٻ࢛጖НӈѬԥഌϸᇅ ࢦၛᇮѰࣻӤȂѻϬԥѠ૗ೞᔯસяٿȂӱԫѠॐ ۢᎍ࿌ޠᖞࣩঅȞthresholdȟȂѬ๶ڦ࢛ᆎࣻᜱโ ࡚пαޠНതȄԫᆎᔯસዂԓȂѠϱ೩ٻң޲ᒰΤ ӉཏԆ՜Ȃࢦၛਣϛ҇ڨၦਠᇳෛȃᓀԆȃ϶Ԇޠ ४ښȄ

(4)

22 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α შ 1! Ӫ໕ߓұݳ

ӱԫȂഇႇӪ໕ޫ໣ዂ࠯ޠ྆܉ȂНӈ D ѠߓұD=(t1,t2,t3,...,ti,...tn)ȂڐϜ ti࣐НӈϜಒ i ঐ

੬ኊᜱᗥມཋȄӶӪ໕ޫ໣ዂ࠯Ϝڐ੬ኊᜱᗥມཋ ᠍२অޠॏᆘȂഷளٻң TFɰIDFȞTerm Frequency

ɰInverse Document FrequencyȟԆມ᠍२ॏᆘР ԓȄഇႇ੬ኊᜱᗥມཋ᠍२ȂНӈ D Ѡೞߓұ࣐ ) ,... ,..., , , (w1 w2 w3 wi wn D= Ȃwi࣐၏੬ኊᜱᗥມ ti ӶНӈ D Ϝϟᄈᔗ᠍२Ȅߓ 1٨яၷளೞٻңޠࣻ ծ࡚ॏᆘϵԓȂ೩ӼޠНᝧࡿя Cosine coefficient ࣻծ࡚ॏᆘРԓޠᔯસ๗ݏྦጃ౦Щڐуޠࣻծ࡚ ॏᆘРݳٿூାȞBinstock & Rex, 1995; Och Dag, Regnell, Carlshamre, Andersson, & Karlsson, 2001ȟȄ

Ӫ໕ޫ໣ዂ࠯ᔯસॏᆘഁ࡚ץйᔯસਞ౦ାܼҁ ݔᔯસȇկુᘉ࢑ءԥՄኍژᜱᗥԆມϟ໣ޠӤဏ ມȂആԚࣻծ࡚ॏᆘޠᇳৰȞलᛞНȂ2005ȟȄ

ߓ 1! ளңޠࣻծ࡚ॏᆘϵԓ

Similarity Measure SimȞX,Yȟ Evaluation for Binary Term Vectors Evaluation for Weighted Term Vectors

Inter Product

X

Y

¦

= t i j iY X 1 Dice Coefficient Y X Y X + ∩ 2

¦

¦

¦

= = = + t i t i i i t i i i Y X Y X 1 1 2 2 1 2 Cosine Coefficient Y X Y X • ∩

¦

¦

¦

= = = × t i t i i i t i i i Y X Y X 1 1 2 2 1 Jaccard Coefficient Y X Y X Y X ∩ − + ∩

¦

¦

¦

¦

= = = = − + t i t i j i t i i i t i i i Y X Y X Y X 1 1 2 1 2 2 1 ၦਠٿྜȈȞSalton, 1989ȟ

ᑨ᪓ೣᒑ€

ᐡ౦ዂԓᔯસȞProbability ModelȟȞMaron & Kuhns, 1960; Miller, 1971ȟ࢑஡ࢦၛມཋᇅࣻᜱНӈ пᐡ౦ඣख़ٯђпၽᆘȇᇅӪ໕ዂԓϛӤޠ࢑ᐡ౦ ዂ࠯௵ңॏᆘࢦၛມᇅНӈࣻᜱޠᐡ౦Рԓໍ՘ᔯ સȂՅϛ࢑ւңࢦၛມᇅНӈࣻᜱޠโ࡚ȄӶНӈᇅ ࢦၛ఩ӈޠߓұРݳஆҐαᗚ࢑௵ңӪ໕םԓȂᇅ Ӫ໕ᔯસዂ࠯ϛӤޠ࢑Ӷॏᆘࣻծ࡚ޠਣ঑׾௵ᐡ ߠ ౦ޠϵԓȄᐡ౦ޠॏᆘഷள ޠ࢑௵ңٕԓۢ౪Ȅ

ው⥴⦝ן᱿იଭ

пαܛख़ϟ༉ಜၦଊᔯસРݳְпᜱᗥԆཫ൷ Ȟkeyword-base searchȟ࣐ஆᙄȂѬ૗׳ژНӈϜڏ ԥٻң޲ܛᒰΤᜱᗥԆޠНӈȂࠔ܈౲Π೩ӼНӈ ϜڏԥᇮཏޠᜱᖓȄՅւңࢦၛᇮѰᘘ৥ٿቩђၦ ଊᔯસޠਞ૗֊૗஡ᇮཏᜱᖓાΤՄ໕ȂႇџӶ೼ Р८ޠࣻᜱःفαȂΚૢഎٻңࣻᜱӲ㔴ȃӤဏມ ІޤᜌҐᡞ፤ٿໍ՘Ȟചӏ๽ȃೆ໰⩨Ȃ2001ȟȂп ίϸրᇴ݃ϟȄ

(5)

᳈〦ߊ㈘

ࣻᜱӲ㔴Ȟrelevance feedbackȟ࢑ࡿٻң޲ӶࠊΚ ໧ࢳᔯસ׳ژޠНӈϜȂࢆڦ२्ޠ੬ኊӲ㔴๞ق ಜȂп෉׳ژ؂Ӽࣻᜱၦਠޠٻң޲Ӳ㔴ᐡښȄՅ Ӳ㔴๞قಜޠ੬ኊέѠϸ࣐ڎᆎ࠯ᄙȈΚ࢑пНӈ Ґ࣐ٙлȂᆏ࣐ࣻᜱНӈӲ㔴ȇѫΚ࢑пࣻᜱມ࣐ л Ȃ ࠍ ᆏ ࣐ ࣻ ᜱ ມ Ӳ 㔴 ܗ ᔯ સ ມ ණ ұ Ȟ term suggestionȟȞ෇ϰᡘȂ2002ȟȄ ႇџӶࣻᜱӲ㔴ޠःفϜȂпࣻᜱНӈӲ㔴࣐ഷ ӼȂٻң޲ሰ्Ս՘ցᘟࣻᜱܼᔯસມޠНӈࡤӲ 㔴๞قಜȄดՅӶၦଊᔯસᕘძϜȂٻң޲҇໹् ߇ຳ೩Ӽ᚟ѵޠਣ໣ᇅᆡΩᘳ៕೼ٳНӈȂϘ૗ः ց٦ٳНӈᇅܛίࢦၛມ࢑ࣻᜱޠȂ܂܂ཽആԚٻ ң޲᚟ѵޠ॓ᐋȄӶ೼ᆎ௒ݸίȂѠ௵ңྦࣻᜱӲ 㔴Ȟpseudo relevance feedbackȟȂւң঩ࢦၛᇮѰᔯ સяΚಣНӈȂϛစٻң޲ցᘟ֊୆೪ܛԥНӈࣲ ࣐ࣻᜱȂՅ೼ٳ୆೪ޠࣻᜱНӈ֊စҦࣻᜱӲ㔴ޠ โז२ུ࡛ᄻུޠࢦၛᇮѰȂӕໍ՘ໍΚؐޠᔯસ Ȟചӏ๽ȃೆ໰⩨Ȃ2001ȟȄԫРݳԥΚ݃ᡘޠુᘉȈ स୆೪ϟࣻᜱНӈఽ൑ϜȂᄃርαϛࣻᜱޠНӈխ τഌϸȂ٦ቅђΤ঩ۗࢦၛᇮѰޠᘘ৥ມཋᇅ঩ᔯ સлᚡٯϛࣻᜱȂࠍᘘ৥ࡤࢦၛޠᔯસࠣ፵ཽᡑৰ ȞMitra, Singhal & Buckley, 1998ȟȄࣻၷϟίȂਗ਼І ЩၷЎ໕ϟၦଊࣻᜱມӲ㔴ᡘด࢑ЩၷԂޠᒶᐆȂ ٻң޲ηၷৡܿցᘟȄҦܼࣻᜱມ࢑௄ᔯસ๗ݏᘞ ڦяٿޠȂՅᔯસ๗ݏτഎᇅ঩ᔯસлᚡԆ՜ࣻ ߗȂӱԫᘞڦяޠࣻᜱມητഎၮᔯસлᚡԆ՜ࣻ ߗȄࣻᜱӲ㔴ӶၦଊᔯસϜೞᇰ࣐ᄈᔯસԚਞֆઊ ࣦτȞFrakes & Baeza-Yates, 1992ȟȄःفᡘұȂӶ Κ ٳ ӓН ᔯ સၦ ਠ ৳Ϝ Ȃ Ѡණ ݈ ᔯસ Ԛਞ 20% ȞHarman, 1995ȟȄկηԥഌӌޠःفࡿяȂпಜॏ ޠРԓٿಜॏӔӤя౫ޠࣻᜱມȞco-occurrence of termsȟђΤ঩ۗޠࢦၛᇮѰϜȂᗷѠпᅗ٘೩ӼН ӈ Ϝ ڏ ԥ ࣻ Ӥ ᇮ ཏ ޠ ඣ ख़ Ȟ Smeaton & Van Rijsbergen, 1983ȟȂٻࢦӓ౦ණାȂկηᡲࢦྦ౦७ ޠ؂մȞPeat & Willett, 1991; Woods, 1997ȟȄ

׺∑⥱

ӤဏມȞsynonymȟ೾ளࡿཏဏࣻծկٻңϛӤН ԆܛߓႁޠມȄΚૢӤဏມέಡϸ࣐ڎᆎȈΚᆎ࣐ ኅဏޠࣻᜱມȂѫΚঐ࣐੮ဏޠӤဏມȞ෇ϰᡘȂ 2002ȟȄՅኅဏޠࣻᜱມ࢑ࡿӶНӈϜစளя౫ޠ ມȂѠᆏ࣐ȶᜱᖓມ৳ȷȃȶӔ౫સЖڑȷȃȶӔգя ౫ޠມ৳ȷȞco-occurrence thesaurusȟȞSalton, 1989ȇ ෇ϰᡘȂ2002ȟȇ੮ဏޠӤဏມ࢑ࡿНݳαܗᇮཏα ׈ӓѠࣻϤڦхޠມཋȄΚૢኅဏޠࣻᜱມ׭೛೾ ளೞٻңӶНӈϸ᜹αȇՅ੮ဏޠӤဏມኅހӵೞ ၽңӶშਫᓣᏱٿණЁཫ൷ޠᆡጃ࡚ȄसңӶུᆹ ཫ൷αȂଭᄈ࢛Κࠍུᆹٲӈпᜱᖓມໍ՘ᘘ৥Ȃ Ѡп׳яڐу᜹ծܗࣻᜱޠུᆹлᚡȇٻңӤဏມ ޠ྆܉Ѡп௄өঢ়ུᆹ൭ᡞϜȂᔯસяൣᏳӤΚӈ ུᆹկ࢑ңມϛΚޠུᆹٲӈȄ ௄ӤဏມޠࣻᜱःفϜȂԟ෉എ࢑п΢ϏޠРԓ ஡Ӥဏ౵ӫȞsynonymȟޠມཋ࡛ᄻԚΚঐӤဏມ ߓȂᡲڐϜΚᆎມཋ૗஋ࢦၛژڐу࠯ᄙޠӤဏມ ཋȄშਫᓣᏱϜȂױӤဏ౵ӫມпಜΚȃዀྦޠם ԓ ख ᓄ ٯ ᆔ ౪ ޠ Р ԓ ᆏ ࣐ ᠍ ࡅ ௢ ښ Ȟ authority controlȟȂՅױଅᓄӤဏ౵ӫມޠᔭ਱ᆏ࣐᠍ࡅᔭ Ȟauthority fileȟȞ෇ϰᡘȂ2002ȟȄՅȶӤဏມݔȷ Ȟ೩ҔݡȂ2004ȟ࣐ҭࠊၦଊᔯસഷளٻңޠϏڏ ਫȂկȶӤဏມݔȷңມၷ୒Ϝ୾τഛР८ޠңᇮȂ ᇅ୾ϲСளҢࣁңݳԥٳяΤȂٻூᄃңܓ७մȄ ݸйུᆹңᇮளᓎ຀Ңࣁਣٲᡑ୞ՅԥܛϛӤȂस п΢Ϗٿ࡛ဋӤဏມ৳ሰ्߇ຳࣻ࿌Ӽޠਣ໣ᇅᆡ ΩȂਞ౦ٯϛାȄ ҭࠊηԥΚٳՍ୞࡛ဋસЖڑޠःفȂկѬ૗୉ ژ჌ӤॲມȃםծມޠՍ୞ϾસЖȄΚૢӤॲມޠ ೏౪РԓȂְпึॲԓЩᄈ࣐лȄϜНޠӤॲມ࢑ ഇႇݨॲಓဵߓȂ஡ࢦၛޠມཋᇅસЖມഎᙾԚݨ ॲȂӶݨॲಓဵα୉ЩᄈȂӱԫ჌ȶջᝌ๜᫂ȷᇅȶջ ೈࡧҚȷȂӱݨॲಓဵࣻӤȂՅѠпϤࣻࢦ൷׳ژȇ ऽНޠӤॲມР८Ȃഇႇ SoundexȞKnuth, 1973ȟᇅ Metaphone ȞBinstock & Rex, 1995ȟᄈНԆձጢጇȂ ηԥ᜹ծޠӤॲມࢦၛਞݏȞTseng, 2001ȟȄѫѵȂ םծມпᎸۿࣻծ࡚ٿॏᆘȂϱ೩ٻң޲ᒰΤӉཏԆ ՜ȂࢦၛਣϛڨၦਠᓀԆȃ϶Ԇޠ४ښȂпߗծԆ՜ ܗዂጚཫ൷ޠРԓЩᄈસЖມ৳Ȟ೩ҔݡȂ2004ȟȂ ϛ༊ѠпᔯસяࣻᜱНӈȂηѠпᔯસяԆ՜ࣻߗޠ ມཋȂ஡םծມӗܼࢦၛ๗ݏϜȂѠпණұٻң޲ࣻ ᜱޠࢦၛңມȂႁژ᠍ࡅ௢ښޠҭޠȞTseng, 2001ȟȄ

(6)

24 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α پԄȈȶϜःଲȷᇅȶϜѶःفଲȷȃȶၦਠঙᓾȷᇅ ȶၦਠঙᓾقಜȷȃȶ؄ଲߞሊলȷᇅȶ؄ሊলଲߞȷ ๊Ȃ೼ٳڏԥ࡟ା२᠓౦ޠםծມഎѠпՍ୞ӵೞᔯ સяٿȄଷԫϟѵȂ჌ȶረටȷᇅȶ՘࢈ଲߞȷȃȶࢻ ݥఁ௳ȷᇅȶݔ࡛໫ȷȃȶ୾Яȷᇅȶ৙Ϝύȷ๊Ԇ՜ ϛ२᠓ޠມཋηೞຝ࣐ӤဏມȂկ࢑Ⴌဟณݳ஡ѻউ ຝ࣐ӤဏȂ೼ٳມཋᗚ࢑ሰӊᒧ΢Ϗٿ࡛ဋȄGauch ᇅ SmithȞ1993ȟޠःفࡿяȂпӤဏມસЖڑঔႻ ࢦၛᇮѰȂѠԥਞණ݈ᔯસਞݏȄկ࢑ӱ࣐ஆܼӤ ဏມޠࢦၛᇮѰᘘ৥ሰ्࡛ᄻӶτ໕΢Ϗܛ࡛ဋޠ Ӥဏມ৳αȂӱԫϛӶҐःفଇ፤ޠጓᛠȄ

ᵧ⨯቏㋤⧄

ޤᜌҐᡞ፤ȞontologyȟΚມٿՍলᏱስ஀Ȃл्࢑ ңٿ௥ଇȶٲޑޠҐ፵࢑ϨቅȉȷȇߗԒٿȂႬဟऌᏱ Ȟcomputer scienceȟІ΢ϏහኌȞartificial intelligenceȟ ስ஀ηআңޤᜌҐᡞ፤೼ঐᇮມٿඣख़઎ᄃзࣩޤ ᜌޠߓႁȄNeches ๊Ᏹ޲Ӷ 1991 ԒණяȶޤᜌҐᡞ ፤࢑ඣख़࢛Κঐлᚡስ஀ޠஆҐ೛ᇮȞbasic termsȟ ڸᜱ߾Ȃηۢဏяѵ۾ಣӬ೛ᇮȞcombining termsȟ Іᜱ߾ޠ೤ࠍȷȞNeches et al., 1991, p40ȟȄBernaras ๊ Ᏹ޲ࠍܼ 1996 ණяȶޤᜌҐᡞ፤Ѡпᄈޤᜌ৳Ϝޠޤ ᜌȂණٽΚᆎ݃ጃඣख़ڐ྆܉ϾޠРݳȷȞBernaras, Laresgoiti & Corera, 1996, p298ȟȄSwartout ๊Ᏹ޲ࠍ ᇰ࣐ȶޤᜌҐᡞ፤࢑Κঐ໧ቺԓޠ࢝ᄻȂڐ࢝ᄻ࢑ң ٿඣ ख़ ࢛ঐ ስ ஀ޠ ޤ ᜌȷȞSwartout et al., 1996, p138ȟȄӱԫस૗஡ޤᜌҐᡞ࡛ဋӶུᆹስ஀ޠມཋ ᜱ߾ሠមαȂ஡Ѡп਴ᐄԫ๗ᄻܓޤᜌٿህֆໍ՘ࣻ ᜱུᆹၦଊޠཫ൷ȂໍՅԥֆܼᔯસя؂Ӽࣻᜱܼܛ మࢦၛུᆹٲӈϟڐуࣻᜱུᆹึ৥૖๝Ȃ೼ηҔ࢑ Ґःفוగึ৥ޠਰЗ׭೛Ȅ

ㆩṘ⃻ℐ⭰

Ґःف௵ң׾ًԓᓔ඾ຆᅮ᜹ઢစᆪၰ࡛ဋུᆹ ມཋޤᜌҐᡞȂٯпՍ୞౱ҢϟུᆹມཋޤᜌҐᡞ ህֆࢦၛᇮѰᘘ৥ȂໍՅႁژණାུᆹᔯસਞ૗І ᔯસࣻᜱུᆹึ৥૖๝ޠҭޠȂ௦ίٿଭᄈ᜹ઢစ ᆪၰޠஆҐ঩౪ձᄣ्ܓޠϮಞȄ

᜹ઢစᆪၰȞArtificial Neural Networkȟ࢑Κᆎ҂ ՘ॏᆘዂԓȂѻٻңτ໕ޠ΢ϏઢစϰٿዂҾҢޑ ઢစᆪၰޠॏᆘ૗ΩȄ௱Ᏻ΢ϏઢစϰޠᏱಭ೤ ࠍȂ्Ӓᕤ၍΢ဟઢစಡबޠᏱಭ೤ࠍȇԥᜱ೼Р ८ޠःفȂഷԥଔᝧޠᔗ၏࢑ HebbianȂуӶ 1949 ԒණяΠઢစಡबޠᏱಭ೤ࠍȂኈ៫Сࡤ᜹ઢစᆪ ၰޠึ৥Ȅშ 2 ࢑ڑ࠯Ңޑઢစಡबዂ࠯Ȅؑঐઢ စϰл्ҦȈઢစᐚȃઢစໆȃઢစ࿾ȃઢစಡब ਰ๊ѳঐഌϸܛಣԚޠȈ 1. ઢစᐚȞdendritesȟȈ࢑ઢစϰӪѵ۾ի֖ᐚݓ ޒޠᒰяΤ൑ϰȂңٿ௦Ԟܗ༉ଛ߭ဵژڐѻ ઢစಡबȄ 2. ઢစໆȞaxonȟȈഀ௦ӶઢစಡबਰαȂ॓ೱ༉ ଛઢစಡबਰ౱ҢޠଊਁژڐѻޠઢစಡबϜȄ 3. ઢစ࿾ȞsynapseȟȈᒰΤઢစᐚڸᒰяઢစᐚࣻ ഀ௦ޠᘉᆏ࣐ઢစ࿾Ȅઢစ࿾࢑ઢစᆪၰαޠ ଅᏺᡞȂߓұڎঐઢစಡब໣ޠഀ๗஽࡚Ȃ஡ пΚঐ኶অٿߓұȂٯᆏϟ࣐ђ᠍অȄ 4. ઢစಡबਰȞsomaȟȈ࢑ઢစಡबޠਰЗഌϸȂ ڐѓ૗࢑஡ઢစᐚ༉ଛٿޠ߭ဵђп༙ᐍȃᙾ ඳࡤȂӕҦઢစໆ༉ଛژڐѻઢစᐚȂԚ࣐ί ΚঐઢစϰޠᒰΤଊဵȄ შ 2! ઢစಡब๗ᄻშ ၦਠٿྜȈȞеໍ኉ȃᒄτӓȂ2003ȟ ᒰΤઢစᐚȞᐚऐȟȞdendritesȟ ઢစ࿾ȞऐដȟȞsynapsesȟ ᒰяઢစᐚȞᐚऐȟȞdendritesȟ ઢစໆ ȞໆસȟȞaxonȟ

(7)

ᕤ၍Ңޑઢစಡबዂ࠯ࡤȂпίϮಞԄեп΢Ϗ ઢစϰٿዂҾҢޑઢစಡबȄ΢Ϗઢစϰ࢑Ңޑઢ စϰޠᙐ൑ዂᔤȂѻ௄ѵࣩᕘძܗ޲ڐѻ΢Ϗઢစ ϰڦூၦଊȂӶစႇᙐ൑ၽᆘࡤȂ஡ڐ๗ݏᒰяژ ѵࣩᕘძܗ޲ڐѻ΢ϏઢစϰȞဩܒԚȂ2003ȟȄ ΢ϏઢစϰȞartificial neuronȟȂέᆏ೏౪൑ϰ Ȟprocessing elementȟȂؑΚঐ೏౪൑ϰޠᒰяпਊ םޒଛяȂԚ࣐ڐѻ೏౪൑ϰޠᒰΤȂԄშήܛұȄ შ 3! ΢Ϗઢစϰዂ࠯ ၦਠٿྜȈȞဩܒԚȂ2003ȟ ؑ Κ ঐ ΢ Ϗ ઢ စ ϰ ࣲ ഀ ๗ ೩ Ӽ ᒰ Τ ൑ ϰ n i x x x x x1, 2, 3,..., ,..., Ȃђ᠍অ wijхߓઢစ࿾ޠഀ ๗஽࡚Ȃᒰяঅ Yj࢑ᒰΤঅޠђ᠍ॹᑗڸȂ೏౪൑ ϰϟᒰяᇅᒰΤঅޠᜱ߾ԓȂѠңᒰΤঅђ᠍ॹᑗ ڸޠړ኶ٿߓұȂԄпίϵԓȈ

¸

¸

¹

·

¨

¨

©

§

=

¦

i j i ij j

f

w

x

Y

θ

ȞϵԓΚȟ ڐಓဵᇴ݃ԄίȈ Yj࣐ዂҾҢޑઢစϰዂ࠯ޠᒰяଊဵȄ f࣐ዂҾҢޑઢစϰዂ࠯ޠᙾඳړ኶Ȅ wij࣐ዂҾҢޑઢစϰዂ࠯ޠઢစ࿾஽࡚Ȃέᆏഀ ๗ђ᠍অȄ Xi࣐ዂҾҢޑઢစϰዂ࠯ޠᒰΤଊဵȄ j θ ࣐ዂҾҢޑઢစϰዂ࠯ϟሩঅȄ ளңޠᙾ ඳړ ኶ϸ࣐ή ᆎȈ 1. ؐ໧ړ኶ Ȟstep functionȟȂέᆏΡঅړ኶Ȟtwo-value functionȟȇ2. ᚗ᠊ԣړ኶Ȟsigmoid functionȟȇ3.ᚗԣጤҔϹړ኶ Ȟhyperbolic tangent functionȟȄ೼ήᆎᙾඳړ኶ְԥ ڐӔ೾ܓȂ൸࢑࿌ᒰΤঅၷϊਣȂڐᒰяঅ࣐ 0 ܗ -1Ȃຝړ኶Յۢȇ࿌ᒰΤঅၷτਣȂڐᒰяঅ֊ᙾ ࣐ 1Ȃ೼л्࢑਴ᐄ E. D. Adrian ઢစಡबޠႬϾᏱ ձңᏱᇴȈ࿌Κঐઢစಡबڨژ٘஋஽࡚ޠڗᐮࡤ ཽึяΚঐۢঅႬࢻ૖ፑȂڐޒᄙη൸׾ᡑ࣐ 1Ȅ ӶᄃርᔗңϜȂ೼ήᆎᙾඳړ኶ȂளϸրᔗңӶϛ Ӥޠ᜹ઢစᆪၰϜȂ࿌᜹ઢစᆪၰᔗңӶΡ྄অϾ ȞbinaryȟقಜਣȂτӼ௵ңؐ໧ړ኶ȇ࿌ᔗңӶഀ ៊࠯ȞcontinuousȟقಜਣȂ߰ሰ௵ңᚗ᠊ԣړ኶І ᚗԣጤҔϹړ኶ȄҐःفܛٻңޠ MHNN ௵ңᚗ᠊ ԣړ኶ձ࣐ᙾඳړ኶Ȅ

ʙ㆛߸ߧ᫧⧄

ҦܼᆪርᆪၰޠᑺକȂٻூၦଊ໕ା࡚ޠԚߞȂ ഇႇΚૢཫ൷ЖᔞпᜱᗥԆࢦၛРԓٿཫ൷ࣻᜱၦ ଊѠ૗ூژԚξα࿳์ޠၦਠȇկ೼ٳၦਠӱ኶໕ ᛂτȃґစಣᙒᇅϸ᜹ȂᏳयٻң޲҇໹ᘳ៕ࣻ࿌ ӼޠၦਠϘ૗׳ژమ׳൷ޠၦଊȄ೼ኻޠཫ൷Рԓ ᄈٻң޲Յّ࢑Κᆎؖ२ޠ॓ᐋȄӱԫлᚡӵშ Ȟtopic mapsȟޠᢏ܉ೞණяٿȂп၍؛ၦଊ॓ၸޠ ୱᚡȄлᚡӵშ࢑Κᆎңٿಣᙒᇅᆔ౪τ໕ၦଊၦ ྜޠᐡښȂڐഷಥҭޠӶܼ࡛ҴΚঐഷٺϾޠޤᜌ Ᏻ៕Ϯ८Ȃٯණٽٻң޲Κঐ૗ץഁජඬᇅሇᚭᏱ ಭޤᜌޠᐈձϮ८Ȟ؄ঢ়ᇻȂ2002ȟȄлᚡӵშܼ 1999Ԓ 12 Уᕖூ୾ርዀྦಣᙒޠᇰᜍȂԚ࣐ ISO / IEC 13250ޠ೤ጓȄISO / IEC 13250 ೤ጓޠлᚡӵშ л ् є ֥ ή ঐ ਰ З ϰ ષ ș л ᚡ Ȟ topic ȟȃ ᜱ ᖓ ȞassociationȟڸၦྜࡿЖȞoccurrenceȟȂ֊ T.A.OȄ

(8)

26 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α ૮஡ T.A.O ޠ֥ཏᇴ݃ԄίȞThe TAO of topic

mapsȇݔ߭Ԛȃዊ໩ኌȃዊ໩஥ᄹȂ2003ȟȈ 1. лᚡȞTopic, Tȟ ӶлᚡӵშϜȂޤᜌޠஆҐ൑ϰᆏ࣐ȶлᚡȷȂ л्֖౫࢛ঐ྆܉ޠȶлᚡȷȄлᚡȞtopicȟѠ п࢑Κঐ΢ȃΚঐᄃᡞȃΚঐ྆܉ޠӉեٲޑȂ ϛᆔ࢑֐ԥᄃᡞԇӶȃ࢑֐ԥӉե੬ܓȂлᚡ ཽӱᔗңαޠሰ्ȃၦଊܓ፵пІлᚡӵშϟ ңഋՅԥܛϛӤȄлᚡѠпೞᘫ᜹ԚထȂᆏ࣐ лᚡ᜹࠯Ȟtopic typesȟȂԄშ 4 ܛұȄඳّϟȂ лᚡ᜹࠯൸࢑лᚡܛᘫ឵ޠ᜹րȇΚঐлᚡѠ пӤਣᘫ឵Κঐпαޠлᚡ᜹࠯Ȃлᚡ᜹࠯Ӷ лᚡӵშϜηೞᇰ࣐ۢΚঐлᚡȄЩРᇴȂ΢ȃ ঴ٱ᜹ȃ୞ޑϸրഎ࢑лᚡȂկ΢Ӥਣηϸ឵ ܼ঴ٱ᜹ȃ୞ޑ೼ڎঐлᚡ᜹࠯Ȅ შ 4! лᚡ᜹࠯Ȟtopic typesȟ 2. ၦྜࡿЖȞOccurrence, Oȟ ΚঐлᚡѠӤਣഀ๗ΚঐܗΚঐпαޠၦଊ ၦྜȞinformation resourcesȟȂՅйӶ࢛ᆎโ ࡚αᇅ၏лᚡڏԥᜱᖓȄ೼ٳޠၦଊၦྜᆏϟ ࣐лᚡޠၦྜࡿЖȂԄშ 5 ܛұȄၦྜࡿЖϲ ֥ӶлᚡӵშϲȂηѠпᑀҴӶлᚡӵშϟ ѵȂഇႇ፞Ԅ HyTime AddressingȞIn HyTMȟ ܗڑ࠯ޠ URIsȞIn XTMȟ๊ᐡښٿ֮ۢȄၦ ྜࡿЖѠп࢑ϛӤ᜹࠯ޠӉեԚসȂӶлᚡӵ შዀྦ೤ጓϜȂ஡ၦྜࡿЖ᜹࠯ຝ࣐ΚঐِՔ ȞroleȟȂԄӤлᚡ᜹࠯ȂၦྜࡿЖِՔηೞຝ ࣐лᚡȄ შ 5! лᚡϟၦྜࡿЖȞoccurrenceȟ 3. ᜱᖓȞAssociation, Aȟ лᚡϟ໣ѠւңᜱᖓȞtopic associationȟٿᡘ ұڐᇮཏᜱ߾ȂԄშ 6 ᡘұڐлᚡᇅᜱᖓȂپ ԄȶᛴೲϜȷڸȶή୾ᅌဏȷڎлᚡϟ໣ڏԥ ȶቹձȷᜱ߾ȄϛӤܼၦྜࡿЖഀ๗ژНӈٿ ྜȂᜱᖓߓ౫яΚঐє֥ၦଊҐ፵ȃ֖౫ၦଊ л्ቌঅޠޤᜌஆᙄȂΚঐлᚡᜱᖓٯґ४ښ ࣻᜱлᚡޠ኶໕ȄӶлᚡӵშϜȂᜱᖓӤਣη ೞຝ࣐ΚঐлᚡȂηԥᜱᖓ᜹࠯Ȟassociation typeȟȂԄȶቹձȷ֊Ѡຝ࣐ΚᆎᜱᖓȄᜱᖓ᜹ ࠯ױڏԥࣻӤᜱ߾ޠлᚡ༙໲ԚထȂԥֆܼቩ ђлᚡӵშޠߓႁ૗ΩȄлᚡӵშ൸ڐҐ፵Յ ّ࢑࡟ᙐ൑ޠȂплᚡձ࣐ஆҐષ؆Ȃٯւң ᜱᖓ࡛Ҵлᚡϟ໣ޠᜱ߾ȂлᚡѠпԥԂංঐ ӫᆏڸၦྜࡿЖȂٯւңጓ൝४ښӫᆏȃၦྜ ࡿЖڸᜱᖓޠԥਞጓᛠȂ೼൸࢑ഷஆҐޠлᚡ ӵშȄ შ 6! лᚡϟ໣ޠᜱഀȞtopic associationȟ

(9)

лᚡӵშᔗңུܼᆹཫ൷๗ݏޠߓႁαȂѠпఽ ྀޠ֖౫ུᆹٲӈ໣ޠᜱᖓᇅ૖๝Ȃ࿌ٻң޲ᄈ࢛ ܓ ຑ ঐུᆹлᚡདᑺ፹ਣȂѠп ңົ ๗ޣ௦ᘉᒶࣻ ۩ ᦰ ᜱུᆹၦྜໍ Ꭸ ȂӤਣη૗ٟഁ׳ژڸ೼ঐུ ᆹлᚡࣻᜱޠڐулᚡȄӱԫлᚡӵშԥֆܼٻң ޲ץഁජඬᇅሇᚭᎨ᠟ུᆹၦଊȄ

ᶇἄᅞᘍ⎞ከᐉ

ᶇἄከᐉ

ҐःفණяϟȶஆܼՍ୞ࢦၛᇮѰᘘ৥ϟлᚡӵშ හኌ࠯ུᆹཫ൷Жᔞȷقಜ࢝ᄻԄშ 7 ܛұȂᐍᡞق ಜӔϸ࣐ήঐዂಣڸڎঐၦਠ৳Ȃϸրᇴ݃ԄίȈ

1. ུᆹᘞڦዂಣȞnews archive moduleȟ

Ґ ः ف ึ ৥ ུ ᆹ ᘞ ڦ х ౪ ΢ Ȟ news crawler agentȟ஡ Google news ϑစϸ᜹Ԃޠུᆹٲӈ ௄ᆪርᆪၰϜᘞڦӲٿȂٯଭᄈؑΚུᆹٲӈ ໍ՘ུᆹዀᚡȃൣᏳ൭ᡞȃൣᏳਣ໣ȃུᆹҐ Н๊ၜមၦਠȞmetadataȟޠՍ୞ᘞڦІϜНᘟ ມ೏౪Ȃٯ஡೏౪๗ݏᓾԇژུᆹٲӈၦਠ৳ Ȟnews event databaseȟϜȂпձ࣐ࡤ៊౱Ңུ ᆹޤᜌҐᡞІུᆹཫ൷ޠஆᙄȄ

2. Ս ୞ ུ ᆹ ޤ ᜌ Ґ ᡞ ࡛ ᄻ х ౪ ΢ Ȟ automatic ontology generation agentȟ

Ս୞ུᆹޤᜌҐᡞ࡛ᄻх౪΢ཽ਴ᐄစҦϜН ᘟມܛᘟяٿޠؑΚུᆹٲӈዀᚡȞtitleȟມཋ Ȟ term ȟȂ ໍ ՘ Ӕ գ я ౫ Ԇ ມ ޠ ೏ ౪ Ȃ ٯ п MHNNଭᄈӔգԆມᜱᖓ࡚ໍ՘ᆺ᜹ȂᐄԫѠ ࡛ ဋ я ུ ᆹ ມ ཋ ޤ ᜌ Ґ ᡞ Ȃ ٯ ٸ ᐄ Google news ޠ ུ ᆹ ᜹ ր ᓾ ԇ ܼ ޤ ᜌ Ґ ᡞ ၦ ਠ ৳ Ȟontology databaseȟϜȄ࿌ٻң޲ᒰΤུᆹᜱ ᗥԆٿࢦၛུᆹၦଊਣȂޤᜌҐᡞཫ൷Жᔞ Ȟontology based search engineȟཽཫ൷ܛԥུ ᆹၦਠ৳Ϝє֥ԫᜱᗥԆޠུᆹٲӈ᜹րȂӕ ٸᐄᓾԇܼޤᜌҐᡞၦਠ৳Ϝޠུᆹມཋᜱᖓ ໍ՘ུᆹٲӈޠᘘ৥ࢦၛȂп஽Ͼུᆹཫ൷ޠ ਞ૗Ȅ

3. лᚡӵშ֖౫х౪΢Ȟtopic maps generation agentȟ лᚡӵშ֖౫х౪΢ཽ਴ᐄٻң޲ܛᒰΤޠ ᜱᗥԆȂ௱ᙩࠊΫ᜹ᇅࢦၛມഷࣻᜱϟлᚡུ ᆹٯՍ୞࡛ဋлᚡӵშȂпຝញϾРԓ֖౫ࣻ ᜱޠུᆹлᚡРԓٽٻң޲ᘳ៕ུᆹȄ࿌ٻң ޲ഇႇлᚡӵშᘉᔟȞclickȟࢦၛᇮѰᘘ৥ѓ ૗ਣȂقಜཽٸᐄϑစ࡛ဋԂϟུᆹ᜹րມཋ ޤᜌҐᡞٿ୉ࢦၛᇮѰᘘ৥Ȅࢦၛޠ๗ݏཽٸ ᐄђພུᆹٲӈึոޠਣ໣ዀଅໍ՘௷זȂᡲ ٻң޲૗஋׈ᐍජඬུᆹٲӈϟึ৥૖๝Ȅ

News Archive Module

News Crawler Agent User Ontology base Search engine Topic Maps Generation Agent News Event Database Automatic Ontology Generation Agent User Interface for Query Google News Visualization Presentation News Metadata Extraction Process Chinese Word Segmentation for News Metadata Ontology Database 1 2 3 4 5 6 7 8 9 11 12 13 10 შ 7! قಜ࢝ᄻშ

(10)

28 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α

ᶇἄᅞᘍ

ҐःفוగᙥҦՍ୞࡛Ҵུᆹስ஀ϟມཋޤᜌҐ ᡞȂп෉૗ഇႇུᆹມཋޤᜌҐᡞໍ՘ࢦၛᇮѰՍ ୞ᘘ৥Ȃ஽Ͼུᆹཫ൷ޠਞ૗Ȅშ 8 ࣐Ґःفܛණ яϟՍ୞࡛ဋུᆹޤᜌҐᡞޠࢻโშȂᇴ݃ԄίȈ

⫏ᅆԊ⇦⚠᫧

ུᆹᘞڦх౪΢஡ Google news ௄ᆪርᆪၰϜᘞ ڦ ٯ ڑ ᙡ ܼ ུ ᆹ ٲ ӈ ၦ ਠ ৳ ޠ ႇ โ Ϝ Ȃ ܛ ԥ ೞ Google news ϸ᜹ԂޠུᆹٲӈዀᚡཽӶໍ՘ϜН ᘟມ೏౪ࡤȂ౱ҢؑΚ᜹ུᆹٲӈԆມޠ໲ӬȂٯ ಜॏؑঐԆມӶӤΚ᜹ུᆹ᜹րϜܛя౫ޠᓝ౦Ȅ Ґःف௵ңϜःଲܛึ৥ޠ CKIP ϜНᘟມقಜ ȞCKIP Chinese word segmentation systemȟȄ

ٸᐄҐःفޠᢏᄇຠզึ౫Ȃ४ܼ Google news Ս ୞ུᆹϸ᜹ޠҔጃ౦ณݳႁژԼϸϟԼᆡྦȂӱԫഌ ӌུᆹٲӈᇅ၏᜹ڐуུᆹٲӈޠࣻծ࡚ϛାȇ࣐Π ᘯଷᇅ၏᜹ུᆹൣᏳၷณᜱᖓޠུᆹዀᚡԆມȂпණ ାܛ࡛ဋུᆹޤᜌҐᡞޠҔጃܓȂҐःفӶໍ՘ུᆹ ޤᜌҐᡞ࡛ᄻϟࠊཽӒ஡೼ٳུᆹዀᚡԆມໍ՘ᘯ ଷȄԄԫηѠп७մ MHNN ޠᒰΤᆱ࡚Ȃቩђ MHNN Ս୞౱ҢུᆹޤᜌҐᡞޠᏱಭԞᔨഁ࡚Ȅ ུᆹዀᚡԆມޠᘯଷРݳȂ௵ң࢛ΚུᆹዀᚡԆ ມӶ၏᜹ܛԥུᆹٲӈϜя౫ޠࠍ኶ЩپմܼԼϸ ϟΫޠߟᘥঅໍ՘ᑣᒶȂڐߟᘥঅᑣᒶ఩ӈԄȞϵ ԓΡȟܛұȈ ª ×10%º = NewsNumClassID tfθ ,tfθ≤1 ȞϵԓΡȟ ڐϜ NewNumClassID ࣐࢛Κঐུᆹ᜹րޠᖃུᆹ ኶ȇtf࣐͞Ԇມя౫ЩپޠߟᘥঅȄ ԼϸϟΫޠߟᘥঅ೪ۢ࢑Մ໕ Google news Ϝུ ᆹࠍ኶Ӷ 10 ࠍпαޠུᆹ᜹րѠпೞߴ੽ίٿໍ ՘ུᆹޤᜌҐᡞޠ࡛ҴȇմܼԫΚߟᘥঅϟུᆹዀ ᚡມཋᇅ၏᜹ུᆹлᚡᜱᖓϛାȂӱԫђпᘯଷȄ ԫߟᘥঅस೪ޠЋାȂ஡ཽԥ࡟Ӽུᆹ᜹րೞᘯଷ Յณݳໍ՘ུᆹޤᜌҐᡞޠ࡛ҴȄӱԫȂစҦϸݚ Google news ؑΚ᜹ུᆹٲӈޠࠍ኶ϸոޒݸІ౱ ҢޠུᆹޤᜌҐᡞࠣ፵ޒݸȇҐःف஡ԫߟᘥЩپ ೪࣐ۢᖃࠍ኶ޠԼϸϟΫȂᐄԫ౱Ңޠུᆹມཋձ ࣐ࡤ៊ུᆹޤᜌҐᡞ࡛ҴϟஆᙄȂѠпᡲ MHNN ޠ ᒰΤᆱ࡚ႁژၷӬ౪ޠޒݸȂη౱Ңࠣ፵ၷٺޠུ ᆹޤᜌҐᡞȄ

ᅘ⊏઎⥱ᓏ⸅ϊ᱿⤺ᾰ

စҦαΚ໧ࢳࠊဋ೏౪ࡤޠུᆹມཋȂ௦ίٿ҇ ໹ໍ՘ུᆹԆມᇅԆມϟ໣ޠࣻծ࡚ॏᆘȂᙥҦࣻ ծ࡚ॏᆘѠпᕤ၍ུᆹມཋϟ໣ޠᜱഀ࡚ȄҐःف ௵ң SaltonȞ1989ȟܛණяޠԆມ᠍२ॏᆘϵԓȈ i i i

d

tf

D

= log × Ȟϵԓήȟ ij ij ij

d

tf

D

= log

×

Ȟϵԓѳȟ ڐϜ di࣐ಒ i ঐུᆹዀᚡԆມя౫ӶӤΚ᜹ུᆹ Ϝڐуུᆹዀᚡޠུᆹࠍ኶ȇdij࣐ಒ i ঐུᆹዀᚡ Ԇມڸಒ j ঐུᆹዀᚡԆມӤਣя౫ӶӤΚ᜹ུᆹ ޠڐуུᆹዀᚡུᆹࠍ኶ȇtfi࣐ಒ i ঐུᆹዀᚡԆມ я౫ӶུᆹዀᚡޠԪ኶ȇtfij࣐ࡿಒ i ঐུᆹዀᚡԆມ ڸಒ j ঐུᆹዀᚡԆມӤਣя౫ӶུᆹዀᚡޠԪ኶ȇ Di࣐ಒ i ঐུᆹዀᚡԆມޠ᠍२অȇDij࣐ಒ i ঐུᆹ ዀᚡԆມڸಒ j ঐུᆹዀᚡԆມӤਣя౫ޠ᠍२অȄ ดՅȂӶҐःفޠུᆹၦਠ৳ϜȂུᆹዀᚡя౫ޠ Ԇ኶എߩளޠ฼й࡟ᜳԥ२ፓޠԆມя౫Ȃࢉ tfiڸ tfij ޠঅτഌӌԆມְ࣐ 1Ȅӕ޲ȂӔգя౫ޠུᆹዀᚡ Ԇມޠ᠍२অॏᆘпίӗޠϵԓٿ໕Ͼڐᜱഀโ࡚Ȉ

( )

i ij j i D D t t rel , = Ȟϵԓϥȟ

( )

j ji i j D D t t rel , = Ȟϵԓϳȟ ڐϜrel(ti,tj)࣐пಒ i ঐུᆹዀᚡԆມ࣐ஆᙄޠ ௒ݸίȂڸಒ j ঐུᆹዀᚡԆມޠӔգя౫ᜱഀโ ࡚ȇrel(tj,ti)࣐пಒ j ঐུᆹዀᚡԆມ࣐ஆᙄޠ௒ ݸίȂڸಒ i ঐུᆹዀᚡԆມޠӔգя౫ᜱഀโ࡚Ȅ စҦпαޠၽᆘȂѠп׳яڎڎӔգޠུᆹዀᚡԆ ມΚକя౫ޠᜱഀโ࡚ȂԄߓ 2 ܛұ֖౫Κঐߩᄈᆏ ޠઑଳȄ࿌rel(ti,tj)=rel(tj,ti)ਣȂߓұಒ i ঐུᆹ ዀᚡԆມڸಒ j ঐུᆹዀᚡԆມ࢑឵ܼȶᄈᆏᜱ߾ȷȂ ᖟپٿᇴȂ୆Ԅrel(԰Բ,٥ᢝ)=rel(٥ᢝ,԰Բ)Ȃ֊ хߓȶΟΡȷІȶӔᜌȷڎঐུᆹዀᚡԆມณ፤п ٦Κঐ࣐ஆྦȂѫΚঐԆມӔգя౫ޠޒݸְΚ ኻȇη൸࢑೼ڎঐུᆹዀᚡມཋӶུᆹዀᚡϜഎ࢑ Κକя౫Ȃءԥ൑ᑀя౫ӶዀᚡޠޒݸึҢȄ

(11)

ߓ 2! ߩᄈᆏઑଳϟӔգມ

rel

(

t

i

,

t

j

)

Termi Termj ΟΡ Ӕᜌ ࡶΫ ນұ Љί Ϋᘉ … ΟΡ 1 1 0.712 1 0.712 0.565 … Ӕᜌ 1 1 0.712 1 0.712 0.565 … ࡶΫ 1 1 1 1 1 0 … ນұ 0.702 0.702 0.5 1 0.5 0.5 … Љί 1 1 1 1 1 0 … Ϋᘉ 0.792 0.792 0 1 0 1 … … … … 1

ᄊ⏦ೣパᇒ⓪ᦲㆩṘ⃻ℐ⭰ņMHNNŇ

ҐःفᙥҦ MHNN ٿ౱ҢུᆹޤᜌҐᡞȞJain, Chen & Ichalkaranje, 2002; Lin & Chen, 1996; Xiaowei & Minghu, 2003ȟȄӶ MHNN Ϝઢစϰڸ ઢစ࿾࢑ࣻϤഀ๗ԚΚঐޤᜌᆪၰȄԄშ 9 ܛұ ࣐ᓔ඾ຆᅮ᜹ઢစᆪၰޠᏱಭ࢝ᄻშȂӶԫΚᏱ ಭ࢝ᄻϜޠઢစϰ֊ߓұ࢛ΚུᆹዀᚡԆມȂՅ ഀ௦ઢစϰڸઢစϰޠ࣐ઢစ࿾ȂӶԫࡿޠ࢑ѻ ޠ᠍२অȂη൸࢑ rel(ti,tj)অȄᙥҦ੬ۢᒰΤઢစ ϰٿࣁϾȞଌጜȟڐуઢစϰȂڦх൑Κᓻാޠ ઢစϰȂϛᘟޠໍ՘ߩጤܓᙾඳړ኶ fs(netj)ޠၽ ᆘȂޣژܛԥޠԆມᆱࡼϛᡑȞԞᔨȟ࣐ЦȂՅ ഷࡤઢစϰޠᒰяȂཽ஡ࣻᜱޠུᆹዀᚡԆມᆺ ᜹࣐ΚထȄၐಡޠ MHNN ᏱಭᅌᆘݳԄίȈ ᇴ݃ΚȈ ؑΚঐઢစϰ࿾ᘉ֊хߓΚঐུᆹዀᚡԆ ມȂՅ tijࠍߓұ࣐ಒ i ঐུᆹዀᚡԆມᇅ ಒ j ঐུᆹዀᚡԆມޠӔգя౫ᜱഀโ࡚ ᠍२অȂη൸࢑ rel(ti,tj)অȄ᠍२অޠॏᆘ РԓԄࠊΚ࿾ܛख़Ȅ ᇴ݃ΡȈ ߒۗޠᒰΤມޠ໲Ӭ࣐

{

t1,t2,t3,...,ti,...tn

}

Ȃଌ ጜᆪၰਣࢆᒶӉΚུᆹԆມٿ࿌ձଌጜޠ ஆྦȞstarting termȟᇅؑঐઢစϰ࿾ᘉٿ һϤၽᆘȂڐϜೞᒶ࣐ଌጜஆྦມޠઢစ ϰڐχiޠߒۗঅ೪࣐ۢ 1Ȃڐуءԥೞᒶ࣐ ଌጜஆྦມޠઢစϰࠍ࣐঑ᒶଌጜஆྦ ມȂڐঅְ೪࣐࣐ 0ȄԄȞϵԓΝȟܛұȈ

( )

0 = i, 0≤in−1 i χ µ ȞϵԓΝȟ Data pre-processing

Terms from word segmentation system CKIP

Detecting the stop criterion

NO

YES

Ontology Database

Determining the set of qualified terms Computing term weights Hopfield Network The generated ontology Data pre-processing

Terms from word segmentation system CKIP

Detecting the stop criterion

NO

YES

Ontology Database

Determining the set of qualified terms

Computing term weights

Hopfield Network

Detecting the cluster criterion

The generated ontology

შ 8! Ս୞࡛ဋུᆹޤᜌҐᡞϟࢻโ

(12)

30 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α ڐϜ xi࣐ಒ i ঐઢစϰޠᒰΤঅȇȝi(0)࣐ಒ i ঐઢ စϰӶਣ໣ tɶ0 ਣޠᒰяঅȇn ࣐ઢစϰঐ኶Ȅ २ፓޠໍ՘ȞϵԓΥȟȃȞϵԓΟȟޠॏᆘޣژᆪ ၰԞᔨ࣐ЦȄ ( 1) ( ), 0 1 1 0 − ≤ ≤ »¼ º «¬ ª = +

¦

− = n j t t f t n i i ij s i µ µ ȞϵԓΥȟ ڐϜ tijࡿಒ i ঐུᆹዀᚡԆມᇅಒ j ঐུᆹዀᚡԆ ມޠӔգя౫ᜱഀโ࡚᠍२অ rel(ti,tj)Ȃ֊ઢစ࿾᠍ २অȇfsࡿᚗ᠊ԣᙾඳړ኶Ȟsigmoid functionȟȇn ࣐ઢစϰঐ኶Ȅ ᚗ᠊ԣᙾඳړ኶ fsޠॏᆘРԓԄίȈ

( )

»¼ º «¬ ª − + = 0 ) ( exp 1 1 θ θj j j s net net f ȞϵԓΟȟ ڐϜș0࣐ңٿ።ᐍᚗ᠊ԣᙾඳړ኶ fsםޒޠள኶ অȇșj࣐೪ۢϟᒰяߟᘥঅȞthreshold or biasȟȄ ȞϵԓΟȟϜ netjޠॏᆘРԓԄίȈ

( )

¦

− = = 1 0 n i ij i j t t net

µ

ȞϵԓΫȟ ᇴ݃ήȈ ᓔ඾ຆᅮ᜹ઢစᆪၰ२ፓஉ՘пαޠؐ ᡾ȂޣژᒰяޠུᆹԆມϛᡑ࣐ЦȄҐः فпίӗޠȞϵԓΫΚȟցᘟԓٿ؛ۢᆪ ၰ࢑֐ԞᔨȄ

( )

( )

[

]

¦

− = ≤ − + 1 0 2 1 n j j j t µ t ε µ ȞϵԓΫΚȟ ڐϜȝi(t)ࡿӶਣ໣ t ਣಒ i ঐ࿾ᘉޠᒰяঅȇ͛ࡿ ԞᔨਣȂഷτৡ೩ޠᇳৰȄ ӶҐᄃᡜϜȂșjȃș0ȃ͛ήঐᏱಭ୥኶҇໹ᎍ࿌ޠ ؛ۢȂစҦҐᄃᡜЇ᙮ขၑޠ๗ݏึ౫șj =1ᇅș0=1 ѠпூژၷٺޠᏱಭ๗ݏȇԫѵҐःف஡ᏱಭԞᔨ ᇳৰ͛অ೪࣐ۢ 1Ȅӱ࣐Ґःف༊ଭᄈུᆹዀᚡມ ཋໍ՘Ӕգᜱᖓโ࡚ϸݚȂٯп MHNN ໍ՘ມཋᜱ ᖓᆺ᜹ȂӱԫؑΚ᜹ུᆹٲӈሰ्ଌጜޠઢစϰ኶ ໕ٯϛӼȂܛпؑԪଭᄈΚঐ᜹րུᆹٲӈໍ՘ޠ ϸထଌጜȂւңঐ΢Ⴌဟໍ՘ၽᆘංоഎѠпӶං ऍយϲႁژԞᔨȄկ࢑Ԟᔨࡤޠ MHNNȂؑΚঐઢ စϰޠᒰяঅഎӶϊ኶ᘉࡤං՞Ϙ౱Ң݃ᡘޠৰ౵ ᡑϾȂӱԫԥ݃ᡘޠϸထ֩ᜳȄࢉҐःفණяΠΚ ঐᆺ᜹఩ӈޠցᘟРݳٿ׾๢೼ঐୱᚡȄԄȞϵԓ ΫΡȟܛұȈ 1 , 0 , 1 0 , ≤ ≤ ≤ ≤ − ≤ −Y i j n Yi j α α ȞϵԓΫΡȟ ڐϜ YiࡿԞᔨࡤಒ i ঐઢစϰȞ֊ಒ i ঐུᆹዀ ᚡԆມȟޠഷࡤᒰяঅȇYjࡿԞᔨࡤಒ j ঐઢစϰȞ֊ ಒ j ঐུᆹዀᚡԆມȟޠഷࡤᒰяঅȇ࣐͗ցᘟᆺ ᜹ޠߟᘥঅȇn ࡿܛԥઢစϰȞུᆹዀᚡԆມȟޠ ঐ኶Ȅ MHNN࢑஡ഷࡤؑΚઢစϰޠᒰяঅȂпڎڎପ ᄈޠРԓໍ՘ϵԓΫΚޠၽᆘȂसڎঐઢစϰȞུ ᆹዀᚡԆມȟࣻ෶ϟ๙ᄈঅϊ๊ܼܼȞϵԓΫΡȟ Ϝޠ͗ߟᘥঅȂࠍցᘟڎঐུᆹዀᚡԆມӔգя౫ ӶӤΚུᆹٲӈዀᚡޠᐡ౦ၷାȂࢉཽ஡ڎঐུᆹ ዀᚡԆມᆺ᜹࣐ΚထȄՅ͗ߟᘥঅ୥኶ޠ೪ۢȂစ ҦҐᄃᡜ२ፓᄃᡜޠ๗ݏึ౫Ȃस͗অ೪ۢЋାȂ ࠍϸထޠထ኶ཽЋЎȂณݳ஡ུᆹٲӈዀᚡϟӔգ ུᆹԆມୣႥ໡ٿй᜹ထϜ౱ҢӔգя౫ޠུᆹԆ ມηཽЋӼȂԄԫ౱ҢޠུᆹޤᜌҐᡞཽᝓ२ኈ៫ СࡤӶໍ՘ࢦၛᇮѰᘘ৥ਣޠཫ൷ഁ࡚ȇस͗অ೪ۢ ЋմȂࠍϸထޠထ኶ཽЋӼȂထϲޠུᆹӔգԆມཽ ЩၷЎȂࣦՎܼءԥӔգມޠಣӬȂԄԫ஡ณݳ࡛ ᄻяԥཏဏޠུᆹޤᜌҐᡞህֆུᆹཫ൷ȄစҦᄃ ᡜขၑ๗ݏึ౫͗೪࣐ 0.1 ਣڏԥϛᓀޠϸထ๗ݏȄ

ᅘ⊏ᵧ⨯቏㋤ʀʁべଶ᱿ဎⲩᅞᘍ

ޤᜌҐᡞ፤Ѡпңܼඣख़࢛ঐስ஀ޤᜌޠ໧ቺ࢝ ᄻȂйαί໧ቺԥ࢛ᆎโ࡚ޠᜱ߾ȄӱԫȂུᆹዀ ᚡԆມစҦ MHNN ᆺ᜹೏౪ࡤȂѠп஡ؑΚུᆹٲ ӈ᜹րϜ೾ႇࠊဋ೏౪ޠዀᚡԆມထᆺԚڏԥᜱഀ ܓޠөထȄҐःفܛණяޠུᆹޤᜌҐᡞ࢝ᄻȂӤ ΚထϲޠዀᚡԆມᜱ߾ҦܛॏᆘூژޠԆມᇅԆມ ϟ໣Ӕգя౫ᜱᖓ࡚অђпඣख़ȂसӤΚထϲޠڎ ԆມܛॏᆘޠӔգя౫ᜱᖓ࡚অ rel(ti,tj)๊ࣻȂࠍп ᚗ጑ᓟٿߓұȇस rel(ti,tj)অ࣐ࣻ౵Ȃࠍϸրп൑጑ ᓟٿߓұȂसڎঐԆມ rel(ti,tj)অ࣐ 0 ਣȂࠍߓұԫ ڎԆມءԥӔգя౫ޠᜱᖓȂࢉϛϡଅᓄȄ Յထڸထϟ໣ޠᜱഀȂҐःفණяΠΚঐຠ໕ޠ Рݳٿඣख़ထڸထϟ໣ޠᜱᖓ࡚ȂԄȞϵԓΫήȟ ܛұȈ

(13)

33 . 0 3 2 ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( = × + + + + ဪق٥ᢝ ຫֽਇ԰Բ ຫֽਇ٥ᢝЀ ࠟࡾ԰Բ ࠟࡾ٥ᢝ ԰Բ

ဪق rel rel rel rel rel

rel

(

)

j i n p n q jq ip j i n n T T rel Weight i j × =

¦¦

− = − = 1 0 1 0 , , ȞϵԓΫήȟ

ڐϜ0≤Weighti,j≤1,0≤i,jn−1ȂWeighti,jߓұಒ i

ထᇅಒ j ထϟ໧ቺᜱᖓ࡚ȇniࠍߓұಒ i ထϜޠུᆹ ዀᚡԆມ኶ȇnjࠍߓұಒ j ထϜޠུᆹዀᚡԆມ኶ȇ rel(Ti,p,Tj,q)ߓұಒ i ထϜಒ p ঐུᆹዀᚡԆມᇅಒ j ထϜಒqঐུᆹዀᚡԆມޠӔգя౫ᜱᖓ࡚Ȅ ҐःفпȞϵԓΫήȟٿໍ՘ထڸထϟ໣ޠᜱᖓ ࡚ၽᆘȂѠпؒяထᇅထϟ໣ޠᜱᖓโ࡚ȄپԄӶ შ 10 ϜಒΚȃΡထޠԆມᜱᖓ࡚ၽᆘȂॷӒпಒΚ ထ࣐ஆྦȂ஡ಒΚထϜؑΚঐུᆹዀᚡԆມᇅಒΡ ထϜؑΚঐུᆹዀᚡԆມໍ՘ rel(ti,tj)ޠђᖃڦ҂ ְၽᆘȂܛؒூঅߓұಒΚထ imply ಒΡထޠᜱᖓ ࡚࣐ 0.452ȇпಒΡထ࣐ஆྦȂಒΡထ imply ಒΚထ ޠᜱᖓ࡚࣐ 0.33ȄڐၐಡॏᆘРԓӗᖟԄίȈ ؑΚထޠԆມᇅڐуܛԥထԆມໍ՘һϤၽᆘȂ सڎထܛॏᆘޠ Weighti,jঅ๊ࣻਣȂࠍпᚗ጑ᓟٿߓ ұȇस Weighti,jঅ࣐ࣻ౵Ȃࠍп൑጑ᓟٿߓұȄԫѵȂ ڎထޠ Weighti,jְ࣐ 0 ਣȂࠍߓұԫڎထءԥᜱᖓȂ ࢉϛϡଅᓄȄစҦпαܛख़ӤΚထϲІϛӤထᇅထ ϟ໣ᜱᖓ࡚ॏᆘޠ೏౪ؐ᡾Ȃ஡Ѡ౱ҢԄშ 10 ܛұ ϟ࢛Κ᜹ུᆹٲӈޠུᆹࢦၛມޤᜌҐᡞȄ 0.4 52 0.3 3 2 ຫֽਇ ࠟࡾ ဪق 0.226 0.328 0. 1 53 0. 2 28 0.258 0.406 0.095 0.1 67 0.108 0.097 0 .18 0.1 98 0.186 0.216 6 ࿳რ ൷࠹ 5 ࣳᕴ ؀᨜ 3 ૨ࠃ ᒷᓢ೴ ᜔อ ᇨრ 4 ᓫᇩ 1 ٥ᢝ ԰Բ 0.21 6 0.2 5 0.2 07 შ 10! пུᆹٲӈ᜹ր࣐ȶܣᇰΚϜϛණΟΡӔᜌࡶΫນұ෣Љίȷ౱ҢϟࢦၛມུᆹޤᜌҐᡞ 452 . 0 3 2 ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( = × + + + + ԰Բຫֽਇ ԰Բࠟࡾ ٥ᢝဪقЀ ٥ᢝຫֽਇ ٥ᢝࠟࡾ ဪق

԰Բ rel rel rel rel rel

(14)

32 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α

ʙ㆛߸ߧ᱿ا᫠ᑨӼ

ҐःفпՍ୞౱Ңޠུᆹስ஀ޤᜌҐᡞٿ࿌ձཫ ൷ມՍ୞ᘘ৥ޠஆᙄȂٯплᚡӵშޠ֖౫РԓȂ ђ஽ᄈུܼᆹٲӈᔯસ๗ݏޠຝញϾᡘұѓ૗Ȅ࿌ ٻң޲དྷ्ࢦၛ࢛ΚུᆹлᚡਣȂقಜ्ཽؒٻң ޲೪ུۢᆹᜱᗥມᇅಒΚቺᇮѰᘘ৥ޠߟᘥঅȄࡤ ᆓཽཫ൷Щᄈܛԥུᆹထಣ᜹րԥ٦ٳ᜹րᜱᖓژ ԫུᆹᜱᗥມȂقಜཽ௱ᙩࠊΫঐഷڏࣻᜱޠུᆹ ထಣ᜹րᡲٻң޲ٿᘉᒶᎨ᠟Ȅ ௦ίٿٻң޲Ѡпٸᑺ፹ٿᘉᒶདᑺ፹ޠུᆹထ ಣ᜹րᘳ៕Ȃٯໍ՘ಒΚቺུᆹޤᜌҐᡞԆມᘘ৥ ࢦၛȄ೼ਣقಜཽٸᐄٻң޲Κ໡ۗܛ؛ۢޠࢦၛ ᇮѰᘘ৥ޠߟᘥঅȂᑣᒶಒΚቺུᆹޤᜌҐᡞϜܛ ԥུᆹዀᚡԆມ໣ rel(ti,tj)অτ๊ܼܼܛ೪ߟᘥঅ ϟུᆹዀᚡԆມȂٿໍ՘ಒΚቺޠࢦၛᇮѰᘘ৥Ȃ ٯщϸၽң Google news ϸ᜹ޠ੬ܓȂᡲٻң޲Ѡ пٸᐄུᆹ᜹րٿᘳ៕ԥᜱޠུᆹዀᚡȄसٻң޲ ᗚདྷޤၿᇅԫུᆹ᜹րࣻᜱܗࣻծޠུᆹਣȂѠп ӕᘉᒶໍ՘ίΚቺུᆹޤᜌҐᡞޠࢦၛᇮѰᘘ৥Ȅ შ 11 ࣐Ґःفлᚡӵშ௱ᙩϟུᆹ᜹րฬ८Ȃق ಜཽпЩپ௷זޠРԓ௱ᙩࠊΫঐུᆹ᜹րᡲٻң ޲Ѡпᘳ៕Ꭸ᠟Ȃٻң޲Ѡпٸ෭ዀޠಌ୞ٿᘳ៕ TOP Nུᆹ᜹րዀᚡȂٯᒶᐆഷདᑺ፹ޠུᆹ᜹ր ٿໍ՘ࢦၛᇮѰᘘ৥Ȅ შ 11! лᚡӵშޠ௱ᙩུᆹထಣ᜹րฬ८ пშ 10 ࣐پȂ࿌ٻң޲మࢦၛȶചЬࡶȷਣȂق ಜցۢȶചЬࡶȷΚມဤӶ MHNN ϸထϜޠಒΡ ထȂӶಒΚቺུᆹޤᜌҐᡞޠᘘ৥ϜȂ஡ໍ՘Ӥኻ ထᆺӶಒΚထϜܛԥԆມޠࢦၛᇮѰᘘ৥ȂԄȶച ЬࡶȂڎۭȷȃȶചЬࡶȂນұȷϟࢦၛᇮѰᘘ৥Ȅ सٻң޲དྷໍ՘؂ໍΚؐޠࢦၛਣȂقಜ஡௱ᙩഷ ڏࣻᜱޠΚထٿໍ՘ᘘ৥ȄӶҐپϜᇅಒΡထࣻᜱ ޠԥϥထȂڐࣻᜱโ࡚ϸր࣐ಒΡထᇅಒΚထࣻᜱ โ࡚࣐ 0.33ȇಒΡထᇅಒήထࣻᜱโ࡚࣐ 0.226ȇಒ Ρထᇅಒѳထࣻᜱโ࡚࣐ 0.153ȇಒΡထᇅಒϥထࣻ ᜱ โ ࡚ ࣐ 0.258 ȇ ಒ Ρ ထ ᇅ ಒ ϳ ထ ࣻ ᜱ โ ࡚ ࣐ 0.095Ȅӱԫقಜ஡௱ᙩࣻᜱโ࡚ഷାޠಒΚထȂໍ ՘ಒΡቺུᆹޤᜌҐᡞϟࢦၛᇮѰᘘ৥Ȃη൸࢑п ϸထӶಒΚထϟࢦၛມȶΟΡ, Ӕᜌȷໍ՘ࢦၛᇮ Ѱᘘ৥Ȅڐуᘘ৥Рԓпԫ᜹௱Ȅ пίпΚঐᄃርޠࢦၛپٿᇴ݃ᐍঐقಜၽձޠ ႇโȄԄშ 12 ܛұȂٻң޲དྷ्ࢦၛȶΟΡӔᜌȷ ࣻᜱޠུᆹлᚡȂقಜٸᐄᜱᗥԆᔯસ௱ᙩഷࣻᜱ ޠࠊΫঐུᆹထಣ᜹րٽٻң޲Ս՘ᘉᒶᎨ᠟Ȃٻ ң޲ᒶᐆུᆹထಣ᜹րхဵ࣐ 49664 ޠུᆹٲӈȂ ձུ࣐ᆹޤᜌҐᡞޠᘘ৥ࢦၛȄԫѵȂقಜ஡्ؒ ٻң޲೪ۢಒΚቺࢦၛᇮѰᘘ৥ޠߟᘥঅȞԫپ೪ ࣐ۢ 0.5ȟȂقಜ஡ᐄԫ೪ۢঅໍ՘ಒΚቺޠࢦၛᇮ Ѱᘘ৥ࢦၛȄ

(15)

შ 12! ࢦၛȺΟΡӔᜌȻࣻᜱޠུᆹлᚡ შ 13 ᡘұಒΚቺࢦၛᇮѰᘘ৥ޠࢦၛ๗ݏȂقಜ ηණٽٸᐄਣ໣ዀଅ௷זޠ๗ݏٽٻң޲ٿᘳ៕Ꭸ ᠟ȂйηණٽؑΚུᆹ᜹րϜഷུึոޠུᆹዀᚡ ٽٻң޲୥ՄȄٻң޲Ѡпᘉᒶདᑺ፹ޠུᆹ᜹ր ٿᢏࣽ၏ུᆹ᜹րϜಒΚቺᘘ৥ܛཫ൷ژܛԥࣻᜱ ޠུᆹಡ࿾ȂԄშ 14 ܛұȄ შ 13! пਣ໣ዀଅ௷זུᆹᡘұಒΚቺࢦၛᇮѰᘘ৥ࢦၛ๗ݏ

(16)

34 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α შ 14! ᡘұུᆹထಣ᜹ր࣐ 49664 ᜹ȂಒΚቺᘘ৥ࢦၛܛཫ൷ژޠܛԥུᆹಡ࿾ Ӷშ 13 ѢαِᡘұȺQuery ExtensionȻޠົഀ ๗Ѡп࡟Р߰ٻң޲ӕ܂ίΚቺ୉ࢦၛᇮѰޠᘘ ৥ࢦၛȂڐقಜཽՍ୞਴ᐄུᆹޤᜌҐᡞޠαί໧ ቺᜱ߾Ȃ௱ᙩഷࣻᜱޠ໧ቺٿ୉ໍΚؐޠᘘ৥ࢦ ၛȄშ 15 ᡘұಒΡቺٸਣ໣ዀଅ௷זུᆹᘘ৥ࢦၛ ޠ๗ݏȄ შ 15! ಒΡቺٸਣ໣ዀଅ௷זུᆹᘘ৥ࢦၛޠ๗ݏ ᆤӬпαܛख़ȂпҐःفܛණяޠུᆹҐᡞޤᜌ ໍ՘ࢦၛᇮѰޠᘘ৥ࢦၛڏԥпί੬ܓȈ࿌ໍ՘ಒ Κቺᘘ৥ࢦၛਣȂܛཫ൷ژޠུᆹထಣ᜹րϜޠུ ᆹȂτഌϸഎᇅࢦၛມڏԥޣ௦ࣻᜱȇՅಒΡቺᘘ ৥ࢦၛϟ๗ݏȂܛཫ൷ژޠུᆹထಣ᜹րᇅٻң޲ ܛమࢦၛϟлᚡԥ໣௦ࣻᜱȄҦԫѠُȂಒΚቺޠ ᘘ৥ࢦၛ࢑п࠮ޣԓޠ౐࡚ཫ൷࣐лȂཫ൷ϟ๗ݏ ᇅлᚡུᆹڏԥޣ௦ࣻᜱȇಒΡቺޠᘘ৥ࢦၛ࢑п Ь҂ԓޠኅ࡚ཫ൷࣐лȂཫ൷ϟ๗ݏᇅлᚡུᆹڏ ԥ໣௦ࣻᜱȄ

૪㊹⃌ኞ⎞Ӡኔ

Ґःفпڑᙡϟ Google news ུᆹໍ՘ུᆹᔯસ ਞ૗ޠຠզȂՍ 2004 Ԓ 11 У 9 СՎ 2006 Ԓ 6 У 16СЦȂᖃॏӔڑᙡ 60,269 ུᆹ᜹րȂӔ 1,191,065 ࠍུᆹܼၦਠ৳ϜȂߓ 3 ࣐Ґःفึ৥ϟུᆹཫ൷ Жᔞࡤᆓܛڑᙡུᆹၦਠ৳ޠϸոޒݸȄ

(17)

ߓ 3! Ґःفڑᙡུᆹၦਠ৳ϸոޒݸ ڑᙡ Google News ෉໣ Ս 2004-11-9 Վ 2006-6-16 ࣐Ц ᖃུᆹٲӈࠍ኶ Ӕ 1,191,065 ࠍུᆹ ᖃུᆹ᜹ր Ӕ 60,269 ᜹ ؑС҂ְུᆹٲӈᇕ໲໕ 2039.49ࠍʝЉ ؑΚ᜹҂ְུᆹࠍ኶ 19.76ࠍʝ᜹ ڑᙡഷτུᆹထಣ᜹րࠍ኶ 3007์ ڑᙡഷϊུᆹထಣ᜹րࠍ኶ 2์ ҦܼҐःفڑᙡϟུᆹၦਠ৳ࣻ࿌ᛂτȂ࣐ԥਞ ຠզུᆹᔯસਞ૗Ȃӱԫᄃᡜпᓎᐡܫኻ 4348 ུᆹ ထಣ᜹րձ࣐Ս୞౱ҢུᆹޤᜌҐᡞໍՅМනུᆹ ཫ൷ޠஆᙄȄ਴ᐄҐःفึ৥ޠлᚡӵშ௱ᙩᐡ ښȂقಜ஡౱ҢࠊΫ᜹ᇅࢦၛມഷࣻᜱޠུᆹٲӈ лᚡȂٯѠଭᄈ೼ٳུᆹлᚡٿໍ՘ུᆹٲӈޠࢦ ၛᇮѰᘘ৥ȄҐःفւң΢Ϗᓎᐡࢆᒶ 5 ঐዦߟޠ ུᆹᜱᗥມٿᡜᜍҐःفܛණРݳޠᔯસਞ૗ȂԄ ߓ 4 ܛӗȄຠզޠРݳ࢑пໍ՘ࢦၛᇮѰᘘ৥ࡤϟ ࢦྦ౦Ȟprecision rateȟІණ݈ޠࢦӓ౦Ȟrecall rateȟ ڎঐข࡚ໍ՘ᔯસਞ૗ޠຠզȄҦܼҐःفڑᙡޠ ུᆹٲӈ኶໕ᛂτȂณݳᆡጃॏᆘяᇅٻң޲ܛࢦ ၛϟࢦၛມࣻᜱޠܛԥུᆹٲӈᖃ኶Ȃӱԫณݳ֖ ౫ࢦӓ౦ޠਞ૗ЩၷȂկ࢑ණ݈ޠࢦӓ౦Ѡпѐᡘ ܛණРݳޠᓻ౵ਞ૗ȄՅུᆹཫ൷Жᔞϟཫ൷๗ݏ ҦΡΫ՞ٻң޲ٿໍ՘ུᆹٲӈޠຠᒶȂຠۢዀྦ ஡пٻң޲ܛࢦၛϟུᆹлᚡ࣐ஆྦᘉȂଭᄈܛཫ ൷ژޠུᆹٲӈ࢑֐ಓӬమࢦၛޠུᆹлᚡРԓຠ զȄ ߓ 4! ϥঐขၑཫ൷ਞ૗ޠུᆹᜱᗥມӗߓ Query 1 ήΚΟᅂᔟ਱ Query 2 ഽଞུښ Query 3 ѵᝳ্ུ Query 4 ା២೾ٚ Query 5 ࠓଠཥक़਱

૪㊹ɺ

ᄃᡜΚޠҭޠӶܼᢏᄇٻң޲ໍ՘ϥঐϛӤࢦ ၛᜱᗥມٯᘉᒶདᑺ፹ޠུᆹлᚡࡤȂᄈӤΚུᆹ ထಣ᜹րϟུᆹٲӈໍ՘ࢦၛᇮѰಒΚቺᇅಒΡ ቺᘘ৥ϟᐍᡞᔯસࢦྦ౦ޠຠզȂڐࢦၛᇮѰᘘ৥ ϟߟᘥ୥኶ϸր೪࣐ۢ 0.3, 0.5, 0.8ȂᙥҦུᆹཫ൷ Жᔞޠཫ൷๗ݏȂٿຠۢࣻᜱޠུᆹٲӈխܛԥᘞ ڦུᆹٲӈϟЩپпؒூڐࢦྦ౦Ȅߓ 5 ᡘұϛӤ ޠུᆹᜱᗥມӶϛӤϟߟᘥ୥኶ίܛᔯસژޠུ ᆹᖃ኶ᇅࣻᜱޠུᆹ์኶Ȅშ 16 ᡘұϛӤޠུᆹ ᜱᗥԆӶϛӤߟᘥঅϟࢦྦ౦Ȅпήঐߟᘥঅ୥኶ ϟ೪ۢՅّȂຠզࢦၛᇮѰಒΚቺᇅಒΡቺᘘ৥ϟ ᐍᡞᔯસਞ૗Ȃ҂ְѠпႁژ஡ߗΟԚпαޠࢦྦ ౦Ȅпϥঐུᆹᜱᗥມᄈᔗ 0.3, 0.5, 0.8 ϟߟᘥঅ ؒࢦྦ౦ϟ҂ְȂ๗ݏϸր࣐ 88.85%ȃ90.39%ᇅ 92.61%ȄשউѠпึ౫ߟᘥঅ೪ۢூ຺ାȂܛᔯસ ژޠུᆹ຺ಓӬٻң޲ܛ्׳ޠུᆹлᚡȂ࿌ดѠ пႲُޠ࢑ࣻᜱܼࢦၛມޠུᆹٲӈࢦӓ౦ཽӱ ԫՅί७ ԫѵȂᄃᡜ๗ݏᡘұಒΡঐࢦၛᜱᗥມȶഽଞུ ښȷȂณ፤௵ңޠࢦၛᇮѰᘘ৥ߟᘥঅ࣐ӼЎȂְѬ ႁژ୒մй௦ߗ 76%ޠࢦྦ౦Ȃစᘫાึ౫Ѡ૗ٿ Սܼڎঐл्঩ӱȈڐΚ࢑݃ᡘޠԫΚࢦၛᜱᗥມ ܛߓႁޠུᆹࢦၛཏ܉Ȃࣻၷܼڐуѳঐࢦၛᜱᗥ ມޠᇮཏՅّၷϛ݃ጃȂٻூࢦၛ๗ݏϛ஋ᆺฑȂ ηৡܿആԚٻң޲ᄈܼᔯસ๗ݏᇰޤαޠৰ౵Ȃӱ Յኈ៫ࢦྦ౦ȇѫΚঐ঩ӱࠍ࢑঩Ґٸᐄ Google

(18)

36 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α news ϸ᜹೤ࠍՅೞϸ᜹Ӷ೼ঐུᆹ᜹րޠུᆹٲ ӈȂ܅ԫϟ໣ޠৰ౵ၷτȂᗷดഎᇅഽଞུښԥᜱȂ կ࢑ᆺฑޠឋᚡၷ࣐ϸයȂӱՅኈ៫ᔯસ๗ݏޠࢦྦ ౦ȄӱԫȂࢆᒶུᆹᜱᗥມᄈܼණ݈ᔯસਞ૗ηڏԥ ᜱᗥܓޠኈ៫Ȃ࿌ٻң޲ᒰΤޠུᆹᜱᗥມϛ஋݃ጃ ਣȂ܂܂ณݳூژᅗཏޠᔯસ๗ݏȄپԄȈȶҰݷα ᅜȷΚມȂҦܼϛ࢑Ѭԥש୾ԥ೼ኻޠ௒םึҢȂࣻ ᄈܼڐуޠ୾ঢ়ηԥԫ௒םึҢȂйҰݷޠαᅜѠ૗ ౟ਗ਼ژө୾ޠ࢈ݾȃစᔽȃऌ׭๊ӱષኈ៫Ȃܛпཫ ൷ޠ๗ݏ஡౟ਗ਼؂Ӽޠስ஀ӱષՅၷϛྦጃȄ ߓ 5! ུᆹᜱᗥມӶϛӤߟᘥঅܛཫ൷ژޠࣻᜱུᆹ์኶Ȟࣻᜱུᆹʝᖃུᆹ์኶ȟ

Query1 Query2 Query3 Query4 Query5 0.3 361/432 420/551 160/164 458/478 1239/1360 0.5 330/401 415/542 154/159 450/455 1324/1360 0.8 63 / 65 410/533 152/156 339/340 1252/1360

Precision rates with various thresholds

70% 80% 90% 100% ˃ˁˆ ˋˆˁˈˉʸ ˊˉˁ˅ˆʸ ˌˊˁˈˉʸ ˌˈˁˋ˅ʸ ˌ˄ˁ˄˃ʸ ˃ˁˈ ˋ˅ˁ˅ˌʸ ˊˉˁˈˊʸ ˌˉˁˋˉʸ ˌˋˁˌ˃ʸ ˌˊˁˆˈʸ ˃ˁˋ ˌˉˁˌ˅ʸ ˊˉˁˌ˅ʸ ˌˊˁˇˇʸ ˌˌˁˊ˄ʸ ˌ˅ˁ˃ˉʸ ˤ̈˸̅̌˄ ˤ̈˸̅̌˅ ˤ̈˸̅̌ˆ ˤ̈˸̅̌ˇ ˤ̈˸̅̌ˈ შ 16! ϛӤུᆹᜱᗥԆӶϛӤߟᘥঅϟࢦྦ౦

૪㊹ʷ

ᄃᡜΡҭޠӶܼᢏᄇٻң޲ໍ՘ᘉᒶདᑺ፹ޠུ ᆹлᚡࡤȂᢏᄇЩၷӤΚུᆹထಣ᜹րϟུᆹޤᜌ ҐᡞڎቺޠՍ୞ࢦၛᇮѰᘘ৥ࢦၛޠ๗ݏȂл्࢑ ٿᡜᜍ࢑֐စҦΡቺུᆹޤᜌҐᡞޠՍ୞ᘘ৥ࢦ ၛȂѠп׳яࣻᜱܼࢦၛມϟᗵ֥ࣻᜱུᆹٲӈȄ შ 17Ȟaȟȃშ 17Ȟbȟȃშ 17ȞcȟϸրᡘұӶϛӤ ߟᘥঅ୥኶ίȂϛӤޠུᆹᜱᗥԆಒΚቺᇅಒΡቺ ᘘ৥ࢦྦ౦ϟЩၷ๗ݏȄڐϜȂҦܼ Query 4 Ѭԥ ΚቺޠུᆹޤᜌҐᡞȂࢉณݳໍ՘ಒΡቺϟЩၷȄ Ҧᄃᡜ๗ݏѠпึ౫҂ְಒΚቺᘘ৥ࢦྦ౦ЩಒΡ ቺٿޠାȄϸݚڐ঩ӱ࣐ӶಒΚቺޠᘘ৥ϜȂ඾ႈ എᇅٻң޲ܛమࢦၛϟུᆹлᚡࣻᜱȂՅಒΡቺϟ ᘘ৥ࠍ࢑ਗ਼Іژ၏ུᆹٲӈϟڻᜟུᆹȄپԄȈࢦ ၛȶήΚΟᅂᔟ਱ȷΚມȂಒΚቺޣ௦ᡘұࣻᜱޠ ུᆹлᚡȂՅಒΡቺࠍᡘұ୏ᒳήΚΟᅂᔟ਱ϟ؄ ݀၊ീςࣻᜱၦଊȞԄ෇စ୏ᒳ٦ٳަཽ਱ӈȃԚ ߞनෂ๊ȟȂпІᔯસяήΚΟᅂᔟ਱ឍᄇਜ਼ൣяৰ ਢຳٲӈ๊ڻᜟࣻᜱޠུᆹȂໍՅᡲٻң޲Ѡпщ ӌᕤ၍ԥᜱήΚΟᅂᔟ਱ึ৥ޠۗҒпІ׳я౟ਗ਼ ژޠࣻᜱུᆹлᚡȄ

(19)

The threshold of Co-occurrence terms is set to 0.3 0% 50% 100% ˡ˸̊̆ʳˢ́̇̂˿̂˺̌ʳ̂˹ ˹˼̅̆̇ʳ˿˴̌˸̅ ˌˊˁˈ˃ʸ ˋˇˁ˃ˈʸ ˄˃˃ˁ˃˃ʸ ˄˃˃ˁ˃˃ʸ ˡ˸̊̆ʳˢ́̇̂˿̂˺̌ʳ̂˹ ̆˸˶̂́˷ʳ˿˴̌˸̅ ˊˈˁˆˊʸ ˈ˄ˁˇ˄ʸ ˌˆˁˆˆʸ ˌ˃ˁˌˇʸ ˤ̈˸̅̌˄ ˤ̈˸̅̌˅ ˤ̈˸̅̌ˆ ˤ̈˸̅̌ˈ Ȟaȟ೪ۢߟᘥঅ୥኶࣐ 0.3 ϟࢦྦ౦

The threshold of Co-occurrence terms is set to 0.5

0% 50% 100%

News Ontology of first layer

96.18% 84% 100.00% 100.00% News Ontology of

second layer

75.56% 58.97% 91.80% 97.30% Query1 Query2 Query3 Query5

Ȟbȟ೪ۢߟᘥঅ୥኶࣐ 0.5 ϟࢦྦ౦

The threshold of Co-occurrence terms is set to 0.8

0% 50% 100% News Ontology of first layer 100.00% 85.41% 100.00% 100.00% News Ontology of second layer 95.00% 56.41% 93.44% 91.91%

Query1 Query2 Query3 Query5

Ȟcȟ೪ۢߟᘥঅ୥኶࣐ 0.8 ϟࢦྦ౦

შ 17! ᡘұϛӤޠུᆹᜱᗥԆӶϛӤࢦၛᇮѰᘘ৥ߟᘥঅίಒΚቺᇅಒΡቺᘘ৥ϟࢦྦ౦Щၷ The threshold of co-occurrence terms is set to 0.3

The threshold of co-occurrence terms is set to 0.5

(20)

38 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α

૪㊹ɿ

ҦܼҐःفܛڑᙡޠུᆹٲӈ኶໕ᛂτȂณݳᆡ ጃॏᆘяᇅٻң޲ܛࢦၛϟࢦၛມܛԥࣻᜱޠུᆹ ᜹րᖃ኶Ȃӱԫณݳ֖౫ࢦӓ౦ޠਞ૗ЩၷȄࢉଭ ᄈٻң޲ܛమࢦၛϟུᆹᜱᗥມȂҐᄃᡜᢏᄇ༉ಜ ᜱᗥԆЩᄈཫ൷ݳᇅུᆹޤᜌҐᡞϟՍ୞ࢦၛᇮѰ ᘘ৥ϟཫ൷๗ݏȂӶུᆹထಣ᜹րࢦӓ౦α࢑֐ԥ ਞණЁȄڐຠզРݳ࢑୆೪ܛԥུᆹၦਠ৳ϜȂᇅ ٻң޲ܛమࢦၛϟུᆹᜱᗥມڏԥࣻᜱޠུᆹထಣ ᜹րᖃ኶࣐ x ᜹Ȃ਴ᐄ༉ಜᜱᗥԆЩᄈཫ൷ݳܛཫ ൷ژޠࣻᜱ᜹ր኶࣐ a ᜹Ȃӱԫཫ൷๗ݏќܛԥࣻ ᜱޠུᆹ᜹րЩپ࣐x a ȇसུᆹޤᜌҐᡞϟՍ୞ᘘ ৥ܛཫ൷ژޠ᜹ր࣐ b ᜹Ȃࠍཫ൷๗ݏќܛԥࣻᜱ ޠ ུ ᆹ ᜹ ր Щ پ ࣐ x b Ȃ ࢉ ڐ ࢦ ӓ ౦ ϟ ණ Ё ࣐ x a x a b− ȂᙐϾࡤ࣐baaȄ Ґᄃᡜϟ๗ݏึ౫Ȃ༉ಜᜱᗥԆЩᄈཫ൷ݳϟཫ ൷๗ݏ҇໹ಓӬٻң޲ܛᒰΤϟུᆹᜱᗥມϘຝ࣐ ࣻᜱȂसུᆹዀᚡڏԥࣻӤᇮཏඣख़Ȃկϛє֥ٻ ң޲ܛᒰΤϟུᆹᜱᗥມȂقಜ஡ณݳຝ࣐ࣻᜱϟ ུᆹՅೞᔯસяٿȄࣻၷϟίȂҐःفϜܛණяϟ ུᆹޤᜌҐᡞϟՍ୞ࢦၛᇮѰᘘ৥ཫ൷๗ݏȂѠпཫ ൷ژ؂Ӽڏࣻᜱޠུᆹлᚡᇅུᆹ᜹րȄშ 18 ᡘұ ϥঐขၑޠࢦၛມȂӶစႇՍ୞ུᆹޤᜌҐᡞᘘ৥ ࡤϟࢦӓ౦ޠණЁϸր࣐ 8.7%ȃ4.55%ȃ66.67%ȃ60% ᇅ 59.09%Ȃڐ҂ְࢦӓ౦ޠණЁ࣐ 39.8%Ȅ 0% 20% 40% 60% 80% ˄ ˅ ˆ ˇ ˈ Query Recall Rate შ 18! ϥঐᜱᗥԆࢦၛມϟՍ୞ུᆹޤᜌҐᡞࢦၛᇮѰᘘ৥ϟࢦӓ౦ණЁϸոშ

⃌⧄⎞ቍ͗ᶇἄᅞ׿

⃌⧄

ҐःفණяΚঐՍ୞࡛ᄻུᆹޤᜌҐᡞޠРݳȂ ٯ஡ڐᔗңུܼᆹཫ൷ЖᔞޠࢦၛᇮѰᘘ৥Ȃԥֆ ܼ׾๢ٻң޲ϛৡܿᒶᐆԂޠࢦၛມٿᔯસུᆹޠ ୱᚡȂᄃᡜ๗ݏηᡜᜍҐःفܛණяϟڏࢦၛᇮѰ ᘘ৥ޠུᆹཫ൷Жᔞڏԥϛᓀޠࢦྦ౦ȂйηѠп ԥਞޠණུ݈ᆹޠࢦӓ౦ȄԫѵȂഇႇུᆹޤᜌҐ ᡞໍ՘ࢦၛᇮѰޠᘘ৥ࢦၛȂӤਣڏԥ࠮ޣԓޠ౐ ࡚ཫ൷ІЬ҂ԓኅ࡚ཫ൷ޠᓻᘉȂପӬਣ໣ዀଅ௷ זࢦၛ๗ݏȂѠп֖౫яུᆹޠึ৥૖๝ȄԫѵȂ ණٽུᆹлᚡӵშޠٻң޲Ϯ८ȂѠп؂ຯߗٻң ޲ᘳ៕ུᆹٲӈޠሰؒȄ

ቍ͗ᶇἄᅞ׿

Ґःفᗷϑึ৥яڏᄃңቌঅϟȶڏࢦၛᇮѰᘘ ৥ϟлᚡӵშུᆹཫ൷ЖᔞȷȂկۧԥΚٳឋᚡঅூ ґٿ؂౐ΤःفȂᘫાԄίȈ Κȃཫ൷Жᔞᔯસഁ࡚ϟਞ૗׾๢ Ґःفึ৥ޠུᆹཫ൷ЖᔞӶٸᐄܛ࡛ဋϟ ུᆹޤᜌҐᡞໍ՘ࢦၛᇮѰᘘ৥ਣȂस၏ቺ ུᆹޤᜌҐᡞӔգя౫ມၷӼਣȂڐཫ൷๊ ࡠޠਣ໣ཽЩၷεȄӕ޲Ȃӱҭࠊقಜܛڑ

(21)

ᙡޠུᆹ໕ϑစົႇԼ࿳ࠍȂӱԫആԚၦਠ ৳ၽձαਞ౦ޠ७մȄӱԫґٿԄեւңၷ ٺޠၦਠસЖᐡښܗၦਠ๗ᄻٿ׾๢ᐍঐཫ ൷Жᔞޠཫ൷ഁ࡚Ȃ஡࢑ґٿقಜ૗֐ᡑԚ Κঐᄃңޠུᆹཫ൷ЖᔞޠᜱᗥȄ Ρȃ׾ًԓᓔ඾ຆᅮ᜹ઢစᆪၰᏱಭ୥኶؛ۢ Ґःف௵ңϟ׾ًԓᓔ඾ຆᅮ᜹ઢစᆪၰޠ Ᏹಭ୥኶࢑စҦლၑᓀᇳޠစᡜРݳூژȂ ೼ٳ୥኶ޠ؛ۢᄈܼᏱಭԞᔨڏԥΚۢโ࡚ ޠఄད࡚Ȃ୥኶ޠఄད࡚ᄈܼӉեᐡᏣᏱಭ РݳְԇӶȄ࣐Π७մ୥኶ޠఄད࡚Ȃւң ΚٳഷٺϾޠཫ൷РݳȞReklaitis, Ravindran & Ragsdell, 1993ȟህֆ୥኶ޠ؛ۢ࢑Κঐၷ ࣐Ѡ՘ޠРݳȂկ࢑ւңഷٺϾРݳӶ؛ۢ Ӭᎍ୥኶ޠႇโϜ҇໹߇ຳࣻ࿌โ࡚᚟ѵޠ ॏᆘਣ໣Ȅ ήȃቩђӤဏມޠᘘ৥ԥֆܼཫ൷ژ؂Ӽࣻᜱϟ ུᆹлᚡ Ҧܼөঢ়ུᆹ൭ᡞൣᏳӤΚུᆹٲӈܛίޠ ዀᚡϛΚȂଷΠՄ໕ҐःفܛණяпӔգມ ࣐ஆᙄܛ࡛ဋޠུᆹޤᜌҐᡞѵȂस૗Ӥਣ ᐍӬӤဏມུܼᆹޤᜌҐᡞȂ஡Ѡпණٽ؂ Ԃޠུᆹཫ൷ࠣ፵Ȅ ѳȃණାϜНᘟມມཋ৳ϟ኶໕Ȃ७մґޤມп ђ஽ུᆹޤᜌҐᡞϟ࡛Ҵ Ґःف௵ңϜःଲޠ CKIP ϜНᘟມقಜٿ ᘞڦѠ૗ޠུᆹዀᚡԆມໍ՘ུᆹࢦၛມޤ ᜌҐᡞޠ࡛ҴȂҭࠊ८ᖞޠഷτୱᚡӶུܼ ᆹϜளя౫ґޤມȞunknown wordȟȂй CKIP ᘟມقಜณݳҔጃ೏౪࢛ٳґޤມȂആԚᒹ ѷ೩Ӽڏԥ࡛ဋུᆹޤᜌҐᡞቌঅޠມཋȄ ґٿस૗෶ЎґޤມȂ஡ԥւܼණЁུᆹޤ ᜌҐᡞϟ࡛ဋࠣ፵ȄԫѵȂҐःفґٿη஡ ლၑւңѫΚঐڏԥུᆹґޤມՍ୞௥ଢ଼ѓ ૗Ս୞ᘘщມ৳ѓ૗ޠ ECScannerȞECScanner chinese word segmentation systemȟϜНᘟມ قಜڦх CKIPȂ၏ᘟມقಜڏԥΚঐՍ୞Ս Google news ᆪય௥ଢ଼ུມѓ૗ቩђມ৳ᘟ

ມ૗ΩޠᐡښȞHong, Chen & Chiu, 2008ȟȂ Ѡпԥਞ၍؛ CKIP ܛ८ᖞϟґޤມୱᚡȄ ϥȃлᚡӵᚡޠϮ८֖౫ Ґᄃᡜւң Jpgraph API ٿ౱Ңлᚡӵშᘳ ៕Ϯ८Ȃґٿस૗п Ajax ׭೛ึ৥Ϥ୞ܓ؂ ٺޠлᚡӵშ֖౫РԓȂ஡ԥֆܼึ৥؂Ѕ ๢ޠٻң޲Ϯ८Ȅ ϳȃ׾ًུᆹዀᚡԆມ᠍२ॏᆘРݳ Ґःفܛ௵ңޠུᆹዀᚡԆມ᠍२অॏᆘР ݳၷᎍӬܼ϶ߞНതϟॏᆘȂґٿѠ൷ؒᎍ Ӭܼᙐ฼ዀᚡϟ᠍२অॏᆘРݳȂпණାུ ᆹԆມᜱᖓ࡚ॏᆘޠҔጃܓȄ

א≙ᄽ᪇

Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern

information retrieval. Addison-Wesley.

Bernaras, A., Laresgoiti, I., and Corera, J. (1996). Building and reusing ontologies for electrical network applications.

In Proceedings of the 12th European Conference on Artificial Intelligence, 298-302.

Berners-Lee, T., & Fischetti, M. (1999). Weaving the

web. Orion Business Books.

Binstock, A., & Rex, J. (1995). Practical algorithms

for programmers. Addison-Wesley.

Choi, J., Kim, M., & Raghavan, V. V. (2001). Adaptive feedback methods in an extended boolean model. In

Proceedings of ACM SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval. New Orleans, LA.

Chen, C. M., and Liu, C. Y. (2008). Personalized e-news monitoring agent system for tracking user-interested Chinese news events. Applied

Intelligence, DOI 10.1007/s10489-007-0106-7. CKIP Chinese word segmentation system. Retrieved from

the World Wide Web: http://ckipsvr.iis.sinica.edu.tw/

ECScanner Chinese word segmentation system.

(22)

40 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α http://dlll.nccu.edu.tw/~rank/ecscanner/

Frakes, W. B., & Baeza-Yates, R. (1992). Information

retrieval: Data structures and algorithms. Prentice-Hall.

Gauch, S., & Smith, J. B. (1993). An expert system for automatic query reformation. Journal of the

American Society for Information Science, 44(3),

124-136.

Harman, D. K. (1995). Overview of the third text

retrieval conference (TREC-3).

Hong, C. M., Chen, C. M., & Chiu, C. Y. (2009). Automatic extraction of new words based on Google news corpora for supporting lexicon-based Chinese word segmentation systems. Expert Systems with

Applications, 36(2), 3641-3651.

Jain, L. C., Chen, Z., & Ichalkaranje, N. (2002).

Intelligent agents and their applications.

Physica-Verlag GmbH.

Kartoo search engine. Retrieved from the World Wide

Web: http://www.kartoo.net/e/eng/index.html Knuth, D. (1973). The art of computer programming,

vol. 3: Sorting and searching. Addison-Wesley.

Lin, C., & Chen, H. (1996). An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents. IEEE Tran. on Sys. Man and

Cybernetics –Part B: Cybernetics, 26(1), 75-88.

Maron, M. E., & Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval.

Journal of the ACM, 7(3), 216-244.

Miller, W. L. (1971). A probabilistic search strategy for MEDLARS. Journal of Documentation, 27(4), 254-266.

Mitra, M., Singhal, A., & Buckley, C. (1998). Improving automatic query expansion. Proceedings

of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 206-214.

Mooter search engine. Retrieved from the World Wide Web: http://www.mooter.com/moot

Neches, R., Fikes, R., Finin, T. W., Gruber, T. R., Patil, R., Senator, T. E., et al. (1991). Enabling technology

for knowledge sharing. AI Magazine, 12(3), 36-56. Och Dag, J. N., Regnell, B., Carlshamre, P., Andersson,

M., & Karlsson, J. (2001). Evaluating automated support for requirements similarity analysis in market-driven development. Proceedings of Seventh

International Workshop on Requirements Engineering: Foundation for Software Quality (REFSQ’01).

Peat, H. J., & Willett, P. (1991). The limitations of term co-occurrence data for query expansion in document retrieval systems. Journal of the American Society

for Information Science, 42(5), 378-383.

Reklaitis, G. V., Ravindran, A., & Ragsdell, K. M. (1993). Engineering optimization methods and

applications. Wiley, New York.

Salton, G. (1989). Automatic text processing: The

transformation, analysis, and retrieval of information by computer. Addison-Wesley.

Smeaton, A. F., & Van Rijsbergen, C. J. (1983). The retrieval effects of query expansion on a feedback document retrieval system. The Computer Journal,

26(3), 239-246.

Swartout, B., Patil, R., Knight, K., & Russ, T. (1996). Toward distributed use of large-scale ontologies.

Proceedings of the 10th Knowledge Acquisition for Knowledge-Based Systems Workshop, 33-40.

The TAO of topic maps. Retrieved from the World Wide Web: http://www.ontopia.net/topicmaps/materials/ tao.html

Tseng, Y. H. (2001). Automatic cataloguing and searching for retrospective data by use of OCR text.

Journal of the American Society for Information Science and Technology, 52(5), 378-390.

Woods, W. A. (1997). Conceptual indexing: A better way to organize knowledge. Retrieved from the World Wide Web: http://www.sun.com/research/ techrep/1997/smli_tr-97-61.ps

Xiaowei, S., and Minghu, J. (2003). An information retrieval system based on automation query expansion and Hopfield network. IEEE lnt. Conf.

Neural Network & Signal Processing, 1624-1627.

(23)

еໍ኉ȃᒄτӓȞ2003ȟȄ᜹ઢစᆪၰᇅዂጚ௢ښ౪ ፤ΤߟȞঔॐޏȟȄѯіҀȈӓ๽Ȅ ؄ঢ়ᇻȞ2002ȟȂп Bio-Ontology ࣐ஆᙄϟлᚡӵშ ೪ॏᇅᄃձȄࡏݎऌ׭τᏱၦଊᆔ౪قᆉς፤ НȂґяޏȂࡏݎᑫȄ ݔ߭Ԛȃዊ໩ኌȃዊ໩஥ᄹȞ2003ȟȄлᚡӵშІڐ ӶસЖڑϟᔗңȄ2003 Ԓၦଊऌ׭ᇅშਫᓣᏱ೛ ःଇཽȂ229-253Ȅ लᛞНȞ2005ȟȄлᚡ྆܉໧ቺዂ࠯Ȉ྆܉ԓཫ൷Ȅ ϜѶτᏱᆪၰᏱಭऌ׭ःفܛᆉς፤НȂґя ޏȂੁ༫ᑫȄ ೩ҔݡȞ2004ȟȄᇮཏᆪαՍ୞Ͼ࡛ᄻҐᡞ፤ϟः فȄЉлఁህϧτᏱၦଊᆔ౪Ᏹقᆉς፤НȂґ яޏȂѯіᑫȄ ചӏ๽ȃೆ໰⩨Ȟ2001ȟȄၦଊᔯસϟϜНມཋᘘ৥Ȅ ၦଊ༉ክᇅშਫᓣᏱȂ8Ȟ1ȟȂ59-75Ȅ ෇ϰᡘȞ2002ȟȄ኶՞НӈϟၦଊಣᙒᇅлᚡϸݚՍ ୞Ͼϟ׭೛ᇅᔗңȄѯіҀҴშਫᓣᓣଊȂ20 Ȟ2ȟȂ23-35Ȅ ဩܒԚȞ2003ȟȄ᜹ઢစᆪၰዂԓᔗңᇅᄃձȄᏑݔȄ

參考文獻

相關文件

National Mathematics Magazine 後來改名為 Mathematics Magazine,而這份期刊最早的 名稱是 Mathematics News Letter,各自之發行期數及年份如下:. Mathematics

隨著朝陽建校滿十週年,營建工程系也 10 歲大了,在歷任系主任的帶領下,營建系已經頗 有規模。目前本系仍維持結構工程、大地工程、營建管理三個領域的師資,共有 19

‡圖形使用者介面( graphical user interface GUI). ‡圖形使用者介面( graphical user

使用人工智慧框架基礎(Frame-based)的架構,這些努力的結果即為後來發展的 DAML+OIL。DAML+OIL 是 Web Resource 中可以用來描述語意的 Ontology 標 記語言,它是以 W3C

在混凝土科技發展方面,則邀請日本東京大學野口貴文教授與中國北京建築大 學宋少民教授等國際知名混凝土工程學者,對於「Challenge to Regeneration and Conservation of

Source: The House News. Source:

建築資訊建模(Building Information Modeling, 簡稱

Additional Key Words and Phrases: Topic Hierarchy Generation, Text Segment, Hierarchical Clustering, Partitioning, Search-Result Snippet, Text Data