ࡣᅠ⎊Ւው⥴⦝ןიଭʠʙ㆛߸ߧᇜࠣ
ᅘ⊏၃ଁ೧Ⴞ
An Intelligent News Search Engine with Topic Map
User Interface Based on Automatic Query Expansion
づൠ⺐
ߡἼᄎᗶञણߧሬ⫏⤻⎞ᒆጊણᶇἄԞᄞ࿙
Chih-Ming Chen
Associate Professor, Graduate Institute of Library, Information and Archival Studies, National
Chengchi University
E-mail: chencm@nccu.edu.tw
⇾ⓧ
ߡἼቺⓧञણણ∳Ấ༬ᶇἄᷟंᶇἄᮝ
Mei-Hua Chang
Master Student, Graduate Institute of Learning Technology, National Dong Hwa University
(Meilun Campus)
E-mail: cutesandra15@yahoo.com.tw
ϒ݄
ߡἼᄎᗶञણ⫏⤻Ấણᶇἄᷟंᶇἄᮝ
Wei-Chia Chiu
Master Student, Graduate Institute of Computer Science, National Chengchi University
E-mail: chiu.wei.jia@gmail.com
〦⼫⥱ņKeywordsŇŘ
ʙ㆛߸ߧņTopic MapsŇř⫏⤻ᒑ€ņInformation RetrievalŇřᵧ⨯㋤
ņOntologyŇ
řው⥴⦝ןიଭņQuery ExpansionŇ
řᅘ⊏၃ଁ೧ႾņNews
Search EngineŇ
ȹၪ⣬Ⱥ
Ⲗ ౺ ͗ ൬ Ⳍ ᱹ ଭ ᱿ ᅘ ⊏ ℐ Ἷ ⠘ ᮝ ʴ ͐ ᮢ
≛ଃᅠ⫏⤻ው⥴᱿ᣊトᖣŊ≟ᲿԊ⥓घᅘ
⊏ℐἿņ͛ॖŘGoogle NewsŇר˫⎊Ւ⸒
ଃ ℐ も ℐ ⭰ ᱿ ᅘ ⊏ Ⳗ ⠗ ᅘ ⊏ ʶ ˴ ᱿ ⎊ Ւ Ӡ
ㆩŊ˫ဏͧ⩊≛ᅘ⊏ʶ˴᱿⊌ゝ⫏⤻Ŋ
20 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α
ʬဏͧ˫〦⼫Ⳗ⠗ᅘ⊏ው⥴᱿Լ⋱Ŋ̟ᆯ
ִᤀᘍဏͧᄮΤ〦ᅘ⊏ʶ˴ᱹଭ⋸⃘᱿ው
⥴דͩჇʙ㆛〦ʠᅘ⊏߸ߧ᱿اᑨӼȯ
ᶇἄӴᮢ Google News ר˫⎊Ւ૽ᅘ⊏
ʶ˴Ӡㆩ᱿᧚ඖŊⳖɺᔎӴᮢᄊ⏦ೣパᇒ⓪
ᦲ ㆩ Ṙ ℐ ⭰ ņ Modified Hopfield Neural
Networks, MHNNŇ⎊Ւᔖₗᮟᮝᅘ⊏ው⥴⦝
ןᵧ⨯㋤ņontologyŇŊʏͩჇᮟᮝʠ
ᅘ⊏ው⥴⦝ןᵧ⨯㋤ŊဏӛʴɺΤר˫ͩ
Ⴧ͐ᮢ≛⎟⬶ᅘ⊏ㆩӲ͗ϝᅘ⊏ʶ˴᱿ი
ଭው⥴ņquery expansionŇᑨӼŊ˫ջᅘ
⊏၃ଁ᱿ᄓ⋱ȯᔍकŊ☼᮫ʙ㆛߸ߧņtopic
mapsŇʠᅞೣဏͧሷӲᅠЗ⃥ᅘ⊏اᅞೣ
⃛͐ᮢ≛ᡕ⤍〇⩊Ŋר˫૽ᅘ⊏ው⥴ኞͩ
Ⴧᆹ⿵ᐻ⥆דʙ㆛〦⊓اŊ⩕͐ᮢ≛⋱ሩ
ᛤᎸ߸ဝᄮΤᅘ⊏ʙ㆛᱿ᱹଭ⋸⃘ȯ
μAbstractν
With the rapid development of the
computer and Internet techniques, the
Internet appears some news aggregator sites
containing a large number of news articles,
thus leading to advanced information
retrieval requirements. Particularly, some
news sites, such as Google news, provide
automatically classified news information and
keyword based search mechanism to
readers for retrieving user-interested news
events. However, most news sites do not
provide currently to retrieve the developing
clues of a news event and display the search
results by the topic map with visualization
user interface. Therefore, this study presents
a novel news search engine with automatic
user query expansion mechanism and a
friendly topic map user interface based on
the automatic generation scheme of news
ontology constructed by Modified Hopfield
Neural Networks. The experimental results
indicate that the proposed query expansion
mechanism based on the generated news
ontology can efficiently help users to retrieve
the user-interested news events. Meanwhile,
the providing topic map of news events
ranked by time stamp also provides benefits
in terms of observing the developing context
of a news event.
ℨ⧄
ᓎᆪርᆪၰޠึȂᆪၰུᆹԚ࣐উഷள௦ ដޠၦଊٿྜϟΚȇུᆹηᗵ֥೩Ӽ२्ଊਁȂ ԥٳ२्ޠུᆹ٘пኈژ࢈ݾȃစᔽȃަ๊ཽР ८ޠึȄՅུᆹٲӈޠึҢȂηᡘұяҭࠊަཽ ึޠȂ֊ਣජඬུᆹԥֆܼձҔጃޠցᘟ ᇅ؛๋ȂࢉึًԂޠུᆹᔯસȂΚঐय़Ϲ ޠःفឋᚡȞChen & Liu, 2008ȟȄՅؑСུᆹٲӈቺяϛጐȂԄեւңԥਞޠၦଊ ᔯસӶᛂτޠུᆹၦਠ৳ϜȂ׳яٻңమࢦ ၛޠၦଊߩள२्ޠःفឋᚡȄளңޠၦଊᔯસ ዂτयѠϸ࣐ȈҁݔړዂȃӪ໕ޫዂІ ᐡ౦ዂήᆎȞBaeza-Yates & Ribeiro-Neto, 1999ȟȄ ҭࠊӶཫ൷Жᔞαࠍпҁݔᔯસഷኅ࣐ٻңȂҁݔ ᔯસණٽࢦၛᇮѰȞܗᜱᗥԆȟӶөНӈໍһ ȞANDȟȃᖓȞORȟܗৰȞNOTȟϟၽᆘȂ ᄈܼڏԥ݃ጃሰؒޠࢦၛມᔯસഁ࡚ץйߩளԥਞ ౦Ȃկᄈܼϛ݃ጃሰؒޠࢦၛܗၷᜳߓႁϟᇮཏ ࢦၛ܂܂ആԚࢦྦ౦Ȟprecision rateȟմйє֥೩Ӽ ᚖଊȞnoiseȟޠુᘉȄ ᓎࢦၛᇮཏޠሰؒབٿབڨژτঢ়ޠ२ຝȂՅ ԥΠᇮཏᆪȞsemantic webȟȞBerners-Lee & Fischetti, 1999ȟޠᄻདྷȇוగӓ౩ၦଊᆪαޠၦਠȂᡑ ԚႬဟ౪၍ޠᇮّȞmachine-readableȟȂҭޠᡲ ႬဟΠ၍উܛమߓႁޠҔཏࡧȄՅᇮཏᆪޠ ᄃ౫ሰٸᎭޤᜌҐᡞ፤Ȟontologyȟޠ࡛ဋȄޤᜌҐ ᡞ፤Κঐቺԓ࢝ᄻȂڐ࢝ᄻңٿඣख़࢛ঐስ ޠޤᜌȞSwartout, Patil, Knight, & Russ, 1996ȟȄस ޤᜌҐᡞ࡛ဋӶ੬ੇስαȂඣख़੬ۢስޠޤ ᜌրȃրޠ឵ܓпІրᇅրϟޠᜱ߾Ȃ ໍΚؐႁԚ྆܉ᇮཏޠၦଊᔯસȂᄈܼٻңࢦၛ ཏӪϛ݃ጃޠཫ൷ȂقಜѠпՍණٽΚٳࣻᜱࢦ ၛᇮѰޠᘘȂпණାၦଊޠࢦӓ౦Ȅկᓎၦଊ
ץഁӵᡑȂसпϏٿ࡛ဋޤᜌҐᡞȂϛ༊ຳ ਣέຳΩȂٛश࡛ဋԂޠޤᜌҐᡞηৡܿႇਣȃϛ ڏኇܓȄԥᠧܼԫȂԄեՍϾ࡛ဋࢦၛᇮѰޤᜌ ҐᡞȂໍՅМනԥਞུᆹᔯસҐःفޠл्ःف ឋᚡȄ Κૢޠཫ൷ЖᔞӶ֖౫Ϯ८೪ॏαȂЩᄈ๗ݏ пఽޠРԓٿ֖౫࣐л्ޠዂԓȄҭࠊηึΠ ԥրܼ༉ಜ֖౫РԓϟຝញϾཫ൷ЖᔞȞvisualization engineȟȄ Mooter ཫ൷ЖᔞȞMooter search engineȟ ւңထಣȂпຝញРԓཫ൷๗ݏϸԚӼထ Рԓ֖౫ٻңᘳ៕ȇKartoo ཫ൷ЖᔞȞKartoo search engineȟϸݚܛᇕޠ୧ཿᆪયܛණٽϟ݉ ଡ଼ٿϸထȂٯւң݉ଡ଼ޠᜱᗥມٿᡘұڐᆪ યޠᜱᖓᏳ៕ȄկҭࠊӶпུᆹཫ൷࣐лޠཫ൷ ЖᔞαȂۧુнпຝញϾޠ֖౫РԓȂР߰ٻң ᘳ៕Ꭸཫ൷๗ݏϟϮ८ȄҐःفпڑᙡޠ Google news ࣐ःفᄈຬȂॷӒၽңུᆹޤᜌҐᡞٿМන ུᆹཫ൷ЖᔞޠࢦၛᇮѰՍᘘȂпቩђᔯ સޠਞȇӕࠍཫ൷๗ݏплᚡӵშРԓ֖౫Ȃ пР߰ٻңᎨࣻᜱུᆹึȄ
ᄽ᪇⤽
Ґःفл्ҭޠӶܼٸᐄ Google news ུᆹϸ ࢝ᄻȂණяΚঐՍϾҢུᆹࢦၛມޤᜌҐᡞϟ ԥਞРݳȂٯڐᔗңܼණུ݈ᆹᔯસਞȄ ᐄԫःفҭޠȂҐଭᄈࣻᜱܼҐःفޠၦଊ ᔯસձНᝧޠᇕᇅϸݚȂձ࣐Ґःفޠ౪ ፤ஆᙄȄॷӒϮಞ༉ಜၦଊᔯસІڐӶၦଊ ᔯસαޠ४ښȇ௦ଇࢦၛᇮѰޠᘘȂє֥ ࣻᜱӲ㔴ȃӤဏມڸޤᜌҐᡞ፤Ԅեԥਞණ݈ᔯ સਞȇӕᇴ݃ઢစᆪၰޠஆҐ࢝ᄻȂձ࣐Ж Ᏻᕤ၍Ґःفܛ௵ңϟًԓᓔຆᅮઢ စᆪၰȂໍུᆹޤᜌҐᡞ࡛ᄻޠஆᙄȇഷࡤඣख़ лᚡӵშ౪፤ІڐᄈຝញϾ֖౫ၦଊᔯસ๗ݏޠ ᓻᘉȄЗ⃥᱿⫏⤻ᒑ€༬⠛
༉ಜޠၦଊᔯસл्єࢃᜱᗥԆՍᘞڦȃ ᜱᗥມસЖȃӓНᔯસȃНӈՍϸІНӈՍᄣ ्๊ȞάϊፇȂ1996ȟȇၦଊᔯસዂԓл्ԥήτȞBaeza-Yates & Ribeiro-Neto, 1999ȟȈҁݔړዂ ȃӪ໕ޫዂІᐡ౦ᔯસዂȂпίഃΚ ϮಞȄ
లኚӜᄲᑁࠣ
ҁݔዂԓȞBoolean ModelȟΚᆎഷᙐޠᔯસ РݳȂഇႇӬ౪፤ȞSet Theoryȟᇅҁݔх ȞBoolean AlgebraȟޠၽᆘȂ֊НӈϲৡಓӬᔯસ ມȞܗࢦၛᇮѰȟϟޠ ANDȞ∧
ȟȃORȞ∨
ȟ І NOTȞ¬
ȟϟҁݔၽᆘϘڦяȂϛಓӬ֊ џȄ ӶϭСϬԥ೩Ӽτقಜ௵ңԫޠᔯસዂԓȂ ᓻᘉԫᔯસዂԓޠᔯસഁ࡚ץйᄃ౫РݳᙐȂ ᄈܼሰؒ݃ጃޠᔯસߩளԥਞȇુᘉᔯસޠ๗ݏ ءԥٸྲಓӬโ࡚௷זйٻңၷᜳпԫߓႁፓᚖ ࢦၛӈȄӱԫҁݔᔯસৡܿҢڎ྄Ͼޠ౫ຬȈ ାѷం౦Ȟsearch failureȟܗାྗᔯ౦Ȟinformation overloadingȟȞBaeza-Yates & Ribeiro-Neto, 1999ȟȄ ӱ ԫ Ȃ ԥ ٳ Ᏹ ණ я ᘘ щ ҁ ݔ ዂ Ȟ Extended Boolean ModelȟȂ࡛ឋђ᠍ؑঐᜱᗥԆȂпණ݈ҁ ݔᔯસޠਞȞChoi, Kim & Raghavan, 2001ȟȄ⸇ἇ⿵ᑁࠣ
SaltonȞ 1989ȟ ܼ 1965 Ԓ ණ я Ӫ ໕ ޫ ዂ ȞVector Space Model, VSMȟޠᄻདྷȂϛӤܼҁݔ ᔯસޠѻϛӕѬΡϰϾޠЩᄈȄSalton ᇰ࣐ ഌӌޠӈಓӬѠпԇӶޠȂӱԫණяւңᜱ ᗥ Ԇ ມ я ౫ ޠ ᓝ ౦ ٿ ඳ ᆘ ڐ ᠍ २ অ ܗ ࣻ ծ ࡚ ȞsimilarityȟȄڐᏱዂޠஆҐ౪Нӈп ඳᆘϟ᠍२ߓұԚΚঐӪ໕ԓȂٻңޠᔯસມη ߓ ұ Ԛ ѫ Κ Ӫ ໕ ԓ Ȃ ٻ ң Ρ ঐ Ӫ ໕ Ꮈ ۿ ַ ِ ȞCosineȟٿॏᆘࢦၛມᇅ࢛ΚНӈޠࣻծ࡚ȇ࿌ ڎ຺ַِϊȂڐࣻծ຺࡚ାȂЇϟȂࠍࣻծ຺࡚մȂ Ԅშ 1 ܛұȄՅࣻծ࡚অϮܼ 0 ژ 1 ϟȂܛԥ НӈᇅࢦၛᇮѰޠࣻծ࡚অђп௷זȂ֊Ѡᕖூၦ ଊᔯસ๗ݏȄԫᆎ੬ܓȂ֊ٻ࢛НӈѬԥഌϸᇅ ࢦၛᇮѰࣻӤȂѻϬԥѠೞᔯસяٿȂӱԫѠॐ ۢᎍ࿌ޠᖞࣩঅȞthresholdȟȂѬڦ࢛ᆎࣻᜱโ ࡚пαޠНതȄԫᆎᔯસዂԓȂѠϱ೩ٻңᒰΤ ӉཏԆ՜Ȃࢦၛਣϛ҇ڨၦਠᇳෛȃᓀԆȃ϶Ԇޠ ४ښȄ
22 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α შ 1! Ӫ໕ߓұݳ
ӱԫȂഇႇӪ໕ޫዂޠ྆܉ȂНӈ D Ѡߓұ ࣐D=(t1,t2,t3,...,ti,...tn)ȂڐϜ ti࣐НӈϜಒ i ঐ
੬ኊᜱᗥມཋȄӶӪ໕ޫዂϜڐ੬ኊᜱᗥມཋ ᠍२অޠॏᆘȂഷளٻң TFɰIDFȞTerm Frequency
ɰInverse Document FrequencyȟԆມ᠍२ॏᆘР ԓȄഇႇ੬ኊᜱᗥມཋ᠍२ȂНӈ D Ѡೞߓұ࣐ ) ,... ,..., , , (w1 w2 w3 wi wn D= Ȃwi࣐၏੬ኊᜱᗥມ ti ӶНӈ D Ϝϟᄈᔗ᠍२Ȅߓ 1٨яၷளೞٻңޠࣻ ծ࡚ॏᆘϵԓȂ೩ӼޠНᝧࡿя Cosine coefficient ࣻծ࡚ॏᆘРԓޠᔯસ๗ݏྦጃ౦Щڐуޠࣻծ࡚ ॏᆘРݳٿூାȞBinstock & Rex, 1995; Och Dag, Regnell, Carlshamre, Andersson, & Karlsson, 2001ȟȄ
Ӫ໕ޫዂᔯસॏᆘഁ࡚ץйᔯસਞ౦ାܼҁ ݔᔯસȇկુᘉءԥՄኍژᜱᗥԆມϟޠӤဏ ມȂആԚࣻծ࡚ॏᆘޠᇳৰȞलᛞНȂ2005ȟȄ
ߓ 1! ளңޠࣻծ࡚ॏᆘϵԓ
Similarity Measure SimȞX,Yȟ Evaluation for Binary Term Vectors Evaluation for Weighted Term Vectors
Inter Product
X
∩
Y
¦
= t i j iY X 1 Dice Coefficient Y X Y X + ∩ 2¦
¦
¦
= = = + t i t i i i t i i i Y X Y X 1 1 2 2 1 2 Cosine Coefficient Y X Y X • ∩¦
¦
¦
= = = × t i t i i i t i i i Y X Y X 1 1 2 2 1 Jaccard Coefficient Y X Y X Y X ∩ − + ∩¦
¦
¦
¦
= = = = − + t i t i j i t i i i t i i i Y X Y X Y X 1 1 2 1 2 2 1 ၦਠٿྜȈȞSalton, 1989ȟᑨ᪓ೣᒑ€
ᐡ౦ዂԓᔯસȞProbability ModelȟȞMaron & Kuhns, 1960; Miller, 1971ȟࢦၛມཋᇅࣻᜱНӈ пᐡ౦ඣख़ٯђпၽᆘȇᇅӪ໕ዂԓϛӤޠᐡ౦ ዂ௵ңॏᆘࢦၛມᇅНӈࣻᜱޠᐡ౦Рԓໍᔯ સȂՅϛւңࢦၛມᇅНӈࣻᜱޠโ࡚ȄӶНӈᇅ ࢦၛӈޠߓұРݳஆҐαᗚ௵ңӪ໕םԓȂᇅ Ӫ໕ᔯસዂϛӤޠӶॏᆘࣻծ࡚ޠਣ௵ᐡ ߠ ౦ޠϵԓȄᐡ౦ޠॏᆘഷள ޠ௵ңٕԓۢ౪Ȅ
ው⥴⦝ן᱿იଭ
пαܛख़ϟ༉ಜၦଊᔯસРݳְпᜱᗥԆཫ൷ Ȟkeyword-base searchȟ࣐ஆᙄȂѬ׳ژНӈϜڏ ԥٻңܛᒰΤᜱᗥԆޠНӈȂࠔ܈Π೩ӼНӈ ϜڏԥᇮཏޠᜱᖓȄՅւңࢦၛᇮѰᘘٿቩђၦ ଊᔯસޠਞ֊ᇮཏᜱᖓાΤՄ໕ȂႇџӶ Р८ޠࣻᜱःفαȂΚૢഎٻңࣻᜱӲ㔴ȃӤဏມ ІޤᜌҐᡞ፤ٿໍȞചӏȃೆ⩨Ȃ2001ȟȂп ίϸրᇴ݃ϟȄ〦ߊ㈘
ࣻᜱӲ㔴Ȟrelevance feedbackȟࡿٻңӶࠊΚ ࢳᔯસ׳ژޠНӈϜȂࢆڦ२्ޠ੬ኊӲ㔴ق ಜȂп׳ژӼࣻᜱၦਠޠٻңӲ㔴ᐡښȄՅ Ӳ㔴قಜޠ੬ኊέѠϸ࣐ڎᆎᄙȈΚпНӈ Ґ࣐ٙлȂᆏ࣐ࣻᜱНӈӲ㔴ȇѫΚпࣻᜱມ࣐ л Ȃ ࠍ ᆏ ࣐ ࣻ ᜱ ມ Ӳ 㔴 ܗ ᔯ સ ມ ණ ұ Ȟ term suggestionȟȞϰᡘȂ2002ȟȄ ႇџӶࣻᜱӲ㔴ޠःفϜȂпࣻᜱНӈӲ㔴࣐ഷ ӼȂٻңሰ्ՍցᘟࣻᜱܼᔯસມޠНӈࡤӲ 㔴قಜȄดՅӶၦଊᔯસᕘძϜȂٻң्҇ ߇ຳ೩ӼѵޠਣᇅᆡΩᘳ៕ٳНӈȂϘः ց٦ٳНӈᇅܛίࢦၛມࣻᜱޠȂ܂܂ཽആԚٻ ңѵޠ॓ᐋȄӶᆎݸίȂѠ௵ңྦࣻᜱӲ 㔴Ȟpseudo relevance feedbackȟȂւңࢦၛᇮѰᔯ સяΚಣНӈȂϛစٻңցᘟ֊೪ܛԥНӈࣲ ࣐ࣻᜱȂՅٳ೪ޠࣻᜱНӈ֊စҦࣻᜱӲ㔴ޠ โז२ུ࡛ᄻུޠࢦၛᇮѰȂӕໍໍΚؐޠᔯસ Ȟചӏȃೆ⩨Ȃ2001ȟȄԫРݳԥΚ݃ᡘޠુᘉȈ स೪ϟࣻᜱНӈఽϜȂᄃርαϛࣻᜱޠНӈխ τഌϸȂ٦ቅђΤۗࢦၛᇮѰޠᘘມཋᇅᔯ સлᚡٯϛࣻᜱȂࠍᘘࡤࢦၛޠᔯસࠣ፵ཽᡑৰ ȞMitra, Singhal & Buckley, 1998ȟȄࣻၷϟίȂਗ਼І ЩၷЎ໕ϟၦଊࣻᜱມӲ㔴ᡘดЩၷԂޠᒶᐆȂ ٻңηၷৡܿցᘟȄҦܼࣻᜱມᔯસ๗ݏᘞ ڦяٿޠȂՅᔯસ๗ݏτഎᇅᔯસлᚡԆ՜ࣻ ߗȂӱԫᘞڦяޠࣻᜱມητഎၮᔯસлᚡԆ՜ࣻ ߗȄࣻᜱӲ㔴ӶၦଊᔯસϜೞᇰ࣐ᄈᔯસԚਞֆઊ ࣦτȞFrakes & Baeza-Yates, 1992ȟȄःفᡘұȂӶ Κ ٳ ӓН ᔯ સၦ ਠ ৳Ϝ Ȃ Ѡණ ݈ ᔯસ Ԛਞ 20% ȞHarman, 1995ȟȄկηԥഌӌޠःفࡿяȂпಜॏ ޠРԓٿಜॏӔӤя౫ޠࣻᜱມȞco-occurrence of termsȟђΤۗޠࢦၛᇮѰϜȂᗷѠпᅗ٘೩ӼН ӈ Ϝ ڏ ԥ ࣻ Ӥ ᇮ ཏ ޠ ඣ ख़ Ȟ Smeaton & Van Rijsbergen, 1983ȟȂٻࢦӓ౦ණାȂկηᡲࢦྦ౦७ ޠմȞPeat & Willett, 1991; Woods, 1997ȟȄ∑⥱
ӤဏມȞsynonymȟளࡿཏဏࣻծկٻңϛӤН ԆܛߓႁޠມȄΚૢӤဏມέಡϸ࣐ڎᆎȈΚᆎ࣐ ኅဏޠࣻᜱມȂѫΚঐ࣐੮ဏޠӤဏມȞϰᡘȂ 2002ȟȄՅኅဏޠࣻᜱມࡿӶНӈϜစளя౫ޠ ມȂѠᆏ࣐ȶᜱᖓມ৳ȷȃȶӔ౫સЖڑȷȃȶӔգя ౫ޠມ৳ȷȞco-occurrence thesaurusȟȞSalton, 1989ȇ ϰᡘȂ2002ȟȇ੮ဏޠӤဏມࡿНݳαܗᇮཏα ӓѠࣻϤڦхޠມཋȄΚૢኅဏޠࣻᜱມ ளೞٻңӶНӈϸαȇՅ੮ဏޠӤဏມኅހӵೞ ၽңӶშਫᓣᏱٿණЁཫ൷ޠᆡጃ࡚ȄसңӶུᆹ ཫ൷αȂଭᄈ࢛ΚࠍུᆹٲӈпᜱᖓມໍᘘȂ Ѡп׳яڐуծܗࣻᜱޠུᆹлᚡȇٻңӤဏມ ޠ྆܉Ѡпөঢ়ུᆹ൭ᡞϜȂᔯસяൣᏳӤΚӈ ུᆹկңມϛΚޠུᆹٲӈȄ ӤဏມޠࣻᜱःفϜȂԟഎпϏޠРԓ ӤဏӫȞsynonymȟޠມཋ࡛ᄻԚΚঐӤဏມ ߓȂᡲڐϜΚᆎມཋࢦၛژڐуᄙޠӤဏມ ཋȄშਫᓣᏱϜȂױӤဏӫມпಜΚȃዀྦޠם ԓ ख ᓄ ٯ ᆔ ౪ ޠ Р ԓ ᆏ ࣐ ᠍ ࡅ ښ Ȟ authority controlȟȂՅױଅᓄӤဏӫມޠᔭᆏ࣐᠍ࡅᔭ Ȟauthority fileȟȞϰᡘȂ2002ȟȄՅȶӤဏມݔȷ Ȟ೩ҔݡȂ2004ȟ࣐ҭࠊၦଊᔯસഷளٻңޠϏڏ ਫȂկȶӤဏມݔȷңມၷϜτഛР८ޠңᇮȂ ᇅϲСளҢࣁңݳԥٳяΤȂٻூᄃңܓ७մȄ ݸйུᆹңᇮளᓎҢࣁਣٲᡑՅԥܛϛӤȂस пϏٿ࡛ဋӤဏມ৳ሰ्߇ຳࣻ࿌Ӽޠਣᇅᆡ ΩȂਞ౦ٯϛାȄ ҭࠊηԥΚٳՍ࡛ဋસЖڑޠःفȂկѬ ژӤॲມȃםծມޠՍϾસЖȄΚૢӤॲມޠ ౪РԓȂְпึॲԓЩᄈ࣐лȄϜНޠӤॲມ ഇႇݨॲಓဵߓȂࢦၛޠມཋᇅસЖມഎᙾԚݨ ॲȂӶݨॲಓဵαЩᄈȂӱԫȶջᝌ᫂ȷᇅȶջ ೈࡧҚȷȂӱݨॲಓဵࣻӤȂՅѠпϤࣻࢦ൷׳ژȇ ऽНޠӤॲມР८Ȃഇႇ SoundexȞKnuth, 1973ȟᇅ Metaphone ȞBinstock & Rex, 1995ȟᄈНԆձጢጇȂ ηԥծޠӤॲມࢦၛਞݏȞTseng, 2001ȟȄѫѵȂ םծມпᎸۿࣻծ࡚ٿॏᆘȂϱ೩ٻңᒰΤӉཏԆ ՜ȂࢦၛਣϛڨၦਠᓀԆȃ϶Ԇޠ४ښȂпߗծԆ՜ ܗዂጚཫ൷ޠРԓЩᄈસЖມ৳Ȟ೩ҔݡȂ2004ȟȂ ϛ༊ѠпᔯસяࣻᜱНӈȂηѠпᔯસяԆ՜ࣻߗޠ ມཋȂםծມӗܼࢦၛ๗ݏϜȂѠпණұٻңࣻ ᜱޠࢦၛңມȂႁژ᠍ࡅښޠҭޠȞTseng, 2001ȟȄ24 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α پԄȈȶϜःଲȷᇅȶϜѶःفଲȷȃȶၦਠঙᓾȷᇅ ȶၦਠঙᓾقಜȷȃȶଲߞሊলȷᇅȶሊলଲߞȷ ๊Ȃٳڏԥା२᠓౦ޠםծມഎѠпՍӵೞᔯ સяٿȄଷԫϟѵȂȶረටȷᇅȶ࢈ଲߞȷȃȶࢻ ݥఁ௳ȷᇅȶݔ࡛ȷȃȶЯȷᇅȶϜύȷ๊Ԇ՜ ϛ२᠓ޠມཋηೞຝ࣐ӤဏມȂկႬဟณݳѻউ ຝ࣐ӤဏȂٳມཋᗚሰӊᒧϏٿ࡛ဋȄGauch ᇅ SmithȞ1993ȟޠःفࡿяȂпӤဏມસЖڑঔႻ ࢦၛᇮѰȂѠԥਞණ݈ᔯસਞݏȄկӱ࣐ஆܼӤ ဏມޠࢦၛᇮѰᘘሰ्࡛ᄻӶτ໕Ϗܛ࡛ဋޠ Ӥဏມ৳αȂӱԫϛӶҐःفଇ፤ޠጓᛠȄ
ᵧ⨯㋤⧄
ޤᜌҐᡞ፤ȞontologyȟΚມٿՍলᏱስȂл् ңٿଇȶٲޑޠҐ፵ϨቅȉȷȇߗԒٿȂႬဟऌᏱ Ȟcomputer scienceȟІϏහኌȞartificial intelligenceȟ ስηআңޤᜌҐᡞ፤ঐᇮມٿඣख़ᄃзࣩޤ ᜌޠߓႁȄNeches ๊ᏱӶ 1991 ԒණяȶޤᜌҐᡞ ፤ඣख़࢛ΚঐлᚡስޠஆҐᇮȞbasic termsȟ ڸᜱ߾Ȃηۢဏяѵ۾ಣӬᇮȞcombining termsȟ Іᜱ߾ޠࠍȷȞNeches et al., 1991, p40ȟȄBernaras ๊ Ᏹࠍܼ 1996 ණяȶޤᜌҐᡞ፤Ѡпᄈޤᜌ৳Ϝޠޤ ᜌȂණٽΚᆎ݃ጃඣख़ڐ྆܉ϾޠРݳȷȞBernaras, Laresgoiti & Corera, 1996, p298ȟȄSwartout ๊Ᏹࠍ ᇰ࣐ȶޤᜌҐᡞ፤Κঐቺԓޠ࢝ᄻȂڐ࢝ᄻң ٿඣ ख़ ࢛ঐ ስ ޠ ޤ ᜌȷȞSwartout et al., 1996, p138ȟȄӱԫसޤᜌҐᡞ࡛ဋӶུᆹስޠມཋ ᜱ߾ሠមαȂѠпᐄԫ๗ᄻܓޤᜌٿህֆໍࣻ ᜱུᆹၦଊޠཫ൷ȂໍՅԥֆܼᔯસяӼࣻᜱܼܛ మࢦၛུᆹٲӈϟڐуࣻᜱུᆹึȂηҔ ҐःفוగึޠਰЗȄㆩṘℐ⭰
Ґःف௵ңًԓᓔຆᅮઢစᆪၰ࡛ဋུᆹ ມཋޤᜌҐᡞȂٯпՍҢϟུᆹມཋޤᜌҐᡞ ህֆࢦၛᇮѰᘘȂໍՅႁژණାུᆹᔯસਞІ ᔯસࣻᜱུᆹึޠҭޠȂ௦ίٿଭᄈઢစ ᆪၰޠஆҐ౪ձᄣ्ܓޠϮಞȄઢစᆪၰȞArtificial Neural NetworkȟΚᆎ҂ ॏᆘዂԓȂѻٻңτ໕ޠϏઢစϰٿዂҾҢޑ ઢစᆪၰޠॏᆘΩȄ௱ᏳϏઢစϰޠᏱಭ ࠍȂ्Ӓᕤ၍ဟઢစಡबޠᏱಭࠍȇԥᜱР ८ޠःفȂഷԥଔᝧޠᔗ၏ HebbianȂуӶ 1949 ԒණяΠઢစಡबޠᏱಭࠍȂኈСࡤઢစᆪ ၰޠึȄშ 2 ڑҢޑઢစಡबዂȄؑঐઢ စϰл्ҦȈઢစᐚȃઢစໆȃઢစȃઢစಡब ਰ๊ѳঐഌϸܛಣԚޠȈ 1. ઢစᐚȞdendritesȟȈઢစϰӪѵ۾ի֖ᐚݓ ޒޠᒰяΤϰȂңٿ௦Ԟܗ༉ଛ߭ဵژڐѻ ઢစಡबȄ 2. ઢစໆȞaxonȟȈഀ௦ӶઢစಡबਰαȂ॓ೱ༉ ଛઢစಡबਰҢޠଊਁژڐѻޠઢစಡबϜȄ 3. ઢစȞsynapseȟȈᒰΤઢစᐚڸᒰяઢစᐚࣻ ഀ௦ޠᘉᆏ࣐ઢစȄઢစઢစᆪၰαޠ ଅᏺᡞȂߓұڎঐઢစಡबޠഀ๗࡚Ȃ пΚঐঅٿߓұȂٯᆏϟ࣐ђ᠍অȄ 4. ઢစಡबਰȞsomaȟȈઢစಡबޠਰЗഌϸȂ ڐѓઢစᐚ༉ଛٿޠ߭ဵђп༙ᐍȃᙾ ඳࡤȂӕҦઢစໆ༉ଛژڐѻઢစᐚȂԚ࣐ί ΚঐઢစϰޠᒰΤଊဵȄ შ 2! ઢစಡब๗ᄻშ ၦਠٿྜȈȞеໍȃᒄτӓȂ2003ȟ ᒰΤઢစᐚȞᐚऐȟȞdendritesȟ ઢစȞऐដȟȞsynapsesȟ ᒰяઢစᐚȞᐚऐȟȞdendritesȟ ઢစໆ ȞໆસȟȞaxonȟ
ᕤ၍ҢޑઢစಡबዂࡤȂпίϮಞԄեпϏ ઢစϰٿዂҾҢޑઢစಡबȄϏઢစϰҢޑઢ စϰޠᙐዂᔤȂѻѵࣩᕘძܗڐѻϏઢစ ϰڦூၦଊȂӶစႇᙐၽᆘࡤȂڐ๗ݏᒰяژ ѵࣩᕘძܗڐѻϏઢစϰȞဩܒԚȂ2003ȟȄ ϏઢစϰȞartificial neuronȟȂέᆏ౪ϰ Ȟprocessing elementȟȂؑΚঐ౪ϰޠᒰяпਊ םޒଛяȂԚ࣐ڐѻ౪ϰޠᒰΤȂԄშήܛұȄ შ 3! Ϗઢစϰዂ ၦਠٿྜȈȞဩܒԚȂ2003ȟ ؑ Κ ঐ Ϗ ઢ စ ϰ ࣲ ഀ ๗ ೩ Ӽ ᒰ Τ ϰ n i x x x x x1, 2, 3,..., ,..., Ȃђ᠍অ wijхߓઢစޠഀ ๗࡚Ȃᒰяঅ YjᒰΤঅޠђ᠍ॹᑗڸȂ౪ ϰϟᒰяᇅᒰΤঅޠᜱ߾ԓȂѠңᒰΤঅђ᠍ॹᑗ ڸޠړٿߓұȂԄпίϵԓȈ
¸
¸
¹
·
¨
¨
©
§
−
=
¦
i j i ij jf
w
x
Y
θ
ȞϵԓΚȟ ڐಓဵᇴ݃ԄίȈ Yj࣐ዂҾҢޑઢစϰዂޠᒰяଊဵȄ f࣐ዂҾҢޑઢစϰዂޠᙾඳړȄ wij࣐ዂҾҢޑઢစϰዂޠઢစ࡚Ȃέᆏഀ ๗ђ᠍অȄ Xi࣐ዂҾҢޑઢစϰዂޠᒰΤଊဵȄ j θ ࣐ዂҾҢޑઢစϰዂϟሩঅȄ ளңޠᙾ ඳړ ϸ࣐ή ᆎȈ 1. ؐړ Ȟstep functionȟȂέᆏΡঅړȞtwo-value functionȟȇ2. ᚗ᠊ԣړȞsigmoid functionȟȇ3.ᚗԣጤҔϹړ Ȟhyperbolic tangent functionȟȄήᆎᙾඳړְԥ ڐӔܓȂ൸࿌ᒰΤঅၷϊਣȂڐᒰяঅ࣐ 0 ܗ -1ȂຝړՅۢȇ࿌ᒰΤঅၷτਣȂڐᒰяঅ֊ᙾ ࣐ 1Ȃл्ᐄ E. D. Adrian ઢစಡबޠႬϾᏱ ձңᏱᇴȈ࿌Κঐઢစಡबڨژ࡚٘ޠڗᐮࡤ ཽึяΚঐۢঅႬࢻፑȂڐޒᄙη൸ᡑ࣐ 1Ȅ ӶᄃርᔗңϜȂήᆎᙾඳړȂளϸրᔗңӶϛ ӤޠઢစᆪၰϜȂ࿌ઢစᆪၰᔗңӶΡ྄অϾ ȞbinaryȟقಜਣȂτӼ௵ңؐړȇ࿌ᔗңӶഀ ៊ȞcontinuousȟقಜਣȂ߰ሰ௵ңᚗ᠊ԣړІ ᚗԣጤҔϹړȄҐःفܛٻңޠ MHNN ௵ңᚗ᠊ ԣړձ࣐ᙾඳړȄʙ㆛߸ߧ⧄
ҦܼᆪርᆪၰޠᑺକȂٻூၦଊ໕ା࡚ޠԚߞȂ ഇႇΚૢཫ൷ЖᔞпᜱᗥԆࢦၛРԓٿཫ൷ࣻᜱၦ ଊѠூژԚξα์ޠၦਠȇկٳၦਠӱ໕ ᛂτȃґစಣᙒᇅϸȂᏳयٻң҇ᘳ៕ࣻ࿌ ӼޠၦਠϘ׳ژమ׳൷ޠၦଊȄኻޠཫ൷Рԓ ᄈٻңՅّΚᆎؖ२ޠ॓ᐋȄӱԫлᚡӵშ Ȟtopic mapsȟޠᢏ܉ೞණяٿȂп၍؛ၦଊ॓ၸޠ ୱᚡȄлᚡӵშΚᆎңٿಣᙒᇅᆔ౪τ໕ၦଊၦ ྜޠᐡښȂڐഷಥҭޠӶܼ࡛ҴΚঐഷٺϾޠޤᜌ Ᏻ៕Ϯ८ȂٯණٽٻңΚঐץഁජඬᇅሇᚭᏱ ಭޤᜌޠᐈձϮ८Ȟঢ়ᇻȂ2002ȟȄлᚡӵშܼ 1999Ԓ 12 УᕖூርዀྦಣᙒޠᇰᜍȂԚ࣐ ISO / IEC 13250ޠጓȄISO / IEC 13250 ጓޠлᚡӵშ л ् є ֥ ή ঐ ਰ З ϰ ષ ș л ᚡ Ȟ topic ȟȃ ᜱ ᖓ ȞassociationȟڸၦྜࡿЖȞoccurrenceȟȂ֊ T.A.OȄ26 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α ૮ T.A.O ޠ֥ཏᇴ݃ԄίȞThe TAO of topic
mapsȇݔ߭ԚȃዊኌȃዊᄹȂ2003ȟȈ 1. лᚡȞTopic, Tȟ ӶлᚡӵშϜȂޤᜌޠஆҐϰᆏ࣐ȶлᚡȷȂ л्֖౫࢛ঐ྆܉ޠȶлᚡȷȄлᚡȞtopicȟѠ пΚঐȃΚঐᄃᡞȃΚঐ྆܉ޠӉեٲޑȂ ϛᆔԥᄃᡞԇӶȃԥӉե੬ܓȂлᚡ ཽӱᔗңαޠሰ्ȃၦଊܓ፵пІлᚡӵშϟ ңഋՅԥܛϛӤȄлᚡѠпೞᘫԚထȂᆏ࣐ лᚡȞtopic typesȟȂԄშ 4 ܛұȄඳّϟȂ лᚡ൸лᚡܛᘫ឵ޠրȇΚঐлᚡѠ пӤਣᘫ឵ΚঐпαޠлᚡȂлᚡӶ лᚡӵშϜηೞᇰ࣐ۢΚঐлᚡȄЩРᇴȂȃ ٱȃޑϸրഎлᚡȂկӤਣηϸ឵ ܼٱȃޑڎঐлᚡȄ შ 4! лᚡȞtopic typesȟ 2. ၦྜࡿЖȞOccurrence, Oȟ ΚঐлᚡѠӤਣഀ๗ΚঐܗΚঐпαޠၦଊ ၦྜȞinformation resourcesȟȂՅйӶ࢛ᆎโ ࡚αᇅ၏лᚡڏԥᜱᖓȄٳޠၦଊၦྜᆏϟ ࣐лᚡޠၦྜࡿЖȂԄშ 5 ܛұȄၦྜࡿЖϲ ֥ӶлᚡӵშϲȂηѠпᑀҴӶлᚡӵშϟ ѵȂഇႇ፞Ԅ HyTime AddressingȞIn HyTMȟ ܗڑޠ URIsȞIn XTMȟ๊ᐡښٿ֮ۢȄၦ ྜࡿЖѠпϛӤޠӉեԚসȂӶлᚡӵ შዀྦጓϜȂၦྜࡿЖຝ࣐ΚঐِՔ ȞroleȟȂԄӤлᚡȂၦྜࡿЖِՔηೞຝ ࣐лᚡȄ შ 5! лᚡϟၦྜࡿЖȞoccurrenceȟ 3. ᜱᖓȞAssociation, Aȟ лᚡϟѠւңᜱᖓȞtopic associationȟٿᡘ ұڐᇮཏᜱ߾ȂԄშ 6 ᡘұڐлᚡᇅᜱᖓȂپ ԄȶᛴೲϜȷڸȶήᅌဏȷڎлᚡϟڏԥ ȶቹձȷᜱ߾ȄϛӤܼၦྜࡿЖഀ๗ژНӈٿ ྜȂᜱᖓߓ౫яΚঐє֥ၦଊҐ፵ȃ֖౫ၦଊ л्ቌঅޠޤᜌஆᙄȂΚঐлᚡᜱᖓٯґ४ښ ࣻᜱлᚡޠ໕ȄӶлᚡӵშϜȂᜱᖓӤਣη ೞຝ࣐ΚঐлᚡȂηԥᜱᖓȞassociation typeȟȂԄȶቹձȷ֊Ѡຝ࣐ΚᆎᜱᖓȄᜱᖓ ױڏԥࣻӤᜱ߾ޠлᚡ༙ԚထȂԥֆܼቩ ђлᚡӵშޠߓႁΩȄлᚡӵშ൸ڐҐ፵Յ ّᙐޠȂплᚡձ࣐ஆҐષ؆Ȃٯւң ᜱᖓ࡛Ҵлᚡϟޠᜱ߾ȂлᚡѠпԥԂංঐ ӫᆏڸၦྜࡿЖȂٯւңጓ൝४ښӫᆏȃၦྜ ࡿЖڸᜱᖓޠԥਞጓᛠȂ൸ഷஆҐޠлᚡ ӵშȄ შ 6! лᚡϟޠᜱഀȞtopic associationȟ
лᚡӵშᔗңུܼᆹཫ൷๗ݏޠߓႁαȂѠпఽ ྀޠ֖౫ུᆹٲӈޠᜱᖓᇅȂ࿌ٻңᄈ࢛ ܓ ຑ ঐུᆹлᚡདᑺ፹ਣȂѠп ңົ ๗ޣ௦ᘉᒶࣻ ۩ ᦰ ᜱུᆹၦྜໍ Ꭸ ȂӤਣηٟഁ׳ژڸঐུ ᆹлᚡࣻᜱޠڐулᚡȄӱԫлᚡӵშԥֆܼٻң ץഁජඬᇅሇᚭᎨུᆹၦଊȄ
ᶇἄᅞᘍ⎞ከᐉ
ᶇἄከᐉ
ҐःفණяϟȶஆܼՍࢦၛᇮѰᘘϟлᚡӵშ හኌུᆹཫ൷Жᔞȷقಜ࢝ᄻԄშ 7 ܛұȂᐍᡞق ಜӔϸ࣐ήঐዂಣڸڎঐၦਠ৳Ȃϸրᇴ݃ԄίȈ1. ུᆹᘞڦዂಣȞnews archive moduleȟ
Ґ ः ف ึ ུ ᆹ ᘞ ڦ х ౪ Ȟ news crawler agentȟ Google news ϑစϸԂޠུᆹٲӈ ᆪርᆪၰϜᘞڦӲٿȂٯଭᄈؑΚུᆹٲӈ ໍུᆹዀᚡȃൣᏳ൭ᡞȃൣᏳਣȃུᆹҐ Н๊ၜមၦਠȞmetadataȟޠՍᘞڦІϜНᘟ ມ౪Ȃٯ౪๗ݏᓾԇژུᆹٲӈၦਠ৳ Ȟnews event databaseȟϜȂпձ࣐ࡤ៊Ңུ ᆹޤᜌҐᡞІུᆹཫ൷ޠஆᙄȄ
2. Ս ུ ᆹ ޤ ᜌ Ґ ᡞ ࡛ ᄻ х ౪ Ȟ automatic ontology generation agentȟ
ՍུᆹޤᜌҐᡞ࡛ᄻх౪ཽᐄစҦϜН ᘟມܛᘟяٿޠؑΚུᆹٲӈዀᚡȞtitleȟມཋ Ȟ term ȟȂ ໍ Ӕ գ я ౫ Ԇ ມ ޠ ౪ Ȃ ٯ п MHNNଭᄈӔգԆມᜱᖓ࡚ໍᆺȂᐄԫѠ ࡛ ဋ я ུ ᆹ ມ ཋ ޤ ᜌ Ґ ᡞ Ȃ ٯ ٸ ᐄ Google news ޠ ུ ᆹ ր ᓾ ԇ ܼ ޤ ᜌ Ґ ᡞ ၦ ਠ ৳ Ȟontology databaseȟϜȄ࿌ٻңᒰΤུᆹᜱ ᗥԆٿࢦၛུᆹၦଊਣȂޤᜌҐᡞཫ൷Жᔞ Ȟontology based search engineȟཽཫ൷ܛԥུ ᆹၦਠ৳Ϝє֥ԫᜱᗥԆޠུᆹٲӈրȂӕ ٸᐄᓾԇܼޤᜌҐᡞၦਠ৳Ϝޠུᆹມཋᜱᖓ ໍུᆹٲӈޠᘘࢦၛȂпϾུᆹཫ൷ޠ ਞȄ
3. лᚡӵშ֖౫х౪Ȟtopic maps generation agentȟ лᚡӵშ֖౫х౪ཽᐄٻңܛᒰΤޠ ᜱᗥԆȂ௱ᙩࠊΫᇅࢦၛມഷࣻᜱϟлᚡུ ᆹٯՍ࡛ဋлᚡӵშȂпຝញϾРԓ֖౫ࣻ ᜱޠུᆹлᚡРԓٽٻңᘳ៕ུᆹȄ࿌ٻң ഇႇлᚡӵშᘉᔟȞclickȟࢦၛᇮѰᘘѓ ਣȂقಜཽٸᐄϑစ࡛ဋԂϟུᆹրມཋ ޤᜌҐᡞٿࢦၛᇮѰᘘȄࢦၛޠ๗ݏཽٸ ᐄђພུᆹٲӈึոޠਣዀଅໍ௷זȂᡲ ٻңᐍජඬུᆹٲӈϟึȄ
News Archive Module
News Crawler Agent User Ontology base Search engine Topic Maps Generation Agent News Event Database Automatic Ontology Generation Agent User Interface for Query Google News Visualization Presentation News Metadata Extraction Process Chinese Word Segmentation for News Metadata Ontology Database 1 2 3 4 5 6 7 8 9 11 12 13 10 შ 7! قಜ࢝ᄻშ
28 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α
ᶇἄᅞᘍ
ҐःفוగᙥҦՍ࡛ҴུᆹስϟມཋޤᜌҐ ᡞȂпഇႇུᆹມཋޤᜌҐᡞໍࢦၛᇮѰՍ ᘘȂϾུᆹཫ൷ޠਞȄშ 8 ࣐Ґःفܛණ яϟՍ࡛ဋུᆹޤᜌҐᡞޠࢻโშȂᇴ݃ԄίȈ⫏ᅆԊ⇦⚠
ུᆹᘞڦх౪ Google news ᆪርᆪၰϜᘞ ڦ ٯ ڑ ᙡ ܼ ུ ᆹ ٲ ӈ ၦ ਠ ৳ ޠ ႇ โ Ϝ Ȃ ܛ ԥ ೞ Google news ϸԂޠུᆹٲӈዀᚡཽӶໍϜН ᘟມ౪ࡤȂҢؑΚུᆹٲӈԆມޠӬȂٯ ಜॏؑঐԆມӶӤΚུᆹրϜܛя౫ޠᓝ౦Ȅ Ґःف௵ңϜःଲܛึޠ CKIP ϜНᘟມقಜ ȞCKIP Chinese word segmentation systemȟȄٸᐄҐःفޠᢏᄇຠզึ౫Ȃ४ܼ Google news Ս ུᆹϸޠҔጃ౦ณݳႁژԼϸϟԼᆡྦȂӱԫഌ ӌུᆹٲӈᇅ၏ڐуུᆹٲӈޠࣻծ࡚ϛାȇ࣐Π ᘯଷᇅ၏ུᆹൣᏳၷณᜱᖓޠུᆹዀᚡԆມȂпණ ାܛ࡛ဋུᆹޤᜌҐᡞޠҔጃܓȂҐःفӶໍུᆹ ޤᜌҐᡞ࡛ᄻϟࠊཽӒٳུᆹዀᚡԆມໍᘯ ଷȄԄԫηѠп७մ MHNN ޠᒰΤᆱ࡚Ȃቩђ MHNN ՍҢུᆹޤᜌҐᡞޠᏱಭԞᔨഁ࡚Ȅ ུᆹዀᚡԆມޠᘯଷРݳȂ௵ң࢛ΚུᆹዀᚡԆ ມӶ၏ܛԥུᆹٲӈϜя౫ޠࠍЩپմܼԼϸ ϟΫޠߟᘥঅໍᑣᒶȂڐߟᘥঅᑣᒶӈԄȞϵ ԓΡȟܛұȈ ª ×10%º = NewsNumClassID tfθ ,tfθ≤1 ȞϵԓΡȟ ڐϜ NewNumClassID ࣐࢛Κঐུᆹրޠᖃུᆹ ȇtf࣐͞Ԇມя౫ЩپޠߟᘥঅȄ ԼϸϟΫޠߟᘥঅ೪ۢՄ໕ Google news Ϝུ ᆹࠍӶ 10 ࠍпαޠུᆹրѠпೞߴίٿໍ ུᆹޤᜌҐᡞޠ࡛ҴȇմܼԫΚߟᘥঅϟུᆹዀ ᚡມཋᇅ၏ུᆹлᚡᜱᖓϛାȂӱԫђпᘯଷȄ ԫߟᘥঅस೪ޠЋାȂཽԥӼུᆹրೞᘯଷ ՅณݳໍུᆹޤᜌҐᡞޠ࡛ҴȄӱԫȂစҦϸݚ Google news ؑΚུᆹٲӈޠࠍϸոޒݸІ ҢޠུᆹޤᜌҐᡞࠣ፵ޒݸȇҐःفԫߟᘥЩپ ೪࣐ۢᖃࠍޠԼϸϟΫȂᐄԫҢޠུᆹມཋձ ࣐ࡤ៊ུᆹޤᜌҐᡞ࡛ҴϟஆᙄȂѠпᡲ MHNN ޠ ᒰΤᆱ࡚ႁژၷӬ౪ޠޒݸȂηҢࠣ፵ၷٺޠུ ᆹޤᜌҐᡞȄ
ᅘ⊏⥱ᓏ⸅ϊ᱿⤺ᾰ
စҦαΚࢳࠊဋ౪ࡤޠུᆹມཋȂ௦ίٿ҇ ໍུᆹԆມᇅԆມϟޠࣻծ࡚ॏᆘȂᙥҦࣻ ծ࡚ॏᆘѠпᕤ၍ུᆹມཋϟޠᜱഀ࡚ȄҐःف ௵ң SaltonȞ1989ȟܛණяޠԆມ᠍२ॏᆘϵԓȈ i i id
tf
D
= log × Ȟϵԓήȟ ij ij ijd
tf
D
= log
×
Ȟϵԓѳȟ ڐϜ di࣐ಒ i ঐུᆹዀᚡԆມя౫ӶӤΚུᆹ Ϝڐуུᆹዀᚡޠུᆹࠍȇdij࣐ಒ i ঐུᆹዀᚡ Ԇມڸಒ j ঐུᆹዀᚡԆມӤਣя౫ӶӤΚུᆹ ޠڐуུᆹዀᚡུᆹࠍȇtfi࣐ಒ i ঐུᆹዀᚡԆມ я౫ӶུᆹዀᚡޠԪȇtfij࣐ࡿಒ i ঐུᆹዀᚡԆມ ڸಒ j ঐུᆹዀᚡԆມӤਣя౫ӶུᆹዀᚡޠԪȇ Di࣐ಒ i ঐུᆹዀᚡԆມޠ᠍२অȇDij࣐ಒ i ঐུᆹ ዀᚡԆມڸಒ j ঐུᆹዀᚡԆມӤਣя౫ޠ᠍२অȄ ดՅȂӶҐःفޠུᆹၦਠ৳ϜȂུᆹዀᚡя౫ޠ Ԇഎߩளޠйᜳԥ२ፓޠԆມя౫Ȃࢉ tfiڸ tfij ޠঅτഌӌԆມְ࣐ 1ȄӕȂӔգя౫ޠུᆹዀᚡ Ԇມޠ᠍२অॏᆘпίӗޠϵԓٿ໕Ͼڐᜱഀโ࡚Ȉ( )
i ij j i D D t t rel , = Ȟϵԓϥȟ( )
j ji i j D D t t rel , = Ȟϵԓϳȟ ڐϜrel(ti,tj)࣐пಒ i ঐུᆹዀᚡԆມ࣐ஆᙄޠ ݸίȂڸಒ j ঐུᆹዀᚡԆມޠӔգя౫ᜱഀโ ࡚ȇrel(tj,ti)࣐пಒ j ঐུᆹዀᚡԆມ࣐ஆᙄޠ ݸίȂڸಒ i ঐུᆹዀᚡԆມޠӔգя౫ᜱഀโ࡚Ȅ စҦпαޠၽᆘȂѠп׳яڎڎӔգޠུᆹዀᚡԆ ມΚକя౫ޠᜱഀโ࡚ȂԄߓ 2 ܛұ֖౫Κঐߩᄈᆏ ޠઑଳȄ࿌rel(ti,tj)=rel(tj,ti)ਣȂߓұಒ i ঐུᆹ ዀᚡԆມڸಒ j ঐུᆹዀᚡԆມ឵ܼȶᄈᆏᜱ߾ȷȂ ᖟپٿᇴȂԄrel(Բ,٥ᢝ)=rel(٥ᢝ,Բ)Ȃ֊ хߓȶΟΡȷІȶӔᜌȷڎঐུᆹዀᚡԆມณ፤п ٦Κঐ࣐ஆྦȂѫΚঐԆມӔգя౫ޠޒݸְΚ ኻȇη൸ڎঐུᆹዀᚡມཋӶུᆹዀᚡϜഎ Κକя౫Ȃءԥᑀя౫ӶዀᚡޠޒݸึҢȄߓ 2! ߩᄈᆏઑଳϟӔգມ
rel
(
t
i,
t
j)
Termi Termj ΟΡ Ӕᜌ ࡶΫ ນұ Љί Ϋᘉ … ΟΡ 1 1 0.712 1 0.712 0.565 … Ӕᜌ 1 1 0.712 1 0.712 0.565 … ࡶΫ 1 1 1 1 1 0 … ນұ 0.702 0.702 0.5 1 0.5 0.5 … Љί 1 1 1 1 1 0 … Ϋᘉ 0.792 0.792 0 1 0 1 … … … … 1ᄊ⏦ೣパᇒ⓪ᦲㆩṘℐ⭰ņMHNNŇ
ҐःفᙥҦ MHNN ٿҢུᆹޤᜌҐᡞȞJain, Chen & Ichalkaranje, 2002; Lin & Chen, 1996; Xiaowei & Minghu, 2003ȟȄӶ MHNN Ϝઢစϰڸ ઢစࣻϤഀ๗ԚΚঐޤᜌᆪၰȄԄშ 9 ܛұ ࣐ᓔຆᅮઢစᆪၰޠᏱಭ࢝ᄻშȂӶԫΚᏱ ಭ࢝ᄻϜޠઢစϰ֊ߓұ࢛ΚུᆹዀᚡԆມȂՅ ഀ௦ઢစϰڸઢစϰޠ࣐ઢစȂӶԫࡿޠѻ ޠ᠍२অȂη൸ rel(ti,tj)অȄᙥҦ੬ۢᒰΤઢစ ϰٿࣁϾȞଌጜȟڐуઢစϰȂڦхΚᓻാޠ ઢစϰȂϛᘟޠໍߩጤܓᙾඳړ fs(netj)ޠၽ ᆘȂޣژܛԥޠԆມᆱࡼϛᡑȞԞᔨȟ࣐ЦȂՅ ഷࡤઢစϰޠᒰяȂཽࣻᜱޠུᆹዀᚡԆມᆺ ࣐ΚထȄၐಡޠ MHNN ᏱಭᅌᆘݳԄίȈ ᇴ݃ΚȈ ؑΚঐઢစϰᘉ֊хߓΚঐུᆹዀᚡԆ ມȂՅ tijࠍߓұ࣐ಒ i ঐུᆹዀᚡԆມᇅ ಒ j ঐུᆹዀᚡԆມޠӔգя౫ᜱഀโ࡚ ᠍२অȂη൸ rel(ti,tj)অȄ᠍२অޠॏᆘ РԓԄࠊΚܛख़Ȅ ᇴ݃ΡȈ ߒۗޠᒰΤມޠӬ࣐{
t1,t2,t3,...,ti,...tn}
Ȃଌ ጜᆪၰਣࢆᒶӉΚུᆹԆມٿ࿌ձଌጜޠ ஆྦȞstarting termȟᇅؑঐઢစϰᘉٿ һϤၽᆘȂڐϜೞᒶ࣐ଌጜஆྦມޠઢစ ϰڐχiޠߒۗঅ೪࣐ۢ 1Ȃڐуءԥೞᒶ࣐ ଌጜஆྦມޠઢစϰࠍ࣐ᒶଌጜஆྦ ມȂڐঅְ೪࣐࣐ 0ȄԄȞϵԓΝȟܛұȈ( )
0 = i, 0≤i≤n−1 i χ µ ȞϵԓΝȟ Data pre-processingTerms from word segmentation system CKIP
Detecting the stop criterion
NO
YES
Ontology Database
Determining the set of qualified terms Computing term weights Hopfield Network The generated ontology Data pre-processing
Terms from word segmentation system CKIP
Detecting the stop criterion
NO
YES
Ontology Database
Determining the set of qualified terms
Computing term weights
Hopfield Network
Detecting the cluster criterion
The generated ontology
შ 8! Ս࡛ဋུᆹޤᜌҐᡞϟࢻโ
30 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α ڐϜ xi࣐ಒ i ঐઢစϰޠᒰΤঅȇȝi(0)࣐ಒ i ঐઢ စϰӶਣ tɶ0 ਣޠᒰяঅȇn ࣐ઢစϰঐȄ २ፓޠໍȞϵԓΥȟȃȞϵԓΟȟޠॏᆘޣژᆪ ၰԞᔨ࣐ЦȄ ( 1) ( ), 0 1 1 0 − ≤ ≤ »¼ º «¬ ª = +
¦
− = n j t t f t n i i ij s i µ µ ȞϵԓΥȟ ڐϜ tijࡿಒ i ঐུᆹዀᚡԆມᇅಒ j ঐུᆹዀᚡԆ ມޠӔգя౫ᜱഀโ࡚᠍२অ rel(ti,tj)Ȃ֊ઢစ᠍ २অȇfsࡿᚗ᠊ԣᙾඳړȞsigmoid functionȟȇn ࣐ઢစϰঐȄ ᚗ᠊ԣᙾඳړ fsޠॏᆘРԓԄίȈ( )
»¼ º «¬ ª − + = 0 ) ( exp 1 1 θ θj j j s net net f ȞϵԓΟȟ ڐϜș0࣐ңٿ።ᐍᚗ᠊ԣᙾඳړ fsםޒޠள অȇșj࣐೪ۢϟᒰяߟᘥঅȞthreshold or biasȟȄ ȞϵԓΟȟϜ netjޠॏᆘРԓԄίȈ( )
¦
− = = 1 0 n i ij i j t t netµ
ȞϵԓΫȟ ᇴ݃ήȈ ᓔຆᅮઢစᆪၰ२ፓஉпαޠؐ ȂޣژᒰяޠུᆹԆມϛᡑ࣐ЦȄҐः فпίӗޠȞϵԓΫΚȟցᘟԓٿ؛ۢᆪ ၰԞᔨȄ( )
( )
[
]
¦
− = ≤ − + 1 0 2 1 n j j j t µ t ε µ ȞϵԓΫΚȟ ڐϜȝi(t)ࡿӶਣ t ਣಒ i ঐᘉޠᒰяঅȇ͛ࡿ ԞᔨਣȂഷτৡ೩ޠᇳৰȄ ӶҐᄃᡜϜȂșjȃș0ȃ͛ήঐᏱಭ҇ᎍ࿌ޠ ؛ۢȂစҦҐᄃᡜЇ᙮ขၑޠ๗ݏึ౫șj =1ᇅș0=1 ѠпூژၷٺޠᏱಭ๗ݏȇԫѵҐःفᏱಭԞᔨ ᇳৰ͛অ೪࣐ۢ 1Ȅӱ࣐Ґःف༊ଭᄈུᆹዀᚡມ ཋໍӔգᜱᖓโ࡚ϸݚȂٯп MHNN ໍມཋᜱ ᖓᆺȂӱԫؑΚུᆹٲӈሰ्ଌጜޠઢစϰ ໕ٯϛӼȂܛпؑԪଭᄈΚঐրུᆹٲӈໍޠ ϸထଌጜȂւңঐႬဟໍၽᆘංоഎѠпӶං ऍយϲႁژԞᔨȄկԞᔨࡤޠ MHNNȂؑΚঐઢ စϰޠᒰяঅഎӶϊᘉࡤං՞ϘҢ݃ᡘޠৰ ᡑϾȂӱԫԥ݃ᡘޠϸထ֩ᜳȄࢉҐःفණяΠΚ ঐᆺӈޠցᘟРݳٿঐୱᚡȄԄȞϵԓ ΫΡȟܛұȈ 1 , 0 , 1 0 , ≤ ≤ ≤ ≤ − ≤ −Y i j n Yi j α α ȞϵԓΫΡȟ ڐϜ YiࡿԞᔨࡤಒ i ঐઢစϰȞ֊ಒ i ঐུᆹዀ ᚡԆມȟޠഷࡤᒰяঅȇYjࡿԞᔨࡤಒ j ঐઢစϰȞ֊ ಒ j ঐུᆹዀᚡԆມȟޠഷࡤᒰяঅȇ࣐͗ցᘟᆺ ޠߟᘥঅȇn ࡿܛԥઢစϰȞུᆹዀᚡԆມȟޠ ঐȄ MHNNഷࡤؑΚઢစϰޠᒰяঅȂпڎڎପ ᄈޠРԓໍϵԓΫΚޠၽᆘȂसڎঐઢစϰȞུ ᆹዀᚡԆມȟࣻϟ๙ᄈঅϊ๊ܼܼȞϵԓΫΡȟ Ϝޠ͗ߟᘥঅȂࠍցᘟڎঐུᆹዀᚡԆມӔգя౫ ӶӤΚུᆹٲӈዀᚡޠᐡ౦ၷାȂࢉཽڎঐུᆹ ዀᚡԆມᆺ࣐ΚထȄՅ͗ߟᘥঅޠ೪ۢȂစ ҦҐᄃᡜ२ፓᄃᡜޠ๗ݏึ౫Ȃस͗অ೪ۢЋାȂ ࠍϸထޠထཽЋЎȂณݳུᆹٲӈዀᚡϟӔգ ུᆹԆມୣႥٿйထϜҢӔգя౫ޠུᆹԆ ມηཽЋӼȂԄԫҢޠུᆹޤᜌҐᡞཽᝓ२ኈ СࡤӶໍࢦၛᇮѰᘘਣޠཫ൷ഁ࡚ȇस͗অ೪ۢ ЋմȂࠍϸထޠထཽЋӼȂထϲޠུᆹӔգԆມཽ ЩၷЎȂࣦՎܼءԥӔգມޠಣӬȂԄԫณݳ࡛ ᄻяԥཏဏޠུᆹޤᜌҐᡞህֆུᆹཫ൷ȄစҦᄃ ᡜขၑ๗ݏึ౫͗೪࣐ 0.1 ਣڏԥϛᓀޠϸထ๗ݏȄᅘ⊏ᵧ⨯㋤ʀʁべଶ᱿ဎⲩᅞᘍ
ޤᜌҐᡞ፤Ѡпңܼඣख़࢛ঐስޤᜌޠቺ࢝ ᄻȂйαίቺԥ࢛ᆎโ࡚ޠᜱ߾ȄӱԫȂུᆹዀ ᚡԆມစҦ MHNN ᆺ౪ࡤȂѠпؑΚུᆹٲ ӈրϜႇࠊဋ౪ޠዀᚡԆມထᆺԚڏԥᜱഀ ܓޠөထȄҐःفܛණяޠུᆹޤᜌҐᡞ࢝ᄻȂӤ ΚထϲޠዀᚡԆມᜱ߾ҦܛॏᆘூژޠԆມᇅԆມ ϟӔգя౫ᜱᖓ࡚অђпඣख़ȂसӤΚထϲޠڎ ԆມܛॏᆘޠӔգя౫ᜱᖓ࡚অ rel(ti,tj)๊ࣻȂࠍп ᚗᓟٿߓұȇस rel(ti,tj)অ࣐ࣻȂࠍϸրп ᓟٿߓұȂसڎঐԆມ rel(ti,tj)অ࣐ 0 ਣȂࠍߓұԫ ڎԆມءԥӔգя౫ޠᜱᖓȂࢉϛϡଅᓄȄ ՅထڸထϟޠᜱഀȂҐःفණяΠΚঐຠ໕ޠ Рݳٿඣख़ထڸထϟޠᜱᖓ࡚ȂԄȞϵԓΫήȟ ܛұȈ33 . 0 3 2 ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( = × + + + + ဪق٥ᢝ ຫֽਇԲ ຫֽਇ٥ᢝЀ ࠟࡾԲ ࠟࡾ٥ᢝ Բ
ဪق rel rel rel rel rel
rel
(
)
j i n p n q jq ip j i n n T T rel Weight i j × =¦¦
− = − = 1 0 1 0 , , ȞϵԓΫήȟڐϜ0≤Weighti,j≤1,0≤i,j≤n−1ȂWeighti,jߓұಒ i
ထᇅಒ j ထϟቺᜱᖓ࡚ȇniࠍߓұಒ i ထϜޠུᆹ ዀᚡԆມȇnjࠍߓұಒ j ထϜޠུᆹዀᚡԆມȇ rel(Ti,p,Tj,q)ߓұಒ i ထϜಒ p ঐུᆹዀᚡԆມᇅಒ j ထϜಒqঐུᆹዀᚡԆມޠӔգя౫ᜱᖓ࡚Ȅ ҐःفпȞϵԓΫήȟٿໍထڸထϟޠᜱᖓ ࡚ၽᆘȂѠпؒяထᇅထϟޠᜱᖓโ࡚ȄپԄӶ შ 10 ϜಒΚȃΡထޠԆມᜱᖓ࡚ၽᆘȂॷӒпಒΚ ထ࣐ஆྦȂಒΚထϜؑΚঐུᆹዀᚡԆມᇅಒΡ ထϜؑΚঐུᆹዀᚡԆມໍ rel(ti,tj)ޠђᖃڦ҂ ְၽᆘȂܛؒூঅߓұಒΚထ imply ಒΡထޠᜱᖓ ࡚࣐ 0.452ȇпಒΡထ࣐ஆྦȂಒΡထ imply ಒΚထ ޠᜱᖓ࡚࣐ 0.33ȄڐၐಡॏᆘРԓӗᖟԄίȈ ؑΚထޠԆມᇅڐуܛԥထԆມໍһϤၽᆘȂ सڎထܛॏᆘޠ Weighti,jঅ๊ࣻਣȂࠍпᚗᓟٿߓ ұȇस Weighti,jঅ࣐ࣻȂࠍпᓟٿߓұȄԫѵȂ ڎထޠ Weighti,jְ࣐ 0 ਣȂࠍߓұԫڎထءԥᜱᖓȂ ࢉϛϡଅᓄȄစҦпαܛख़ӤΚထϲІϛӤထᇅထ ϟᜱᖓ࡚ॏᆘޠ౪ؐȂѠҢԄშ 10 ܛұ ϟ࢛ΚུᆹٲӈޠུᆹࢦၛມޤᜌҐᡞȄ 0.4 52 0.3 3 2 ຫֽਇ ࠟࡾ ဪق 0.226 0.328 0. 1 53 0. 2 28 0.258 0.406 0.095 0.1 67 0.108 0.097 0 .18 0.1 98 0.186 0.216 6 რ ൷࠹ 5 ࣳᕴ 3 ૨ࠃ ᒷᓢ ᜔อ ᇨრ 4 ᓫᇩ 1 ٥ᢝ Բ 0.21 6 0.2 5 0.2 07 შ 10! пུᆹٲӈր࣐ȶܣᇰΚϜϛණΟΡӔᜌࡶΫນұЉίȷҢϟࢦၛມུᆹޤᜌҐᡞ 452 . 0 3 2 ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( = × + + + + Բຫֽਇ Բࠟࡾ ٥ᢝဪقЀ ٥ᢝຫֽਇ ٥ᢝࠟࡾ ဪق
Բ rel rel rel rel rel
32 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α
ʙ㆛߸ߧ᱿اᑨӼ
ҐःفпՍҢޠུᆹስޤᜌҐᡞٿ࿌ձཫ ൷ມՍᘘޠஆᙄȂٯплᚡӵშޠ֖౫РԓȂ ђᄈུܼᆹٲӈᔯસ๗ݏޠຝញϾᡘұѓȄ࿌ ٻңདྷ्ࢦၛ࢛ΚུᆹлᚡਣȂقಜ्ཽؒٻң ೪ུۢᆹᜱᗥມᇅಒΚቺᇮѰᘘޠߟᘥঅȄࡤ ᆓཽཫ൷Щᄈܛԥུᆹထಣրԥ٦ٳրᜱᖓژ ԫུᆹᜱᗥມȂقಜཽ௱ᙩࠊΫঐഷڏࣻᜱޠུᆹ ထಣրᡲٻңٿᘉᒶᎨȄ ௦ίٿٻңѠпٸᑺ፹ٿᘉᒶདᑺ፹ޠུᆹထ ಣրᘳ៕ȂٯໍಒΚቺུᆹޤᜌҐᡞԆມᘘ ࢦၛȄਣقಜཽٸᐄٻңΚۗܛ؛ۢޠࢦၛ ᇮѰᘘޠߟᘥঅȂᑣᒶಒΚቺུᆹޤᜌҐᡞϜܛ ԥུᆹዀᚡԆມ rel(ti,tj)অτ๊ܼܼܛ೪ߟᘥঅ ϟུᆹዀᚡԆມȂٿໍಒΚቺޠࢦၛᇮѰᘘȂ ٯщϸၽң Google news ϸޠ੬ܓȂᡲٻңѠ пٸᐄུᆹրٿᘳ៕ԥᜱޠུᆹዀᚡȄसٻң ᗚདྷޤၿᇅԫུᆹրࣻᜱܗࣻծޠུᆹਣȂѠп ӕᘉᒶໍίΚቺུᆹޤᜌҐᡞޠࢦၛᇮѰᘘȄ შ 11 ࣐Ґःفлᚡӵშ௱ᙩϟུᆹրฬ८Ȃق ಜཽпЩپ௷זޠРԓ௱ᙩࠊΫঐུᆹրᡲٻң Ѡпᘳ៕ᎨȂٻңѠпٸ෭ዀޠಌٿᘳ៕ TOP NུᆹրዀᚡȂٯᒶᐆഷདᑺ፹ޠུᆹր ٿໍࢦၛᇮѰᘘȄ შ 11! лᚡӵშޠ௱ᙩུᆹထಣրฬ८ пშ 10 ࣐پȂ࿌ٻңమࢦၛȶചЬࡶȷਣȂق ಜցۢȶചЬࡶȷΚມဤӶ MHNN ϸထϜޠಒΡ ထȂӶಒΚቺུᆹޤᜌҐᡞޠᘘϜȂໍӤኻ ထᆺӶಒΚထϜܛԥԆມޠࢦၛᇮѰᘘȂԄȶച ЬࡶȂڎۭȷȃȶചЬࡶȂນұȷϟࢦၛᇮѰᘘȄ सٻңདྷໍໍΚؐޠࢦၛਣȂقಜ௱ᙩഷ ڏࣻᜱޠΚထٿໍᘘȄӶҐپϜᇅಒΡထࣻᜱ ޠԥϥထȂڐࣻᜱโ࡚ϸր࣐ಒΡထᇅಒΚထࣻᜱ โ࡚࣐ 0.33ȇಒΡထᇅಒήထࣻᜱโ࡚࣐ 0.226ȇಒ Ρထᇅಒѳထࣻᜱโ࡚࣐ 0.153ȇಒΡထᇅಒϥထࣻ ᜱ โ ࡚ ࣐ 0.258 ȇ ಒ Ρ ထ ᇅ ಒ ϳ ထ ࣻ ᜱ โ ࡚ ࣐ 0.095Ȅӱԫقಜ௱ᙩࣻᜱโ࡚ഷାޠಒΚထȂໍ ಒΡቺུᆹޤᜌҐᡞϟࢦၛᇮѰᘘȂη൸п ϸထӶಒΚထϟࢦၛມȶΟΡ, Ӕᜌȷໍࢦၛᇮ ѰᘘȄڐуᘘРԓпԫ௱Ȅ пίпΚঐᄃርޠࢦၛپٿᇴ݃ᐍঐقಜၽձޠ ႇโȄԄშ 12 ܛұȂٻңདྷ्ࢦၛȶΟΡӔᜌȷ ࣻᜱޠུᆹлᚡȂقಜٸᐄᜱᗥԆᔯસ௱ᙩഷࣻᜱ ޠࠊΫঐུᆹထಣրٽٻңՍᘉᒶᎨȂٻ ңᒶᐆུᆹထಣրхဵ࣐ 49664 ޠུᆹٲӈȂ ձུ࣐ᆹޤᜌҐᡞޠᘘࢦၛȄԫѵȂقಜ्ؒ ٻң೪ۢಒΚቺࢦၛᇮѰᘘޠߟᘥঅȞԫپ೪ ࣐ۢ 0.5ȟȂقಜᐄԫ೪ۢঅໍಒΚቺޠࢦၛᇮ ѰᘘࢦၛȄშ 12! ࢦၛȺΟΡӔᜌȻࣻᜱޠུᆹлᚡ შ 13 ᡘұಒΚቺࢦၛᇮѰᘘޠࢦၛ๗ݏȂقಜ ηණٽٸᐄਣዀଅ௷זޠ๗ݏٽٻңٿᘳ៕Ꭸ ȂйηණٽؑΚུᆹրϜഷུึոޠུᆹዀᚡ ٽٻңՄȄٻңѠпᘉᒶདᑺ፹ޠུᆹր ٿᢏࣽ၏ུᆹրϜಒΚቺᘘܛཫ൷ژܛԥࣻᜱ ޠུᆹಡȂԄშ 14 ܛұȄ შ 13! пਣዀଅ௷זུᆹᡘұಒΚቺࢦၛᇮѰᘘࢦၛ๗ݏ
34 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α შ 14! ᡘұུᆹထಣր࣐ 49664 ȂಒΚቺᘘࢦၛܛཫ൷ژޠܛԥུᆹಡ Ӷშ 13 ѢαِᡘұȺQuery ExtensionȻޠົഀ ๗ѠпР߰ٻңӕ܂ίΚቺࢦၛᇮѰޠᘘ ࢦၛȂڐقಜཽՍᐄུᆹޤᜌҐᡞޠαί ቺᜱ߾Ȃ௱ᙩഷࣻᜱޠቺٿໍΚؐޠᘘࢦ ၛȄშ 15 ᡘұಒΡቺٸਣዀଅ௷זུᆹᘘࢦၛ ޠ๗ݏȄ შ 15! ಒΡቺٸਣዀଅ௷זུᆹᘘࢦၛޠ๗ݏ ᆤӬпαܛख़ȂпҐःفܛණяޠུᆹҐᡞޤᜌ ໍࢦၛᇮѰޠᘘࢦၛڏԥпί੬ܓȈ࿌ໍಒ ΚቺᘘࢦၛਣȂܛཫ൷ژޠུᆹထಣրϜޠུ ᆹȂτഌϸഎᇅࢦၛມڏԥޣ௦ࣻᜱȇՅಒΡቺᘘ ࢦၛϟ๗ݏȂܛཫ൷ژޠུᆹထಣրᇅٻң ܛమࢦၛϟлᚡԥ௦ࣻᜱȄҦԫѠُȂಒΚቺޠ ᘘࢦၛпޣԓޠ࡚ཫ൷࣐лȂཫ൷ϟ๗ݏ ᇅлᚡུᆹڏԥޣ௦ࣻᜱȇಒΡቺޠᘘࢦၛп Ь҂ԓޠኅ࡚ཫ൷࣐лȂཫ൷ϟ๗ݏᇅлᚡུᆹڏ ԥ௦ࣻᜱȄ
૪㊹ኞ⎞Ӡኔ
Ґःفпڑᙡϟ Google news ུᆹໍུᆹᔯસ ਞޠຠզȂՍ 2004 Ԓ 11 У 9 СՎ 2006 Ԓ 6 У 16СЦȂᖃॏӔڑᙡ 60,269 ུᆹրȂӔ 1,191,065 ࠍུᆹܼၦਠ৳ϜȂߓ 3 ࣐Ґःفึϟུᆹཫ൷ Жᔞࡤᆓܛڑᙡུᆹၦਠ৳ޠϸոޒݸȄߓ 3! Ґःفڑᙡུᆹၦਠ৳ϸոޒݸ ڑᙡ Google News Ս 2004-11-9 Վ 2006-6-16 ࣐Ц ᖃུᆹٲӈࠍ Ӕ 1,191,065 ࠍུᆹ ᖃུᆹր Ӕ 60,269 ؑС҂ְུᆹٲӈᇕ໕ 2039.49ࠍʝЉ ؑΚ҂ְུᆹࠍ 19.76ࠍʝ ڑᙡഷτུᆹထಣրࠍ 3007์ ڑᙡഷϊུᆹထಣրࠍ 2์ ҦܼҐःفڑᙡϟུᆹၦਠ৳ࣻ࿌ᛂτȂ࣐ԥਞ ຠզུᆹᔯસਞȂӱԫᄃᡜпᓎᐡܫኻ 4348 ུᆹ ထಣրձ࣐ՍҢུᆹޤᜌҐᡞໍՅМනུᆹ ཫ൷ޠஆᙄȄᐄҐःفึޠлᚡӵშ௱ᙩᐡ ښȂقಜҢࠊΫᇅࢦၛມഷࣻᜱޠུᆹٲӈ лᚡȂٯѠଭᄈٳུᆹлᚡٿໍུᆹٲӈޠࢦ ၛᇮѰᘘȄҐःفւңϏᓎᐡࢆᒶ 5 ঐዦߟޠ ུᆹᜱᗥມٿᡜᜍҐःفܛණРݳޠᔯસਞȂԄ ߓ 4 ܛӗȄຠզޠРݳпໍࢦၛᇮѰᘘࡤϟ ࢦྦ౦Ȟprecision rateȟІණ݈ޠࢦӓ౦Ȟrecall rateȟ ڎঐข࡚ໍᔯસਞޠຠզȄҦܼҐःفڑᙡޠ ུᆹٲӈ໕ᛂτȂณݳᆡጃॏᆘяᇅٻңܛࢦ ၛϟࢦၛມࣻᜱޠܛԥུᆹٲӈᖃȂӱԫณݳ֖ ౫ࢦӓ౦ޠਞЩၷȂկණ݈ޠࢦӓ౦Ѡпѐᡘ ܛණРݳޠᓻਞȄՅུᆹཫ൷Жᔞϟཫ൷๗ݏ ҦΡΫ՞ٻңٿໍུᆹٲӈޠຠᒶȂຠۢዀྦ пٻңܛࢦၛϟུᆹлᚡ࣐ஆྦᘉȂଭᄈܛཫ ൷ژޠུᆹٲӈಓӬమࢦၛޠུᆹлᚡРԓຠ զȄ ߓ 4! ϥঐขၑཫ൷ਞޠུᆹᜱᗥມӗߓ Query 1 ήΚΟᅂᔟ Query 2 ഽଞུښ Query 3 ѵᝳ্ུ Query 4 ା២ٚ Query 5 ࠓଠཥक़
૪㊹ɺ
ᄃᡜΚޠҭޠӶܼᢏᄇٻңໍϥঐϛӤࢦ ၛᜱᗥມٯᘉᒶདᑺ፹ޠུᆹлᚡࡤȂᄈӤΚུᆹ ထಣրϟུᆹٲӈໍࢦၛᇮѰಒΚቺᇅಒΡ ቺᘘϟᐍᡞᔯસࢦྦ౦ޠຠզȂڐࢦၛᇮѰᘘ ϟߟᘥϸր೪࣐ۢ 0.3, 0.5, 0.8ȂᙥҦུᆹཫ൷ Жᔞޠཫ൷๗ݏȂٿຠۢࣻᜱޠུᆹٲӈխܛԥᘞ ڦུᆹٲӈϟЩپпؒூڐࢦྦ౦Ȅߓ 5 ᡘұϛӤ ޠུᆹᜱᗥມӶϛӤϟߟᘥίܛᔯસژޠུ ᆹᖃᇅࣻᜱޠུᆹ์Ȅშ 16 ᡘұϛӤޠུᆹ ᜱᗥԆӶϛӤߟᘥঅϟࢦྦ౦Ȅпήঐߟᘥঅ ϟ೪ۢՅّȂຠզࢦၛᇮѰಒΚቺᇅಒΡቺᘘϟ ᐍᡞᔯસਞȂ҂ְѠпႁژߗΟԚпαޠࢦྦ ౦Ȅпϥঐུᆹᜱᗥມᄈᔗ 0.3, 0.5, 0.8 ϟߟᘥঅ ؒࢦྦ౦ϟ҂ְȂ๗ݏϸր࣐ 88.85%ȃ90.39%ᇅ 92.61%ȄשউѠпึ౫ߟᘥঅ೪ۢூ຺ାȂܛᔯસ ژޠུᆹ຺ಓӬٻңܛ्׳ޠུᆹлᚡȂ࿌ดѠ пႲُޠࣻᜱܼࢦၛມޠུᆹٲӈࢦӓ౦ཽӱ ԫՅί७ ԫѵȂᄃᡜ๗ݏᡘұಒΡঐࢦၛᜱᗥມȶഽଞུ ښȷȂณ፤௵ңޠࢦၛᇮѰᘘߟᘥঅ࣐ӼЎȂְѬ ႁژմй௦ߗ 76%ޠࢦྦ౦Ȃစᘫાึ౫Ѡٿ Սܼڎঐл्ӱȈڐΚ݃ᡘޠԫΚࢦၛᜱᗥມ ܛߓႁޠུᆹࢦၛཏ܉Ȃࣻၷܼڐуѳঐࢦၛᜱᗥ ມޠᇮཏՅّၷϛ݃ጃȂٻூࢦၛ๗ݏϛᆺฑȂ ηৡܿആԚٻңᄈܼᔯસ๗ݏᇰޤαޠৰȂӱ Յኈࢦྦ౦ȇѫΚঐӱࠍҐٸᐄ Google36 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α news ϸࠍՅೞϸӶঐུᆹրޠུᆹٲ ӈȂ܅ԫϟޠৰၷτȂᗷดഎᇅഽଞུښԥᜱȂ կᆺฑޠឋᚡၷ࣐ϸයȂӱՅኈᔯસ๗ݏޠࢦྦ ౦ȄӱԫȂࢆᒶུᆹᜱᗥມᄈܼණ݈ᔯસਞηڏԥ ᜱᗥܓޠኈȂ࿌ٻңᒰΤޠུᆹᜱᗥມϛ݃ጃ ਣȂ܂܂ณݳூژᅗཏޠᔯસ๗ݏȄپԄȈȶҰݷα ᅜȷΚມȂҦܼϛѬԥשԥኻޠםึҢȂࣻ ᄈܼڐуޠঢ়ηԥԫםึҢȂйҰݷޠαᅜѠ ਗ਼ژөޠ࢈ݾȃစᔽȃऌ๊ӱષኈȂܛпཫ ൷ޠ๗ݏਗ਼ӼޠስӱષՅၷϛྦጃȄ ߓ 5! ུᆹᜱᗥມӶϛӤߟᘥঅܛཫ൷ژޠࣻᜱུᆹ์Ȟࣻᜱུᆹʝᖃུᆹ์ȟ
Query1 Query2 Query3 Query4 Query5 0.3 361/432 420/551 160/164 458/478 1239/1360 0.5 330/401 415/542 154/159 450/455 1324/1360 0.8 63 / 65 410/533 152/156 339/340 1252/1360
Precision rates with various thresholds
70% 80% 90% 100% ˃ˁˆ ˋˆˁˈˉʸ ˊˉˁ˅ˆʸ ˌˊˁˈˉʸ ˌˈˁˋ˅ʸ ˌ˄ˁ˄˃ʸ ˃ˁˈ ˋ˅ˁ˅ˌʸ ˊˉˁˈˊʸ ˌˉˁˋˉʸ ˌˋˁˌ˃ʸ ˌˊˁˆˈʸ ˃ˁˋ ˌˉˁˌ˅ʸ ˊˉˁˌ˅ʸ ˌˊˁˇˇʸ ˌˌˁˊ˄ʸ ˌ˅ˁ˃ˉʸ ˤ̈˸̅̌˄ ˤ̈˸̅̌˅ ˤ̈˸̅̌ˆ ˤ̈˸̅̌ˇ ˤ̈˸̅̌ˈ შ 16! ϛӤུᆹᜱᗥԆӶϛӤߟᘥঅϟࢦྦ౦
૪㊹ʷ
ᄃᡜΡҭޠӶܼᢏᄇٻңໍᘉᒶདᑺ፹ޠུ ᆹлᚡࡤȂᢏᄇЩၷӤΚུᆹထಣրϟུᆹޤᜌ ҐᡞڎቺޠՍࢦၛᇮѰᘘࢦၛޠ๗ݏȂл् ٿᡜᜍစҦΡቺུᆹޤᜌҐᡞޠՍᘘࢦ ၛȂѠп׳яࣻᜱܼࢦၛມϟᗵ֥ࣻᜱུᆹٲӈȄ შ 17Ȟaȟȃშ 17Ȟbȟȃშ 17ȞcȟϸրᡘұӶϛӤ ߟᘥঅίȂϛӤޠུᆹᜱᗥԆಒΚቺᇅಒΡቺ ᘘࢦྦ౦ϟЩၷ๗ݏȄڐϜȂҦܼ Query 4 Ѭԥ ΚቺޠུᆹޤᜌҐᡞȂࢉณݳໍಒΡቺϟЩၷȄ Ҧᄃᡜ๗ݏѠпึ౫҂ְಒΚቺᘘࢦྦ౦ЩಒΡ ቺٿޠାȄϸݚڐӱ࣐ӶಒΚቺޠᘘϜȂႈ എᇅٻңܛమࢦၛϟུᆹлᚡࣻᜱȂՅಒΡቺϟ ᘘࠍਗ਼Іژ၏ུᆹٲӈϟڻᜟུᆹȄپԄȈࢦ ၛȶήΚΟᅂᔟȷΚມȂಒΚቺޣ௦ᡘұࣻᜱޠ ུᆹлᚡȂՅಒΡቺࠍᡘұᒳήΚΟᅂᔟϟ ݀၊ീςࣻᜱၦଊȞԄစᒳ٦ٳަཽӈȃԚ ߞनෂ๊ȟȂпІᔯસяήΚΟᅂᔟឍᄇਜ਼ൣяৰ ਢຳٲӈ๊ڻᜟࣻᜱޠུᆹȂໍՅᡲٻңѠпщ ӌᕤ၍ԥᜱήΚΟᅂᔟึޠۗҒпІ׳яਗ਼ ژޠࣻᜱུᆹлᚡȄThe threshold of Co-occurrence terms is set to 0.3 0% 50% 100% ˡ˸̊̆ʳˢ́̇̂˿̂˺̌ʳ̂˹ ˹˼̅̆̇ʳ˿˴̌˸̅ ˌˊˁˈ˃ʸ ˋˇˁ˃ˈʸ ˄˃˃ˁ˃˃ʸ ˄˃˃ˁ˃˃ʸ ˡ˸̊̆ʳˢ́̇̂˿̂˺̌ʳ̂˹ ̆˸˶̂́˷ʳ˿˴̌˸̅ ˊˈˁˆˊʸ ˈ˄ˁˇ˄ʸ ˌˆˁˆˆʸ ˌ˃ˁˌˇʸ ˤ̈˸̅̌˄ ˤ̈˸̅̌˅ ˤ̈˸̅̌ˆ ˤ̈˸̅̌ˈ Ȟaȟ೪ۢߟᘥঅ࣐ 0.3 ϟࢦྦ౦
The threshold of Co-occurrence terms is set to 0.5
0% 50% 100%
News Ontology of first layer
96.18% 84% 100.00% 100.00% News Ontology of
second layer
75.56% 58.97% 91.80% 97.30% Query1 Query2 Query3 Query5
Ȟbȟ೪ۢߟᘥঅ࣐ 0.5 ϟࢦྦ౦
The threshold of Co-occurrence terms is set to 0.8
0% 50% 100% News Ontology of first layer 100.00% 85.41% 100.00% 100.00% News Ontology of second layer 95.00% 56.41% 93.44% 91.91%
Query1 Query2 Query3 Query5
Ȟcȟ೪ۢߟᘥঅ࣐ 0.8 ϟࢦྦ౦
შ 17! ᡘұϛӤޠུᆹᜱᗥԆӶϛӤࢦၛᇮѰᘘߟᘥঅίಒΚቺᇅಒΡቺᘘϟࢦྦ౦Щၷ The threshold of co-occurrence terms is set to 0.3
The threshold of co-occurrence terms is set to 0.5
38 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α
૪㊹ɿ
ҦܼҐःفܛڑᙡޠུᆹٲӈ໕ᛂτȂณݳᆡ ጃॏᆘяᇅٻңܛࢦၛϟࢦၛມܛԥࣻᜱޠུᆹ րᖃȂӱԫณݳ֖౫ࢦӓ౦ޠਞЩၷȄࢉଭ ᄈٻңܛమࢦၛϟུᆹᜱᗥມȂҐᄃᡜᢏᄇ༉ಜ ᜱᗥԆЩᄈཫ൷ݳᇅུᆹޤᜌҐᡞϟՍࢦၛᇮѰ ᘘϟཫ൷๗ݏȂӶུᆹထಣրࢦӓ౦αԥ ਞණЁȄڐຠզРݳ೪ܛԥུᆹၦਠ৳ϜȂᇅ ٻңܛమࢦၛϟུᆹᜱᗥມڏԥࣻᜱޠུᆹထಣ րᖃ࣐ x Ȃᐄ༉ಜᜱᗥԆЩᄈཫ൷ݳܛཫ ൷ژޠࣻᜱր࣐ a Ȃӱԫཫ൷๗ݏќܛԥࣻ ᜱޠུᆹրЩپ࣐x a ȇसུᆹޤᜌҐᡞϟՍᘘ ܛཫ൷ژޠր࣐ b Ȃࠍཫ൷๗ݏќܛԥࣻᜱ ޠ ུ ᆹ ր Щ پ ࣐ x b Ȃ ࢉ ڐ ࢦ ӓ ౦ ϟ ණ Ё ࣐ x a x a b− ȂᙐϾࡤ࣐ba−aȄ Ґᄃᡜϟ๗ݏึ౫Ȃ༉ಜᜱᗥԆЩᄈཫ൷ݳϟཫ ൷๗ݏ҇ಓӬٻңܛᒰΤϟུᆹᜱᗥມϘຝ࣐ ࣻᜱȂसུᆹዀᚡڏԥࣻӤᇮཏඣख़Ȃկϛє֥ٻ ңܛᒰΤϟུᆹᜱᗥມȂقಜณݳຝ࣐ࣻᜱϟ ུᆹՅೞᔯસяٿȄࣻၷϟίȂҐःفϜܛණяϟ ུᆹޤᜌҐᡞϟՍࢦၛᇮѰᘘཫ൷๗ݏȂѠпཫ ൷ژӼڏࣻᜱޠུᆹлᚡᇅུᆹրȄშ 18 ᡘұ ϥঐขၑޠࢦၛມȂӶစႇՍུᆹޤᜌҐᡞᘘ ࡤϟࢦӓ౦ޠණЁϸր࣐ 8.7%ȃ4.55%ȃ66.67%ȃ60% ᇅ 59.09%Ȃڐ҂ְࢦӓ౦ޠණЁ࣐ 39.8%Ȅ 0% 20% 40% 60% 80% ˄ ˅ ˆ ˇ ˈ Query Recall Rate შ 18! ϥঐᜱᗥԆࢦၛມϟՍུᆹޤᜌҐᡞࢦၛᇮѰᘘϟࢦӓ౦ණЁϸոშ⧄⎞ቍ͗ᶇἄᅞ
⧄
ҐःفණяΚঐՍ࡛ᄻུᆹޤᜌҐᡞޠРݳȂ ٯڐᔗңུܼᆹཫ൷ЖᔞޠࢦၛᇮѰᘘȂԥֆ ܼٻңϛৡܿᒶᐆԂޠࢦၛມٿᔯસུᆹޠ ୱᚡȂᄃᡜ๗ݏηᡜᜍҐःفܛණяϟڏࢦၛᇮѰ ᘘޠུᆹཫ൷Жᔞڏԥϛᓀޠࢦྦ౦ȂйηѠп ԥਞޠණུ݈ᆹޠࢦӓ౦ȄԫѵȂഇႇུᆹޤᜌҐ ᡞໍࢦၛᇮѰޠᘘࢦၛȂӤਣڏԥޣԓޠ ࡚ཫ൷ІЬ҂ԓኅ࡚ཫ൷ޠᓻᘉȂପӬਣዀଅ௷ זࢦၛ๗ݏȂѠп֖౫яུᆹޠึȄԫѵȂ ණٽུᆹлᚡӵშޠٻңϮ८ȂѠпຯߗٻң ᘳ៕ུᆹٲӈޠሰؒȄቍ͗ᶇἄᅞ
ҐःفᗷϑึяڏᄃңቌঅϟȶڏࢦၛᇮѰᘘ ϟлᚡӵშུᆹཫ൷ЖᔞȷȂկۧԥΚٳឋᚡঅூ ґٿΤःفȂᘫાԄίȈ Κȃཫ൷Жᔞᔯસഁ࡚ϟਞ Ґःفึޠུᆹཫ൷ЖᔞӶٸᐄܛ࡛ဋϟ ུᆹޤᜌҐᡞໍࢦၛᇮѰᘘਣȂस၏ቺ ུᆹޤᜌҐᡞӔգя౫ມၷӼਣȂڐཫ൷๊ ࡠޠਣཽЩၷεȄӕȂӱҭࠊقಜܛڑᙡޠུᆹ໕ϑစົႇԼࠍȂӱԫആԚၦਠ ৳ၽձαਞ౦ޠ७մȄӱԫґٿԄեւңၷ ٺޠၦਠસЖᐡښܗၦਠ๗ᄻٿᐍঐཫ ൷Жᔞޠཫ൷ഁ࡚ȂґٿقಜᡑԚ Κঐᄃңޠུᆹཫ൷ЖᔞޠᜱᗥȄ ΡȃًԓᓔຆᅮઢစᆪၰᏱಭ؛ۢ Ґःف௵ңϟًԓᓔຆᅮઢစᆪၰޠ ᏱಭစҦლၑᓀᇳޠစᡜРݳூژȂ ٳޠ؛ۢᄈܼᏱಭԞᔨڏԥΚۢโ࡚ ޠఄད࡚Ȃޠఄད࡚ᄈܼӉեᐡᏣᏱಭ РݳְԇӶȄ࣐Π७մޠఄད࡚Ȃւң ΚٳഷٺϾޠཫ൷РݳȞReklaitis, Ravindran & Ragsdell, 1993ȟህֆޠ؛ۢΚঐၷ ࣐ѠޠРݳȂկւңഷٺϾРݳӶ؛ۢ ӬᎍޠႇโϜ҇߇ຳࣻ࿌โ࡚ѵޠ ॏᆘਣȄ ήȃቩђӤဏມޠᘘԥֆܼཫ൷ژӼࣻᜱϟ ུᆹлᚡ Ҧܼөঢ়ུᆹ൭ᡞൣᏳӤΚུᆹٲӈܛίޠ ዀᚡϛΚȂଷΠՄ໕ҐःفܛණяпӔգມ ࣐ஆᙄܛ࡛ဋޠུᆹޤᜌҐᡞѵȂसӤਣ ᐍӬӤဏມུܼᆹޤᜌҐᡞȂѠпණٽ Ԃޠུᆹཫ൷ࠣ፵Ȅ ѳȃණାϜНᘟມມཋ৳ϟ໕Ȃ७մґޤມп ђུᆹޤᜌҐᡞϟ࡛Ҵ Ґःف௵ңϜःଲޠ CKIP ϜНᘟມقಜٿ ᘞڦѠޠུᆹዀᚡԆມໍུᆹࢦၛມޤ ᜌҐᡞޠ࡛ҴȂҭࠊ८ᖞޠഷτୱᚡӶུܼ ᆹϜளя౫ґޤມȞunknown wordȟȂй CKIP ᘟມقಜณݳҔጃ౪࢛ٳґޤມȂആԚᒹ ѷ೩Ӽڏԥ࡛ဋུᆹޤᜌҐᡞቌঅޠມཋȄ ґٿसЎґޤມȂԥւܼණЁུᆹޤ ᜌҐᡞϟ࡛ဋࠣ፵ȄԫѵȂҐःفґٿη ლၑւңѫΚঐڏԥུᆹґޤມՍଢ଼ѓ Սᘘщມ৳ѓޠ ECScannerȞECScanner chinese word segmentation systemȟϜНᘟມ قಜڦх CKIPȂ၏ᘟມقಜڏԥΚঐՍՍ Google news ᆪયଢ଼ུມѓቩђມ৳ᘟ
ມΩޠᐡښȞHong, Chen & Chiu, 2008ȟȂ Ѡпԥਞ၍؛ CKIP ܛ८ᖞϟґޤມୱᚡȄ ϥȃлᚡӵᚡޠϮ८֖౫ Ґᄃᡜւң Jpgraph API ٿҢлᚡӵშᘳ ៕Ϯ८Ȃґٿसп Ajax ึϤܓ ٺޠлᚡӵშ֖౫РԓȂԥֆܼึЅ ޠٻңϮ८Ȅ ϳȃًུᆹዀᚡԆມ᠍२ॏᆘРݳ Ґःفܛ௵ңޠུᆹዀᚡԆມ᠍२অॏᆘР ݳၷᎍӬܼ϶ߞНതϟॏᆘȂґٿѠ൷ؒᎍ Ӭܼᙐዀᚡϟ᠍२অॏᆘРݳȂпණାུ ᆹԆມᜱᖓ࡚ॏᆘޠҔጃܓȄ
א≙ᄽ᪇
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern
information retrieval. Addison-Wesley.
Bernaras, A., Laresgoiti, I., and Corera, J. (1996). Building and reusing ontologies for electrical network applications.
In Proceedings of the 12th European Conference on Artificial Intelligence, 298-302.
Berners-Lee, T., & Fischetti, M. (1999). Weaving the
web. Orion Business Books.
Binstock, A., & Rex, J. (1995). Practical algorithms
for programmers. Addison-Wesley.
Choi, J., Kim, M., & Raghavan, V. V. (2001). Adaptive feedback methods in an extended boolean model. In
Proceedings of ACM SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval. New Orleans, LA.
Chen, C. M., and Liu, C. Y. (2008). Personalized e-news monitoring agent system for tracking user-interested Chinese news events. Applied
Intelligence, DOI 10.1007/s10489-007-0106-7. CKIP Chinese word segmentation system. Retrieved from
the World Wide Web: http://ckipsvr.iis.sinica.edu.tw/
ECScanner Chinese word segmentation system.
40 Journal of Library and Information Science 34ņ2ŇΚ19 – 41ΰOctober, 2008α http://dlll.nccu.edu.tw/~rank/ecscanner/
Frakes, W. B., & Baeza-Yates, R. (1992). Information
retrieval: Data structures and algorithms. Prentice-Hall.
Gauch, S., & Smith, J. B. (1993). An expert system for automatic query reformation. Journal of the
American Society for Information Science, 44(3),
124-136.
Harman, D. K. (1995). Overview of the third text
retrieval conference (TREC-3).
Hong, C. M., Chen, C. M., & Chiu, C. Y. (2009). Automatic extraction of new words based on Google news corpora for supporting lexicon-based Chinese word segmentation systems. Expert Systems with
Applications, 36(2), 3641-3651.
Jain, L. C., Chen, Z., & Ichalkaranje, N. (2002).
Intelligent agents and their applications.
Physica-Verlag GmbH.
Kartoo search engine. Retrieved from the World Wide
Web: http://www.kartoo.net/e/eng/index.html Knuth, D. (1973). The art of computer programming,
vol. 3: Sorting and searching. Addison-Wesley.
Lin, C., & Chen, H. (1996). An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents. IEEE Tran. on Sys. Man and
Cybernetics –Part B: Cybernetics, 26(1), 75-88.
Maron, M. E., & Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval.
Journal of the ACM, 7(3), 216-244.
Miller, W. L. (1971). A probabilistic search strategy for MEDLARS. Journal of Documentation, 27(4), 254-266.
Mitra, M., Singhal, A., & Buckley, C. (1998). Improving automatic query expansion. Proceedings
of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 206-214.
Mooter search engine. Retrieved from the World Wide Web: http://www.mooter.com/moot
Neches, R., Fikes, R., Finin, T. W., Gruber, T. R., Patil, R., Senator, T. E., et al. (1991). Enabling technology
for knowledge sharing. AI Magazine, 12(3), 36-56. Och Dag, J. N., Regnell, B., Carlshamre, P., Andersson,
M., & Karlsson, J. (2001). Evaluating automated support for requirements similarity analysis in market-driven development. Proceedings of Seventh
International Workshop on Requirements Engineering: Foundation for Software Quality (REFSQ’01).
Peat, H. J., & Willett, P. (1991). The limitations of term co-occurrence data for query expansion in document retrieval systems. Journal of the American Society
for Information Science, 42(5), 378-383.
Reklaitis, G. V., Ravindran, A., & Ragsdell, K. M. (1993). Engineering optimization methods and
applications. Wiley, New York.
Salton, G. (1989). Automatic text processing: The
transformation, analysis, and retrieval of information by computer. Addison-Wesley.
Smeaton, A. F., & Van Rijsbergen, C. J. (1983). The retrieval effects of query expansion on a feedback document retrieval system. The Computer Journal,
26(3), 239-246.
Swartout, B., Patil, R., Knight, K., & Russ, T. (1996). Toward distributed use of large-scale ontologies.
Proceedings of the 10th Knowledge Acquisition for Knowledge-Based Systems Workshop, 33-40.
The TAO of topic maps. Retrieved from the World Wide Web: http://www.ontopia.net/topicmaps/materials/ tao.html
Tseng, Y. H. (2001). Automatic cataloguing and searching for retrospective data by use of OCR text.
Journal of the American Society for Information Science and Technology, 52(5), 378-390.
Woods, W. A. (1997). Conceptual indexing: A better way to organize knowledge. Retrieved from the World Wide Web: http://www.sun.com/research/ techrep/1997/smli_tr-97-61.ps
Xiaowei, S., and Minghu, J. (2003). An information retrieval system based on automation query expansion and Hopfield network. IEEE lnt. Conf.
Neural Network & Signal Processing, 1624-1627.
еໍȃᒄτӓȞ2003ȟȄઢစᆪၰᇅዂጚښ౪ ፤ΤߟȞঔॐޏȟȄѯіҀȈӓȄ ঢ়ᇻȞ2002ȟȂп Bio-Ontology ࣐ஆᙄϟлᚡӵშ ೪ॏᇅᄃձȄࡏݎऌτᏱၦଊᆔ౪قᆉς፤ НȂґяޏȂࡏݎᑫȄ ݔ߭ԚȃዊኌȃዊᄹȞ2003ȟȄлᚡӵშІڐ ӶસЖڑϟᔗңȄ2003 ԒၦଊऌᇅშਫᓣᏱ ःଇཽȂ229-253Ȅ लᛞНȞ2005ȟȄлᚡ྆܉ቺዂȈ྆܉ԓཫ൷Ȅ ϜѶτᏱᆪၰᏱಭऌःفܛᆉς፤НȂґя ޏȂੁ༫ᑫȄ ೩ҔݡȞ2004ȟȄᇮཏᆪαՍϾ࡛ᄻҐᡞ፤ϟः فȄЉлఁህϧτᏱၦଊᆔ౪Ᏹقᆉς፤НȂґ яޏȂѯіᑫȄ ചӏȃೆ⩨Ȟ2001ȟȄၦଊᔯસϟϜНມཋᘘȄ ၦଊ༉ክᇅშਫᓣᏱȂ8Ȟ1ȟȂ59-75Ȅ ϰᡘȞ2002ȟȄ՞НӈϟၦଊಣᙒᇅлᚡϸݚՍ ϾϟᇅᔗңȄѯіҀҴშਫᓣᓣଊȂ20 Ȟ2ȟȂ23-35Ȅ ဩܒԚȞ2003ȟȄઢစᆪၰዂԓᔗңᇅᄃձȄᏑݔȄ