Semantic Based Clustering of Web Documents

Share "Semantic Based Clustering of Web Documents"

N/A

Protected

學年: 2021

Info

下載

Protected

Academic year: 2021

Share "Semantic Based Clustering of Web Documents"

Copied!

加載中.... (立即查看全文)

立即下載 ( 1 頁 )

全文

(1)

Semantic Based Clustering of Web Documents

蔣以仁

Tsau Young Lin;I-Jen Chiang

摘要 Abstract

A new methodology that structures the semantics of a collection of documents into the geometry of a simplicial complex is developed: a primitive concept is represented by a top dimension simplex, and a connected component represents a concept. Based on these structures, documents can be clustered into some meaningful classes.

Experiments with three different data sets from web pages and medical literature have shown that the proposed unsupervised clustering approach performs significantly better than traditional clustering algorithms, such as k-means, AutoClass and

hierarchical clustering (HAC). This abstract geometric model seems have captured the intrinsic semantics of the documents.

參考文獻

立即下載 ( PDF - 1 頁 - 27.78 KB )

"Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values," Data Mining and Knowledge Discovery, Vol. “Density-Based Clustering in

群聚參數與群聚適切性的分析與應用 The Analysis and Applications of Cluster Parameters

The research proposes a data oriented approach for choosing the type of clustering algorithms and a new cluster validity index for choosing their input parameters.. The

Constructions of complementarity functions and merit functions for circular cone complementarity problem

In this paper, we have shown that how to construct complementarity functions for the circular cone complementarity problem, and have proposed four classes of merit functions for

High Accuracy Reconstruction from Wavelet Coefficients

Numerical experiments indicate that our alternative reconstruction formulas perform significantly better than the standard scaling function series (1.1) for smooth f and are no

群聚技術之研究

In the past researches, all kinds of the clustering algorithms are proposed for dealing with high dimensional data in large data sets.. Nevertheless, almost all of

Topic Hierarchy Generation for Text Segments: A Practical Web-based Approach

Additional Key Words and Phrases: Topic Hierarchy Generation, Text Segment, Hierarchical Clustering, Partitioning, Search-Result Snippet, Text Data

Combining SVMs with Various Feature Selection Strategies

For the data sets used in this thesis we find that F-score performs well when the number of features is large, and for small data the two methods using the gradient of the

An Overview of Web Retrieval and Mining

• Information retrieval : Implementing and Evaluating Search Engines, by Stefan Büttcher, Charles L.A.

上傳您的學習材料以下載所有文件。

您的文件將被豐富，在 9lib TW 上共享以幫助學習。

相關文件

Placement of Web-Server Proxies with Consideration of Read and Update

傳統教學與網路教學之比較研究

288

Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking

A Simple Methodology for Soft Cost-sensitive Classiﬁcation

Reduction from Cost-sensitive Ordinal Ranking to Weighted Binary Classiﬁcation

Training Support Vector Machines: Status and Challenges

Clustering with Local Density Peaks-Based Minimum Spanning Tree