4 Semantic Social Web Blog Portal

ORDER BY DESC (?Popularity)

Compared with Technorati³, it only provides limited independent search services for user from his input blog posts, tags or directory where user can not have semantic (social web) query services for any possible relevant outputs using his previous search results. So user can not search the most influential blogger friend’s articles or he can not search high similarity articles from those bloggers with certain higher level of SNA indegree measures.

In this research, a semantic social web blog portal was constructed to exploit the incentives of bridging Web 2.0 ↔ Web 3.0 where users could enjoy semantic social web query services on this portal. This portal structure is a layer schema shown as Fig. 1. In the bottom layer, crawler collects semi-structured HTML blog pages, structured RSS or FOAF context information, and free tags. Both RSS 1.0 and FOAF ontology schema are based on RDF(S) so their semantics are explicitly specified. Then, we extract and store the crawler’s collected information in our local repository. In the ontology and tags annotation layer, we mash up the blog ontology and the topic ontology with collected free tags from social web annotation by folksonomy. The blog information diffusion patterns will be analyzed by using SNA software Pajek to derive important SNA measures, such as indegree, outdegree, closeness, betweenness, and k-cores, etc[20]. Finally, we provide semantic social web query services for users to satisfy his best interested.

4.1 Data Collection

WRETCH is the biggest BSP platform in Taiwan with more than 2 million registered bloggers so huge amount of living and recreation information were available for our experiment on the research issues of bridging of Web 2.0

↔ Web 3.0. After filtering out insignificant noise data, the number of useful bloggers information samples in our analysis is around 108,518 bloggers. The period of time for our data collection was one month spanned from Sep. 09 2006 to Oct. 09 2006.

3 http://technorati.com/

Fig. 1. A layer conceptual schema for construction a semantic social web blog portal

4.2 Data Analysis

In our mashup model, the free tags collected from users are usually 2-word or 3-word Chinese words (or characters) to annotate their daily real life’s living activities. The scan and parsing processes of Chinese characters are different from the English free tags. There are no spaces between Chinese characters so we use regular expression to extract the meaningful high frequency 2-word or 3-word tags as our folksonomy final consensus social web annotations. With no surprise, the distribution for the top 300 tags is shown as power law that is similar to lots of other studies [12].

Initially the tags addressed by blogger in the WRETCH only imply that the taxonomy of blog articles can be classified as one of 16 broad channel cat-egories, such as living, cuisine, music, drama, travel, etc. When we carefully examined the tags, we surprisingly found that those of significant 54,824 blog-gers (approximate to 50% of 108518 blogblog-gers) with their addressed 1046 tags were converging to some of high frequent 521 2-word and 197 3-word tags.

And these tags were evenly distributed to our 16 broad channel categories.

This demonstrates that the social consensus opinions are possibly formulated in terms of folksonomy tagging. We are expecting a more powerful folksonomy annotation scheme can be realized in a near future as long as we have more versatile ontology+tag structure.

4.3 Blog Ontology

The blog ontology describes the profile of a blogger with his blog articles (see Fig. 2). The profile of a blogger is very similar to FOAF that defines a blogger’s personal ID, friend relationship, and mbox, etc. The attributes of each blog article include article title, date, feedback comment, and trackback, etc. In addition, the SNA index measure is defined as one of a blogger’s pro-file attributes. Therefore, SNA analysis capabilities were embedded into blog ontology to serve our SNA+ontology+tag semantic social web query services.

Fig. 2. The blog ontology describes the profile of a blogger with his blog articles

The blog ontology is declared as OWL ontology language, where property can be classified as two types: object property and datatype property. For example, the domain and range of the has author object property are de-clared respectively as Archives class and P ersons class, where Archives is the superclass of both Articles and Categories classes. Based on this object property, we describe the abstract relationships between a blogger and his blog articles. The datatype property allows us to define a concrete XML-Schema attributes, such as SNA index, for P rof iles subclass for further arithmetic operations.

4.4 Topic Ontology

The blog articles in the WRETCH were classified into one of the 16 broad topic channels based on their attachment tags. The design processes of broad classification of blog article channel will be shown as three steps (see Fig. 3):

First, we subjectively declare 16 broad topic channel as instances under their superclass Channel. The 16 broad topic channels are life, cuisine, music, etc, where Channel and T ag are subclasses of Category superclass in the topic ontology. Second, a set of possible tags we consider for each channel are those with higher frequent 2-word or 3-word tags presented by users. Third, if a new blog article has attachment tags that match at least one of higher frequent tags in the set declared for one of a broad topic channels, then this new blog article will be automatically classified to that channel.

4.5 Social Web Annotation

The goal of Web 1.0 annotation is to create a well-defined and computer un-derstandable structure knowledge base e.g., ontologies, whose content mirrors that of the WWW. The biggest challenge for bridging of Web 1.0 ↔ Web 3.0 is the terms mining from the Web can not be automatically and exactly fitted into the ontology that defines the vocabularies for the target knowledge base

Fig. 3. Topic ontology - Channel and Tag are subclasses of Category so we can automatically mash up ontology data and search model with the folksonomy tagging system services

[7]. Therefore, most of the semi-automatic annotation systems usually apply machine learning techniques to recognize new class instances and relation in-stances mining from the Web. In the folksonomy annotation for bridging of Web 2.0 ↔ Web 3.0, the granularity of class instances and relation instances are restricted to the resource targets that can be clearly tagged by folksonomy.

The folksonomy of social web annotations are explicitly collected from tags or implicitly initiated by users from their activity events. These explicit tags and implicit events are precise terms that describe the instances and relations corresponding to our ontology schema.

The objective and granularity of tags for describing instances and rela-tions that corresponding to the target resources can be further refined if we have more elaborate social web annotation system in the future. As seman-tic wikipedia in [23], we might allow users to enable semanseman-tic tags similar to typed links and attributes two kinds of property for describing corresponding abstract relationship and concrete attributes within/between entity. Then var-ious levels of reasoning for discovery of semantic relationship among taggers, tags, and resources can be achieved.

Our semantic social web annotation system takes three inputs either col-lected by web crawler or computed by local software agent. The first is HTML blog pages with hyperlinks , comments, and trackbacks context. The second is RSS context with permalink, publication data, author, and description at-tributes. The third is tags, channel, and SNA indices computed via agents.

They are all stored in a local database and to be mashed up for afterward semantic social web query services (see Fig. 4).

4.6 Semantic Social Web Blog Portal Testbed

An online semantic social web blog portal testbed (see Fig. 5) was constructed based on previous layer conceptual schema (see Fig. 1) to experiment our

Fig. 4. Semantic social web annotation from three inputs of data sources for mashup purpose to enable semantic social web services

mashup model. The crawler collects all of the necessary context information from the WRETCH BSP. The context information shown in Figure 4 were processed to create relevant class and relation instances defined in the blog ontology and the topic ontology (see section 4.3 and section 4.4). This se-mantic social web annotations for folksonomy were automatically generated except in the bootstrapping stage where we have to analyze the blog site de-pendent context to specify our initial lightweight ontology schema. A variety of important SNA measures, such as indegree, closeness, betweenness, and k-core, were computed via Pajek SNA software. ⁴ to provide semantic social web query services shown in section 3.3.

Fig. 5. The semantic social web blog portal to experiment our Web 2.0 ↔ Web 3.0 bridging model

4 http://valado.fmf.uni-lj.si/pub/networks/pajek/

5 Conclusions

The goal of this research is to exploit the incentives of bridging Web 2.0 ↔ Web 3.0 via building a semantic social web blog portal. On the Web 2.0, we usually use tagging system to label all kinds of Internet resources. Web 2.0 is a folksonomy social web, where we effectively search what we are desirous of information through tags. The tagging system enables the wisdom of crowds and surprisingly social consensus can be derived from these voluminous and unregular tags. Contrarily, Web 3.0 (semantic web) is aiming at using ontol-ogy for effectively information search under taxonomy classification. We have justified that the concepts of folksonomy and taxonomy can be mashed up to-gether to achieve semantic social web query services via bridging of Web 2.0 ↔ Web 3.0. That allows us to leverage search capabilities from both bottom-up folksonomy indexing and top-down taxonomy ontology two techniques.

Conceptually, tags in the tagging system are equivalent to terms mining from the WWW in the conventional annotation system. The terms mining from the Web are usually defined as instances that are related to a particu-lar class or property in ontology. But tags from the folksonomy are usually instances related to a particular class. Therefore, all of the relation instances have to be created dynamically following the ontology schema. The relation instances that describe the relationships between bloggers, tags, and blogs, are generated from blogger’s daily activity events based on our blog ontology.

Although users can effectively search information by folksonomy tagging sys-tem in Web 2.0, we still have the capacity to improve search capability via social network analysis (SNA). A real SNA-based semantic social web query services could possibly encourage users to find out what they are really inter-ested in because well-organized topic-specific ranking contents are ready for user to enjoy.

Acknowledgements

This research was partially supported by Taiwan National Science Council (NSC), Under Grant No. NSC 95-2221-E-004-001-MY3.

References

1. Ali-Hasan, N. and Adamic, L. A., Expressing Social Relationships on the Blog through Links and Comments. http://www-personal.umich.edu/~ladamic.

2. Berners-Lee, Tim, et al. (2001). The Semantic Web. Scientific American, May.

3. Bojars, U. , Breslin, J. G., and Moller, K. (2006). Using Semantics to Enhance the Blogging Experience. Proceedings of 3rd European Semantic Web Confer-ence (ESWC 2006), 679-696.

4. Brooks, C. H. and Montanez, V. (2006). Improve Annotation of the Blogo-sphere via Autotagging and Hierarchical Clustering. WWW 2006, May 23-26, Edinburgh, Scotland.

5. Cayzer, S. (2004). Semantic Blogging: Spreading the Semantic Web Meme. XML Europe, Apr. 18-21, Amsterdam.

6. Chin, A. and Chignell, M. (2006). A Social Hypertext Model for Finding Com-munity in Blogs. HyperText (HT’06), Aug. 22-25, Odense, Denmark.

7. Craven, M., et al. (2000). Learning to construct knowledge bases from the World Wide Web. Artificial Intelligence, 11, Elsevier, 69-113.

8. Dill, S., et al. (2003). A case for automated large-scale semantic annotation.

Journal of Web Semantics, 1(1), 115V132.

9. Ding, L., et al. (2005). How the Semantic Web is Being Used: An Analysis of FOAF Documents. Proceedings of the 38th Hawaii International Conference on System Sciences.

10. Gruber, T. Ontology of Folksonomy: A Mash-Up of Apples and Oranges.

http://tombruber.org.

11. Gruhl, D. et al. (2004). Information Diffusion Through Blogspace. WWW 2004, May 17-22, New York, USA.

12. Hotho, A. et al. (2006). Information Retrieval in Folksonomies: Search and Rank-ing. Proceedings of 3rd European Semantic Web Conference (ESWC 2006).

13. Karger, D. R. and Quan, D. (2004). What Would It Mean to Blog on the Se-mantic Web?. The Third International SeSe-mantic Web Conference(ISWC 2004), Springer-Verlag.

14. Kiryakov, A., Popov, B., and Terziev, I. (2004). Semantic Annotation, Indexing, and Retrieval. Journal of Web Semantics, 2(1), 49-79.

15. Kumar, R., Novak, J., Raghavan, P., and A. Tomkin. (2004). Structure and evolution of blogspace. Comm. of the ACM, 47(12).

16. Markoff, J. (2006). Entrepreneurs See a Web Guided by Common Sense. New York Times, Nov. 12, http://www.nytimes.com.

17. Marlow, C. et al. (2006). HT06, Tagging Paper, Taxonomy, Flickr, Academic Article, ToRead, HyperText (HT’06), Aug. 22-25, Odense, Denmark.

18. Matsuo, Y., Mori, J., and Hamasaki, M. (2006). POLYPHONET: An Advanced Social Network Extraction System from the Web. WWW 2006, May 23-26, Edinburgh, Scotland.

19. Mika, P. (2005). Flink: Semantic Web Technology for the Extraction and Anal-ysis of Social Networks. Journal of Web Semantics, 3(2-3), 211-223.

20. Nooy, de W., Mrvar, A. and Batagelj, V. (2006). Exploratory Social Network Analysis with Pajek. Cambridge University Press.

21. O’Reilly, Tim. (2005). What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software. http://www.oreillynet.com/lpt/a/6228.

22. Uren, V., et al. (2006). Semantic annotation for knowledge management: Re-quirements and a survey of the state of the art. Journal of Web Semantics, 4(1), 14-28.

23. V¨olkel, M., et al. (2006). Semantic Wikipedia. WWW 2006, May 23-26, Edin-burgh, Scotland.

24. Wasserman, S. and Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge University Press.

the Combination of Rules and Ontologies＂, Workshop on Privacy Enforcement and Accountability with Semantics, ISWC+ASWC 2007, Busan, Korea, 2007, CEUR-WS, Vol-320.

Delegation Policies via the Combination of Rules

在文檔中具可信度的語意資訊服務網架構--認證理論 vs. 社會網路 (III) (頁 24-32)