行政院國家科學委員會專題研究計畫 成果報告
提供具資料共享與保護的語意規範於雲端環境中
研究成果報告(精簡版)
計 畫 類 別 : 個別型
計 畫 編 號 : NSC 99-2221-E-004-010-
執 行 期 間 : 99 年 08 月 01 日至 100 年 07 月 31 日
執 行 單 位 : 國立政治大學資訊科學系
計 畫 主 持 人 : 胡毓忠
計畫參與人員: 碩士班研究生-兼任助理人員:楊協達
碩士班研究生-兼任助理人員:楊竣展
博士班研究生-兼任助理人員:吳穩男
報 告 附 件 : 國外研究心得報告
處 理 方 式 : 本計畫涉及專利或其他智慧財產權,2 年後可公開查詢
中 華 民 國 100 年 10 月 04 日
1
行政院國家科學委員會補助專題研究計畫成果報告
提供具資料共享與保護的語意規範於雲端環境中
計畫類別:■個別型計畫 □整合型計畫
計畫編號:NSC 99-2221-E-004-010
-
執行期間:2010 年 8 月 1 日至 2011 年 7 月 31 日
執行機構及系所:國立政治大學資訊科學系
計畫主持人:胡毓忠
共同主持人:
計畫參與人員:吳穩男、楊竣展、楊協達、鄭迪嶸
成果報告類型(依經費核定清單規定繳交):■精簡報告 □完整報告
本計畫除繳交成果報告外,另須繳交以下出國心得報告:
■赴國外出差或研習心得報告
□赴大陸地區出差或研習心得報告
■出席國際學術會議心得報告
□國際合作研究計畫國外研究報告
處理方式:
除列管計畫及下列情形者外,得立即公開查詢
□涉及專利或其他智慧財產權,■一年□二年後可公開查詢
中 華 民 國 100 年 09 月 28 日
2
國科會補助專題研究計畫成果報告自評表
請就研究內容與原計畫相符程度、達成預期目標情況、研究成果之學術或應用價
值(簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性)
、是否適
合在學術期刊發表或申請專利、主要發現或其他有關價值等,作一綜合評估。
1. 請就研究內容與原計畫相符程度、達成預期目標情況作一綜合評估
■ 達成目標
□ 未達成目標(請說明,以 100 字為限)
□ 實驗失敗
□ 因故實驗中斷
□ 其他原因
說明:
2. 研究成果在學術期刊發表或申請專利等情形:
論文:■已發表 □未發表之文稿 □撰寫中 □無
專利:□已獲得 □申請中 □無
技轉:□已技轉 □洽談中 □無
其他:(以 100 字為限)
1. Hu, Y. J., W. N. Wu, J. J. Yang, “Semantics-enabled Policies for Super-Peer Data
Integration and Protection”, International Journal of Computer Science and Applications
Technomathematics Research Foundation, 2011 (submitted).
2. Hu, Y. J., W. N. Wu, J. J. Yang, "Semantics-enabled Policies for Information Sharing
and Protection in the Cloud", 3rd Int. Conference on Social Informatics (SocInfo2011),
Oct. 6-8, 2011, Singapore, LNCS, Springer.
3. Hu, Y. J. and Jiun-Jan Yang , "A Semantic Privacy-Preserving Model for Data Sharing
and Integration", International Conference on Web Intelligence, Mining and Semantics
(WIMS'11), May 25-27, 2011, Norway, ACM Press.
4. 胡毓忠,雲端運算最新發展:建構數位圖書館於雲端環境中, 數位資源管理與雲端圖書
館自動化研討會,Nov.-5, 2010 (邀稿)
3
3. 請依學術成就、技術創新、社會影響等方面,評估研究成果之學術或應用價
值(簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性)(以
500 字為限)
自從 TBL 在 2001 年開始推動語意網(Semantic Web)的概念以來,我們就長時間的進行語意網
學術性研究。透過國科會歷年專題計畫案的推動,我們也陸續產生一些具體的研究成果。我
們利用語意網的本體論與規則兩大知識庫系統來表示人類使用的法治規範,並且希望可以讓
電腦來自動化解讀而且不會產生語意上的錯誤。初期我們針對的法律規範主要在表達著作權
管理的概念與其法規自動化落實。最近幾年則將重點放在個人資料與隱私權保護的法律概念
表達與落實。隨著這幾年隨著雲端計算的興起,如何來確保雲端平台之上的安全與資料保護
則是我們的研究重點。基於現有個人資料與隱私保護法並未能完全適用於雲端平台之上,因
此如何利用電腦本體論與規則語言來加以表示個人資料保護法並且自動落實其於雲端平台之
上則正好成為一個有趣但是需要被解決的研究議題。我們從資料整合的議題上著手,雖然這
個研究議題過去在資料庫系統領域已經被研究一段時間,但是雲端平台的興起讓這個議題再
度的被重視。尤其是如何將大量雲端上的資料整合並且更進一步被保護以符合個人資料保護
法規的要求將會是一項重點研究挑戰。我們考量的方法是以點對點式資料管理系統(peer data
management system, PDMS)來加以落實。另外當雲端平台上資料面臨跨域整合時,考量不同
法治區的個人資料保護規範整合則將會是另外研究上的一大挑戰。我們已經通過國科會未來
兩年專題計畫案的審核並且開始進行研究「雲端跨域資料整合與保護」這個研究議題。
4
精簡報告
提供具資料共享與保護的語意規範於雲端環境中
Semantics-enabled Policies for Data Sharing and Protection in the Cloud
計畫編號:
NSC 99-2221-E-004-010
執行期間:
2010/08/01-2011/07/31
摘要 雲端運算平台提供方便與彈性的運算讓 web 之上的資訊整合更為容易。我們探索了資訊科技與法
律之間的整合並且運用語意式規範來塑模法規於雲端之上。提供資訊整合與保護的語意式規範是由本
體論與規則組合而成。
「本體論」用來表示資訊整合與保護的抽象概念,而「規則」則是更進一步在本
體論建構完成之後用來具體落實上述的概念。我們將面對的挑戰將會是如何法治化語意式規範於雲端
運算平台之上,並且縮小法律與電腦規範之間表達上的落差以避免在電腦規範表示時所產生的語意含
糊,同時也必須要解決從多重法治區規範整合時所面臨的法治衝突。
關鍵詞:語意式規範、雲端運算、資料整合與保護、本體論與規則、語意網
Abstract. Cloud computing platform provides utility computing to allow people to have convenient and
flexible information sharing services on the web. We investigate the inter-disciplinary area of information
technology and law and use semantics-enabled policies for modeling legal regulations in the cloud. The
semantics-enabled policies of information sharing and protection are represented as a combination of
ontologies and rules to capture the concept of security and privacy laws. Ontologies are abstract knowledge
representation of information sharing and protection extracted manually from the data sharing and protection
laws. Rules provide further enforcement power after ontologies have been constructed. The emerging
challenges of legalizing semantics-enabled policies for laws in the cloud are: to mitigate the gap between
semantics-enabled policy and laws, to avoid any ambiguity in the policy representation, and to resolve
possible conflicts between policies when they are required to integrate the laws from multiple jurisdictions.
Keywords: semantics-enabled policies, cloud computing, data integration and protection, ontology and rule,
5
I. 前言
本研究案在於延續過去主持人研究語意網規範的成果,將這項技術轉換到雲端運算的平台之上來
解決雲端資料整合與保護的議題。雲端運算透過電腦系統與服務委外的方式來進行,除了電腦購
置與維護成本大量減少之外。另外因為雲端運算平台提供彈性且方便的電腦資源,因此在資訊界
已經蔚為風潮。但是雲端運算目前最大的挑戰在於委外之後資料的保護議題。因此本研究案希望
能夠透過語意網的本體論與規則兩大知識系統來表示個人資料保護法的規範概念,並希望能夠具
體的落實到雲端運算的平台之上。
II. 研究目的
以語意網的「本體論」與「規則」整合的知識系統來表示個人資料保護法的法律規範概念。並且
能夠將其具體落實到雲端運算雛形系統之上。我們提供以語意網「本體論」與「規則」整合的方
式來解決資訊跨平台的整合議題,並且也同時解決了資訊保護的目的。並且要能夠更進一步提供
廣域範圍(wide-scale)資訊的整合與保護文獻探討
III.
文獻探討
根據 VII 相關參考文獻。資訊整合是迫切需要被解決的研究問題[19]。而我們也瞭解到雲端的安
全與資料保護是非常重要的研究議題[1][2]。這個議題將會是延伸原有個人資料保護研究的結果
[3][[5-6]。我們利用 OpenTC 的架構來建置我們的雲端平台[8] [18]。資料的整合早在 200 年中
期就有學者 Calvanese 提出以 logic 的方式來塑模[21]。但是他提出的是兩層架構單純資料整合
架構,而我們提出的是三層資料整合與保護架構[7]。至於語意式規範的研究則相當的多
[4][11][12][14-17]。而利用語意網的技術來進行反恐與犯罪的偵防則可以參考[9-10]。
IV. 研究方法
我們在現有 EU FP6 計畫的 OpenTC (
http://www.opentec.net/
)架構之上來建構我們的雲端雛形
實做平台。首先我們提出三層式資料整合與保護的架構(framework),在這個架構之中,底層是結
構化的關連式資料庫系統,中間層則是以「區域本體論」與「規則」為主,至於最上層則是整合
各「區域本體論」與「規則」的整合式資料使用平台。因此我們必須要克服各「區域本體論」對
應(mapping)、定位(alignment)與整合(merging)問題。資料整合議題在傳統的資料庫已經研究有
一段時日。但是利用「本體論」來進行結構化資料的整合則是在起步階段。
「本體論」的優勢在於
可以利用抽象的概念來表示資料整合時所需要遵守的個人資料保護法的特性如資料使用目的、使
用者身份、權利與義務的關係、以及資料本身的結構性。我們利用原有資料庫結構(schema)對
應的方法來進行「本體論」的結構對應,包括了有 global-as-view (GAV), local-as-view (LAV),
global-local-as-view (GLAV)。因此在上述三層的架構中,GAV, LAV 可以被正確的應用到「本
體論」和資料庫的結構對應,以及「區域性本體論」結構和「整合式本體論」結構的相對應。而
「規則」則可以被利用來進行資料的保護與查詢的具體落實的動作。這一部份的研究成果發表在
2011 年五月的 WIMS'11 研討會[3]。
6
另外為了解決大範圍資料整合與保護的問題,我們也參考過去對等式資料管理系統(Peer Data
Management System, PDMS)如 Piazza,來進行多重資料源的資料整合與保護。我們提出了
super-peer data integration management system 的架構,並且從中探討如何在這個架構之上
來進行廣域範圍資料整合與保護。基本上這個 super-peer 資料整合與保護的架構是延伸原來發表
在 WIMS'11 的論文成果。我們已經將這個構想已經寫成論文並且投稿到 International Journal
of Computer Science and Applications 的期刊[1]。
至於在語意式規範應用在雲端資訊平台來進行資訊分享與保護的研究構想我們也完成了一篇論文
並且被 SocInfo 2011 研討會接受為全論文[2]。我們提出了三層雲端跨域資訊整合與保護的平台。
最底層是 Cloud Machine Domain (CMD) layer,中間層是 Cloud Virtual Domain (CVD) layer,
而最上層則是 Cloud Legalized Domain (CLD) layer。我們提出了語意式規範的本體論和規則可
以運用到 CLD layer 之上。實際上這個法治層的雲端資訊整合和保護所使用的技術就是先前發表
在 WIMS'11 的論文結果。我們並且以個人資料保護法與國家安全法來規範雲端平台之上資料揭露
的範例來說明該如何以分階段的方式來揭露個人資訊以避免個人隱私權被侵犯。
V. 結果與討論(含結論與建議)
延續過去我們在語意式規範的研究成果,我們將其擴展到資料的整合與保護。並且更進一步應用
到雲端運算平台之上的整合與保護。因此過去一年的研究成果非常的豐碩。我們除了有已經發表
兩篇論文分別在 ACM(WIMS'11)與 Springer(SocInfo2011)電子資料庫的世界知名研討會之外。
也將其中之一的 WIMS'11 研討會論文延伸,並且投稿到 IJCSA 期刊。這些研究的基礎將會在未來
國科會已經審核通過並且執行中的兩年計畫:雲端資料跨域整合與保護(Data Integration and
Protection for the Inter-Domain Data Cloud) (NSC 100-2221-E-004-011-MY2)
(2011/08/01-2013/07/31)中繼續進行更深入的研究。
VI. 具體的研究內容與成果。請參考下面已經發表或投稿的論文清單。論文詳見附件。
1.
Hu, Y. J., W. N. Wu, J. J. Yang, “Semantics-enabled Policies for Super-Peer Data Integration and
Protection”, International Journal of Computer Science and Applications Technomathematics Research
Foundation, 2011 (submitted).
2.
Hu, Y. J., W. N. Wu, J. J. Yang, "Semantics-enabled Policies for Information Sharing and Protection in
the Cloud", 3rd Int. Conference on Social Informatics (SocInfo2011), Oct. 6-8, 2011, Singapore, LNCS,
Springer.
3.
Hu, Y. J. and Jiun-Jan Yang , "A Semantic Privacy-Preserving Model for Data Sharing and Integration",
International Conference on Web Intelligence, Mining and Semantics (WIMS'11), May 25-27, 2011,
Norway, ACM Press.
7
5.
胡毓忠,雲端運算最新發展:建構數位圖書館於雲端環境中, 數位資源管理與雲端圖書館自動化
研討會,Nov.-5, 2010 (邀稿)
VII.
相關參考文獻
1. Bruening, J.P., Treacy, B.C.: Cloud computing: privacy, security challenges. Privacy & Security Law
Report (2009)
2. Takabi, H., et al.: Security and privacy challenges in cloud computing environments. IEEE Seurity &
Privacy 8 (2010) 24–31
3. Ant´on, I.A., et al.: A roadmap for comprehensive online for privacy policy management. Comm. of
the ACM 50 (2007) 109–116
4. Vimercati, S.D.C.d., et al.: Second research report on next generation policies, project deliverable
D5.2.2. Technical report, PrimeLife (2010)
5. Ardagna, A.C., et al.: A privacy-aware access control system. Journal of Computer Security 16 (2008)
369–397
6. Karjoth, G., et al.: Translating privacy practices into privacy promises - how to promise what you can
keep. In: POLICY’03, IEEE (2003)
7. Hu, Y.J., J., J.: A semantic privacy-preserving model for data sharing and integration. In:
International Conference on Web Intelligence, Mining and Semantics (WIMS’11), Norway, ACM
(2011)
8. Cabuk, S., et al.: Towards automated security policy enforcement in multi-tenant virtual data centers.
Journal of Computer Security 18 (2010) 89–121, Popp, R., Poindexter, J.: Countering terrorism
through information and privacy protection technologies. IEEE Seurity & Privacy 4 (2006)
9. In Popp, L.R., Yen, J., eds.: Emergent Information Technologies and Enabling Policies for
Counter-Terrorism. Wiley (2005)
10. Kettler, B., et al.: Facilitating information sharing across intelligence community boundaries using
knowledge management and semantic web technologies. In Popp, L.R., Yen, J., eds.: Emergent
Information Technologies and Enabling Policies for Counter-Terrorism. Wiley (2005) 175–195
8
12. Bonatti, P., Olmedilla, D.: Policy language specification, enforcement, and integration. project
deliverable D2, working group I2. Technical report, REWERSE (2005)
13. Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5
(1993)
14. Kagal, L., et al.: Using semantic web technologies for policy management on the web. In: 21st
National Conference on Artificial Intelligence (AAAI), AAAI (2006)
15. Tonti, G., et al.: Semantic web languages for policy representation and reasoning: A comparison of
KAoS, Rei, and Ponder. In: 2nd International Semantic Web Conference (ISWC) 2003. LNCS
2870 (2003) 419–437
16. Hu, Y.J., Boley, H.: SemPIF: A semantic meta-policy interchange format for multiple web policies.
In: 2010 IEEE/WIC/ACM Int. Conference on Web Intelligenceand Intelligent Agent Technology,
IEEE (2010) 302–307
17. Hosmer, H.H.: Metapolicies I. ACM SIGSAC Review 10 (1992) 18–43
18. Berger, S., et al.: Security for the cloud infrastructure: Trusted virtual data center implementation.
IBM Journal of Research and Development (2009) 6:1–6:12
19. Bernstein, A.P., Haas, L.M.: Information integration in the enterprise. Comm. Of the ACM 51
(2008) 72–79
20. Clifton, C., et al.: Privacy-preserving data integration and sharing. In: Data Mining and Knowledge
Discovery, ACM (2004) 19–26
21. Calvanese, D., Giacomo, G.D.: Data integration: A logic-based perspective. AI Magazine 26 (2005)
59–70
9
國科會補助計畫衍生研發成果推廣資料表
日期:100 年 09 月 28 日
國科會補助計畫
計畫名稱:提供具資料共享與保護的語意規範於雲端環境中
計畫主持人: 胡毓忠
計畫編號:NSC 99-2221-E-004-010-
領域:資訊科技
研發成果名稱
(中文)語意式規範於雲端資料整合與保護
(英文)Semantics-enabled Data Integration and Protection
in the Cloud
成果歸屬機構
國立政治大學
發明人
(創作人)
胡毓忠
技術說明
雲端運算平台提供方便與彈性的運算讓 web 之上的資訊整合更為
容易。我們探索了資訊科技與法律之間的整合並且運用語意式規
範來塑模法規於雲端之上。提供資訊整合與保護的語意式規範是
由本體論與規則組合而成。
「本體論」用來表示資訊整合與保護的
抽象概念,而「規則」則是更進一步在本體論建構完成之後用來
具體落實上述的概念。我們將面對的挑戰將會是如何法治化語意
式規範於雲端運算平台之上,並且縮小法律與電腦規範之間表達
上的落差以避免在電腦規範表示時所產生的語意含糊,同時也必
須要解決從多重法治區規範整合時所面臨的法治衝突。
Cloud computing platform provides utility computing to allow
people to have convenient and flexible information sharing
services on the web. We investigate the inter-disciplinary
area of information technology and law and use
semantics-enabled policies for modeling legal regulations
in the cloud. The semantics-enabled policies of information
sharing and protection are represented as a combination of
ontologies and rules to capture the concept of security and
privacy laws. Ontologies are abstract knowledge
representation of information sharing and protection
extracted manually from the data sharing and protection
laws. Rules provide further enforcement power after
ontologies have been constructed. The emerging challenges
of legalizing semantics-enabled policies for laws in the
cloud are: to mitigate the gap between semantics-enabled
policy and laws, to avoid any ambiguity in the policy
representation, and to resolve possible conflicts between
policies when they are required to integrate the laws from
multiple jurisdictions.
10
技術/產品應用範圍
個人資料整合與保護、雲端運算服務的加值
技術移轉可行性及預期
效益
本研究將持續進行中,將在未來 1-2 年完成雲端運算資料整合與
保護平台雛形系統平台建置成功之後,進行技術轉移。預期的效
益將會是強化現有雲端運算環境的資料整合、分享、與保護的加
值服務
註:
本項研發成果若尚未申請專利,請勿揭露可申請專利之主要內容。
11
國科會補助專題研究計畫項下出席國際學術會議心得報告
日期:100 年 9 月 28 日
一、參加會議經過
請參考出差心得報告內容。
二、與會心得
請參考出差心得報告內容。
三、考察參觀活動(無是項活動者略)
請參考出差心得報告內容。
四
、攜回資料名稱及內容
請參考出差心得報告內容。
計畫編號
NSC 99-2221-E-004-010-
計畫名稱
提供具資料共享與保護的語意規範於雲端環境中
出國人員
姓名
胡毓忠
服務機構
及職稱
國立政治大學資訊科學系
會議時間
2011 年 5 月 25 日
至
2011 年 5 月 27 日
會議地點
Sogndal, Norway
會議名稱
(中文)
2011 Web 智慧、探勘與語意國際研討會
(英文)
International Conference on Web Intelligence, Mining and Semantics
(WIMS’11)
發表論文
題目
(中文)
提供資料分享與整合的語意隱私權保護模式
12
國科會補助專題研究計畫項下赴國外(或大陸地區)出差或研習心得報告
日期:100 年 9 月 28 日
一、 國外研究過程
本研究案原申請期限為三年,經過評審之後僅通過一年。在計畫書中原來是希望能夠在三
年的計畫案中加入到 Dr. Thomas F. Gordon 所主持的歐盟 FP6 計畫:IMPACT 來進行跨國
研究與合作,並且分三年的方式以每年都參與計畫案的討論
與成果的發表。但是後續的接觸和討論並未獲得 IMPACT 計畫案所有參與者的同
意來加入他們的 FP7 IMPACT 歐盟跨國計畫案。因此只好另循管道來參與歐盟的
FP6/FP7 計畫。
計畫主持人在本計畫執行其間將本計畫案的研究成果投稿到 International
Conference on Web Intelligence Mining and Semantics (WIMS) 2011 研討會,
我們所提出的論文: A Semantic Privacy-Preserving Model for Data Sharing
and Integration 有幸被接受為全論文(full paper),本研討會所收到的投稿論
文總數為 170 篇,而被接受為全論文的僅有 50 篇,因此接受率為 29.41%。台
灣的投稿總數為 6 篇而我們則是唯一被接受的論文。本國際研討會的論文因為是
由 ACM 電子圖書館所收錄,因此在品質上是有一定的水準。本計畫主持人基於本
研討會是在北歐的挪威舉行,而主持人同時也是本研討會的議程委員,因此想藉
由論文發表的機會一併來進行原有歐盟 FP6/FP7 跨國計畫案的合作事宜。
本 WIMS-2011 的國際研討會的舉辦其間是 2011/05/25-05/27。主持人和其指導
的碩士班學生楊竣展一起到挪威距離首都 Oslo 約 40 分鐘飛行距離的 Sogndal
小鎮( http://wims.vestforsk.no/ )研討會的舉辦地。這些參與論文發表者主
要是來自於歐洲幾個知名大學,如英國牛津大學,德國萊比錫大學,奧地利維也
納大學及西班牙、義大利等其它知名大學。主持人除了在研討會的首日的場次發
表論文之外,也接著主持後續場次的論文發表。這個研討會的幾個重要的
keynote speakers 都是語意網、資訊檢索與探勘知名的學者,如美國 RPI 的 Jim
Handlers, 西班牙 Yahoo 的 Peter Mika, 德國萊比錫大學 Soren Auer 等。從這
幾個演講者的內容我們瞭解到語意網技術的發展趨勢主要是在 Linked Open
計畫編號
NSC 99-2221-E-004-010-
計畫名稱
提供具資料共享與保護的語意規範於雲端環境中
出國人員
姓名
胡毓忠
服務機構
及職稱
國立政治大學資訊科學系
出國時間
2011 年 5 月 25 日
至
2011 年 5 月 27 日
出國地點
Sogndal, Norway
13
Data (LOD)的技術上,因此強調的 lightweight semantics(輕量型的語意)而
不是傳統重量型的語意(heavyweight semantics)
。當然這個研究方向的選擇主
要是考量的現有資訊實務界的接受度是在輕量型的語意。比較特別的是 LOD 已經
在歐盟計畫發展中演進到第二代 LOD2 而且已經有不錯的研究成果和應用。
這個由挪威西科技研究院所舉辦的 WIMS-2011 研討會讓我們收穫良多。除了能夠
面對面和語意網與資訊檢索與探勘的知名學者互動討論之外。也順便瞭解到北歐
挪威這個國家的風土人情和景色。而在研究方向的調整上,我們體認到我們未來
在雲端環境中進行資料整合與保護的研究必須要考量到以 LOD2 為主的輕量型語
意整合與保護方式。這種以 LOD2 由下而上的資訊整合與保護將和傳統集中式由
上而下的重量型語意資料整合有所不同。另外在資訊的整合的技術上,我們除了
要解決原有的結構上(schema)對應(mapping)與整合(integration)之外,也必
須要考量到本體論(或資料庫)結構內的具體內容(instances)的整合,才能夠
將資訊整合與保護的研究議題徹底的來加以解決。
另外為了解決如 internet 或 web 大範圍資料整合與保護的問題,我們則必須要
引入點對點式(peer to peer)的資料整合與保護架構才能真正解決我們所面臨的
大範圍資料整合問題。因此我們在雲端的資訊整合與保護將會考量以點對點式的
資訊整合架構來鋪設我們具有語意式資訊整合與保護的電腦規範。最後我們因為
透過和研討會的主辦人 Rajendra Akerkar, Western Norway Research Institute,
Norway 的互動與討論,正在討論如何能夠有效的加入和我們研究主題有關的歐
盟 FP7 跨國計畫,我想這是我們參與 WIMS'11 研討會除了論文發表與研習之外
的另外一個重要的收穫。
二、 研究成果
附件內容:
1. 已發表論文:A Semantic Privacy-Preserving Model for Data Sharing and
Integration 的原稿
2. WIMS-2011 研討會的統計資訊(僅供給 WIMS'11 議程委員參考)
3. WIMS-2011 的研討會議程(提供研討會參與者的資訊)
4. 和 WIMS-2011 研討會主辦人 Rajendra Akerkar 的 email 互動訊息
5. 同意本次以研討會論文發表與學術交流整合的方式來進行原核准的國外研習
國科會同意函
14
參加 WIMS'11 研討會的全體大合照
主持人參加 WIMS’11 發表論文:
International Conference on Web Intelligence,
Mining and Semantics
May 25 – 27, 2011
|
Sogndal, Norway
wims.vestforsk.no
Program Committee Chair
Report
The
International Conference on Web Intelligence, Mining and Semantics
(WIMS'11) will take place
under the auspices of Western Norway Research Institute from May 25‐27, 2011 in Sogndal,
Norway.
We have just completed review process of submitted papers. This report gives a summary of the
overall review process.
In my capacity as Program Committee Chair I personally thank those of you who were able to review
submissions for the WIMS’11 conference. Thanks for devoting your valuable time for the WIMS’11
review activity. Though it is a new conference, WIMS’11 is very fortunate to have such enthusiastic
authors, PC members, Advisory members, and all others who share our WIMS vision of building
remarkable research conference about intelligent approaches to transform the World Wide Web
into a global reasoning and semantics‐driven computing machine.
I do hope you and your group members can make time in your busy schedule to attend the
conference. Thanks!
We are looking forward to meeting you in Sogndal!
‐ Rajendra Akerkar
The Program Committee
Adrian Giurca
Alexander Gelbukh
Alfio Massimiliano Gliozzo
Alejandro Jaimes
Alessandro Provetti
Alex Conconi
Alexander Mädche
Alexandre Passant
Andre C P L F de Carvalho
Andreas Blumauer
Anna Fensel
Arjen P. de Vries
Christoph Lange
Costin Badica
Daniel Lemire
David Camacho
Debajyoti Mukhopadhyay
Diana Santos
Dietrich Rebholz‐Schuhmann
Dragan Gasevic
Eetu Mäkelä
Elena Zamsa
Eriks Sneiders
Esma Aimeur
Fabio Rinaldi
François Bry
Giacomo Fiumara
Grzegorz J. Nalepa
Henry Hexmoor
Ivan Jelinek
Jacques Calmet
Jidi (Judy) Zhao
WIMS’11 PC Chair Report 2
Jie Zhang
John R Elliott
Jong C. Park
Jung‐jae Kim
Kanagasabai Rajaraman
Keith C.C. Chan
Ken Currie
Kevyn Collins‐Thompson
Marek Obitko
Markus Zanker
Martine De Cock
Michael Hausenblas
Miguel‐Angel Sicilia
Milan Milanovic
Miltos Petridis
Mohammad Essaaidi
Nicola Fanizzi
Paola Di Maio
Paulo Alexandre Ribeiro Cortez
Patrick Albert
Pierre Maret
Prasad Tadepalli
Priti Srinivas Sajja
Rajendra Akerkar (Chair)
Reinhold Behringer
Richi Nayak
Robert West
Roberto García González
Rodrigo Capobianco Guido
Sandra Lovrenčić
Sang Yong Han
Seiji Yamada
Sławomir Zadrożny
Tassilo Pellegrini
Tim Furche
Tommaso Di Noia
Tzung‐Pei Hong
Vasant Honavar
Vijay Raghavan
Viorel Negru
Yasufumi Takama
Yiannis Kompatsiaris
Yiyu Yao
Yorick Wilks
Yuh‐Jong Hu
Zhihua Cui
Zora Konjovic
Special thanks to:
Advisory committee
Amit Sheth
Frank van Harmelen
Grigoris Antoniou
Guus Schreiber
Harold Boley
James Hendler
Takahira Yamaguchi
External reviewers
Alexandru Cicortas
Pasquale De Meo
Timur Fayruzov
Claudio Gentile
Juan Manuel Gimeno
Sheila Kinsella
Le Duy Gnan
Leonardo Lezcano
Xiaoli Li
Daniel Pop
Azzurra Ragone
Christoph Wieser
Kostyantyn Shchekotykhin
Submitting authors
EasyChair administrator
WIMS’11 PC Chair Report 3
Process: overview
We used the free conference support system EasyChair throughout the process.
May 25, 2010: Call for papers published
November 20, 2010: Papers due
November 29, 2010: Decision‐making started
January 7, 2011: Review reports due
January 8, 2011: Review‐editing began
January 16, 2011: Notification
Process: submissions
November 20, 2010: 170 submissions received.
Used EasyChair to disable new submissions, but allow revisions.
I turned off web submission at midnight, but accepted email (re‐)submissions until 5pm Nov
28, 2010.
November 21–29 , 2010: Submissions assigned to PC members
November 29, 2010: E‐mails about assigned papers to PC members
Process: decisions
January 7, 2011: PC reviews due.
Except 4 PC members, all others submitted their review reports using EasyChair.
Target of 65 acceptances (50 full papers and 15 posters) —max allowed by single‐track
schedule.
In‐depth discussion of 4 controversial papers, some additional reviews.
Some delays, but overall very thorough and professional reviews by PC and external
reviewers.
January 15, 2011: Decisions complete.
Process: reviews and notification
January 8, 2011: Review editing began.
Chair edited reviews of all submissions with a team of selected PC members.
Goals: clarity, correctness, and incorporation of discussion.
January 16, 2011: Notifications and reviews sent from EasyChair.
WIMS’11 PC Chair Report 4
Statistics
170 submissions, 50 full papers and 15 poster papers accepted. Acceptance rate 38.25%.
Submitted/Accepted by Topic
0
10
20
30
40
1
2
3
4
5
6
7
8
9
10 11 12
Topics
N
o
. of
P
a
pe
rs
Submitted
Accepted
1. Semantics‐driven information retrieval 2. Semantic agent systems 3. Semantic data search 4. Interaction paradigms for semantic search 5. Evaluation of semantic search 6. Knowledge Extraction for Building Expressive Document Representation 7. Text & Web mining 8. Visualisation of Semantic Data 9. Semantic deep Web and intelligent e‐ Technology 10. Linked data application architectures 11. Opinion mining and sentiment analysis 12. Other
Submitted/Accepted 0 50 100 150 200 1 Submissions Accepted
WIMS’11 PC Chair Report 5
International Conference on Web Intelligence, Mining and Semantics
Sogndal, Norway | May 25 – 27, 2011
Conference Program
Wednesday 25.05.2011
09:30 -
10:30
Registration
(Main lobby) 10:45 -11:00
Opening
( Sogndal 2 - Auditorium)Agnes Landstad Managing Director, WNRI
James Hendler Member, WIMS'11 Advisory Committee Rajendra Akerkar Chair, Program Committee
Keynote Address
(Sogndal 2 - Auditorium)Chair: Terje Aaberge 11:00 -
12:00
The Semantic Web 10th Year Update
James Hendler
12:00 -
13:30
Lunch Break
Web Information Search & Retrieval- I
(Room: Sogndal 1 )Session Chair: Nicola Fanizzi
A Semantic Privacy-Preserving Model for Data Sharing and Integration
Yuh-Jong Hu; Jiun-Jan Yang
News Personalization using the CF-IDF Semantic Recommender
Frank Goossen; Wouter IJntema; Flavius Frasincar; Frederik Hogenboom; Uzay Kaymak
CATALOGA®: a Software for Semantic and Terminological Information Retrieval
Annibale Elia; Alberto Postiglione; Mario Monteleone; Johanna Monti; Daniela Guglielmo
Semantics and Ontology Engineering - I
(Room: Sogndal 3 )Session Chair: Michael Hausenblas
Comparison of Ontology Reasoning Systems Using Custom Rules
Hui Shi, Kurt Maly; Steven Zeil; Mohammad Zubair
Tracing the Provenance of Linked Data using voiD
Tope Omitola; Landong Zuo; Christopher Gutteridge; Ian C Millard; Hugh Glaser; Nicholas Gibbins; Nigel Shadbolt
Publishing and Interacting with Linked Data
Roberto Garcia; Josep Maria Brunetti; Antonio López-Muzás; Juan Manuel Gimeno; Rosa Gil
Sessions WE-1
13:30 - 15:00
Tutorial:
Description Logic Reasoning for Semantic web ontologies - I(Room: Sogndal 2)WIMS’11 Program 2
15:00 -
15:30 Coffee Break
Web Information Search & Retrieval - II
(Room: Sogndal 1)Session Chair: Yuh-Jong Hu
ELS: A Word-Level Method for Entity-Level Sentiment Analysis
Nikolaos Engonopoulos; Angeliki Lazaridou; Georgios Paliouras; Costas Chandrinos
Real Understanding of Real Estate Forms
Tim Furche; Georg Gottlob; Giovanni Grasso; Xiaonan Guo; Giorgio Orsi; Christian Schallhart
ForAVis - Explorative User Forum Analysis
Franz Wanner; Thomas Ramm; Daniel A. Keim
The InfoAlbum: Image Centric Information Collection
Randi Karlsen; Børge Jakobsen
Semantics and Ontology Engineering - II
(Room: Sogndal 3)Session Chair: Andreas L Opdahl
Improving the Usability of Integrated Applications by Using Visualizations of Linked Data
Heiko Paulheim
Toward A Semantic Vocabulary For Systems Engineering
Paola Di Maio
Logical Structure Analysis of Scientific Publications in Mathematics
Nikita Zhiltsov; Valery Solovyev
Disambiguating Entity References within an Ontological Model
Joachim Kleb; Andreas Abecker
Sessions WE-2
15:30 - 17:00
Tutorial:
Description Logic Reasoning for Semantic web ontologies - II (Room: Sogndal 2 )Anni-Yasmin Turhan
Poster Session
( Main lobby)Session Chair: Sören Auer
Sessions WE-3
17:00 - 18:30
An Iterative Voting Method Based On Word Density For Text Classification
Wang Jiaxun; Li Chunping
A Correlation Analysis of Social Media
Konstantinos N. Vavliakis; Konstantina Gemenetzi; Pericles A. Mitkas
Teacher Education with Share.TEC Web-Based Repository System
Dimo Boyadzhiev
Improving Web Database Search Incorporating Users Query Information
Rakesh Rawat; Richi Nayak; Yuefeng Li
Recommender Systems by means of Information Retrieval
Alberto Costa; Fabio Roda
Semantic Systems Biology: Eenabling Integrative Biology via Semantic Web Technologies
WIMS’11 Program 3
Erick Antezana; Ward Blondé; Aravind Venkatesan; Bernard De Baets; Vladimir Mironov; Martin Kuiper
The RDF Foundry: Call for an Initiative to Build Enhanced RDF Resources for the Biological Data Integration
Aravind Venkatesan; Erick Antezana; Ward Blondé; Mats Skillingstad; M Scott Marshall; Bernard De Baets; Vladimir Mironov; Martin Kuiper
Optimal Recommender Systems Blending
Fabio Roda; Alberto Costa; Leo Liberti
Linking the (un)Linked Data through backlinks
Michalis Stefanidakis; Ioannis Papadakis
On the Distinctiveness of Tags in Collaborative Tagging Systems
André Gohr; Alexander Hinneburg
Implementation of a New Method for Stemming In Persian Language
Asieh Estahbanati; Reza Javidan; Mashalla Abbasi Dezfooli
Late-Breaking News
(Room: Sogndal 1)Session Chair:Costin Badica
AdMotional: Towards Personalized Online Ads
Manfred Meyer, Markus Balsam, Arlo O'Keeffe, Christian Schlüter, Jens Fendler
Evaluating Interaction Patterns for Linked Data
Rosa Gil, Antonio López-Muzás, Josep Maria Brunetti, Juan Manuel Gimeno, Roberto García
You Need Only One Clue for Effective Record Segmentation
Cheng Wang, Tim Furche, Georg Gottlob, Giovanni Grasso, Giorgio Orsi, Christian Schallhart
SeSam4: Semi-Semantic Models for Cross-Sector Portals
Robert Engels et al.
User-Centric Mapping for RDFa Web Mining
Tewson Seeoun, Rob Brennan, Declan O’Sullivan
Ontology Extraction from Social Semantic Tags
Csaba Veres, Weiqin Chen, Andreas Opdahl, Kristian Johansen 19:00 -
20:00
Panel Discussion
Semantic Web: What will happen in next 10 years?
Moderator: Christian Schallhart
Panelists: James Hendler, Sören Auer, Peter Mika, Marko Grobelnik 20:00 -
WIMS’11 Program 4
Thursday 26.05.2011
Keynote Address
(Sogndal 2 - Auditorium)Chair: Svein Ølnes 09:00 -
10:00
Making Things Findable
Peter Mika
10:00 -
10:30
Coffee Break
Keynote Address
(Sogndal 2 - Auditorium)Chair: Robert Engels
10:30 - 11:30
Creating Knowledge Out of Interlinked Data
Sören Auer
Collective and Web Intelligence
- I (Room: Sogndal 1)Session Chair: Marko Grobelnik
Distributed Agent-Based Ant Colony Optimization for Solving Traveling Salesman Problem on a Partitioned Map
Sorin Ilie; Amelia Badica; Costin Badica
Page Segmentation by Web Content Clustering
Sadet Alcic; Stefan Conrad
A Study on the Impact of Crowd-Based Voting Schemes in the ’Eurovision’ European Contest
Gema Bello Orgaz; Raul Cajias; David Camacho
Designing an Architecture for Improving Web Query Processing in Heterogeneous Databases Access
Mohd Kamir Yusof, Ahmad Faisal Amri Abidin, Surayati Usop, Sufian Mat Deris
Intelligent Web & NLP
(Room: Sogndal 3)Session Chair: Alessandro Provetti
Semantic Web and Language Resources for e-Government: Linguistically Motivated Data Mining
Annibale Elia; Daniela Vellutino; Federica Marano; Alberto Maria Langella; Antonella Napoli
Towards automatic thematic sheets based on discursive categories in biomedical literature
Julien Desclés; olfa Makkaoui; Jean-Pierre Desclés
Improving Accessibility to Web Documents for the Aurally Challenged with Sign Language Animation
Jin-Woo Chung; Ho-Joon Lee; Jong C. Park
Tutorial:
Utilising Linked Open Data in Applications - I(Room Sogndal 2) Michael HausenblasSessions TU-1
11:30 - 13:00
Perl Hacker's Bootcamp
(Room: Group Room 109)WIMS’11 Program 5
13:00 -
14:30
Lunch Break
Collective and Web Intelligence
- II (Room Sogndal 1)Session Chair: Michel Plantié
Identification of Fine Grained Feature Based Event and Sentiment Phrases from Business News Stories
Brett Drury; Jose Joao Almeida
The Use of Data Mining Techniques in Location-Based Recommender System
Nafiseh Shabib; John krogstie
A Graph-based Approach to Measuring Semantic Relatedness in Ontologies
Ahmad Hawalah; Maria Fasli
A Register-based Annotation Scheme for CO3H
Ritesh Kumar
Intelligent e-Technologies and the Semantic Web
(Room Sogndal 3)Session Chair: Jong C. Park
Personalisation by Semantic Web Technology in Food Shopping
Lillian Hella; John Krogstie
Virtual Knowledge Communities for Semantic Agents
Julien Subercaze; Pierre Maret
Improving Search in Tele-Lecturing: Using Folksonomies as Trigger to Query Semantic Datasets to Extract Additional Metadata
Franka Moritz; Maria Siebert; Christoph Meinel
Tutorial:
Utilising Linked Open Data in Applications - II(Room Sogndal 2) Michael HausenblasSessions TU-2
14:30 - 16:00
Perl Hacker's Bootcamp
(Room: Group Room 109)Organisers: Kjetil Kjernsmo and Gregory Todd Williams 16:00 -
16:15
Coffee Break
16:15 -19:30
Social Event
16:15 Departure from Quality Hotel Sogndal 17:10 Arrival at the Boya glacier
17:40 Departure from the Boya glacier
18:00 Arrival at the Norwegian Glacier Museum in Fjaerland 19:00 Departure from the Norwegian Glacier Museum in Fjaerland 19:30 Arrival at Sogndal
20:00 -
WIMS’11 Program 6
Friday 27.05.2011
Keynote Address (
Sogndal 2 - Auditorium)Chair: Pierre Maret 09:00 -
10:00
Open Social Learning Communities
Ashwin Ram
10:00 -
10:30
Coffee Break
Keynote Address
(Sogndal 2 - Auditorium)Chair: David Camacho 10:30 -
11:30
Many Faces of Text Processing
Marko Grobelnik
Knowledge Representation Formalisms - I
(Room: Sogndal 2a )Session Chair: Chunping Li
Semantic Annotations for Modelling Language Interoperability
Andreas L Opdahl
A solution to the exact match on rare item searches
Morten Goodwin
SWRL-F - A Fuzzy Logic Extension of the Semantic Web Rule Language
Tomasz Wiktor Wlodarczyk; Chunming Rong; Marting O'Connor; Mark Musen
R2DF Framework for Ranked Path Queries over Weighted RDF Graphs
Juan Cedeno; K. Selcuk Candan
Large Knowledge Collider - A Service-oriented Platform for Large-scale Semantic Reasoning
Alexey Cheptsov; Matthias Assel; Georgina Gallizo; Irene Celino; Daniele Dell'Aglio; Luka Bradeško; Michael Witbrock; Emanuelle Della Valle
Social Network Analysis - I
(Room: Sogndal 1 )Session Chair: Peter Mika
Graph Visualization Tool for Twittersphere Users based on a High-Scalable Extract, Transform and Load System
Pablo Aragón; Íñigo García; Antonio García
Measuring Profile Distance in Online Social Networks
Niklas Lavesson; Henric Johnson
What have fruits to do with technology? The case of Orange, Blackberry and Apple.
Surender Reddy Yerva; Zoltan Miklos; Karl Aberer
Sessions FR-1a
11:30 - 13:00
Tutorial:
Just Enough Ontology Engineering- I(Room Sogndal 2b) Paola Di MaioWorkshop: Social Data Mining for Human Behaviour Analysis
(SoDaMin) I
(Room: Sogndal 3 )Chairs: David Camacho and Costin Badica
Session FR-1b
11:30 -
13:00
Multiple-Membership Communities Detection in Mobile Networks Nikolai Nefedov
WIMS’11 Program 7
Optimized K-Means Clutering with Intelligent Initial Centroid Selection for Web Search using URL and Tag Contents
S. Poomagal;T. Hamsapriya
Some Experiences Using Social Networks in Spanish Public Administrations
J. Ignacio Criado;Yolanda E-Martín;David Camacho
Predicting the popularity of online articles based on user comments
Alexandru Tatar;Panayotis Antoniadis;Marcelo Dias de Amorim;Jérémie Leguay;Arnaud Limbourg;Serge Fdida
Applying Gini Coefficient to quantify Scientific Collaboration in Researchers Network
Giseli Rabello Lopes;Roberto da Silva;J. Palazzo M. de Oliveira
Intelligent Social Media
Yolanda E-Martin;Angel Moreno;Miguel Doctor;D. Diaz 13:00 -
14:30
Lunch Break
Knowledge Representation Formalisms - II
(Room: Sogndal 2a )Session Chair: Anni-Yasmin Turhan
Semantic Web Based Architecture for Managing Hardware Heterogeneity in Wireless Sensor Network
Siniša Nikolić; Valentin Penca; Milan Segedinac; Zora Konjović
Group Decision Making in Ontology Matching
Mahdieh Kargar Ghavi; Mohammad Reza Khayyambashi
Ontological Logic Programming
Murat Sensoy; Geeth de Mel; Wamberto Vasconcelos; Timothy Norman
Prediction of Class and Property Assertions on OWL Ontologies through Evidence Combination
Giuseppe Rizzo; Nicola Fanizzi; Claudia d’Amato; Floriana Esposito
Social Network Analysis - II
(Room: Sogndal 3)Session Chair:Randi Karlsen
Mining Social Networks and Their Visual Semantics from Social Photos
Michel Crampes; Michel Plantié
Managing Social Overlay Networks in Semantic Open Enterprise Systems
Florian Skopik; Daniel Schall; Schahram Dustdar
Towards a Framework for Weaving Social Networks Principles into Web Services Discovery
Zakaria Maamar; Noura Faci; Youakim Badr; Leandro Krug Wives; Pédro Bispo dos Santos; Djamal Benslimane; José Palazzo Moreira de Oliveira
Crawling Facebook for Social Network Analysis Purposes
Salvatore Catanese; Pasquale De Meo; Emilio Ferrara; Giacomo Fiumara; Alessandro Provetti
Sessions FR-2
14:30 - 16:00
Tutorial:
Just Enough Ontology Engineering- II(Room Sogndal 2b) Paola Di Maio16:00 -
Yuh-Jong Hu <yjong.hu@gmail.com>
Re: call 8: FP7 ICT- 2011.4.4 – Intelligent Information
Management
3 封郵件
Rajendra Akerkar <rak@vestforsk.no>
2011年6月15日下午2:49
收件者: Yuh-Jong Hu <hu@cs.nccu.edu.tw>
Dear Dr. Hu,
I am very glad for receiving your message and for your interest.
I am attaching herewith a project idea.
This call
will be open next month with deadline in January 2012.
If you know any other suitable call, please let me know.
In case you have additional ideas, corrections, suggestions etc. I would be only glad if you let me know.
You may further edit the file and send to me. Let us have a brainstorming before we form a formal
consortium.
Thank you very very much for your time, and hope to hear from you before 22 June.
Regards,
Rajendra
________________ Rajendra Akerkar
Senior Researcher/Professor
Vestlandsforsking/Western Norway Research Institute P.Box 163 6851 Sogndal, Norway Phone: +47 916 85 607
One-page-EUproject.doc
46K
Yuh-Jong Hu <hu@cs.nccu.edu.tw>
2011年6月15日下午3:52
收件者: Rajendra Akerkar <rak@vestforsk.no>
Dear Rajendra,
For me, this is the first time to participate the FP7 project submission so I am not quite sure how is the exact
processes. Maybe you can tell me more about the current status of this proposal preparation, such as how
many people are interested in (or have committed) this proposal submission. Is the FP7 closing to end?
Currently, I am interested in Intelligent Information Management, especially on intelligent data integration
and protection using ontology and rule from tremendous amount of structure data, such as RDB. However, we
are more focus on the ontology-based semantic part. So we are dealing with the issues of ontology
mapping/merging/alignment as well as ontology modular and reuse for data integration and protection.
Besides, we are considering using the cloud computing platform with possible cross judicial domain of data
integration and protection. Possible applications are electronic health record (EHR) sharing for medical
diagnosis or information sharing for counter-terrorists. We have several professors from the law school
supporting this research issues. If possible, please create a discussion forum, Linkedin or Facebook. I am
happy to participate and contribute.
A Semantic Privacy-Preserving Model
for Data Sharing and Integration
∗Yuh-Jong Hu
ENT Lab., Dept. of CS National Chengchi University
Taipei, Taiwan, 11605
hu@cs.nccu.edu.tw
Jiun-Jan Yang
ENT Lab., Dept. of CS National Chengchi University
Taipei, Taiwan, 11605
98753036@nccu.edu.tw
ABSTRACT
In this paper, we encompass and extend previous ontology-based data integration system. A semantic privacy-preserving model provides authorized view-based query answering over a widespread multiple servers for data sharing and integra-tion. The combined semantics-enabled privacy protection policies are used to empower the data integration and access control services at the virtual platform (VP). The ontology mapping and merging algorithm with a local-as-view (LAV) source description that creates a global ontology schema at the VP by integrating multiple local ontology schemas for data sharing. The perfect rules integration of datalog rules enforces the data query and protection services. Semantics-enable policies are combined together at the VP, but the ac-cess control criteria specified in each server are still satisfied. Therefore the soundness and completeness of data sharing and protection criteria are ensured to support the validity of policy combination. This guarantees the trustworthiness of data sharing and protection services in multiple servers.
Categories and Subject Descriptors
H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—query formulation; H.3.5 [Information Storage and Retrieval]: Online Information Services— data sharing ; K.4.1 [Computers and Society]: Public Policy Issues—privacy, regulation
General Terms
WWW, Semantic Web, Database
∗Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. WIMS’11, May 25-27, 2011 Sogndal, Norway Copyright c 2011 ACM 978-1-4503-0148-0/11/05... $10.00
Keywords
data sharing and integration, semantics-enabled policy, pri-vacy protection, query rewriting, ontology and rule
1.
INTRODUCTION
Large enterprises spend a great deal of time and money on data (or information) integration [3]. Data integration is the problem of combining the data from autonomous and heterogeneous sources, and providing users with a unified view of these data through so called global (or mediated) schema. The global schema, which is a reconciled view of the information, that provides query services to end users. The design of a data integration system is a very complex task, which includes several different issues: heterogeneity of the data sources, relation between the global schema and the data sources, limitations on the mechanisms for accessing the sources, and how to process queries expressed on the global schema, etc [11].
Three approaches have been proposed to model a set of source descriptions that specify the semantic mapping be-tween the source schema and the global schema. The first one, called global-as-view (GAV), requires that the each con-cept in the global schema is expressed in terms of query over the data sources. The GAV deals with the case when the stable data source contains details not present in the global schema so it is not used for dynamically adding or deleting data sources.
The second one, called local-as-view (LAV), requires the global schema to be specified independently from the sources, and the source descriptions between the stable global schema, such as ontology and the dynamic data sources are estab-lished by defining each concept in the data sources as a view over the global schema [10] [26]. LAV descriptions handle the case in which the global schema contains details that are not present in every data sources.
The third one, called global-local-as-view (GLAV), a source description that combines the expressive power of both GAV and LAV, allowing flexible schema definitions independent of the particular details of the data sources [14] [30]. The data integration system uses these different source descrip-tions to reformulate a user query into a query over the source schemas. However, data sharing and integration are ham-pered by legitimate and widespread privacy concerns so it is critical to develop techniques that enables the integration and sharing of data without losing a user’s privacy [12].
Privacy protection policies represent a long-term promise made by an enterprise to its users and are determined by business practice and legal concerns. It is undesirable to change an enterprise’s promises to customers every time an internal access control rule changes. If possible, we should enable the integration of Platform for Privacy Preferences (P3P) and Enterprise Privacy Authorization Language (EPAL) policies to provide accountable and transparent information processing for data owners to revise their data usage per-missions [2].
Although many organizations post online privacy policies, they must realize that simply posting a privacy policy on their websites does not guarantee true compliance with ex-isting legislation. Following the OECD’s Fair Information Principles (FIPs)1, an organization should provide norms of personal information process for its data collection, re-tention, use, disclosure, and destruction. An organization must also be accountable for its information possession and should declare the purposes of information usage before col-lection. Moreover, an organization should collect personal information with an individual’s consent and disclose per-sonal information only for previously identified purposes. In this paper we are addressing the following research issues. More detailed modelling and implementation will be shown in the later sections.
• Data sharing and protection services are considered in a large number of servers. The incentives for using the virtual platform (VP) is to avoid solving the com-plex pair-wise problem of ontology matching and rule integration between these servers. Therefore a uni-fied global data sharing and protection service can be achieved at the VP.
• Privacy protection policies are expressed as a combi-nation ontology and rule, i.e. O + R, where ontol-ogy O includes TBox schema and ABox instances, and rules R include deductive rule set (RS) and facts (F ). Data sharing and protection in multiple servers are achieved through a combination of semantics-enabled formal protection policy (F PP).
• The challenge of designing a semantic privacy protec-tion model is to ensure a soundness and a complete-ness of data sharing and protection in multiple servers. For the soundness criterion, we do not allow unin-tended data being released to the data users through the global policy schema (GPS) at the VP. Other-wise, it violates the privacy protection policies. As for the completeness criterion, we do not miss any eligible shared data when a user asks for a data request ser-vice at the VP. Therefore, shareable data obtained at the VP should equal data obtained directly from each server.
Each enterprise server declares its P3P privacy protection policies that takes into account the FIPs criteria (see Fig-ure 1). Then EPAL policies are established in each site, corresponding to the P3P [24]. For each data request, the 1
See http://www.privacyrights.org/ar/fairinfo.htm
data handling and usage controls are based on the EPAL policies. However P3P and EPAL lack formal and unam-biguous semantics to specify privacy protection policies so they are limited in the policy enforcement and auditing sup-port for the software agents. One of the research challenges for the online privacy protection problem is to develop a privacy management framework and a formal semantics lan-guage to empower agents to enforce privacy protection poli-cies. Agents must avoid any policy violation of each data request. We attempt to establish a semantic privacy pro-tection model to address this issue. Each server shares its collected data with other servers but without breaking the original data usage commitment to its clients [25].
The contributions of this paper are twofold. We first of-fer a three layers semantic privacy-preserving model which encompasses and extends the existing work on data shar-ing and integration by usshar-ing a combination of ontology and rule for the representation of privacy protection policies. In particular, we define a formal policy using ontology for pri-vacy protection concept descriptions and rule for data query and access control services. Then we focus on solving the soundness and completeness of query rewriting problem us-ing a perfect ontology mergus-ing and a perfect rule integration from the local formal protection policies. Followed by each possible data query at the VP, we briefly demonstrate how the soundness and completeness criteria for privacy protec-tion data integraprotec-tion can be achieved using this semantics-enabled privacy-preserving model.
The paper is organized as follows. In section 2, we present a semantic privacy-preserving model as a framework for data sharing and integration services. In section 3, we define a formal policy combination as an integration of formal poli-cies from autonomous data sources. Each formal policy is composed of ontologies and rules for each independent data source. A privacy protection policy is a type of formal pol-icy used for specifying a data usage constraint from a data owner. In section 4, we formally define a formal policy com-bination in terms of ontology mapping, merging, and align-ment. Then we demonstrate how a perfect rule integration is used for query rewriting at the VP corresponding to each local schema. In section 6, we briefly prove the soundness and completeness of privacy-preserving data sharing and in-tegration based on this semantic privacy-preserving model. We conclude with related work and discussion in the last two sections.
2.
A PRIVACY-PRESERVING MODEL
A semantic privacy protection model is proposed with three layers, where the bottom layer provides data sources from the relational databases, the middle layer provides a semantics-enabled local schema for each independent service domain. The top layer is served at the VP, which provides a unified global view of privacy-preserving data sharing and integra-tion services (see Figure 2).
We have a merged global ontology schema created by map-ping and aligning local ontology schemas with a LAV source description from multiple local schemas in the middle layer. The idea of using description logic (DL) to model the local and global schemas is to empower the ontology’s abstract concept representation and reasoning capabilities. A query
Figure 1: A semantic privacy protection model extended from the integration of P3P and EPAL for data sharing and protection in multiple servers
is defined as an SQWRL datalog rule in the SWRL-based policy to access to a global ontology [31]. Each SQWRL data service query for a global ontology at the VP is mapped to multiple queries as SQWRL datalog rules for each local schema. This is a LAV query rewriting service which has been investigated in databases but it is largely unexplored in the context of DL-based ontologies [14].
2.1
Formal Privacy Protection Policy
A policy’s explicit representation in terms of ontologies or rules depends on what the underlying logic foundation of your policy language is. If your policies are created from DL-based policy language, such as Rein or KAoS, then ordinary policies are shown as TBox schema and ABox instances. Otherwise, policies created from LP-based policy language, such as EPAL or Protune ordinary policies are a set of rules with predicates of unary, binary, or ternary variables and facts [5].
In the SemPIF framework [21], we define Policy Interchange Format (PIF) to follows W3C O + R standards [6] and strives to provide a mechanism for agents to preserve different pol-icy syntax and semantics throughout its polpol-icy integration and interchange. In addition, agents can use meta-PIF, pro-viding further management and reconciliation services of PIF-enabled multiple policies across various domains. In this paper, we apply the SemPIF framework for the privacy-preserving data integration through a combination of formal policies.
A formal policy (F P) is a declarative expression correspond-ing to a human legal norm that can be executed in a com-puter system without causing any semantic ambiguity. An F P is created from a policy language (PL), and this PL is shown as a combination of ontology language and rule language . Therefore, an F P is composed of ontologies O and rules R, where ontologies are created from an ontology language and rules are created from a rule language.
A formal protection policy (F PP) is an F P that aims at representing and enforcing resource protection principles, where the structure of resources is modelled as ontologies O but the resources protection is shown as rules R. A privacy protection policy shown as an F PP is a combina-tion of ontologies and rules, e.g., O + R, where DL-based on-tologies, such as OWL-DL ontologies provide a well-defined structure data model for data sharing, while Logic Program (LP)-based rules, such as datalog rules provide further ex-pressive power for data query and protection. There are nu-merous O + R combinations available for designing privacy protection policies, such as SWRL [20], and OWL2 RL [17]. Each O + R combination implies what expressive power we can extract from ontologies for the rules and vice versa. The SWRL is one of the O + R semantic web languages suitable for a policy representation in the privacy protection model. But this is not an exclusive selection. Other O + R combinations, such as CARIN, OWL2 RL are also possi-ble for modeling formal privacy protection policy whenever their underlying theoretical foundations and development tools are available. We fully utilize the SWRLTab develop-ment tools and SQWRL OWL-DL query language [31] in the Prot´eg´e to model and enforce semantic privacy protec-tion policies.
We face a research challenge of combining SWRL-based pri-vacy protection policies from multiple servers to ensure the soundness and completeness of data sharing and protection criteria. Another challenge is to solve the policy’s syntax and semantics incompatibility when we allow policy combi-nation in multiple servers. SWRL is based on the classical first order logic (FOL) semantics that mitigates a possible semantic and syntax inconsistency when policies come from different servers.
But we still face a background policy inconsistency prob-lem when default policy assumptions vary between different