文化遺產圖像深度語意標引方法設計與實現

(1)

DOI: 10.6245/JLIS.2017.431/716

文化遺產圖像深度語意標引方法設計與

實現

王曉光 武漢大學信息管理學院教授 E-mail: [email protected] 劉雪梅 武漢大學信息管理學院碩士生 E-mail: [email protected] 夏生平 敦煌研究院副研究員 E-mail: [email protected] 關鍵詞：文化遺產；深度語意標引；敦煌壁畫；信息建模；敘事性圖像

【摘要】

隨著文化遺產圖像資訊資源的迅速增長和數位人文研究的發展，針對文化遺產圖像語意內涵的深度語意標引逐漸引起人們的關注。文化遺產圖像的深度語意標引不僅可以提高圖像檢索與獲取的效率，增強使用者對圖像的理解程度，更能支援自動的圖像資源整合和知識發現，具有重要的理論和實踐意義。通過分析文化遺產圖像語意特徵及其主題、歸納總結現有文化遺產詮釋資料與本體組織，在明確深度語意標引的概念及其基本要求的基礎上，設計了深度語意標引的基本流程，構建了文化遺產圖像語意標引模型：宏觀概念模型、蘊含資訊層次模型與標引文本的結構化組織模型，並針對敦煌壁畫《九色鹿王本生圖》進行圖像深度語意標引實驗。圖像深度語意標引模型揭示了概念、圖像、文字等之間的語意關係，深度挖掘圖像標引中各資訊層之間的知識關聯，實現了圖像資訊單元細粒度組織。同時標引實驗驗證了圖像深度語意標引模型與標引文本結構化組織的可行性。文化遺產圖像深度語意標引方法的設計與實現是深度語意標引理論及圖像資訊組織理論的創新，對圖像資訊組織與數位人文研究發展具有深遠的意義。圖像深度語意標引的粒度與深度應以標注條件為基準進行選擇，深度語意標引資訊關聯集成與發佈展示也將成為未來研究發展的重點內容。

(2)

引言

文化遺產是人類在社會歷史實踐中創造的寶貴財富，具有極高的藝術、文化和社會價值。隨著社會資訊化進程的發展，利用電腦技術實現文化遺產的數位化保護、傳承和傳播日益受到重視。1992 年聯合國教科文組織開始實施「世界記憶」計畫，旨在關注世界文獻遺產的保存保護和廣泛利用。2008 年，歐洲數位圖書館門戶網站 Europeana（2017）正式向全球開放，該網站提供來自歐盟27 個成員國文化機構的超過千萬件圖書、地圖、繪畫、照片等文化遺產資料。我國文化遺產保護歷史悠久，但數位化保護、傳承和傳播工作起步較晚。2012 年，中國國家圖書館開始實施「中國記憶」計畫，截至目前為止收錄包括年畫、照片、文獻、甲骨文等多類數位文化遺產資源。圖像和文本一樣，都是重要的文化遺產載體和表現形式。隨著文化遺產數位化的發展、數位形式的文化遺產圖像資源增長迅速，這些數位圖像的主題豐富、內涵深刻，但是囿於現有的圖像中，詮釋資料在主題描述規範不夠細緻，圖像內複雜的語意內涵少有揭示，圖像主題標引中的漏標、錯標現象也十分常見，由此導致絕大多數文化遺產圖像資源的檢索和發現都不盡人意，難以支撐新興的數位人文（Digital Humanities）研究需要。為了支援文化遺產資源深度的語意組織與集成，我們針對圖像資源提出了一種深度語意標引方法，旨在通過多層次的語意描述和結構化組織，實現圖像豐富內涵的語意揭示和表達。借助敦煌壁畫圖像實例的標引實驗，我們實現並驗證了該方法的可行性。

文化遺產圖像深度語意標引

深度語意標引的概念

圖像深度語意標引（Deep Sematic Indexing, DSI）的概念源於文獻的深度標引（Deep Indexing）實踐，深度標引作為文獻組織的一種新型方式，意在滿足使用者高精度、高效率的檢索需求，其關鍵在於明確文檔中的圖表，抽取其中的資料與資訊來構建索引，以實現數位資源細粒度組織與管理，達到知識挖掘與共用的目標（Hyer, 2008）。David Clarke 於 2015 年提出圖像深度語意標引的概念，旨在深入到圖像本身，側重於挖掘和表達圖像中的內容和語意結構，提高圖像資訊組織管理和檢索水準（Clarke, 2015）。 深度語意標引的基本要求 隨著數位人文研究的崛起及語意技術的進步，研究者希望利用機器自動標注的方法，實現圖像細粒度知識單元的深層揭示，整合圖像與其他類型的數位資源，以達到建立文化遺產數位資源網路和數位人文研究基礎設施的目的。圖像深度語意標引（DSI）的目標是通過受控的主題詞表、本體或自由的文本以及網路

(6)

資源進行細粒度語意標引，提高圖像整體及其片段的可檢索性、可發現性和可理解性，改進圖像資訊資源的管理水準。DSI 不同於傳統的圖像主題標引，其特點在於：一、 DSI 使用精確的領域術語和自然語言來揭示圖像內所包含的資訊，盡可能全面、多層次的挖掘圖像內所包含的複雜資訊，這與使用關鍵字或主題來進行圖像標引的傳統方式並不相同。二、 DSI 採用了結構主義思想，為了揭示圖像中各個細粒度物件的語意，DSI 採用精準區域對應的方式標引內容，這使得DSI 並非只描述圖像整體語意，還關注細粒度圖像片段的語意。三、為了組織圖像的語意標引資訊，DSI 採用層級結構來組織資訊，間接地揭示不同內容物件之間的語意關係，包括時間、空間、邏輯結構上的關係；最大限度地實現描述資訊與圖像內容本身的對應性。四、 DSI 的內容並非封閉的和一成不變的，其核心在於揭示圖像學前兩層的內容，對於圖像學第三層方面的解釋採用自選性操作，從而界定了圖像語意標引和圖像學研究的差異與聯繫。就理論和方法而言，深度語意標引跨越了圖像資訊組織和圖像學研究的界限，一方面它提高了研究者資源搜索的效率且提供了圖像標注的基礎架構，使得研究者可以集中於圖像高層知識；另一方面，為了進行深度語意標引，標引者必須考慮圖像學領域專家研究角度，把握圖像學研究的認知模式，這樣語意標注的全面性和準確性才能得到有效的保障。 深度語意標引的基本流程 深度語意標引在於利用受控主題詞表、本體或自由文本以及網路資源進行細粒度的語意標引，提高圖像的可理解性，其基本的圖像主題詞表與關聯文本等資源必不可少。通過將圖像內細粒度資訊單元與關聯文本進行關聯匹配，借助受控的主題詞表及領域本體實現規範化處理，提高圖像深度語意標注的可靠性與閱讀性。本文採用結構化文本層級，深層次揭示圖像中隱藏的層級關係，輔助讀者理解圖像，同時為電腦通過機器學習實現深度語意標注提供規範化方法。具體流程如圖1 所示。

(7)

圖1 圖像深度語意標引流程圖深度語意標引流程共分為五個環節，包括圖像與圖像關聯文本獲取、圖像語意單元分析、規範語意標引、標引資訊組織及標引資訊發佈。通常情況下，由於圖像內容的複雜性與表現上的抽象性，人們對於圖像內涵的理解往往需要各類圖像相關文獻來輔助解讀圖像，因此在對圖像進行深度語意標引時，首要任務是獲取數位圖像及與圖像關聯的文本。此後，依據文本對圖像進行語意分析，識別與確認標引物件，並確定標引資訊，選擇領域主題詞表與本體對部分標引資訊進行規範化轉換；接著還要對標引資訊進行組織，即將標引資訊按照一定結構進行排列。標引資訊的組織必須參考圖像物件之間的關係和圖像關聯文本的解釋性內容。以敘事性圖像為例，它們通常含有時間和情節，所以借助情節來組織標引資訊，可以實現對圖像內容的完整準確表示。最後，標引完畢的圖像還需要與標引資訊一起發佈，以便於使用者開放獲取或者內容集成共用與交互操作等。

文化遺產圖像深度語意標引模型

文化遺產圖像多帶有敘述性或象徵性意義，尤其是敘述性突出的圖像，具有明顯的時間與空間屬性。文化遺產圖像深度語意標引的重點在於理解概念、圖像、文字之間的語意關聯，明確圖像深度語意標引資訊層次模型，並構建樹形資訊層次模型，挖掘文化遺產圖像內蘊含的深層次語意資訊。

Publishing Iindexed information release 標引資訊發佈 Indexed information organization 標引資訊組織 Thesaurus 圖像索引典 Digital images 數位圖像 Associated text 關聯文本

Normlizing indexing terms 規範語意標引 Analyzing semantic units 圖像語意單元分析

Ontologies 圖像領域本體

(8)

文化遺產圖像語意描述宏觀概念模型 文化遺產圖像深度語意標引的基礎在於明確數位圖像的關聯資源及其相互之間的語意連接。文化遺產圖像本體模型的構建，不僅能夠實現文化遺產數位圖像資源的有效組織，構建圖像資訊描述之間的多種映射，明確地表達與揭示隱含的深度語意資訊，同時也為深度語意標引提供了標準框架，用以集成各類資訊，滿足不同主體、不同文獻資源的有效檢索和訪問，增強圖像資訊資源在語意層面的互聯性，並在此基礎之上，實現更高層次的知識挖掘與組織管理。本文選取敦煌壁畫中的《九色鹿王本生圖》為例，構建文化遺產圖像概念模型，揭示概念、壁畫、數位圖像、文字等之間的語意關係，如圖2 所示。圖2 敦煌壁畫語意描述概念圖以「九色鹿」為例，九色鹿作為一種神話動物，可作為一種高層概念的表示；針對特定物件「九色鹿」而產生基於認知的「圖像」是對概念的實現和轉換，同時，「壁畫」以傳統壁畫的形式作為認知層次的九色鹿圖像的具體表達；隨著數位化的不斷深入，「數位圖像」作為一種新型的圖像描述及存儲形式可用以描述傳統壁畫，但由於敦煌壁畫的抽象性與複雜性，這樣的描述對於圖像深層語意資訊的理解與揭示是遠遠不夠的。

(9)

圖像與文字從本質上來看都是表達概念的符號形式，兩者可以表達相同的語意。因此借助文字揭示圖像內各物件所蘊含的深層含義是可行的。在一定程度上，「文字」與「圖像」的語意是等價的。圖像深度語意標引就是要利用圖像與文字之間的這種語意等價性，實現更高效率的內容檢索和知識發現。 文化遺產圖像蘊含資訊的層次模型 為了更好地揭示圖像中所蘊含的語意資訊，深度挖掘圖像中知識關聯，實現圖像知識單元的再組織，我們構建了敘事性圖像的深度語意標引概念模型。模型分為四個層次，自下而上分別為圖像層、元素層、標注層及組織層，四層模型分別實現了圖像的語意揭示和資訊組織。圖3 文化遺產圖像語意資訊層次模型一、圖像層：針對原始壁畫圖像進行描述，以 CDWA 為基礎框架，描述圖像外在資訊，包括保存地點、創作時間、創作者等。二、元素層：通過對圖像中各個實體物件的識別，分離提取出圖像中所包含的各個物件元素，精準對應圖像各元素的標引框為本層的主要結構。該層以國際圖像交互框架（International Image Interoperability Framework, IIIF）為描述基礎，描述標引框對應的 X、Y 軸座標、標引框形狀、顏色等資訊。三、標引層：將已經分離出的各個元素各自對應標引唯一的、規範化的概念，如「九色鹿」、「溺人調達」、「跪拜」等。該層以潘諾夫斯基三層圖像學理論中的前圖像志與圖像志為理論基礎。四、組織層：將標引層中所提取的概念與資訊以形式化的方式而非敘事性語句的方式進行表示，以此顯性地揭示圖像內容表現的邏輯關係，説明讀者理解圖像，同時為電腦理解圖像提供機器學習的資料基礎。

(10)

文化遺產圖像標引文本的結構化組織模型 文化遺產圖像的類型多樣，如何組織標引層提取的資訊才能最大限度地還原圖像本身的內容是個難題。毋庸置疑圖像類型不同，資訊組織方式也不同。敘事性圖像是文化遺產領域常見的圖像類型。圖像是一種從事件的形象流中離析出來的「去語境化的存在」（龍迪勇， 2007），根據單幅圖像的敘事方式，敘事性圖像可以分為單一場景敘述型、綱要式敘述型和迴圈式敘述型三種類型（龍迪勇，2008）。維基百科也曾提到敘述性藝術（Narrative Art）可以分為全景式、前進式、序列式、連續式等幾種。綜合來看，敘事性圖像的基本要素為空間和時間，圖像表達敘事的方式依託故事情節的反覆運算，即圖像內場景與實體物件的變換來表達時間的演進。因此，在進行敘事性圖像的深度語意標引時，除去圖像內所包含的基本物件、行為活動等直觀顯示的語意單元，「情節」作為連接圖像與故事文本的重要結構，也應成為圖像深度語意標引的主要語意單元。在確定了圖像深度語意標引的語意單元類型之後，如何將獨立的語意單元進行有效組織便是接下來的重點。傳統的組織方法主要是分類法和主題法（馬費成、宋恩梅，2011），新型組織方法有主題圖、概念圖、語意網等（蘇新寧，2014）。對於敘事性圖像而言，在用文字還原圖像的過程中，必須考慮「情節」與物件、行為的關係。參照主題樹的方式，我們設計了一種結構化組織方法。主題樹組織資訊資源的方法是將資訊資源按照某種事先確定的概念體系分門別類地加以組織。在敘事性圖像中，以「情節」為中心組織資訊不僅讓各語意單元一目了然，更可以表現圖像的敘事性內涵。使用者可逐層遍歷層級結構，分層次瀏覽感興趣的標引內容和對應的圖像區域。敦煌壁畫《九色鹿王本生圖》是一幅敘事性圖像，具備敘事性圖像的基本特徵，其中包含多個物件與多個場景的變換，語意描述較為複雜。借助以上組織思想，我們構建了《九色鹿王本生圖》的語意資訊組織框架，如圖4 所示。深度語意標引層級結構以「情節」為基本語意單元。第一層為情節及其連續性關係；第二層為具體情節下的物件，包括人物、動物、建築、交通工具等概念和實例；第三層為特定物件的子物件及其行為和情感等屬性，如有需要，子物件下還可進一步細分。

(11)

圖4 深度語意標引層級結構深度語意標引資訊作為一種結構化文本，既不同於圖像，又區別於傳統文本，具備結構化的層級結構，實現了文字對圖像的精準還原，揭示了圖像的深度內涵，同時也作為揭示圖像與文字之間隱形連結的中介型文獻，為圖像資源與文本資源之間的關聯搭建了橋樑，實現了跨模態的語意連接。

敦煌壁畫圖像深度語意標引實驗

敦煌莫高窟始建於十六國時期的前秦，歷經多個朝代的興建，形成了集建築、彩塑、壁畫於一體的大規模洞窟藝術。至今保有洞窟735 個，壁畫 45,000 平方米、彩塑 2,415 尊，是世界上現存規模最大、內容最豐富的佛教藝術聖地（敦煌旅遊網，2016）。1987 年，莫高窟被列為世界文化遺產。敦煌壁畫作為敦煌藝術的重要組成部分，適於表達豐富的內容和複雜的場景，以傳統漢晉藝術為基礎，吸收融合外來佛教藝術，呈現出了獨特的藝術風格。敦煌壁畫不僅包括尊像畫、供養畫、裝飾圖案畫等內容較為單一的壁畫類型，也包括本生故事畫、因緣故事畫、佛傳故事畫等複雜場景的壁畫類型。敦煌壁畫歷史悠久、思想深邃，不僅是人類文化遺產中的瑰寶，更是人類文化史上的奇跡，具有極高的佛教學術價值、藝術觀賞價值和廣泛的社會傳播價值。隨著文化遺產數位化的發展，敦煌研究院對壁畫進行了數位化採集，形成了海量的敦煌壁畫數位圖像資源。為了更好地揭示、發佈、挖掘和共用敦煌壁畫中所蘊含的語意資訊，有必要對敦煌壁畫情節2 情節1 深度語意標引資訊情節N 對象2 對象1 對象1.1 _行為 _情感

(12)

數位圖像進行深度的語意組織。在深度語意標引思想指導下，我們借助Synaptic 圖像標引軟體對敦煌壁畫中的《九色鹿王本生圖》的數位圖像進行標引實驗。 九色鹿王本生圖的敘事層級結構 圖5 九色鹿王本生圖的標引結果《九色鹿王本生圖》屬本生故事畫。本生故事是描繪釋迦牟尼生前為菩薩時的各種善行的故事（陳鈺、何家蓉，2007）。標引實驗首先對九色鹿王本生故事進行權威文本的檢索和搜集，將文本匯總整合後，對九色鹿圖像進行物件識別，對圖像中的各個實體進行基本的分類並確定標引的粒度；其次，將文本中的故事情節進行拆分，並轉換成樹結構，對圖像進行預標引；最後，使用Synaptic 軟體對圖像進行正式標引。圖6 九色鹿王本生圖的情節序列《九色鹿王本生圖》標引資訊依據故事情節展開，第一層標引資訊連續的情節序列，如圖6 所示。第二層標引資訊為各情節中的物件、行為、場景等。第三層標引資訊為各種子物件。在進行具體標引工作時，層級結構分別使用紅色、黃色、綠色進行區分。

(13)

圖7 「溺人拜恩」情節及其對應的描述在「溺人拜恩」的情節標引中，「溺人拜恩」為此標引區域的具體情節，其下級包含兩個物件，一個是「九色鹿」（神話動物），一個是「溺人調達」（人物），其中「溺人調達」標引層次中還包含子物件「披巾」（服飾），披巾隨情節的遞進、人物性格的轉變發生顏色變化（由石綠色變為黑色）。導覽圖的樹狀三層結構的標引資訊揭示了圖像本身所蘊含的敘事過程，實現了圖像標引資訊與敘事內涵的對應。 標引資訊的編碼表示

Synaptic 系統中每張標引圖像都遵從開放注釋標準（Open Annotation，OA）的架構，標引圖像內各標引物件，同樣遵從OA 標準。Synaptic 系統在 oa:semanticTag 中使用 skos:broader 構建標引資訊層級結構。本文選取「眾人捕鹿」情節下的「勇士」作為標引表示示例，部分 XML 編碼如表 1 所示。表1 「勇士」XML 編碼 標引資訊層次結構 XML 表示 06.眾人捕鹿 06.01 勇士 06.02 馬車 06.03 溺人調達 06.03.01 手指九色鹿 06.03.02 爛瘡 …… <https://dunhuang.linkedcanvas.com/poi/itkwzrzoz78#bodyLabel>a<http://www.w3.org /2011/content#ContentAsText>； <http://www.w3.org/2011/content#chars> “勇士” . <https://dunhuang.linkedcanvas.com/poi/itkwzrzoz78#bodyTag>a<http://www.w3.org/n s/oa#semanticTag>； <http://www.w3.org/2004/02/skos/core#broader> <https://dunhuang.linkedcanvas.com/poi/itkxnrfoktb>；（上層概念“眾人捕鹿”） <http://xmlns.com/foaf/0.1/depicts> <https://dunhuang.linkedcanvas.com/concept/itkawkmy59239> . ……

(14)

（續表1）標引資訊層次結構 XML 表示 <https://dunhuang.linkedcanvas.com/poi/itkwzrzoz78#selector> <http://schema.synaptica.com/oasis#captionPosition> "8" ; <http://schema.synaptica.com/oasis#lineColor> "#FFFF00"；（標引框顏色） <http://schema.synaptica.com/oasis#media> "polygon"；（標引框形狀） <http://schema.synaptica.com/oasis#selectorID> "inzgi30zmx8";（標引框 ID） <http://schema.synaptica.com/oasis#textColor> "#FFFF00"；（標引文本顏色） …… a <http://www.w3.org/ns/oa#Selector> 標引用語的規範化控制 除了明確敘事性圖像深度語意標引的具體內容及結構，標引用詞彙的規範化定義也十分必要。在借鑒《敦煌學文獻檢索主題詞表》基礎上，我們提出了初步的敦煌壁畫標引詞表，並實現了簡單知識組織系統（Simple Knowledge Organization System，SKOS）化編碼，增強

了詞表的互通性。詞表的編碼片段如表2 所示。表2 敦煌壁畫標引詞表 SKOS 描述片段 釋迦牟尼佛 <skos:preLable>釋迦牟尼佛</skos:preLable> <skos:notation>01010101</skos:notation> 蓮花藻井圖案 <skos:preLable>蓮花藻井圖案</skos:preLable> <skos:notation>010801</skos:notation> 　

討論

語意標引的粒度與深度 根據圖像內涵語意的深度，可以將其分為多個語意層面，如物件層、物件空間層、場景層、行為／活動層、情感層等（王曉光等人，2014），在此思想指導下，進行圖像的深度語意標引是可行的。在敦煌壁畫圖像的語意標引實驗過程中，我們發現圖像標引的粒度難於確定，即深度語意標引物件細化的級別。以標引物件「溺人調達」為例，是否需要標引他的「披肩」難以確定。對此問題，我們認為標引粒度應考慮實際專案的目標和條件限制，在條件允許的情況下，越細越好，但成本越高。其次，針對圖像標引深度問題，是否要標引特定物件的情感資訊，需要慎重考慮。以往多有研究建議只標注圖像學的前兩層，對於涉及第三層的語意內容，除非有公認的研究結果可用於借鑒，可以不用標引，以免引起誤解和干擾後期的圖像學研究。 語意標引資訊的關聯集成 對文化遺產圖像進行深度語意標引是資料資源語意關聯和集成展示的基礎。文化遺產圖像往往帶有強烈的符號意義，在不同的時空和器物上，圖像的表現方式不同，但傳達的符號

(15)

意義卻可能是相似或者相同的，甚至具有同源性或者衍生性。不同的標引者在標引這些不同的、但關係密切的圖像時，會形成不同的標引資料集，為了更好地支持研究和知識發現，我們就需要實現這些資源之間的共用、關聯集成，甚至是交互操作，以揭示不同地區、語言和文化的認知差異。資源關聯與集成的物件甚至是不同模態的資料，比如圖像與文獻、典籍之間的關聯集成，圖像與視頻、音訊之間的關聯集成等。深度語意標引是跨模態語意關聯的基礎，也是文化遺產資訊資源語意組織的關鍵。 語意標引資訊的發佈與展示 為了共用人類的文化遺產資訊資源，越來越多的機構開始以關聯資料的形式發佈文化遺產資料。2014 年美國蓋堤基金會（The J. Paul Getty Trust, Getty）將 AAT 以關聯資料的形式進行了發佈。可以想像，隨著文化遺產圖像深度語意標引的不斷深度，將形成大量的語意標引資料集，它們是一種新型的伴隨圖像存在的語意出版物，也是一種仲介型文獻，支撐著細粒度圖像資訊的發現，為了更好地滿足了人們對圖像檢索、瀏覽與學習的需求，這些標引資訊需要和圖像本身一起發佈和展示，如何創新發佈與展示形式是未來文化遺產資訊資源研究的重點內容。

總結

本文針對文化遺產圖像的語意單元及複雜關係，對文化遺產圖像語意特徵進行了提取和歸納，提出了文化遺產圖像本體模型及文化遺產圖像資訊層次模型，進而以敦煌壁畫中的《九色鹿王本生圖》為例，構建了敘事性圖像的深度語意標引結構，開展了標引實驗，創新了圖像深度語意標引方法。圖像深度語意標引不僅僅是圖像標引方法的創新，更具有標引理論創新價值。它將潘諾夫斯基的圖像學理論框架引入圖像標引過程，在圖像語意結構定義基礎上，實現了標引方法與過程的創新。深度語意標引讓圖像標引的目標不再停留在圖像主題（aboutness）上，而擴展到了圖像內容（ofness），這一變化意味著深度語意標引處於圖像學與資訊組織理論的交叉領域，這對於圖像資訊組織具有深遠的意義。圖像深度語意標引的產生適應了數位人文研究的發展趨勢。由大眾或專家標引形成的標引資料集圖像的知識發現具有良好的支撐作用，是構建數位人文研究基礎設施的重要組成部分。我們的初步試驗顯示圖像深度語意標引是可行的，但該項工作的調整是人工標引的高成本性，為了加快推進圖像深度語意標引，還需要嘗試將深度學習技術應用於標引過程，這一工作也是我們未來的關注重點。

(16)

致謝

本文受中國教育部「新世紀優秀人才」基金和人文社科重點研究基地重大專案基金資助。

衷心感謝David Clarke 先生提供的正在研發中的圖像標引軟體 OASIS，以及曾蕾教授富有啟

發意義的學術意見！

參考文獻

Armitage, L.H., & Enser, P.G.B. (1997). Analysis of user need in image archives. Journal of Information

Science, 23(4), 287-299. doi: 10.1177/016555159702300403

Clarke, D. (2015, July). Deep image annotation. In Philip Carlisle (Chair), Making a Difference in Knowledge

Organization. Symposium conducted at the meeting of International Society for Knowledge

Organization, London, UK. Retrieved from http://www.iskouk.org/sites/default/files/ClarkePaperISKO- UK2015.pdf

Collins, K. (1998). Providing subject access to images: A study of user queries. The American Archivist, 61(1), 36-55. doi: 10.17723/aarc.61.1.b531vt5q0q620642

Europeana Foundation. (2009). Europeana Collection. Retrieved from http://www.europeana.eu/portal/en Hernández, F., Rodrigo, L., Contreras, J., Carbone, F., & Botín, F. M. (2008, September). Building a cultural

heritage ontology for Cantabria. In Nicholas Crofts (Chair), The Digital Curation of Cultural Heritage. Symposium conducted at the meeting of the International Documentation Committee of the International Council of Museums, Athens, Greece. Retrieved from http://network.icom.museum/fileadmin/user_upload/ minisites/cidoc/ConferencePapers/2008/64_papers.pdf

Huang, X., Soergel, D., & Klavans, J. L. (2015). Modeling and analyzing the topicality of art images. Journal

of the Association for Information Science & Technology, 66(8), 1616-1644. doi: 10.1002/asi.23281

Hyer, M. (2008). Deep Indexing: Harnessing the Power of Data Discovery. Retrieved from https://www.osti. gov/hoMe/system/files/07-08-08_CENDI_ProQuest_Hyer.pdf

Jorgensen, C. (1995). Image attributes: An investigation (Unpublished doctoral dissertation). Syracuse University, New York.

Kakali, C., Lourdi, I., Stasinopoulou, T., Bountouri, L., Papatheodorou, C., Doerr, M., & Gergatsoulis, M. (2007, August). Integrating Dublin Core metadata for cultural heritage collections using ontologies. DC-2007--Singapore Proceedings, 128-139.

Lagoze, C., & Hunter, J. (2001, October). The ABC Ontology and Model. DC-2001--Tokyo Proceedings, 160-176. Layne, S. S. (1994). Some issues in the indexing of images. Journal of the American Society for Information

Science, 45(8), 583–588. doi: 10.1002/(SICI)1097-4571(199409)45:8<583::AID-ASI13>3.0.CO;2-N

Neugebauer, T. (2005). Image Indexing. Photography Media Journal. Retrieved from http://www.phtographymedia. com/article.php?page=All&article=AImageIndexing

Panofsky, E. (1939). Studies in Iconology: Humanistic Themes in the Art of the Renaissance. New York: Oxford University Press.

Rafferty, P., & Albinfalah, F. (2014). A tale of two images: The quest to create a story-based image indexing system. Journal of Documentation, 70(4), 605-621. doi: 10.1108/JD-10-2012-0130

(17)

Rose, G. (2012). Visual Methodologies: An Introduction to Researching with Visual Materials. United Kingdom: Sage.

Tousch, A. M., Herbin, S., & Audibert, J. Y. (2012). Semantic hierarchies for image annotation: A survey.

Pattern Recognition, 45(1), 333-345. doi: 10.1016/j.patcog.2011.05.017

Zeng, M. L., Gracy, K. F., & Žumer, M. (2014, May). Using a Semantic Analysis Tool to Generate Subject Access Points: A Study Using Panofsky's Theory and Two Research Samples. Knowledge Organization,

41(6), 440-451. In Professor M. Hudon (Chair), Knowledge Organization in the 21st Century: Between Historical Patterns and Future Prospects. Symposium conducted at the meeting of International Society

for Knowledge Organization, Krakow, Poland.

中國圖書館學會（2015）。2015 中國圖書館學會年會第 6 分會場：中國記憶專案資源共建共用，檢自： http://www.lsc.org.cn/c/cn/news/2015-12/29/news_8640.html

【Library Society of China (2015). 2015 Chinese Library the Sixth Session of Annual Conference. The Construction and Sharing of Chinese Memory Project Resources. Retrieved from http://www.lsc.org.cn/c/cn/news/ 2015-12/29/news_8640.html】

王曉光、徐雷、李綱（2014）。敦煌壁畫數位圖像語意描述方法研究。中國圖書館學報，40(1)，50-59。【Wang, Xiao-Guang, Xu-Lei & Li-Gang (2014). Semantic Description Framework for Digital Dunhuang

Mural Images. Journal of Library Science in China, 40(1), 50-59.】

夏立新、白陽、孫晶瓊（2016）。基於關聯標籤的非遺圖片資源主題發現研究。圖書情報工作，60(2)，22-29。【Xia, Li-Xin, Bai, Yang, & Sun, Jing-Qiong (2016). Topic extraction of intangible cultural heritage image

resources based on associated labels. Library and Information Service, 60(2), 22-29.】袁莉、張曉林（2001）。數位圖像的元數據格式。大學圖書館學報，19(2)，27-30。

【Yuan, Li, & Zhang, Xiao-Lin (2001). Metadata Formats of Digital Images. Journal of Academic Libraries, 19(2), 27-30.】

馬費成、宋恩梅（2011）。資訊管理學基礎。中國湖北省：武漢大學出版社。

【Ma, Fei-Cheng, & Song, En-Mei (2011). Foundation of Information Management. Hubei, China: Press of Wuhan University.】

陳淑君、陳雪華（2015）。中國藝術領域的中英控制詞表語意對應。圖書資訊學刊，13(2)，161-208。【Shu-Jiun Chen, & Hsueh-Hua Chen (2015). Lexical-semantic Mapping between Chinese and English Controlled Vocabularies in the Domain of Chinese Art. Journal of Library and Information Studies, 13(2), 161-208.】

陳鈺、何家蓉（編著）（2007）。敦煌壁畫故事大觀。中國甘肅省：甘肅人民美術出版社。

【Chen, Jue, & He, Jia-Rong (eds.) (2007). Dunhuang bihua gushi daguan. Gansu, China: Ganshu People's Fine Arts Publishing House.】

黃永、陸偉、程齊凱、鄧勝利（2016）。非物質文化遺產知識本體構建系統的設計與實現--以西藏“鍋莊”、“堆諧”為例。西藏民族學院學報：哲學社會科學版，37(1)，20-26。

【Huang, Yong, Lu, Wei, Cheng, Qi-Kai, & Deng, Sheng-Li (2016). Feiwuzhi wenhua yichan zhishi benti goujian xitong de sheji yu shixian – yi Xizang “guozhuang”, “duixie” weili. Journal of Xizang Minzu University (Philosophy and Social Sciences Edition), 37(1), 20-26.】

黃永林、談國新（2012）。中國非物質文化遺產數位化保護與開發研究。華中師範大學學報（人文社會科學版），51(2)，49-55。

(18)

【Huang,Yong-Lin, & Tan, Guo-Xin（2012）. Zhongguo feiwuzhi wenhua yichan shuzihua baohu yu kaifa yanjiu. Journal of Huazhong Normal University (Humanities and Social Sciences), 51(2), 49-55.】黃崑、王珊珊、耿騫（2015）。國內圖像元數據應用研究現狀與分析。國家圖書館學刊，24(4)，60-66。【Huang, Kun, Wang, Shan-Shan, & Geng, Qian (2015). Analysis on the Status of Domestic Image Metadata

Application Study. Journal of the National Library of China, 24(4), 60-66.】

馮項雲、肖瓏、廖三三、莊紀林（2001）。國外常用元數據標準比較研究。大學圖書館學報，19(4)， 15-21。

【Feng, Xiang-Yun, Xiao, Long, Liao, San-San, & Zhuang, Ji-Lin（2001）. A comparative study of commonly used metadata formats in abroad. Journal of Academic Libraries, 19(4), 15-21.】

程齊凱、周耀林、戴暘（2011）。論基於本體的非物質文化遺產分類組織方法。信息資源管理學報，

01(3)，78-83。

【Cheng, Qi-Kai, Zhou, Yao-Lin, & Dai, Yang (2011). Classification and Organization of Intangible Cultural Heritage: A Method Based on Ontology Tool. Journal of Information Resources Management, 01(3), 78-83.】

董坤（2014）。非物質文化遺產本體構建與語意化組織研究。數字圖書館論壇，(10)，40-45。【Dong, Kun (2014). Fei wuzhi wenhua yichan benti goujian yu yuyihua zuzhi yanjiu. Digital Library Forum,

(10), 40-45.】

敦煌市旅遊局（2014）。敦煌旅遊官網，檢自：http://www.dhcn.gov.cn/

【Dunhuang Tourism Administration (2014). Tunhuang luyu kuanwang. Retrieved from http://www.dhcn. gov.cn/.】

翟姍姍（2015）。基於關聯數據的非物質文化遺產資源聚合研究。中國北京市：科學出版社。

【Zhai, Shan-Shan（2015）. Jiyu guanlian shuju de feiwuzhi wenhua yichan ziyuan juhe yanjiu. Beijing, China: Science Press.】

蘇新寧（2014）。面向知識服務的知識組織理論與方法。中國北京市：科學出版社。

【Su, Xin-Ning (2014). Knowledge Organization Theories and Methods for Knowledge Service. Beijing, China: Science Press.】

龍迪勇（2007）。圖像敘事：空間的時間化。江西社會科學，9，39-53.

【Long, Di-Yong (2007). Tuxiang xushi: Kongjian de shijianhua. Jiangxi Social Sciences, 9, 39-53.】龍迪勇（2008）。圖像敘事與文字敘事─故事畫中的圖像與文本。江西社會科學，3，28-43. 【Long, Di-Yong (2008). Tuxiang xushi yu wenzi xushi ─ gushihuazhong de tuxiang yu wenben. Jiangxi

(19)

Design and Implementation of Deep Semantic

Indexing on Digital Cultural Heritage Images

Xiaoguang Wang

Professor, School of Information Management, WuHan University, China

E-mail: [email protected] Xuemei Liu

Master, School of Information Management, WuHan University, China

E-mail: [email protected] Shengping Xia

Deputy Director, Information Center of Dunhuang Academic, China E-mail: [email protected]

Keywords: Cultural Heritage; Deep Semantic Indexing; Dunhuang Mural;

Information

Modeling; Narrative Image

【Abstract】

With the rapid growth of information resources for cultural heritage images and the development of Digital Humanities research, Deep Semantic Indexing (DSI), which aims at semantic indexing of cultural heritage images, has gradually attracted more and more attentions. DSI can improve not only the efficiency of image retrieval and acquisition, but also the user understanding of the images. It can support the integration of image resources and automatic knowledge discovery, which has important theoretical and practical significance. The study conducted throughout analysis of semantic features and themes of cultural heritage images and reviewed the existing cultural heritage metadata models and ontologies. Based on the understanding of the concept of DSI and its basic requirements, we designed the workflow and technological process of DSI, constructed the cultural heritage image semantic indexing model, including an inclusive concept model, an multi‐layered information model, and a structural model of the indexing texts. We also conducted an indexing experiment of the Dunhuang mural “Nine‐colored Deer”. The DSI modeling of images reveals the semantic relationship between concepts, images and text, mines the knowledge correlation between each information layer DOI: 10.6245/JLIS.2017.431/716

(20)

related to image indexing, and realizes the fine‐grained organization of image information units. At the same time, the indexing experiment verified the feasibility and scientificity of cultural heritage image DSI structure. The design and implementation in DSI of cultural heritage image information is an advancement of the deep semantic indexing theory and image information organization theory. The decision on image indexing’s granularity and extensibility should be based on the indexing contents. The integration of DSI information and the publishing of such information will be studied further in the future.

【Long Abstract】

Background

The deep semantic indexing of cultural heritage images can improve the efficiency of image retrieval and accessing, enhance users’ understanding of image, and support automatic image resources integration and knowledge discovery. According to the current research progress, although the existing cultural heritage-related metadata standards, ontologies, and cataloging rules provide detailed descriptions about external characteristics of images, the descriptions about image connotations are not sufficient. Due to the lack of rules on descriptions, it is difficult to reveal the profound semantic connotations of cultural heritage images, which is unfavorable to the granular resource aggregation and knowledge discovery of cultural heritage.

Methodology

After summarizing the existing metadata standards and ontologies related to cultural heritages, this study proposes a basic model of deep semantic indexing (DSI) and designs a macroscopic concept model and a structured epression model for the indexing information. A DSI experiment on the image of Dunhuang mural “Nine-colored Deer” is also conducted.

The purpose of DSI is to represent the abundant meanings contained in images, using controlled schemas, thesaurus, ontologies, free text, and network resources. The DSI has several features: (1) the use of precise terms and natural language to reveal the information contained in images; (2) the application of structuralism; (3) the use of hierarchical structure to organize information; (4) its core is to reveal the content of the first two hierarchies of iconography while leaving the explanation of the third hierarchy as optional operation.

(21)

Basic Process of Deep Semantic Indexing

The basic process of DSI involves five steps, including the acquisition of images and image-associated texts, analyzing the image semantic units, normalizing the semantic indexing strings, organizing and publishing the indexed information.

Figure 1 Process of Deep Semantic Indexing

Results

Reference model of cultural heritage images and related concepts

This study develops a reference model for deep semantic indexing, using the Dunhuang mural “Nine-colored Deer” as an example, to reveal the concepts and their relations related to cultural heritage images. As a special entity, Nine-colored Deer is an instance of a mythical animal, as shown in Figure 2.

Publishing Iindexed information 標引資訊發佈 Organizing indexed information 標引資訊組織 Thesaurus 圖像索引典 Digital images 數位圖像 Associated text 關聯文本

Normlizing indexing terms 規範語意標引 Analyzing semantic units 圖像語意單元分析

Ontologies 圖像領域本體

(22)

Figure 2 A concept map of the Dunhuang mural images of cultural heritage symbols

A hierarchical model of semantic information contained in cultural heritage images

The model comprises four hierarchies. (1) Level of image – the original mural image is described, using Categories for the Description of Works of Art (CDWA) as the basic framework to describe the external information of image; (2) Level of elements –various objects are separated and extracted from the image following an annotation schema; the International Image Interoperability Framework (IIIF) is used on this level; (3) Level of index informaiton – with Panofsky’s theory as the theoretic basis, each object or activity is given a concept; (4) Level of semantic organization – the concepts and information extracted in the level of indexing are presented in a structured model instead of a narrative, to explicitly reveal the story reflected in image content. The four hierarchies of the model respectively enable the semantic revelation and information organization of image.

(23)

Figure 3 Hierarchical Model of DSI

The mural of “Nine-colored Deer” is a narrative image and it involves several scenes . In order to organize the DSI information, a schema is proposed and shown in Figure 4.

Figure 4 The Schema of DSI Information

In this schema, plot is the key unit. The DSI information is organized according to the plots, the contained objects and their properties.

(24)

Figure 7 Plot of “Gratitude Expression of Drowning Man” and Corresponding Description

The plot of “gratitude expression of drowning man”, for instance, is the specific plot of the indexed area, and it involves two objects: the “Nine-colored Deer” and the “drowning man Devadatta”.. The indexing information is organized into a tree, which reveals the semantic structure of the image.

Normative Control of Terms Used in Indexing

On the basis of “Dunguang Literature Indexing Thesaurus ,” we propose a preliminary Dunhuang mural image thesaurus and code it with Simple Knowledge Organization System (SKOS).

Conclusion

Granularity and Depth of Semantic Indexing

Many existed studies suggest that indexing be performed in only the first two hierarchies of iconography and that the semantic content involving the third hierarchy be exempt to avoid causing misunderstanding and distraction to prospective iconography studies, unless a publicly acknowledged study is available for reference.

Associative Integration of Indexed Semantic Information

Different indexed data sets may be produced when individual indexers index a set of different but closely associated images. In order to better support research and knowledge discovery, it is necessary to achieve the sharing, associative integration, and even interoperability of the datasets so as to reveal the cognitive differences in different areas, languages, and cultures.

(25)

Release and Exhibition of Indexing Semantic Information

In order to meet the need for image retrieval, browsing, and learning, it is necessary to release and exhibit the indexing information along with the image itself. In this regard, the development of innovative forms of release and exhibition is the focus of future studies on cultural heritage information resources.