• 沒有找到結果。

DERIVING LOCAL RELATIONAL SURFACE FORMS FROM DEPENDENCY-BASED ENTITY EMBEDDINGS FOR UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING

N/A
N/A
Protected

Academic year: 2022

Share "DERIVING LOCAL RELATIONAL SURFACE FORMS FROM DEPENDENCY-BASED ENTITY EMBEDDINGS FOR UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING"

Copied!
28
0
0

加載中.... (立即查看全文)

全文

(1)

10/02/2014 Sphinx Lunch

DERIVING LOCAL RELATIONAL SURFACE FORMS FROM DEPENDENCY-BASED ENTITY EMBEDDINGS FOR

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING

YUN-NUNG (VIVIAN) CHEN

DILEK HAKKANI-TÜ R & GOKHAN TUR

(2)

Outline

Introduction

◦ Main Idea

◦ Semantic Knowledge Graph

◦ Semantic Interpretation via Relation

Proposed Approach

◦ Relation Inference from Gazetteers

◦ Relational Surface Form Derivation

◦ Probabilistic Enrichment

◦ Boostrapping

Experiments

Conclusions

(3)

Main Idea

Relation Detection for Unsupervised SLU

Spoken Language Understanding (SLU): convert automatic speech recognition (ASR) outputs into pre-defined semantic output format

Relation: semantic interpretation of input utterances

◦ movie.release_date, movie.name, movie.directed_by, director.name

Unsupervised SLU: utilize external knowledge to help relation detection without labelled data

“when was james cameron’s avatar released”

Intent: FIND_RELEASE_DATE

Slot-Val: MOVIE_NAME=“avatar”, DIRECTOR_NAME=“james cameron”

(4)

Semantic Knowledge Graph

Priors for SLU

What are knowledge graphs?

◦ Graphs with

◦ strongly typed and uniquely identified entities (nodes)

◦ facts/literals connected by relations (edge)

Examples:

◦ Satori, Google KG, Facebook Open Graph, Freebase

How large?

◦ > 500M entities, >1.5B relations, > 5B facts

How broad?

◦ Wikipedia-breadth: “American Football”  “Zoos”

Slides of Larry Heck, Dilek Hakkani-Tur, and Gokhan Tur, Leveraging Knowledge Graphs for Web-Scale Unsupervised Semantic Parsing, in Proceedings of Interspeech, 2013.

(5)

Semantic Interpretation via Relations

Two Examples

differentiate two examples by including the originating node types in the relation

User Utterance:

find movies produced by james cameron SPARQL Query (simplified):

SELECT ?movie {?movie. ?movie.produced_by?producer.

?producer.name"James Cameron".}

Logical Form:

λx. Ǝy. movie.produced_by(x, y) Λ person.name(y, z) Λ z=“James Cameron”

Relation:

movie.produced_by producer.name

User Utterance:

who produced avatar SPARQL Query (simplified):

SELECT ?producer {?movie.name"Avatar“. ?movie.produced_by?producer.}

Logical Form:

λy. Ǝx. movie.produced_by(x, y) Λ movie.name(x, z) Λ z=“Avatar”

Relation:

movie.name movie.produced_by

produced_by

name

MOVIE PERSON

produced_by

name

MOVIE PERSON

(6)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

(7)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

(8)

Relation Inference from Gazetteers

Gazetteers (entity lists)

“james cameron”

director producer

:

james cameron director

director producer

#movies James Cameron directed

movie.directed_by director.name

director director

Dilek Hakkani-Tur, Asli Celikyilmaz, Larry Heck, and Gokhan Tur, Probabilistic enrichment of knowledge graph entities for relation detection in conversational understanding, in Proceedings of Interspeech, 2014.

(9)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

(10)

Relational Surface Form Derivation

Web Resource Mining

Bing query snippets including entity pairs connected with specific relations in KG

Dependency Parsing

Avatar is a 2009 American epic science fiction film directed by James Cameron.

directed_by

Avatar is a 2009 American epic science fiction film directed by James Cameron nsub

num cop det

nn vmod

prop_by

nn

$movie nn nn nn prop pobj $director

(11)

Relational Surface Form Derivation

Dependency-Based Entity Embeddings

1) Word & Context Extraction

Word Contexts

$movie film/nsub-1

is film/cop-1

a film/det-1

2009 film/num-1

american, epic, science, fiction film/nn-1

Word Contexts

film

film/nsub, is/cop, a/det, 2009/num, american/nn, epic/nn, science/nn, fiction/nn, directed/vmod directed $director/prep_by

$director directed/prep_by-1

Avatar is a 2009 American epic science fiction film directed by James Cameron nsub

num cop det

nn vmod

prop_by

nn

$movie nn nn nn prop pobj $director

(12)

Relational Surface Form Derivation

Dependency-Based Entity Embeddings

2) Training Process

Each word w is associated with a vector v

w

and each context c is represented as a vector v

c

◦ Learn vector representations for both words and contexts such that the dot product v

w

v

c

associated with good word-context pairs belonging to the training data D is maximized

◦ Objective function:

Word Contexts

$movie film/nsub-1

is film/cop-1

a film/det-1

2009 film/num-1

american, epic, science, fiction film/nn-1

Word Contexts

film

film/nsub, is/cop, a/det, 2009/num, american/nn, epic/nn, science/nn, fiction/nn, directed/vmod directed $director/prep_by

$director directed/prep_by-1

(13)

Relational Surface Form Derivation

Surface Form Derivation

Entity Surface Forms

◦ learn the surface forms corresponding to entities

Entity Syntactic Contexts

◦ learn the important contexts of entities

$char, $director, etc.

$char: “character”, “role”, “who”

$director: “director”, “filmmaker”

$genre: “action”, “fiction”

based on word vector v

w

based on context vector v

c

$char: “played”

$director: “directed”

 with similar contexts

 frequently occurring together

(14)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

(15)

Probabilistic Enrichment

Integrate relations from

◦ Prior knowledge

◦ Entity surface forms

◦ Entity syntactic contexts

Integrated Relations for Words by

Unweighted: combine all relations with binary values

Weighted: combine all relations and keep the highest weights of relations

Highest Weighted: combine the most possible relation of each word

Integrated Relations for Utterances by

Dilek Hakkani-Tur, Asli Celikyilmaz, Larry Heck, and Gokhan Tur, Probabilistic enrichment of knowledge graph entities for relation detection in conversational understanding, in Proceedings of Interspeech, 2014.

(16)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

(17)

Boostrapping

Unsupervised Self-Training

Training a multi-label multi-class classifier estimating relations given an utterance

Ru1 (r)

r

Ru2 (r)

r

Ru3 (r)

r

Utterances with relation weights

Pseudo labels for training

u1: Lu1 (r) u2: Lu2 (r) u3: Lu3 (r) :

creating labels by a threshold

Adaboost:

ensemble M weak classifiers

Classifier

output prob dist.

of relations

(18)

Experiments of Relation Detection

Dataset

Knowledge Base: Freebase

◦ 670K entities

◦ 78 entity types (movie names, actors, etc)

Relation Detection Data

◦ Crowd-sourced utterances

◦ Manually annotated with SPARQL queries  relations

Query Statistics Dev Test

% entity only 8.9% 10.7%

% rel only w/ specified movie names 27.1% 27.5%

% rel only w/ specified other names 39.8% 39.6%

% more complicated relations 15.4% 14.7%

% not covered 8.8% 7.6%

#utterances 3338 1084

User Utterance:

who produced avatar Relation:

movie.name

movie.produced_by

produced_by

name

MOVIE PERSON

(19)

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Baseline

(20)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Words derived by dependency embeddings can successfully capture the surface forms of entity tags, while words derived by regular embeddings cannot.

Baseline

(21)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74 Gazetteer + Entity Context 35.31 37.23 38.04 38.88 37.25 38.04

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Words derived from entity contexts slightly improve performance.

Baseline

(22)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74 Gazetteer + Entity Context 35.31 37.23 38.04 38.88 37.25 38.04 Gazetteer + Entity Surface Form + Context 37.66 38.64 40.29 41.98 40.07 43.34

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Combining all approaches performs best, while the major improvement is from derived entity surface forms.

Baseline

Proposed

(23)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74 Gazetteer + Entity Context 35.31 37.23 38.04 38.88 37.25 38.04 Gazetteer + Entity Surface Form + Context 37.66 38.64 40.29 41.98 40.07 43.34

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

With the same information, learning surface forms from dependency- based embedding performs better, because there’s mismatch between written and spoken language.

Baseline

Proposed

(24)

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Weighted methods perform better when less features, and highest weighted methods perform better when more features.

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74 Gazetteer + Entity Context 35.31 37.23 38.04 38.88 37.25 38.04 Gazetteer + Entity Surface Form + Context 37.66 38.64 40.29 41.98 40.07 43.34 Baseline

Proposed

(25)

Experiments of Relation Detection

Entity Surface Forms Derived from Dependency Embeddings The functional similarity carried by dependency-based entity

embeddings effectively benefits relation detection task.

Entity Tag Derived Word

$character character, role, who, girl, she, he, officier

$director director, dir, filmmaker

$genre comedy, drama, fantasy, cartoon, horror, sci

$language language, spanish, english, german

$producer producer, filmmaker, screenwriter

(26)

Experiments of Relation Detection

Effectiveness of Boosting

◦ The best result is the

combination of all approaches, because probabilities came from different resources can complement each other.

◦ Only adding entity surface forms performs similarly, showing that the major improvement comes from

relational entity surface forms.

◦ Boosting significantly improves most performance

0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40 0.41 0.42 0.43 0.44

1 2 3 4 5 6 7 8 9 10

F-Measure

Iteration

Gaz. Gaz. + Weakly Supervised

Gaz. + Entity Surface Form (BOW) Gaz. + Entity Surface Form (Dep) Gaz. + Entity Context Gaz. + Entity Surface Form + Context

(27)

Conclusions

We propose an unsupervised approach to capture the relational surface forms including entity surface forms and entity contexts based on

dependency-based entity embeddings.

The detected relations viewed as local observations can be integrated with background knowledge by probabilistic enrichment methods.

Experiments show that involving derived relational surface forms as

local cues together with prior knowledge can significantly improve the

relation detection task and help open domain SLU.

(28)

Q & A 

THANKS FOR YOUR AT TENTIONS!!

參考文獻

相關文件

It’s easy to check that m is a maximal ideal, called the valuation ideal. We can show that R is a

Engaging Students in Task-based Learning Activities to Practise the Target Vocabulary Items and Language Forms.  KS1 learners may not have enough experience in buying/ordering

This December, at the 21st Century Learning Hong Kong Conference, we presented a paper called ‘Can makerspace and design thinking help English language learning in local Hong

In 2006, most School Heads perceived that the NET’s role as primarily to collaborate with the local English teachers, act as an English language resource for students,

• Introduction of language arts elements into the junior forms in preparation for LA electives.. Curriculum design for

• A narrative poem is a poem that tells a story. Narrative poems can come in many forms and styles. They can be long or short, simple or complex, as long as they tell stories.

From the existence theorems of solution for variational relation prob- lems, we study equivalent forms of generalized Fan-Browder fixed point theorem, exis- tence theorems of

To improve the convergence of difference methods, one way is selected difference-equations in such that their local truncation errors are O(h p ) for as large a value of p as