• 沒有找到結果。

Understanding in Dialogue Systems

N/A
N/A
Protected

Academic year: 2022

Share "Understanding in Dialogue Systems"

Copied!
56
0
0

加載中.... (立即查看全文)

全文

(1)

Unsupervised Spoken Language

Understanding in Dialogue Systems

YUN-NUNG (VIVIAN) CHEN 陳 縕儂 CARNEGIE MELLON UNIVERSITY

2015/01/16 @NCTU H T T P : / / V I V I A N C H E N . I D V . T W

1 UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

(2)

Outline

Introduction

Unsupervised Slot Induction [Chen et al., ASRU’13 & Chen et al., SLT‘14]

Unsupervised Domain Exploration [Chen and Rudnicky, SLT’14]

Unsupervised Relation Detection [Chen et al., SLT’14]

Conclusions & Future Work

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 2

(3)

Outline

Introduction

Unsupervised Slot Induction [Chen et al., ASRU’13 & Chen et al., SLT‘14]

Unsupervised Domain Exploration [Chen and Rudnicky, SLT’14]

Unsupervised Relation Detection [Chen et al., SLT’14]

Conclusions & Future Work

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 3

(4)

Spoken Language Understanding (SLU)

SLU in dialogue systems

SLU maps natural language inputs to semantic forms “I would like to go to NCTU on Friday.”

◦ Semantic frames, slots, and values

◦ often manually defined by domain experts or developers.

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

location: NCTU date: Friday

What are the problems?

4

(5)

Problems with Predefined Information

Generalization: may not generalize to real-world users.

Bias propagation: can bias subsequent data collection and annotation.

Maintenance: when new data comes in, developers need to start a new round of annotation to analyze the data and

update the grammar.

Efficiency: time consuming, and high costs.

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

Can we automatically induce semantic information w/o annotations?

5

(6)

Outline

Introduction

Unsupervised Slot Induction [Chen et al., ASRU’13 & Chen et al., SLT‘14]

Unsupervised Domain Exploration [Chen and Rudnicky, SLT’14]

Unsupervised Relation Detection [Chen et al., SLT’14]

Conclusions & Future Work

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 6

(7)

Unsupervised Slot Induction

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

Motivation

◦ Spoken dialogue systems (SDS) require predefined semantic slots to parse users’ input into semantic representations

Frame semantics theory provides generic semantics

Distributional semantics capture contextual latent semantics

7

(8)

Probabilistic Frame-Semantic Parsing

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

FrameNet [Baker et al., 1998]

◦ a linguistically-principled semantic resource, based on the frame-semantics theory.

◦ “low fat milk”  “milk” evokes the “food” frame;

“low fat” fills the descriptor frame element

Frame (food): contains words referring to items of food.

Frame Element: a descriptor indicates the characteristic of food.

SEMAFOR [Das et al., 2010; 2013]

◦ a state-of-the-art frame-semantics parser, trained on manually annotated FrameNet sentences

8

(9)

Step 1: Frame-Semantic Parsing for ASR outputs

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

can i have a cheap restaurant

Frame: capability FT LU: can FE LU: i

Frame: expensiveness FT LU: cheap

Frame: locale by use FT/FE LU: restaurant

Task: adapting generic frames to task-specific settings for SDSs Good!

Good!

Bad!

9

(10)

Step 2: Slot Ranking Model

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

Main Idea

◦ Ranking domain-specific concepts higher than generic semantic concepts

can i have a cheap restaurant

Frame: capability FT LU: can FE LU: i

Frame: expensiveness FT LU: cheap

Frame: locale by use FT/FE LU: restaurant

slot candidate slot filler

10

(11)

Step 2: Slot Ranking Model

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

Rank the slot candidates by integrating two scores

the frequency of the slot candidate in the SEMAFOR-parsed corpus

the coherence of slot fillers

slots with higher frequency may be more important

domain-specific concepts should focus on fewer topics and be similar to each other

lower coherence in topic space higher coherence in topic space

slot: quantity slot: expensiveness

a one

all three

cheap

expensive

inexpensive

11

(12)

Step 2: Slot Ranking Model

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

Measure coherence by pair-wised similarity of slot fillers

◦ For each slot candidate

slot candidate: expensiveness

The slot with higher h(s

i

) usually focuses on fewer topics, which are more specific, which is preferable for slots of SDS.

corresponding slot filler:

“cheap”, “not expensive”

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 12

(13)

Step 2: Slot Ranking Model

How to define the vector for each slot filler?

◦ Run clustering and then build vectors based on clustering results

◦ K-means, spectral clustering, etc.

◦ Use distributional semantics to transfer words into vectors

◦ LSA, PLSA, neural word embeddings (word2vec)

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 13

(14)

Experiments for Slot Induction

Dataset

◦ Cambridge University SLU corpus [Henderson, 2012]

◦ Restaurant recommendation in an in-car setting in Cambridge

◦ WER = 37%

◦ vocabulary size = 1868

◦ 2,166 dialogues

◦ 15,453 utterances

◦ dialogue slot: addr, area, food, name, phone, postcode, price range, task, type

The mapping table between induced and reference slots

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 14

(15)

Experiments for Slot Induction

Slot Induction Evaluation: MAP of the slot ranking model to measure the quality of induced slots via the mapping table

Slot Filling Evaluation: MAP-F-H/S: weight the MAP score with F-measure of two slot filler lists

Approach ASR

MAP MAP-F-H MAP-F-S

Frame Sem

(a) Frequency 67.61 26.96 27.29

(b) K-Means 67.38 27.38 27.99

(c) Spectral Clustering 68.06 30.52 28.40

Frame Sem + Dist Sem

(d) Google News RepSim 72.71 31.14 31.44

(e) NeiSim 73.35 31.44 31.81

(f) Freebase RepSim 71.48 29.81 30.37

(g) NeiSim 73.02 30.89 30.72

(h) (d) + (e) + (f) + (g) 76.22 30.17 30.53

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 15

(16)

Experiments for Slot Induction

Approach ASR

MAP MAP-F-H MAP-F-S

Frame Sem

(a) Frequency 67.61 26.96 27.29

(b) K-Means 67.38 27.38 27.99

(c) Spectral Clustering 68.06 30.52 28.40

Frame Sem + Dist Sem

(d) Google News RepSim 72.71 31.14 31.44

(e) NeiSim 73.35 31.44 31.81

(f) Freebase RepSim 71.48 29.81 30.37

(g) NeiSim 73.02 30.89 30.72

(h) (d) + (e) + (f) + (g) 76.22 30.17 30.53

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

Adding distributional information outperforms our baselines

16

(17)

Experiments for Slot Induction

Approach ASR

MAP MAP-F-H MAP-F-S

Frame Sem

(a) Frequency 67.61 26.96 27.29

(b) K-Means 67.38 27.38 27.99

(c) Spectral Clustering 68.06 30.52 28.40

Frame Sem + Dist Sem

(d) Google News RepSim 72.71 31.14 31.44

(e) NeiSim 73.35 31.44 31.81

(f) Freebase RepSim 71.48 29.81 30.37

(g) NeiSim 73.02 30.89 30.72

(h) (d) + (e) + (f) + (g) 76.22 30.17 30.53

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

Combining two datasets to integrate the coverage of Google and precision of Freebase can rank correct slots higher and performs the best MAP scores

17

(18)

Outline

Introduction

Unsupervised Slot Induction [Chen et al., ASRU’13 & Chen et al., SLT‘14]

Unsupervised Domain Exploration [Chen and Rudnicky, SLT’14]

Unsupervised Relation Detection [Chen et al., SLT’14]

Conclusions & Future Work

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 18

Question?

(19)

Unsupervised Domain Exploration

Target: given conversation

interaction with SDS, predicting which application the user

wants to launch Approach:

◦ Step 1: enriching the semantics using word embeddings

◦ Step 2: using the descriptions of applications as a retrieval cue to find relevant applications

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 19

(20)

Proposed Framework

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 20

“play lady gaga’s bad romance”

1. Semantic Seed Generation

The Semantic Seeds (Slot Types)

2. Semantics Enrichment

Entity Linking Wikipedia

Freebase Structured Knowledge

Word Embeddings

Enrichment Process

3. Retrieval Process

Ranking Model

Ranked Applications

Pandora

singer songwriter

song music

:

Application Data Query Utterance

Frame- Semantic

Parsing

(21)

Proposed Framework

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 21

“play lady gaga’s bad romance”

1. Semantic Seed Generation

The Semantic Seeds (Slot Types)

2. Semantics Enrichment

Entity Linking Wikipedia

Freebase Structured Knowledge

Word Embeddings

Enrichment Process

3. Retrieval Process

Ranking Model

Ranked Applications

Pandora

singer songwriter

song music

:

Application Data Query Utterance

Frame- Semantic

Parsing

(22)

Semantic Seed Generation

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 22

Semantic parsing performs well on a generic domain, and cannot recognize domain-specific named entities.

• Main idea: Slot types help imply semantic meaning of the utterance for expanding domain knowledge.

• Frame Type of Semantic Parsing

Q: compose an email to alex

Frame: text creation

FT LU: compose FE LU: an email

Frame: contacting FT LU: email

S

frm

(Q): frame-based semantic seeds

(23)

Semantic Seed Generation

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 23

• Main idea: Slot types help imply semantic meaning of the utterance for expanding domain knowledge.

• Entity Type from Linked Structured Knowledge

。 Wikipedia Page Linking 。 Freebase List Linking

Q: play lady gaga’s bad romance

… is an American singer, songwriter, and actress.

… is a song by American singer …

S

wk

(Q): wikipedia-based semantic seeds

celebrity composition

:

composition canonical version musical recording

:

Q: play lady gaga’s bad romance

S

fb

(Q): freebase-based

semantic seeds

(24)

Proposed Framework

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 24

“play lady gaga’s bad romance”

1. Semantic Seed Generation

The Semantic Seeds (Slot Types)

2. Semantics Enrichment

Entity Linking Wikipedia

Freebase Structured Knowledge

Word Embeddings

Enrichment Process

3. Retrieval Process

Ranking Model

Ranked Applications

Pandora

singer songwriter

song music

:

Application Data Query Utterance

Frame- Semantic

Parsing

(25)

Semantic Enrichment

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 25

• Main idea: Utilizing distributed word embeddings to obtain the semantically related knowledge of each word.

1) Modeling word embeddings by the application vender descriptions.

2) Extracting the most related words by trained word embeddings for

each word. (ex. “text”  “message”,

“msg”)

Words with higher similarity suggest that they are often

occurs with common contexts in the embedding training data.

(26)

Proposed Framework

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 26

“play lady gaga’s bad romance”

1. Semantic Seed Generation

The Semantic Seeds (Slot Types)

2. Semantics Enrichment

Entity Linking Wikipedia

Freebase Structured Knowledge

Word Embeddings

Enrichment Process

3. Retrieval Process

Ranking Model

Ranked Applications

Pandora

singer songwriter

song music

:

Application Data Query Utterance

Frame- Semantic

Parsing

(27)

Retrieval Process

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 27

• Main idea: Retrieving the applications that are more likely to support users’ requests.

probability that user speaks Q to make the request for launching the application A

Query Reformulation (Q’)

Embedding-Enriched Query: integrates similar words to all words in Q

。 Type-Embedding-Enriched Query: additionally adds similar words to semantic seeds S(Q)

• Ranking Model

probability that word x occurs in the

application

The application with higher P(Q | A) is more likely to be able to

support the user desired functions.

(28)

Results

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 28

0.22 0.26 0.30 0.34 0.38

0 25 50 75 100 125 150 175 200 MAP

#word / query

Baseline Embedding-Enriched (T)

Type-Embedding-Enriched: Frame (T) Type-Embedding-Enriched: Wikipedia (T) Type-Embedding-Enriched: Freebase (T) Type-Embedding-Enriched: Hand-crafted (T)

0.25 0.31 0.37 0.43 0.49

0 25 50 75 100 125 150 175 200 P@5

#word / query

(29)

Tune the thresholds by develop set

Overall Results

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 29

Approach ASR

MAP P@5

Original Query 25.50 34.97 Embedding-Enriched 30.42 40.72

Type- Embed.- Enriched

Frame 30.11 39.59

Wikipedia 30.74 40.82 Freebase 32.02 41.23 Hand-Craft 34.91 45.03

◦ Enriching semantics improves performance by involving domain-specific knowledge.

◦ Freebase results are better than the embedding-enriched method, showing that we can effectively and efficiently expand domain-specific knowledge by types of slots from Freebase.

◦ Hand-crafted mapping shows

that the correct types of slots

offer better understanding and

tells the room of improvement.

(30)

Outline

Introduction

Unsupervised Slot Induction [Chen et al., ASRU’13 & Chen et al., SLT‘14]

Unsupervised Domain Exploration [Chen and Rudnicky, SLT’14]

Unsupervised Relation Detection [Chen et al., SLT’14]

Conclusions & Future Work

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 30

Question?

(31)

Unsupervised Relation Detection

Spoken Language Understanding (SLU): convert ASR outputs into pre- defined semantic output format

Relation: semantic interpretation of input utterances

◦ movie.release_date, movie.name, movie.directed_by, director.name

Unsupervised SLU: utilize external knowledge to help relation detection without labelled data

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

“when was james cameron’s avatar released”

Intent: FIND_RELEASE_DATE

Slot-Val: MOVIE_NAME=“avatar”, DIRECTOR_NAME=“james cameron”

31

(32)

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS

Semantic Knowledge Graph

Priors for SLU

What are knowledge graphs?

◦ Graphs with

◦ strongly typed and uniquely identified entities (nodes)

◦ facts/literals connected by relations (edge)

Examples:

◦ Satori, Google KG, Facebook Open Graph, Freebase

How large?

◦ > 500M entities, >1.5B relations, > 5B facts

How broad?

◦ Wikipedia-breadth: “American Football”  “Zoos”

Slides of Larry Heck, Dilek Hakkani-Tur, and Gokhan Tur, Leveraging Knowledge Graphs for Web-Scale Unsupervised Semantic Parsing, in Proceedings of Interspeech, 2013.

32

(33)

Semantic Interpretation via Relations

Two Examples

differentiate two examples by including the originating node types in the relation

User Utterance:

find movies produced by james cameron SPARQL Query (simplified):

SELECT ?movie {?movie. ?movie.produced_by?producer.

?producer.name"James Cameron".}

Logical Form:

λx. Ǝy. movie.produced_by(x, y) Λ person.name(y, z) Λ z=“James Cameron”

Relation:

movie.produced_by producer.name

User Utterance:

who produced avatar SPARQL Query (simplified):

SELECT ?producer {?movie.name"Avatar“. ?movie.produced_by?producer.}

Logical Form:

λy. Ǝx. movie.produced_by(x, y) Λ movie.name(x, z) Λ z=“Avatar”

Relation:

movie.name movie.produced_by

produced_by

name

MOVIE PERSON

produced_by

name

MOVIE PERSON

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 33

(34)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 34

(35)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 35

(36)

Relation Inference from Gazetteers

Gazetteers (entity lists)

“james cameron”

director producer

:

james cameron director

director producer

#movies James Cameron directed

movie.directed_by director.name

director director

Dilek Hakkani-Tur, Asli Celikyilmaz, Larry Heck, and Gokhan Tur, Probabilistic enrichment of knowledge graph entities for relation detection in conversational understanding, in Proceedings of Interspeech, 2014.

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 36

(37)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 37

(38)

Relational Surface Form Derivation

Web Resource Mining

Bing query snippets including entity pairs connected with specific relations in KG

Dependency Parsing

Avatar is a 2009 American epic science fiction film directed by James Cameron.

directed_by

Avatar is a 2009 American epic science fiction film directed by James Cameron nsub

num cop det

nn vmod

prop_by

nn

$movie nn nn nn prop pobj $director

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 38

(39)

Relational Surface Form Derivation

Dependency-Based Entity Embeddings

1) Word & Context Extraction

Word Contexts

$movie film/nsub-1 is film/cop-1 a film/det-1 2009 film/num-1 american, epic,

science, fiction film/nn-1

Word Contexts

film

film/nsub, is/cop, a/det, 2009/num, american/nn, epic/nn, science/nn, fiction/nn, directed/vmod directed $director/prep_by

$director directed/prep_by-1

Avatar is a 2009 American epic science fiction film directed by James Cameron nsub

num cop det

nn vmod

prop_by

nn

$movie nn nn nn prop pobj $director

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 39

(40)

Relational Surface Form Derivation

Dependency-Based Entity Embeddings

2) Training Process

Each word w is associated with a vector v

w

and each context c is represented as a vector v

c

◦ Learn vector representations for both words and contexts such that the dot product v

w

v

c

associated with good word-context pairs belonging to the training data D is maximized

◦ Objective function:

Word Contexts

$movie film/nsub-1 is film/cop-1 a film/det-1 2009 film/num-1 american, epic,

science, fiction film/nn-1

Word Contexts

film

film/nsub, is/cop, a/det, 2009/num, american/nn, epic/nn, science/nn, fiction/nn, directed/vmod directed $director/prep_by

$director directed/prep_by-1

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 40

(41)

Relational Surface Form Derivation

Surface Form Derivation

Entity Surface Forms

◦ learn the surface forms corresponding to entities

Entity Syntactic Contexts

◦ learn the important contexts of entities

$char, $director, etc.

$char: “character”, “role”, “who”

$director: “director”, “filmmaker”

$genre: “action”, “fiction”

based on word vector v

w

based on context vector v

c

$char: “played”

$director: “directed”

 with similar contexts

 frequently occurring together

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 41

(42)

Proposed Framework

Relation Inference from

Gazetteers

Entity Dict.

Relational Surface Form

Derivation

Entity

Embeddings

P

F

(r | w)

Entity Surface Forms

P

C

(r | w) P

E

(r | w)

Entity Syntactic Contexts Knowledge Graph Entity

Probabilistic Enrichment

Ru

(r)

Relabel

Boostrapping

Final Results

“find me some films directed by james cameron”

Input Utterance Background Knowledge

Local Relational Surface Form

Bing Query Snippets Knowledge Graph

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 42

(43)

Probabilistic Enrichment

Integrate relations from

◦ Prior knowledge

◦ Entity surface forms

◦ Entity syntactic contexts

Integrated Relations for Words by

Unweighted: combine all relations with binary values

Weighted: combine all relations and keep the highest weights of relations

Highest Weighted: combine the most possible relation of each word

Integrated Relations for Utterances by

Dilek Hakkani-Tur, Asli Celikyilmaz, Larry Heck, and Gokhan Tur, Probabilistic enrichment of knowledge graph entities for relation detection in conversational understanding, in Proceedings of Interspeech, 2014.

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 43

(44)

Boostrapping

Unsupervised Self-Training

Training a multi-label multi-class classifier estimating relations given an utterance

Ru1 (r)

r

Ru2 (r)

r

Ru3 (r)

r

Utterances with relation weights

Pseudo labels for training

u1: Lu1 (r) u2: Lu2 (r) u3: Lu3 (r) :

creating labels by a threshold

Adaboost:

ensemble M weak classifiers

Classifier

output prob dist.

of relations

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 44

(45)

Experiments of Relation Detection

Dataset

Knowledge Base: Freebase

◦ 670K entities

◦ 78 entity types (movie names, actors, etc)

Relation Detection Data

◦ Crowd-sourced utterances

◦ Manually annotated with SPARQL queries  relations

Query Statistics Dev Test

% entity only 8.9% 10.7%

% rel only w/ specified movie names 27.1% 27.5%

% rel only w/ specified other names 39.8% 39.6%

% more complicated relations 15.4% 14.7%

% not covered 8.8% 7.6%

#utterances 3338 1084

User Utterance:

who produced avatar Relation:

movie.name

movie.produced_by

produced_by

name

MOVIE PERSON

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 45

(46)

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Baseline

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 46

(47)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Words derived by dependency embeddings can successfully capture the surface forms of entity tags, while words derived by regular embeddings cannot.

Baseline

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 47

(48)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74 Gazetteer + Entity Context 35.31 37.23 38.04 38.88 37.25 38.04

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Words derived from entity contexts slightly improve performance.

Baseline

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 48

(49)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74 Gazetteer + Entity Context 35.31 37.23 38.04 38.88 37.25 38.04 Gazetteer + Entity Surface Form + Context 37.66 38.64 40.29 41.98 40.07 43.34

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Combining all approaches performs best, while the major improvement is from derived entity surface forms.

Baseline

Proposed

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 49

(50)

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74 Gazetteer + Entity Context 35.31 37.23 38.04 38.88 37.25 38.04 Gazetteer + Entity Surface Form + Context 37.66 38.64 40.29 41.98 40.07 43.34

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

With the same information, learning surface forms from dependency- based embedding performs better, because there’s mismatch between written and spoken language.

Baseline

Proposed

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 50

(51)

Experiments of Relation Detection

All performance

Evaluation Metric: micro F-measure (%)

Weighted methods perform better when less features, and highest weighted methods perform better when more features.

Approach

Unweighted Weighted Highest Weighted Ori Boostrap Ori Boostrap Ori Boostrap

Gazetteer 35.21 36.91 37.93 40.10 36.08 38.89

Gazetteer + Weakly Supervised 25.07 37.39 39.04 39.07 39.40 39.98 Gazetteer + Entity Surface Form (Reg) 34.23 34.91 36.57 38.13 34.69 37.16 Gazetteer + Entity Surface Form (Dep) 37.44 38.37 41.01 41.10 39.19 42.74 Gazetteer + Entity Context 35.31 37.23 38.04 38.88 37.25 38.04 Gazetteer + Entity Surface Form + Context 37.66 38.64 40.29 41.98 40.07 43.34 Baseline

Proposed

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 51

(52)

Experiments of Relation Detection

Entity Surface Forms Derived from Dependency Embeddings The functional similarity carried by dependency-based entity

embeddings effectively benefits relation detection task.

Entity Tag Derived Word

$character character, role, who, girl, she, he, officier

$director director, dir, filmmaker

$genre comedy, drama, fantasy, cartoon, horror, sci

$language language, spanish, english, german

$producer producer, filmmaker, screenwriter

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 52

(53)

Experiments of Relation Detection

Effectiveness of Boosting

◦ The best result is the

combination of all approaches, because probabilities came from different resources can complement each other.

◦ Only adding entity surface forms performs similarly, showing that the major improvement comes from

relational entity surface forms.

◦ Boosting significantly improves most performance

0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40 0.41 0.42 0.43 0.44

1 2 3 4 5 6 7 8 9 10

F-Measure

Iteration

Gaz. Gaz. + Weakly Supervised

Gaz. + Entity Surface Form (BOW) Gaz. + Entity Surface Form (Dep) Gaz. + Entity Context Gaz. + Entity Surface Form + Context

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 53

(54)

Outline

Introduction

Unsupervised Slot Induction [Chen et al., ASRU’13 & Chen et al., SLT‘14]

Unsupervised Domain Exploration [Chen and Rudnicky, SLT’14]

Unsupervised Relation Detection [Chen et al., SLT’14]

Conclusions & Future Work

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 54

Question?

(55)

Conclusions & Future Work

Conclusions

◦ Unsupervised SLU are more and more popular.

◦ Using external knowledge helps SLU in different ways.

◦ Word embeddings is very useful

Future Work

◦ Fusion of various knowledge resources

◦ Different resources help SLU in different ways

◦ Relation between slots

◦ Understanding Inter-slot relations can help develop better SDS

◦ Active learning

◦ In terms of practical and efficiency, manually labeling a small set of samples can boost performance.

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 55

(56)

Q & A 

THANKS FOR YOUR AT TENTIONS!!

UNSUPERVISED SPOKEN LANGUAGE UNDERSTANDING IN DIALOGUE SYSTEMS 56

參考文獻

相關文件

help students develop the reading skills and strategies necessary for understanding and analysing language use in English texts (e.g. text structures and

• To enhance teachers’ knowledge and understanding about the learning and teaching of grammar in context through the use of various e-learning resources in the primary

- allow students to demonstrate their learning and understanding of the target language items in mini speaking

 Register, tone and style are entirely appropriate to the genre and text- type.  Text

List up all different types of high-sym k (points, lines, planes) 2...

Stress and energy distribution in quark-anti-quark systems using gradient flow.. Ryosuke Yanagihara

His understanding of animals can be summarized in three aspects: animals have minds of buddhas ‒ free and lucid; animals are beings that live in coexistence with humans; and

• Dilek Hakkani-Tur, Asli Celikyilmaz, Larry Heck, and Gokhan Tur, Probabilistic enrichment of knowledge graph entities for relation detection in conversational understanding,