• 沒有找到結果。

Communication

在文檔中 in DialoguesY (頁 34-49)

Subjects’ app invocation is logged on a daily basis

Subjects annotate their app activities with

Task Structure: link applications that serve a common goal

Task Description: briefly describe the goal or intention of the task

Subjects use a wizard system to perform the annotated task by speech

35

TASK59; 20150203; 1; Tuesday; 10:48

play music via bluetooth speaker

com.android.settings  com.lge.music Meta

Desc App

: Ready.

: Connect my phone to bluetooth speaker.

: Connected to bluetooth speaker.

: And play music.

: What music would you like to play?

: Shuffle playlist.

: I will play the music for you.

W1 U1 W2

U2 W3

U3 W4

Dialogue

SETTINGS MUSIC

MUSIC

Y.-N. Chen, S. Ming, A. I Rudnicky, and A. Gershman, "Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding," in Proc. of ICMI, pages 83-86, 2015. ACM.

1

Lexical Intended App

photo tell check CAMERA IM

take this photo

tell vivian this is me in the lab

CAMERA

IM

Train

check my grades on website send an email to professor

CHROME EMAIL

send

Behavioral

NULL CAMERA

.85

take a photo of this send it to alice

CAMERA

IM

email

1 1

1 1

1

1 .70

CHROME

1

1 1

1 1 1

CHROME EMAIL

1 1

1 1

.95

.80 .55

User Utterance Intended

App

Test take a photo of this send it to alex

hidden semantics

Issue: unobserved hidden semantics may benefit understanding

The decomposed matrices represent low-rank latent semantics for utterances and words/histories/apps respectively

The product of two matrices fills the probability of hidden semantics

37 1

Lexical Intended App

photo tell check send CAMERA IM

Behavioral

NULL CAMERA

.85

email

1 1

1 1

1

1 .70

CHROME

1

1 1

1 1 1

CHROME EMAIL

1 1

1 1

.95

.80 .55

𝑼

𝑾 + 𝑯 + 𝑨

𝑼 × 𝒅 𝒅 × 𝑾 + 𝑯 + 𝑨

Y.-N. Chen, S. Ming, A. I Rudnicky, and A. Gershman, "Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding," in Proc. of ICMI, pages 83-86, 2015. ACM.

Model implicit feedback by completing the matrix

not treat unobserved facts as negative samples (true or false)

give observed facts higher scores than unobserved facts

Objective:

the model can be achieved by SGD updates with fact pairs

1

𝑓+ 𝑓 𝑓

𝑢

𝑥

39 1

Lexical Intended App

photo tell check CAMERA IM

take this photo

tell vivian this is me in the lab

CAMERA

IM

Train

check my grades on website send an email to professor

CHROME EMAIL

send

Behavioral

NULL CAMERA

.85

take a photo of this send it to alice

CAMERA

IM

email

1 1

1 1

1

1 .70

CHROME

1

1 1

1 1 1

CHROME EMAIL

1 1

1 1

.95

.80 .55

User Utterance Intended

App

Reasoning with Matrix Factorization for Implicit Intents Test take a photo of this

send it to alex

Y.-N. Chen, S. Ming, A. I Rudnicky, and A. Gershman, "Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding," in Proc. of ICMI, pages 83-86, 2015. ACM.

Dataset: 533 dialogues (1,607 utterances); 455 multi-turn dialogues

Google recognized transcripts (word error rate = 25%)

Evaluation metric: accuracy of user intent prediction (ACC)

mean average precision of ranked intents (MAP)

Baseline: Maximum Likelihood Estimation (MLE) Multinomial Logistic Regression (MLR)

Approach Lexical Behavioral All

(a) MLE User-Indep 13.5 / 19.6

(b) User-Dep 20.2 / 27.9

Dataset: 533 dialogues (1,607 utterances); 455 multi-turn dialogues

Google recognized transcripts (word error rate = 25%)

Evaluation metric: accuracy of user intent prediction (ACC)

mean average precision of ranked intents (MAP)

Baseline: Maximum Likelihood Estimation (MLE) Multinomial Logistic Regression (MLR)

41

Approach Lexical Behavioral All

(a) MLE User-Indep 13.5 / 19.6

(b) User-Dep 20.2 / 27.9

(c) MLR User-Indep 42.8 / 46.4 14.9 / 18.7 (d) User-Dep 48.2 / 52.1 19.3 / 25.2

Lexical features are useful to predict intended apps for both independent and user-dependent models.

Y.-N. Chen, S. Ming, A. I Rudnicky, and A. Gershman, "Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding," in Proc. of ICMI, pages 83-86, 2015. ACM.

Dataset: 533 dialogues (1,607 utterances); 455 multi-turn dialogues

Google recognized transcripts (word error rate = 25%)

Evaluation metric: accuracy of user intent prediction (ACC)

mean average precision of ranked intents (MAP)

Baseline: Maximum Likelihood Estimation (MLE) Multinomial Logistic Regression (MLR)

Approach Lexical Behavioral All

(a) MLE User-Indep 13.5 / 19.6

(b) User-Dep 20.2 / 27.9

(c) MLR User-Indep 42.8 / 46.4 14.9 / 18.7 46.2+ / 50.1+ (d) User-Dep 48.2 / 52.1 19.3 / 25.2 50.1+ / 53.9+

Y.-N. Chen, S. Ming, A. I Rudnicky, and A. Gershman, "Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding," in Proc. of ICMI, pages 83-86, 2015. ACM.

Dataset: 533 dialogues (1,607 utterances); 455 multi-turn dialogues

Google recognized transcripts (word error rate = 25%)

Evaluation metric: accuracy of user intent prediction (ACC)

mean average precision of ranked intents (MAP)

Baseline: Maximum Likelihood Estimation (MLE) Multinomial Logistic Regression (MLR)

43

Approach Lexical Behavioral All

(a) MLE User-Indep 13.5 / 19.6

(b) User-Dep 20.2 / 27.9

(c) MLR User-Indep 42.8 / 46.4 14.9 / 18.7 46.2+ / 50.1+ (d) User-Dep 48.2 / 52.1 19.3 / 25.2 50.1+ / 53.9+ (e) (c) + Personalized MF 47.6 / 51.1 16.4 / 20.3 50.3+* / 54.2+*

(f) (d) + Personalized MF 48.3 / 52.7 20.6 / 26.7 51.9+* / 55.7+* Personalized MF significantly improves MLR results by considering hidden semantics.

App functionality modeling

Learning app embeddings

45

Investigation of Language Understanding Impact for Reinforcement Learning Based Dialogue Systems

X. Li, Y.-N. Chen, L. Li, and J. Gao, “End-to-End Task-Completion Neural Dialogue Systems,” preprint arXiv: 1703.01008, 2017.

X. Li, Y.-N. Chen, L. Li, J. Gao, and A. Celikyilmaz, “Investigation of Language Understanding Impact for Reinforcement Learning Based Dialogue Systems,” preprint arXiv: 1703.07055, 2017.

Dialogue management is framed as a reinforcement learning task

Agent learns to select actions to maximize the expected reward

Environment

Observation

Reward

If booking a right ticket, reward = +30 If failing, reward = -30

Otherwise, reward = -1

Agent

Dialogue management is framed as a reinforcement learning task

Agent learns to select actions to maximize the expected reward

47

Environment

Observation

Action

Agent Natural Language Generation

User Agenda Modeling User Simulator

Language Understanding Dialogue Management Neural Dialogue System Text Input:

Are there any action movies to see this weekend?

Dialogue Policy:

request_location

X. Li, Y.-N. Chen, L. Li, and J. Gao, “End-to-End Task-Completion Neural Dialogue Systems,” preprint arXiv: 1703.01008, 2017.

NLU and NLG are trained in a supervised manner

DM is trained in a reinforcement learning framework (NLU and NLG can be fine tuned)

wi

B-type

wi+

1

wi+2

O O

EO S

<intent>

wi

B-type

wi+

1

wi+2

O O

EO S

<intent>

Dialogue Policy

request_location

User Dialogue Action

Inform(location=San Francisco)

Time t-1

wi

<slot>

wi+

1

wi+2

O O

EO S

<intent

>

Language Understanding

Time t-2 Time t

Dialogue Management

w

0

w1 w2

Natural Language Generation

EO S

User Goal

User Agenda Modeling

User Simulator End-to-End Neural Dialogue System

Text Input

Are there any action movies to see this weekend?

Semantic Frame request_movie genre=action, date=this weekend

DM receives frame-level information

No error model: perfect recognizer and LU

Error model: simulate the possible errors

49

Error Model

• Recognition error

• LU error

Dialogue State Tracking (DST)

system dialogue acts

Dialogue Policy

Optimization

在文檔中 in DialoguesY (頁 34-49)

相關文件