Slides credit from Shawn
Review
2
3
Task-Oriented Dialogue System
(Young, 2000)3
Speech Recognition
Language Understanding (LU)
• Domain Identification
• User Intent Detection
• Slot Filling
Dialogue Management (DM)
• Dialogue State Tracking (DST)
• Dialogue Policy Natural Language
Generation (NLG) Hypothesis
are there any action movies to see this weekend
Semantic Frame request_movie
genre=action, date=this weekend
System Action/Policy request_location Text response
Where are you located?
Text Input
Are there any action movies to see this weekend?
Speech Signal
Backend Database/
Knowledge Providers
http://rsta.royalsocietypublishing.org/content/358/1769/1389.short
4
Task-Oriented Dialogue System
(Young, 2000)4
Speech Recognition
Language Understanding (LU)
• Domain Identification
• User Intent Detection
• Slot Filling
Dialogue Management (DM)
• Dialogue State Tracking (DST)
• Dialogue Policy Natural Language
Generation (NLG) Hypothesis
are there any action movies to see this weekend
Semantic Frame request_movie
genre=action, date=this weekend
System Action/Policy request_location Text response
Where are you located?
Text Input
Are there any action movies to see this weekend?
Speech Signal
Backend Action / Knowledge Providers
Natural Language Generation
5
Traditional Approaches
6
Natural Language Generation (NLG)
Mapping dialogue acts into natural language
inform(name=Seven_Days, foodtype=Chinese)
Seven Days is a nice Chinese restaurant
6
7
Template-Based NLG
Define a set of rules to map frames to NL
7
Pros:simple, error-free, easy to control Cons: time-consuming, rigid, poor scalability Semantic Frame Natural Language
confirm() “Please tell me more about the product your are looking for.”
confirm(area=$V) “Do you want somewhere in the $V?”
confirm(food=$V) “Do you want a $V restaurant?”
confirm(food=$V,area=$W) “Do you want a $V restaurant in the $W.”
8
Class-Based LM NLG
(Oh and Rudnicky, 2000)
Class-based language modeling
NLG by decoding
8
Pros:easy to implement/
understand, simple rules
Cons: computationally inefficient Classes:
inform_area inform_address
…
request_area request_postcode
http://dl.acm.org/citation.cfm?id=1117568
9
Phrase-Based NLG
(Mairesse et al, 2010)Semantic DBN Phrase
DBN
Charlie Chan is a Chinese Restaurant near Cineworld in the centre
d d
Inform(name=Charlie Chan, food=Chinese, type= restaurant, near=Cineworld, area=centre)
9
Pros:efficient, good performance Cons: require semantic alignments
realization phrase semantic stack
http://dl.acm.org/citation.cfm?id=1858838
Natural Language Generation
10
Deep Learning Approaches
11
RNN-Based LM NLG
(Wen et al., 2015)<BOS> SLOT_NAME serves SLOT_FOOD .
<BOS> Din Tai Fung serves Taiwanese . delexicalisation
Inform(name=Din Tai Fung, food=Taiwanese) 0, 0, 1, 0, 0, …, 1, 0, 0, …, 1, 0, 0, 0, 0, 0…
dialogue act 1-hot representation
SLOT_NAME serves SLOT_FOOD . <EOS>
Slot weight tying
conditioned on the dialogue act
Input
Output
http://www.anthology.aclweb.org/W/W15/W15-46.pdf#page=295
12
Handling Semantic Repetition
Issue: semantic repetition
Din Tai Fung is a great Taiwanese restaurant that serves Taiwanese.
Din Tai Fung is a child friendly restaurant, and also allows kids.
Deficiency in either model or decoding (or both)
Mitigation
Post-processing rules (Oh & Rudnicky, 2000)
Gating mechanism (Wen et al., 2015)
Attention(Mei et al., 2016; Wen et al., 2015)
12
13
Visualization
13
14
Original LSTM cell
Dialogue act (DA) cell
Modify C
tSemantic Conditioned LSTM
(Wen et al., 2015)DA cell LSTM cell
Ct
it
ft
ot
rt
ht
dt
dt-1
xt
xt ht-1
xt ht-1 xt ht-1 xt ht-
1
ht-1
Inform(name=Seven_Days, food=Chinese)
0, 0, 1, 0, 0, …, 1, 0, 0, …, 1, 0, 0, … dialog act 1-hot representation d0
14
Idea: using gate mechanism to control the generated semantics (dialogue act/slots)
http://www.aclweb.org/anthology/D/D15/D15-1199.pdf
15
Attentive Encoder-Decoder for NLG
Slot & value embedding
Attentive meaning representation
15
16
Attention Heat Map
17
Model Comparison
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
1 10 100
BLEU
% of data
hlstm sclstm encdec
0%
1%
10%
100%
1 10 100
ERR
% of data hlstm
sclstm encdec
18
Structural NLG
(Dušek and Jurčíček, 2016)
Goal: NLG based on the syntax tree
Encode trees as sequences
Seq2Seq model for generation
18 https://www.aclweb.org/anthology/P/P16/P16-2.pdf#page=79
19
Contextual NLG
(Dušek and Jurčíček, 2016)
Goal: adapting users’
way of speaking, providing context- aware responses
Context encoder
Seq2Seq model
19 https://www.aclweb.org/anthology/W/W16/W16-36.pdf#page=203
20
Decoder Sampling Strategy
Decoding procedure
Greedy search
Beam search
Random search
20
Inform(name=Din Tai Fung, food=Taiwanese) 0, 0, 1, 0, 0, …, 1, 0, 0, …, 1, 0, 0, 0, 0, 0…
SLOT_NAME serves SLOT_FOOD . <EOS>
21
Greedy Search
Select the next word with the highest probability
21
22
Beam Search
Select the next k-best words and keep a beam with width=k for following decoding
22
23
Random Search
Randomly select the next word
Higher diversity
Can follow a probability distribution
23
Chit-Chat Generation
24
25
Chit-Chat Bot
Neural conversational model
Non task-oriented
25
26
Many-to-Many
Both input and output are both sequences → Sequence-to- sequence learning
E.g. Machine Translation (machine learning→機器學 習)
26
learning
machine
機 器 學 習
[Ilya Sutskever, NIPS’14][Dzmitry Bahdanau, arXiv’15]
===
27
A Neural Conversational Model
Seq2Seq
27
[Vinyals and Le, 2015]
28
Chit-Chat Bot
28
電視影集 (~40,000 sentences)、美國總統大選辯論
29
Sci-Fi Short Film - SUNSPRING
https://www.youtube.com/watch?v=LY7x2Ihqj29
30
Concluding Remarks
The three pillars of deep learning for NLG
Distributed representation – generalization
Recurrent connection – long-term dependency
Conditional RNN – flexibility/creativity
Useful techniques in deep learning for NLG
Learnable gates
Attention mechanism
Generating longer/complex sentences
Phrase dialogue as conditional generation problem
Conditioning on raw input sentence chit-chat bot
Conditioning on both structured and unstructured sources task-completing dialogue system
30