2
Milestone 3 – Learning-Based Agent
2
Speech Recognition
Language Understanding (LU)
• Domain Identification
• User Intent Detection
• Slot Filling
Dialogue Management (DM)
• Dialogue State Tracking (DST)
• Dialogue Policy Natural Language
Generation (NLG) Hypothesis
are there any action movies to see this weekend
Semantic Frame request_movie
genre=action, date=this weekend
System Action/Policy request_location Text response
Where are you located?
Text Input
Are there any action movies to see this weekend?
Speech Signal
Backend Database/
Knowledge Providers
3
Milestone 3 – Speech / Multimodal API
Google Cloud Platform/ Chrome Extension (demo)
Microsoft Cognitive Service (demo)
3
4
Milestone 3 – RL-Based DM
1)
Dialogue policy optimization
Reinforcement learning agent
Check whether the agent can handle misrecognized texts or misunderstanding
2)
Evaluation
Learning curve
Success/Fail
#turn
Reward
Please check the strategies this agent applied to make
sure your RL agent has increasing performance trend
5
Milestone 3 – NN-Based NLG
3) Model
RNN-based NLG for generating sentences given the system actions associated with the slots
4) Evaluation
BLEU score for train and test
Training data (#sentences)
Testing data (#sentences)
Should be human-written
5) Creativity
Diverse/interesting responses for bonus
5
6
Milestone 3 Requirements
Report (10%)
Speech/multimodal API
Describe how you implement speech recognition or richer input analysis
Reinforcement learning based dialogue policy
Describe how you implement the RL agent
Observation, state, etc
Report the learning curves for reward and success rate
NN-based NLG
Describe how you implement the NLG
Training/testing data split (testing should come from human-written full sentences)
Show some testing results
Report the BLEU score
Performance for simulated dialogues
Show some dialogues between the simulated user and the RL agent
Report the performance in terms of success rate and reward
Demonstration (5%)
Send the public link
TAs will randomly pick 10 interactive dialogues and record the success rate
Failed interactions will be forwarded to the team, you can make them work to get credits