Milestone 3 – Learning-Based Agent

(1)

(2)

2

Milestone 3 – Learning-Based Agent

2

Speech Recognition

Language Understanding (LU)

• Domain Identification

• User Intent Detection

• Slot Filling

Dialogue Management (DM)

• Dialogue State Tracking (DST)

• Dialogue Policy Natural Language

Generation (NLG) Hypothesis

are there any action movies to see this weekend

Semantic Frame request_movie

genre=action, date=this weekend

System Action/Policy request_location Text response

Where are you located?

Text Input

Are there any action movies to see this weekend?

Speech Signal

Backend Database/

Knowledge Providers

(3)

3

Milestone 3 – Speech / Multimodal API

 Google Cloud Platform/ Chrome Extension (demo)

 Microsoft Cognitive Service (demo)

3

(4)

4

Milestone 3 – RL-Based DM

1)

Dialogue policy optimization



Reinforcement learning agent



Check whether the agent can handle misrecognized texts or misunderstanding

2)

Evaluation



Learning curve



Success/Fail



#turn



Reward



Please check the strategies this agent applied to make

sure your RL agent has increasing performance trend

(5)

5

Milestone 3 – NN-Based NLG

3) Model

 RNN-based NLG for generating sentences given the system actions associated with the slots

4) Evaluation

 BLEU score for train and test

 Training data (#sentences)

 Testing data (#sentences)

 Should be human-written

5) Creativity

 Diverse/interesting responses for bonus

5

(6)

6

Milestone 3 Requirements



Report (10%)



Speech/multimodal API

 Describe how you implement speech recognition or richer input analysis



Reinforcement learning based dialogue policy

 Describe how you implement the RL agent

 Observation, state, etc

 Report the learning curves for reward and success rate



NN-based NLG

 Describe how you implement the NLG

 Training/testing data split (testing should come from human-written full sentences)

 Show some testing results

 Report the BLEU score



Performance for simulated dialogues

 Show some dialogues between the simulated user and the RL agent

 Report the performance in terms of success rate and reward



Demonstration (5%)



Send the public link



TAs will randomly pick 10 interactive dialogues and record the success rate

 Failed interactions will be forwarded to the team, you can make them work to get credits