Knowledge Providers Language Understanding (LU) Framework

(1)

(2)

2

Framework

2

Speech Recognition

Language Understanding (LU)

• Domain Identification

• User Intent Detection

• Slot Filling

Dialogue Management (DM)

• Dialogue State Tracking (DST)

• Dialogue Policy Natural Language

Generation (NLG) Hypothesis

are there any action movies to see this weekend

Semantic Frame request_movie

genre=action, date=this weekend

System Action/Policy request_location Text response

Where are you located?

Text Input

Are there any action movies to see this weekend?

Speech Signal

Backend Database/

Knowledge Providers

(3)

3

Framework

3

Speech Recognition

Language Understanding (LU)

• Domain Identification

• User Intent Detection

• Slot Filling

Dialogue Management (DM)

• Dialogue State Tracking (DST)

• Dialogue Policy Natural Language

Generation (NLG) Hypothesis

are there any action movies to see this weekend

Semantic Frame request_movie

genre=action, date=this weekend

System Action/Policy request_location Text response

Where are you located?

Text Input

Are there any action movies to see this weekend?

Speech Signal

Backend Database/

Knowledge Providers

If your LU is weak, the rule-based policy easily performs bad

Check whether all values in the backend tables can be searched as the target

Check whether the output responses are diverse enough

If possible, try richer multimodal input signal for better interaction

(4)

4

System Improvement



Ontology: check whether all columns in the table can be searched as the target



LU: evaluate the LU to see the coverage of the understanding module

 Testing data should come from real human

 Provide the system link to collect more dialogues and then annotate them for evaluation



DM: add multi-turn interactions into the simulator for training the RL agent

 The RL agent should handle misunderstanding better than the rule-based agent

 Check whether the agent can handle misrecognized texts or misunderstanding

 If the RL agent performs worse than the rule agent, increase your system complexity

 More functionality/backend databases, more complex simulated interactions

 Please check the strategies this agentapplied to make sure your RL agent has increasing performance trend



NLG: improve diverse and interesting responses



Multimodality: try richer multimodality for interesting interactions

 Emotion recognition, speaker recognition, etc for better greeting

4

(5)

5

Final Score



System functionality



#tables, #slots, #intents



System success performance



Human testing performance evaluated by TAs

 ~30 dialogues

 If the failed dialogues are fixed, we use the refined performance



Evaluation



Correctness and reasonability

 Testing data should be from real human instead of generated patterns



Creativity



Multimodality usage (e.g. emotion)



Diverse/interesting responses



The poster template can be revised freely [link]

 Due: 6/17 23:59:59

5

Top 3 Best System Awards

Creativity Awards

(6)

6

Milestone 3 / Peer Demo Log



Improve your system based on the feedback



Milestone 3 [link]



Peer demo feedback [link]



Team peer review form



Due 6/15 23:59:59

6

(7)

7

Poster Content (1)



Demo link / QR code for app



Input



Interaction example



Supported APIs (speech, vision, emotion, etc)



Functionality your system supports



Ontology



DB tables (size of the DB, #column, #slot, #intent)



How did you get the DB data



LU



Model architecture



Training data size



Testing data size (should come from real human)



Performance on testing data (frame accuracy, etc)

7

3 numbers should be close

(8)

8

Poster Content (2)



DM



Model architecture



User simulation summary



Trend of the learning curves for rule-based and RL agents (success rate, reward, etc)



Show the example with the difference between two agents



NLG



Model architecture



Training data size



Testing data size (should come from real human)



Performance on testing data (BLEU score, naturalness)

8

(9)

9

Poster Presentation



2 minute presentation



Supported functions



Special features



Whole system performance



3 minute demonstration



Allow the user to test the system

9

(10)

10

Agenda



9 am – 10 am



Preparation (poster, system, etc.)



10 am – 11:50 am



Presentation



12 pm – 12:20 pm



Lunch break & judge discussion



12:20 pm – 1 pm



Company sharing



Award announcement

10

(11)

11

Final Report / Code



Due: 6/25 (Sun) 23:59:59



Code



README, Requirements



Report



GitHub page [link]



Put the poster contents / figures into the page as the report (can be more detailed)

11