• 沒有找到結果。

Learning OOV through Semantic Relatedness in Spoken Dialog Systems

N/A
N/A
Protected

Academic year: 2022

Share "Learning OOV through Semantic Relatedness in Spoken Dialog Systems"

Copied!
1
0
0

加載中.... (立即查看全文)

全文

(1)

Learning OOV through Semantic Relatedness in Spoken Dialog Systems

Ming Sun, Yun-Nung (Vivian) Chen, and Alexander I. Rudnicky

• Speech recognition and language understanding

performance can be improved through an OOV expect- and-learn procedure.

• A limited domain vocabulary can be utilized to

effectively acquire OOVs by the word relatedness theory through web knowledge bases.

• With data-driven semantic relatedness, both the global and local learning procedures are able to successfully harvest more than 50% of OOVs, leading to better

recognition and understanding performance.

• This work demonstrates that

o OOV learning may benefit dialog system o the proposed expect-and-learn strategy

outperforms the traditional detect-and-learn in both higher effectiveness and no human

involvement.

1. Linguistically semantic relatedness

o Defined by linguistics, e.g., WordNet (WN),

Paraphrase Database (PPDB) (

Ganitkevitch et al., 2013)

2. Data-driven semantic relatedness

o Distributional semantics, e.g., continuous bag-of- word embeddings (CBOW) (Mikolov et al., 2013)

 Detect-and-Learn (Qin et al., 2011; 2012):

o Discover OOV words during the conversation o Example:

S: “I heard something like SELF, can you repeat it?”

U: “It’s SELFIE.”

o Drawbacks

• Limited number of new words

• Required human efforts to correct spellings and pronunciations

 Expect-and-Learn (proposed):

o Use semantic relatedness to automatically enrich the vocabulary and language model beforehand

o Advantages

• Large amount of potentially useful new words can be learned

• No human involved

• Vocabulary Expansion

 Idea: learn new words related to the current domain represented by in-vocabulary words (IVs)

1. From the IV with the highest frequency v*, one unseen word w* is extracted from the resource according to:

» Local relatedness (Algo1): w* is mostly related to v*

» Global relatedness (Algo2): w* is mostly related to the complete IV set

2. Repeat until the size of vocabulary satisfies a threshold

• Language Model Expansion

o Use Kneser-Ney smoothing to estimate the unigram for the newly learned OOVs.

Recognition and Understanding Performance

Learning Strategy

Vocab Size

OOV Rate (%)

Recog.

WER (%)

SLU F1 (%) Baseline 2854 22.6 49.9 57.0

Algo1 5394 11.7 41.6 65.4

Algo2 5394 11.6 42.0 65.1

Oracle 4254 0.0

23.5 80.9

Only Domain Specific Models

Domain + Generic Models

Summary Experimental Results

• Dataset: Wall Street Journal

o Acoustic model: WSJ GMM-HMM semi continuous o Pronunciation: CMU Dictionary + Logios Lexicon Tool

• OOV Coverage Evaluation

o How much OOV tokens in test set can be covered by using different relatedness resources.

OOV Learning Method

0.00 0.10 0.20 0.30 0.40 0.50 0.60

0 500 1000 1500 2000 2500 3000

OOV Coverage

#Learned Word CBOW-Algo2 CBOW-Algo1

PPDB-Algo2 PPDB-Algo1

WN-Algo1 WN-Algo2 Detect & Learn

Random

Expect-and-Learn Procedure

Relatedness Resources

Conclusion

Test Utterance

Recognition Result

“i want to selfie”

Learned OOV OOV

Learning Domain-Specific

Collection

Domain Vocab Domain LM

ASR

Learning Strategy

Vocab Size

OOV Rate (%)

Recog.

WER (%)

SLU F1 (%) Baseline 20175 3.6 21.7 82.2

Algo1 22599 3.0 20.3 83.2

Algo2 22599 3.0 20.4 83.2

Oracle 20431 0.0

15.1 87.1

 Motivation

o Domain language may drift over time so that

ensuring language coverage in dialog systems can be a challenge (

Furnas et al., 1987)

.

o The mismatch between training data and current input increases recognition errors and

misunderstanding.

o Detect-and-Learn strategy requires human effort and takes more time to adapt the vocabulary and LM.

 Approach: Expect-and-Learn

o Automatically acquiring potential out-of-vocabulary (OOV) words by leveraging different types of words relatedness.

 Result

o Both recognition and semantic parsing accuracy can be improved after acquiring potential OOVs.

參考文獻

相關文件

Finally, we train the SLU model by learning latent feature vectors for utterances and slot candidates through MF techniques. Combining with a knowledge graph propagation model based

For slot filling task, the best results performance on both ASR and manual transcripts is from neighbor-derived similarity using word vectors trained on Google News, because

To do this, we propose the use of a state-of-the-art frame-semantic parser, and a spectral clustering based slot ranking model that adapts the generic output of the parser to the

A spoken language understanding (SLU) component requires the domain ontology to decode utterances into semantic forms, which contain core content (a set of slots and slot-fillers)

The approach uses structured knowledge resources (e.g. Freebase, Wikipedia, FrameNet) to induce types of slots for generating semantic seeds, and enriches the semantics of

I would like to thank the Education Bureau and the Academy for Gifted Education for their professional support and for commissioning the Department of English Language and

Roles of English language (ELTs) and non- language teachers (NLTs)3. General, academic and technical

 In principle, special schools would take into account the diverse learning needs and circumstances of students with different types of special educational needs, by