Integrating Intra-Speaker Topic Modeling and Temporal-Based Inter-Speaker Topic Modeling in Random Walk for Improved Multi-Party Meeting Summarization

(1)

• Compute a set of new scores based on graph structure, S(U_i) satisfying

• Updated importance  eigenvector of P’

Integrating Intra-Speaker Topic Modeling and

Temporal-Based Inter-Speaker Topic Modeling in Random Walk for Improved Multi-Party Meeting Summarization

Yun-Nung (Vivian) Chen and Florian Metze 1. Summary

Idea:

o Important utterances are topically similar to each other

o Utterances from the same speaker usually focus on similar topics o Temporally adjacent utterances have similar topic distribution

Approach for extractive summary

o Construct a graph to represent the utterances in the doc.

(node: utterance, edge: weighted by topical similarity)

o Topic similarity models intra- and inter-speaker information o Use the graph to compute importance of each utterance

• Basic Idea: high importance means

 Utterances with higher Latent Topic Entropy (original score)

 Utterances topically similar to the indicative utterances

5. Experiments

• Graph-based approach can improve summarization performance using topical similarity

• Using intra-speaker topic modeling alone is useful for improving the results, because the utterances from the speaker who speaks more important utterances should be important

• Using inter-speaker topic modeling only doesn’t improve the results

• Integrating intra- and inter-speaker topic modeling performs best for ASR and manual transcripts

6. Conclusions

• The paper is supported by the IES, U.S. Department of Education, through Grant R305A080628, and NSF workshop “Virtual Speech Kitchen”.

Latent Topic Entropy

(original importance score)

scores propagated from its neighbor weighted by topical similarity

• Dataset: 10 meetings from CMU Speech Group

4. Random Walk

 Best Student Paper Nomination 

Intra-Speaker Topic Modeling

Increase the edge similarity if two utterances are from the same speaker S_k

3. Intra/Inter-Speaker Topic Modeling

 The utterances from the same speaker can partially share the importance

0 0.2 0.4 0.6 0.8 1

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8

ω_inter

σ = 0.5 σ = 1 σ = 2

l_i l_i+1

l_i-1 l_i+2 l_i-2

… …

Temporal Order

• Graph Construction

o Node: utterances in a document

o Edge weight: topical similarity between utt.

2. Graph Construction

 The utterances topically similar to more important utterances should be more important

a meeting document d

U₁ U₂ U₃ U₄ U₅ U₆

Sim(U_i, U_j): latent topic generative significance of utterance U_i to U_j based on PLSA (see paper)

U₁

U₂

U₃ U₄

U₅

U₆

Inter-Speaker Topic Modeling

Increase the edge similarity if two utterances have a closer position in the dialogue based on normal distribution

Topic Model

(ex. PLSA, LDA)

Multi-Party Meeting

Corpus

ASR Manual

Random

Walk Re-Rank

U₅ 93 U₂ 88 U₃ 75

:

Summary Latent Topic

Entropy (Baseline)

U₅ 98 U₄ 85 U₃ 73

:

Summary Baseline Part

Re-Rank Part

Flowchart of proposed approach

 Temporally adjacent

utterances can partially share the importance

0.44 0.45 0.46 0.47 0.48 0.49 0.50

Baseline: LTE Random Walk (RW)

RW + Intra- Speaker

RW + Inter- Speaker

RW + Inter- &

Intra-Speaker

30% Summary for ASR Transcripts

ROUGE-1 ROUGE-L