• Compute a set of new scores based on graph structure, S(Ui) satisfying
• Updated importance eigenvector of P’
Integrating Intra-Speaker Topic Modeling and
Temporal-Based Inter-Speaker Topic Modeling in Random Walk for Improved Multi-Party Meeting Summarization
Yun-Nung (Vivian) Chen and Florian Metze 1. Summary
Idea:
o Important utterances are topically similar to each other
o Utterances from the same speaker usually focus on similar topics o Temporally adjacent utterances have similar topic distribution
Approach for extractive summary
o Construct a graph to represent the utterances in the doc.
(node: utterance, edge: weighted by topical similarity)
o Topic similarity models intra- and inter-speaker information o Use the graph to compute importance of each utterance
• Basic Idea: high importance means
Utterances with higher Latent Topic Entropy (original score)
Utterances topically similar to the indicative utterances
5. Experiments
• Graph-based approach can improve summarization performance using topical similarity
• Using intra-speaker topic modeling alone is useful for improving the results, because the utterances from the speaker who speaks more important utterances should be important
• Using inter-speaker topic modeling only doesn’t improve the results
• Integrating intra- and inter-speaker topic modeling performs best for ASR and manual transcripts
6. Conclusions
• The paper is supported by the IES, U.S. Department of Education, through Grant R305A080628, and NSF workshop “Virtual Speech Kitchen”.
Latent Topic Entropy
(original importance score)
scores propagated from its neighbor weighted by topical similarity
• Dataset: 10 meetings from CMU Speech Group
4. Random Walk
Best Student Paper Nomination
Intra-Speaker Topic Modeling
Increase the edge similarity if two utterances are from the same speaker Sk
3. Intra/Inter-Speaker Topic Modeling
The utterances from the same speaker can partially share the importance
0 0.2 0.4 0.6 0.8 1
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
ωinter
σ = 0.5 σ = 1 σ = 2
li li+1
li-1 li+2 li-2
… …
Temporal Order
• Graph Construction
o Node: utterances in a document
o Edge weight: topical similarity between utt.
2. Graph Construction
The utterances topically similar to more important utterances should be more important
a meeting document d
U1 U2 U3 U4 U5 U6
Sim(Ui, Uj): latent topic generative significance of utterance Ui to Uj based on PLSA (see paper)
U1
U2
U3 U4
U5
U6
Inter-Speaker Topic Modeling
Increase the edge similarity if two utterances have a closer position in the dialogue based on normal distribution
Topic Model
(ex. PLSA, LDA)
Multi-Party Meeting
Corpus
ASR Manual
Random
Walk Re-Rank
U5 93 U2 88 U3 75
:
Summary Latent Topic
Entropy (Baseline)
U5 98 U4 85 U3 73
:
Summary Baseline Part
Re-Rank Part
Flowchart of proposed approach
Temporally adjacent
utterances can partially share the importance
0.44 0.45 0.46 0.47 0.48 0.49 0.50
Baseline: LTE Random Walk (RW)
RW + Intra- Speaker
RW + Inter- Speaker
RW + Inter- &
Intra-Speaker
30% Summary for ASR Transcripts
ROUGE-1 ROUGE-L
0.44 0.45 0.46 0.47 0.48 0.49 0.50
Baseline: LTE Random Walk (RW)
RW + Intra- Speaker
RW + Inter- Speaker
RW + Inter- &
Intra-Speaker
30% Summary for Manual Transcripts
ROUGE-1 ROUGE-L
F-measure