Author: Yun-Nung Chen, Yu Huang , Ching-Feng Yeh, and Lin-Shan Lee Speaker: Hung-Yi Lee
Spoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically
Extracted Key Terms
National Taiwan University
Outline
O
Introduction
O
Graph-based summarization approach
A spoken document is transformed into a graph structure
Nodes: sentences in a spoken document
Edge weight: topical similarities of sentences
Random walk is used to select indicative sentences
all sentences in a document can be jointly considered
O
Experiments
O
Conclusion
Introduction –
Extractive Summarization (1/2)
O
Extractive speech summarization
O
Select the indicative sentences in a spoken document
O
Cascade the sentences to form a summary
O
The number of sentences selected as summary is
decided by a predefined ratio
Introduction –
Extractive Summarization (2/2)
O
Each sentence S in a spoken document d is given an importance score I(S,d)
O
Select the indicative sentences based on I(S,d)
n
i
t
t t
t
S
1 2
ni
i
d
t s d
1
] ,
[ ,
S
I
sentence term
term statistical measure
Importance score
Introduction – PLSA
……
……
……
……
……
……
……
……
……
……
……
……
T1
t
1t
2t
3t
it
MS1
S2
Sj
SJ
Sentences Latent
Topics
Terms
…… …… ……
T2
Tk
TK
P(Tk|Sj)
P(Tk|Sj): weight of latent topic Tk for sentence Sj
Proposed Approach (1/2)
• Basic idea
▫
Not only the sentences with high importance score
based on statistical measure should be considered
as indicative sentence
Proposed Approach (1/2)
• Basic idea
▫
Not only the sentences with high importance score based on statistical measure should be considered as indicative sentence
▫
But the sentences topically similar to the indicative
sentences should also be considered as indicative
Proposed Approach (2/2)
• Graph-based approach
▫ Sentences in a spoken document are nodes on a graph, and topical similarities of sentences are weights of
edges.
▫ Use random walk to obtain new scores for summary selection
▫ → all sentences in the document can be jointly considered rather than individually.
Spoken Document d
S1
S1 S2 S3 S4 S5 S6
Graph Construction (1/2)
S2
S3 S4
S5
S6
Each sentence Si in the spoken document d is a node on the graph.
W( i , j ) (Si
Sj):Topical similarity from sentence Si to Sj
(based on PLSA latent topics of sentences)
Graph Construction (1/2)
Spoken Document d
S1
S1 S2 S3 S4 S5
S6 S2
S3 S4
S5
S6 W( 3, 4): topical
similarities from sentence S3 to S4
O
Topical Similarity from sentences S
ito S
jO Edge weight W(i , j) (sentence Si → sentence Sj)
Graph Construction (2/2) - Topical Similarities
tj t1
tm
Si
T1 T2
Tk TK
……
Sj P(Tk|Si) P(Tk|Sj)
t2
……
ti t1
tn t2
W( i , j ): evaluated by the latent topic similarities of sentences Si to Sj based on PLSA model
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Mathematical Formulation
G(i) for sentence Si would be a new importance score for summary selection
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Mathematical Formulation
The original importance score of node Si
Scores propagate from other nodes to node Si ( weighted by 1-α ) ( weighted by α )
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Mathematical Formulation
ni
i
d
t s d
1
] ,
[ ,
S
I
term statistical measure
Importance score
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Si
Mathematical Formulation
Sj
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Si
Mathematical Formulation
Sj
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
j out
Sk j k
i i j
j
W ,
, , W
W ˆ
Mathematical Formulation
Si Sj
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Mathematical Formulation
Sj Si
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
j out
Sk j k
i i j
j
W ,
, , W
W ˆ
The scores propagate from a node to all other
nodes sums to unity.
Sa
i in S i
j
i j j
d
i 1 I S , G Wˆ ,
G
Mathematical Formulation
G(i) can obtain higher score when 1) I(Si,d) is high.
2) More sentences topically similar to Si
i in S i
j
i j j
d
i 1 I S , G Wˆ ,
G
Mathematical Formulation
G(i) can obtain higher score when 1) I(Si,d) is high.
2) More sentences topically similar to Si
i in S i
j
i j j
d
i 1 I S , G Wˆ ,
G
Mathematical Formulation
All sentences in the documents are considered jointly
Rather than individually
4 1- I S ,
6 ˆ 6,4
3 ˆ 3,4G 4 d G W G W
Mathematical Formulation – an Example
S1
S2
S3 S4
S5
S6
Find G(1), G(2), G(3), G(4), G(5), G(6) such that
G(4) G(3)
G(6)
3 1- I S ,
1 ˆ 1,3
4 ˆ 4,3G 3 d G W G W
4 1- I S ,
6 ˆ 6,4
3 ˆ 3,4G 4 d G W G W
1 1- I S ,
2 ˆ 2,1G 1 d G W
2 1- I S ,
3 ˆ 3,2G 2 d G W
5 1- I S ,
4 ˆ 4,5G 5 d G W
6 1- I S ,
4 ˆ 4,6
5 ˆ 5,6G 6 d G W G W
Mathematical Formulation – Equations to be solved
S1
S2
S3 S4
S5
S6
Find G(1), G(2), G(3), G(4), G(5), G(6) such that
How to solve these equations to obtain G(1), G(2), …… G(6)?
G(4) G(3)
G(6)
solve the problem iteratively (random walk)
Random Walk Solution
S1
S2
S3 S4
S5
S6
G0(4) G0(3)
G0(1)
G0(2)
G0(5)
G0(6)
Each sentence is assigned an initial value G0(i)
G0(i) = I(Si,d)
Random Walk Solution
S1
S2
S3 S4
S5
S6
G0(4) G0(3)
G0(1)
G0(2)
G0(5)
G0(6)
Update the score for each sentence ……
4 1- I S ,
6 ˆ 6,4
3 ˆ 3,4G1 4 d G0 W G0 W
Random Walk Solution
S1
S2
S3 S4
S5
S6
G1(4) G1(3)
G1(1)
G1(2)
G1(5)
G1(6)
Update the score for each sentence ……
4 1- I S ,
6 ˆ 6,4
3 ˆ 3,4G2 4 d G1 W G1 W
Random Walk Solution
S1
S2
S3 S4
S5
S6
G2(4) G2(3)
G2(1)
G2(2)
G2(5)
G2(6)
The process is repeated ……
4 1- I S ,
6 ˆ 6,4
3 ˆ 3,4G3 4 d G2 W G2 W
Random Walk Solution
S1
S2
S3 S4
S5
S6
G∞(4) G∞(3)
G∞(1)
G∞(2)
G∞(5)
G∞(6)
The process is repeated ……
The score of each node would finally converge.
According to the theory of random walk:
The converged score G∞(i) is actually G(i) satisfying
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
I’(S
i,d) = I(S
i,d)
1- δG(i)
δNew scores: Consider graph structure Original importance
score based on terms in the sentences For summary
selection
Graph-based Summarization Approach
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
ni
i
d t s d
1
] ,
[ ,
S
I
term statistical measure
Importance score
Experimental Setup (1/2)
O Corpus: course offered in National Taiwan University
O Mandarin Chinese embedded by English words
O Single speaker
O 45.2 hours
O ASR System
O Bilingual AM with model adaptation [1]
O LM with adaptation using random forests [2]
Language Mandarin English Overall
Acc (%) 78.15 53.44 76.26
[1] Ching-Feng Yeh, et al., “Bilingual Acoustic Model Adaptation by Unit Merging on Different Levels and Cross-level Integration, ” Interspeech, 2011.
[2] Ching-Feng Yeh, et al. , “An Integrated Framework for Transcribing Mandarin-English Code-mixed Lectures with Improved Acoustic and Language Modeling,” ISCSLP, 2010.
Experimental Setup (2/2)
O
Spoken Documents
▫ We segmented the whole lecture into 155 documents by topic segmentation
▫ 34 documents out of the 155 were tested.
▫ The average length of each document was about 17.5 minutes
▫ Human produced reference summaries for each document
O
Evaluation
O ROUGE-1, ROUGE-2, ROUGE-3
O ROUGE-L: Longest Common Subsequence (LCS)
Experimental Results
41 46 51 56
10% 20% 30%
18 23 28
10% 20% 30%
9 14 19
10% 20% 30%
40 45 50
10% 20% 30%
ROUGE-1 ROUGE-2 ROUGE-3 ROUGE-L
Summarization ratio Summarization ratio Summarization ratio Summarization ratio
Baseline1 Baseline1 + Proposed (Graph)
Baseline1: I(Si,d) – importance score using latent topic entropy term statistical measure
Baseline1+Proposed: I(Si,d)G(i)
Experimental Results
41 46 51 56
10% 20% 30%
18 23 28
10% 20% 30%
9 14 19
10% 20% 30%
40 45 50
10% 20% 30%
ROUGE-1 ROUGE-2 ROUGE-3 ROUGE-L
Summarization ratio Summarization ratio Summarization ratio Summarization ratio
Baseline1 Baseline1 + Proposed (Graph)
The proposed approach outperformed the first baseline in most cases.
(Compare blue and red bars)
Experimental Results
41 46 51 56
10% 20% 30%
18 23 28
10% 20% 30%
9 14 19
10% 20% 30%
40 45 50
10% 20% 30%
ROUGE-1 ROUGE-2 ROUGE-3 ROUGE-L
Baseline2 Baseline2 + Proposed (Graph)
Summarization ratio Summarization ratio Summarization ratio Summarization ratio 41
46 51 56
10% 20% 30%
18 23 28
10% 20% 30%
9 14 19
10% 20% 30%
40 45 50
10% 20% 30%
ROUGE-1 ROUGE-2 ROUGE-3 ROUGE-L
Summarization ratio Summarization ratio Summarization ratio Summarization ratio
Baseline1 Baseline1 + Proposed (Graph)
Baseline2: I(Si,d) – importance score using key-term based statistical measure
Baseline2+Proposed: I(Si,d)G(i)
Experimental Results
41 46 51 56
10% 20% 30%
18 23 28
10% 20% 30%
9 14 19
10% 20% 30%
40 45 50
10% 20% 30%
ROUGE-1 ROUGE-2 ROUGE-3 ROUGE-L
Baseline2 Baseline2 + Proposed (Graph)
Summarization ratio Summarization ratio Summarization ratio Summarization ratio 41
46 51 56
10% 20% 30%
18 23 28
10% 20% 30%
9 14 19
10% 20% 30%
40 45 50
10% 20% 30%
ROUGE-1 ROUGE-2 ROUGE-3 ROUGE-L
Summarization ratio Summarization ratio Summarization ratio Summarization ratio
Baseline1 Baseline1 + Proposed (Graph)
The proposed approach always outperformed the second baseline.
(Compare green and orange bars)
Conclusions
• The performance of summarization can be improved by
▫ Graph-based approach considering topical similarity
This offers a way to globally consider all sentences in a document for
summarization rather than considers each
sentence individually
37
Master Defense, National Taiwan University
Random Walk Solution
S1
S2
S3 S4
S5
S6
αG0(4) αG0(3)
αG0(1)
αG0(2)
αG0(5)
αG0(6)
Random Walk Solution
S1
S2
S3 S4
S5
S6
αG0(4) αG0(3)
αG0(1)
αG0(2)
αG0(5)
αG0(6)
j out
Sk j k
i i j
j
W ,
, , W
W ˆ
4 Wˆ 4,3G0
4 Wˆ 4,6G0
4 Wˆ 4,5G0
Random Walk Solution
S1
S2
S3 S4
S5
S6
αG0(4) αG0(3)
αG0(1)
αG0(2)
αG0(5)
αG0(6)
6 Wˆ 6,4G0
3 Wˆ 3,4G0
Random Walk Solution
S1
S2
S3 S4
S5
S6
6 Wˆ 6,4G0
3 Wˆ 3,4G0
4 1- Iˆ 4 G
3 Wˆ 3,4 G
6 Wˆ 6,4G1 0 0
1-
Iˆ 4
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Mathematical Formulation
Sj Si
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
W ˆ ,
1
Skout j j k
j out
Sk j k
i i j
j
W ,
, , W
W ˆ
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Mathematical Formulation
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
Sj Si
Sa
j i W
j aW
i j j W
, ,
G ,
The amount of score Sj propagate to Si is
i in S i
j
i j j
d
i 1 I S , G W ˆ ,
G
Mathematical Formulation
Find a set of new scores based on graph structure
{G(i) for each sentence Si in document d} which satisfies
Sj Si
Sa
j i W
j aW
i j j W
, ,
G ,
The amount of score Sj propagate to Si is
j i W
j aW
a j j W
, ,
G ,
The amount of score Sj propagate to Sa is
4 1- I S ,
6 ˆ 6,4
3 ˆ 3,4G 4 d G W G W
3,2 3,4
3,44 , ˆ 3
W W
W W
6,4
6,4ˆ W
W
Mathematical Formulation – an Example
S1
S2
S3 S4
S5
S6
Find G(1), G(2), G(3), G(4), G(5), G(6) such that
G(4) G(3)
G(6)
depends on S4 itself
Depends on topically similar sentences (S3 and S6)