Spoken Lecture Summarization by Random Walk

(1)

Author: Yun-Nung Chen, Yu Huang , Ching-Feng Yeh, and Lin-Shan Lee Speaker: Hung-Yi Lee

Spoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically

Extracted Key Terms

National Taiwan University

(2)

Outline

O

Introduction

O

Graph-based summarization approach

 A spoken document is transformed into a graph structure

 Nodes: sentences in a spoken document

 Edge weight: topical similarities of sentences

 Random walk is used to select indicative sentences

 all sentences in a document can be jointly considered

O

Experiments

O

Conclusion

(3)

Introduction –

Extractive Summarization (1/2)

O

Extractive speech summarization

O

Select the indicative sentences in a spoken document

O

Cascade the sentences to form a summary

O

The number of sentences selected as summary is

decided by a predefined ratio

(4)

Introduction –

Extractive Summarization (2/2)

O

Each sentence S in a spoken document d is given an importance score I(S,d)

O

Select the indicative sentences based on I(S,d)

n

i

t

t t

t

S 

₁ ₂

   

    







ⁿ

i

d

t s d

1

] ,

[ ,

S

I    

sentence term

term statistical measure

Importance score

(5)

Introduction – PLSA

……

T₁

t

₁

t

₂

t

₃

t

_i

t

_M

S₁

S₂

S_j

S_J

Sentences Latent

Topics

Terms

…… …… ……

T₂

T_k

T_K

P(T_k|S_j)

P(T_k|S_j): weight of latent topic T_k for sentence S_j

(6)

Proposed Approach (1/2)

• Basic idea

▫

Not only the sentences with high importance score

based on statistical measure should be considered

as indicative sentence

(7)

Proposed Approach (1/2)

• Basic idea

▫

Not only the sentences with high importance score based on statistical measure should be considered as indicative sentence

▫

But the sentences topically similar to the indicative

sentences should also be considered as indicative

(8)

Proposed Approach (2/2)

• Graph-based approach

▫ Sentences in a spoken document are nodes on a graph, and topical similarities of sentences are weights of

edges.

▫ Use random walk to obtain new scores for summary selection

▫ → all sentences in the document can be jointly considered rather than individually.

(9)

Spoken Document d

S₁

S₁ S₂ S₃ S₄ S₅ S₆

Graph Construction (1/2)

S₂

S₃ S₄

S₅

S₆

Each sentence S_i in the spoken document d is a node on the graph.

(10)

W( i , j ) (S_i



S_j):

Topical similarity from sentence S_i to S_j

(based on PLSA latent topics of sentences)

Graph Construction (1/2)

Spoken Document d

S₁

S₁ S₂ S₃ S₄ S₅

S₆ S₂

S₃ S₄

S₅

S₆ W( 3, 4): topical

similarities from sentence S₃ to S₄

(11)

O

Topical Similarity from sentences S

_i

to S

_j

O Edge weight W(i , j) (sentence S_i → sentence S_j)

Graph Construction (2/2) - Topical Similarities

t_j t₁

t_m

S_i

T₁ T₂

T_k T_K

……

S_j P(T_k|S_i) P(T_k|S_j)

t₂

……

t_i t₁

t_n t₂

 W( i , j ): evaluated by the latent topic similarities of sentences S_i to S_j based on PLSA model

(12)

Find a set of new scores based on graph structure

{G(i) for each sentence S_i in document d} which satisfies

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

Mathematical Formulation

 G(i) for sentence S_i would be a new importance score for summary selection

(13)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

Mathematical Formulation

The original importance score of node S_i

Scores propagate from other nodes to node S_i ( weighted by 1-α ) ( weighted by α )

(14)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

Mathematical Formulation

    







ⁿ

i

d

t s d

1

] ,

[ ,

S

I    

Importance score

(15)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

S_i

Mathematical Formulation

S_j

(16)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

S_i

Mathematical Formulation

S_j

(17)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

   

 







j out

S_k j k

i i j

j

W ,

, , W

W ˆ

Mathematical Formulation

S_i S_j

(18)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

Mathematical Formulation

S_j S_i

   

 







j out

S_k j k

i i j

j

W ,

, , W

W ˆ

The scores propagate from a node to all other

nodes sums to unity.

S_a

(19)

         



 









i in S i

j

i j j

d

i 1 I S , G Wˆ ,

G  

Mathematical Formulation

 G(i) can obtain higher score when 1) I(S_i,d) is high.

2) More sentences topically similar to S_i

(20)

         



 









i in S i

j

i j j

d

i 1 I S , G Wˆ ,

G  

Mathematical Formulation

 G(i) can obtain higher score when 1) I(S_i,d) is high.

2) More sentences topically similar to S_i

(21)

         



 









i in S i

j

i j j

d

i 1 I S , G Wˆ ,

G  

Mathematical Formulation

 All sentences in the documents are considered jointly

 Rather than individually

(22)

    

⁴ ¹^- ^I ^S ^,

    

⁶ ^ˆ ⁶^,⁴

   

³ ^ˆ ³^,⁴

G   ₄ d G W G W

Mathematical Formulation – an Example

S₁

S₂

S₃ S₄

S₅

S₆

Find G(1), G(2), G(3), G(4), G(5), G(6) such that

G(4) G(3)

G(6)

(23)

    

³ ¹^- ^I ^S ^,

    

¹ ^ˆ ¹^,³

   

⁴ ^ˆ ⁴^,³

G   ₃ d G W G W

    

⁴ ¹^- ^I ^S ^,

    

⁶ ^ˆ ⁶^,⁴

   

³ ^ˆ ³^,⁴

G   ₄ d G W G W

    

¹ ¹^- ^I ^S ^,

    

² ^ˆ ²^,¹

G   ₁ d G W

    

² ¹^- ^I ^S ^,

    

³ ^ˆ ³^,²

G   ₂ d G W

    

⁵ ¹^- ^I ^S ^,

    

⁴ ^ˆ ⁴^,⁵

G   ₅ d G W

    

⁶ ¹^- ^I ^S ^,

    

⁴ ^ˆ ⁴^,⁶

   

⁵ ^ˆ ⁵^,⁶

G   ₆ d G W G W

Mathematical Formulation – Equations to be solved

S₁

S₂

S₃ S₄

S₅

S₆

 How to solve these equations to obtain G(1), G(2), …… G(6)?

G(4) G(3)

G(6)

 solve the problem iteratively (random walk)

(24)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

G₀(4) G₀(3)

G₀(1)

G₀(2)

G₀(5)

G₀(6)

 Each sentence is assigned an initial value G₀(i)

G₀(i) = I(S_i,d)

(25)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

G₀(4) G₀(3)

G₀(1)

G₀(2)

G₀(5)

G₀(6)

 Update the score for each sentence ……

    

⁴ ¹^- ^I ^S ^,

    

⁶ ^ˆ ⁶^,⁴

   

³ ^ˆ ³^,⁴

G₁   ₄ d G₀ W G₀ W

(26)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

G₁(4) G₁(3)

G₁(1)

G₁(2)

G₁(5)

G₁(6)

 Update the score for each sentence ……

    

⁴ ¹^- ^I ^S ^,

    

⁶ ^ˆ ⁶^,⁴

   

³ ^ˆ ³^,⁴

G₂   ₄ d G₁ W G₁ W

(27)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

G₂(4) G₂(3)

G₂(1)

G₂(2)

G₂(5)

G₂(6)

 The process is repeated ……

    

⁴ ¹^- ^I ^S ^,

    

⁶ ^ˆ ⁶^,⁴

   

³ ^ˆ ³^,⁴

G₃   ₄ d G₂ W G₂ W

(28)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

G_∞(4) G_∞(3)

G_∞(1)

G_∞(2)

G_∞(5)

G_∞(6)

 The process is repeated ……

 The score of each node would finally converge.

 According to the theory of random walk:

 The converged score G_∞(i) is actually G(i) satisfying

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

(29)

I’(S

_i

,d) = I(S

_i

,d)

^{1- δ}

G(i)

^δ

New scores: Consider graph structure Original importance

score based on terms in the sentences For summary

selection

Graph-based Summarization Approach

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

    







ⁿ

i

d t s d

1

] ,

[ ,

S

I    

Importance score

(30)

Experimental Setup (1/2)

O Corpus: course offered in National Taiwan University

O Mandarin Chinese embedded by English words

O Single speaker

O 45.2 hours

O ASR System

O Bilingual AM with model adaptation [1]

O LM with adaptation using random forests [2]

Language Mandarin English Overall

Acc (%) 78.15 53.44 76.26

[1] Ching-Feng Yeh, et al., “Bilingual Acoustic Model Adaptation by Unit Merging on Different Levels and Cross-level Integration, ” Interspeech, 2011.

[2] Ching-Feng Yeh, et al. , “An Integrated Framework for Transcribing Mandarin-English Code-mixed Lectures with Improved Acoustic and Language Modeling,” ISCSLP, 2010.

(31)

Experimental Setup (2/2)

O

Spoken Documents

▫ We segmented the whole lecture into 155 documents by topic segmentation

▫ 34 documents out of the 155 were tested.

▫ The average length of each document was about 17.5 minutes

▫ Human produced reference summaries for each document

O

Evaluation

O ROUGE-1, ROUGE-2, ROUGE-3

O ROUGE-L: Longest Common Subsequence (LCS)

(32)

Experimental Results

41 46 51 56

10% 20% 30%

18 23 28

10% 20% 30%

9 14 19

10% 20% 30%

40 45 50

10% 20% 30%

ROUGE-1 ROUGE-2 ROUGE-3 ROUGE-L

Summarization ratio Summarization ratio Summarization ratio Summarization ratio

Baseline1 Baseline1 + Proposed (Graph)

Baseline1: I(S_i,d) – importance score using latent topic entropy term statistical measure

Baseline1+Proposed: I(S_i,d)G(i)

(33)

Experimental Results

41 46 51 56

10% 20% 30%

18 23 28

10% 20% 30%

9 14 19

10% 20% 30%

40 45 50

10% 20% 30%

 The proposed approach outperformed the first baseline in most cases.

(Compare blue and red bars)

(34)

Experimental Results

41 46 51 56

10% 20% 30%

18 23 28

10% 20% 30%

9 14 19

10% 20% 30%

40 45 50

10% 20% 30%

Summarization ratio Summarization ratio Summarization ratio Summarization ratio 41

46 51 56

10% 20% 30%

18 23 28

10% 20% 30%

9 14 19

10% 20% 30%

40 45 50

10% 20% 30%

Baseline2: I(S_i,d) – importance score using key-term based statistical measure

Baseline2+Proposed: I(S_i,d)G(i)

(35)

Experimental Results

41 46 51 56

10% 20% 30%

18 23 28

10% 20% 30%

9 14 19

10% 20% 30%

40 45 50

10% 20% 30%

Summarization ratio Summarization ratio Summarization ratio Summarization ratio 41

46 51 56

10% 20% 30%

18 23 28

10% 20% 30%

9 14 19

10% 20% 30%

40 45 50

10% 20% 30%

 The proposed approach always outperformed the second baseline.

(Compare green and orange bars)

(36)

Conclusions

• The performance of summarization can be improved by

▫ Graph-based approach considering topical similarity

 This offers a way to globally consider all sentences in a document for

summarization rather than considers each

sentence individually

(37)

37

Master Defense, National Taiwan University

(38)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

αG₀(4) αG₀(3)

αG₀(1)

αG₀(2)

αG₀(5)

αG₀(6)

(39)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

αG₀(4) αG₀(3)

αG₀(1)

αG₀(2)

αG₀(5)

αG₀(6)

   

 







j out

S_k j k

i i j

j

W ,

, , W

W ˆ

   

⁴ ^W^ˆ ⁴^,³

G₀



   

⁴ ^W^ˆ ⁴^,⁶

G₀



   

⁴ ^W^ˆ ⁴^,⁵

G₀



(40)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

αG₀(4) αG₀(3)

αG₀(1)

αG₀(2)

αG₀(5)

αG₀(6)

   

⁶ ^W^ˆ ⁶^,⁴

G₀



   

³ ^W^ˆ ³^,⁴

G₀



(41)

Random Walk Solution

S₁

S₂

S₃ S₄

S₅

S₆

   

⁶ ^W^ˆ ⁶^,⁴

G₀



   

³ ^W^ˆ ³^,⁴

G₀



     

⁴ ¹^- ^Iˆ ⁴ ^G

   

³ ^W^ˆ ³^,⁴ ^G

   

⁶ ^W^ˆ ⁶^,⁴

G₁    ₀  ₀



¹^-

^   

^Iˆ ⁴

(42)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

Mathematical Formulation

S_j S_i

 

W ˆ   ,



1 

S_kout j j k

   

 







j out

S_k j k

i i j

j

W ,

, , W

W ˆ

(43)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

Mathematical Formulation

S_j S_i

S_a

   

 

^j ⁱ ^W

 

^j ^a

W

i j j W

, ,

G ,



The amount of score S_j propagate to S_i is

(44)

         



 









i in S i

j

i j j

d

i 1 I S , G W ˆ ,

G  

Mathematical Formulation

S_j S_i

S_a

   

 

^j ⁱ ^W

 

^j ^a

W

i j j W

, ,

G ,



The amount of score S_j propagate to S_i is

   

 

^j ⁱ ^W

 

^j ^a

W

a j j W

, ,

G ,



The amount of score S_j propagate to S_a is

(45)

    

⁴ ¹^- ^I ^S ^,

    

⁶ ^ˆ ⁶^,⁴

   

³ ^ˆ ³^,⁴

G   ₄ d G W G W

   

 

³^,² ³^,⁴

 

³^,⁴

4 , ˆ 3

W W

 

 

⁶^,⁴

 

⁶^,⁴

ˆ W

W 

Mathematical Formulation – an Example

S₁

S₂

S₃ S₄

S₅

S₆

G(4) G(3)

G(6)

depends on S₄ itself

Depends on topically similar sentences (S₃ and S₆)