• 沒有找到結果。

Non-projective Dependency Parsing using Spanning Tree Algorithm

N/A
N/A
Protected

Academic year: 2022

Share "Non-projective Dependency Parsing using Spanning Tree Algorithm"

Copied!
39
0
0

加載中.... (立即查看全文)

全文

(1)

Non-projective Dependency Parsin g using Spanning Tree Algorithm

R98922004 Yun-Nung Chen 資工碩一 陳縕儂

1 /39

(2)

Reference

Non-projective Dependency Parsing usi ng Spanning Tree Algorithms (HLT/EMNL P 2005)

Ryan McDonald, Fernando Pereira, Kiril Ribarov, Jan Hajic

2 /39

(3)

Introduction

3 /39

(4)

Example of Dependency Tree

Each word depends on exactly one parent

Projective

Words in linear order, satisfying

Edges without crossing

A word and its descendants form a contiguous subs tring of the sentence

4 /39

(5)

Non-projective Examples

English

Most projective, some non-projective

Languages with more flexible word ord er

Most non-projective

German, Dutch, Czech

5 /39

(6)

Advantage of Dependency Parsing

Related work

relation extraction

machine translation

6 /39

(7)

Main Idea of the Paper

Dependency parsing can be formalized as

the search for a maximum spanning tree in a directed graph

7 /39

(8)

Dependency Parsing and Spanning Tre es

8 /39

(9)

Edge based Factorization (1/

3)

sentence: x = x1 … xn

the directed graph Gx = ( Vx , Ex ) given by

dependency tree for x: y

the tree Gy = ( Vy , Ey ) Vy = Vx

Ey = {(i, j), there’s a dependency from xi to xj } 9 /39

(10)

Edge based Factorization (2/

3)

scores of edges

score of a dependency tree y for sent ence x

10 /39

(11)

Edge based Factorization (3/

3)

11 /39

x = John hit the ball with the bat

root hit

John ball the

with

bat the

y1 root

ball John hit

the

with bat the

y2 root

John

ball hit the

with bat the

y3

(12)

Two Focus Points

1) How to decide weight vector w

2) How to find the tree with the maximum score

12 /39

(13)

Maximum Spanning Trees

dependency trees for x

= spanning trees for Gx

the dependency tree with maximum score for x = maximum spanning trees for Gx

13 /39

(14)

Maximum Spanning Tree Algorithm

14 /39

(15)

Chu-Liu-Edmonds Algorithm (1/12)

Input: graph G = (V, E)

Output: a maximum spanning tree in G

greedily select the incoming edge with hi ghest weight

Tree

Cycle in G

contract cycle into a single vertex and r ecalculate edge weights going into and ou t the cycle

15 /39

(16)

Chu-Liu-Edmonds Algorithm (2/12)

x = John saw Mary

16 /39

sa w roo

t

Joh n

Mary

9

3 0 1

0 2 9 0

3 3

0 1

1

0

Gx

(17)

Chu-Liu-Edmonds Algorithm (3/12)

For each word, finding highest scorin g incoming edge

17 /39

sa w roo

t

Joh n

Mary

9

3 0 1

0 2 9 0

3 3

0 1

1

0

Gx

(18)

Chu-Liu-Edmonds Algorithm (4/12)

If the result includes

Tree – terminate and output

Cycle – contract and recalculate

18 /39

sa w roo

t

Joh n

Mary

9

3 0 1

0 2 9 0

3 3

0 1

1

0

Gx

(19)

Chu-Liu-Edmonds Algorithm (5/12)

Contract and recalculate

Contract the cycle into a single node

Recalculate edge weights going into and out th e cycle

19 /39

sa w roo

t

Joh n

Mary

9

3 0 1

0 2 9 0

3 3

0 1

1

0

Gx

(20)

Chu-Liu-Edmonds Algorithm (6/12)

Outcoming edges for cycle

20 /39

sa w roo

t

Joh n

Mary

9

3 0 1

0

9

3

1 1

0

Gx

2

0 3

0

(21)

Chu-Liu-Edmonds Algorithm (7/12)

Incoming edges for cycle

,

21 /39

sa w roo

t

Joh n

Mary

9

3 0 1

0

9

1 1

0

Gx

2

0 3

0

(22)

Chu-Liu-Edmonds Algorithm (8/12)

x = root

s(root, John) – s(a(John), John) + s(C) = 9-30+50=29

s(root, saw) – s(a(saw), saw) + s(C) = 10-20+50=40

22 /39

sa w roo

t

Joh n

Mary

9

3 0 1

0

9

1 1

0

Gx 4

0 2

9

2

0 3

0

(23)

Chu-Liu-Edmonds Algorithm (9/12)

x = Mary

s(Mary, John) – s(a(John), John) + s(C) = 11-30+50=31

s(Mary, saw) – s(a(saw), saw) + s(C) = 0-20+50=30

23 /39

sa w roo

t

Joh n

Mary

9

3 0

1 1

0

Gx

3 1 4

0

3 2 0

0 3

0

(24)

Chu-Liu-Edmonds Algorithm (10/12)

24 /39

sa w roo

t

Joh n

Mary

9

3 0

Gx

Reserving highest tree in cycle

Recursive run the algorithm

3 1 4

0 2

0 3

0 30

(25)

Chu-Liu-Edmonds Algorithm (11/12)

25 /39

sa w roo

t

Joh n

Mary

9

3 0

Gx

Finding incoming edge with highest sc ore

Tree: terminate and output

3 1 4

0

30

(26)

Chu-Liu-Edmonds Algorithm (12/12)

26 /39

sa w roo

t

Joh n

Mary

3 0

Gx

Maximum Spanning Tree of Gx

30 4 0 1 0

(27)

Complexity of Chu-Liu-Edmonds Algorit hm

Each recursive call takes O(n2) to fin d highest incoming edge for each word

At most O(n) recursive calls (contracting n times)

Total: O(n3)

Tarjan gives an efficient implementati on of the algorithm with O(n2) for den se graphs

27 /39

(28)

Algorithm for Projective Tre es

Eisner Algorithm: O(n3)

Using bottom-up dynamic programming

Maintain the nested structural constraint (non-crossing constraint)

28 /39

(29)

Online Large Margin Learni ng

29 /39

(30)

Online Large Margin Learning

Supervised learning

Target: training weight vectors w betwee n two features (PoS tag)

Training data:

Testing data: x

30 /39

(31)

MIRA Learning Algorithm

Margin Infused Relaxed Algorithm (MIR A)

dt(x): the set of possible dependency tree s for x

31 /39

keep new vector as close as possible to the old

final weight vector is the

average of the weight vectors after each iteration

(32)

Single-best MIRA

Using only the single margin constrai nt

32 /39

(33)

Factored MIRA

Local constraints

correct incoming edge for j other incoming edge for j

correct spanning tree

incorrect spanning trees

 More restrictive than original constraints

33 /39

 a margin of 1

 the number of incorrect edges

(34)

Experiments

34 /39

(35)

Experimental Setting

Language: Czech

More flexible word order than English

Non-projective dependency

Feature: Czech PoS tag

standard PoS, case, gender, tense

Ratio of non-projective and projective

Less than 2% of total edges are non-projective

Czech-A: entire PDT

Czech-B: including only the 23% of sentences with non- projective dependency

35 /39

(36)

Compared Systems

COLL1999

The projective lexicalized phrase-structur e parser

N&N2005

The pseudo-projective parser

McD2005

The projective parser using Eisner and 5-b est MIRA

Single-best MIRA

Factored MIRA

The non-projective parser using Chu-Liu-Ed monds

36 /39

(37)

Results of Czech

Czech-A (23% non- projective)

Accuracy Complete

82.8 -

80.0 31.8

83.3 31.3

84.1 32.2

84.4 32.3

37 /39

Czech-B (non- projective)

Accuracy Complete

- -

- -

74.8 0.0

81.0 14.9

81.5 14.3

COLL1999 O(n5) N&N2005

McD2005 O(n3) Single-best MIRA O(n2)

Factored MIRA O(n2)

(38)

Results of English

English

Accuracy Complete

90.9 37.5

90.2 33.2

90.2 32.3

38 /39

McD2005 O(n3) Single-best MIRA O(n2)

Factored MIRA O(n2)

English projective dependency trees

Eisner algorithm uses the a priori

knowledge that all trees are projective

(39)

Thanks for your attention!

39/39

參考文獻

相關文件

這是一篇將 MST 應用在 clustering 問題的論文。防疫期間我們會以文字代替口頭報告的方式向大家介紹 Clustering with Local Density Peaks-Based Minimum Spanning

• Tree lifetime: When the first node is dead in tree T, the rounds number of the node surviving is the lifetime of the tree. The residual energy of node is denoted as E)), where

— John Wanamaker I know that half my advertising is a waste of money, I just don’t know which half.. —

• The randomized bipartite perfect matching algorithm is called a Monte Carlo algorithm in the sense that.. – If the algorithm finds that a matching exists, it is always correct

• Non-uniform space subdivision (for example, kd tree and octree) is better than uniform grid kd-tree and octree) is better than uniform grid if the scene is

Lecture by baluteshih Credit by zolution... Minimum

• 下面介紹三種使用greedy algorithm產生minimum cost spanning

Both problems are special cases of the optimum communication spanning tree problem, and are reduced to the minimum routing cost spanning tree (MRCT) prob- lem when all the