• 沒有找到結果。

# Slide credit from Richard Socher 1

N/A
N/A
Protected

Share "Slide credit from Richard Socher 1"

Copied!
48
0
0

(1)
(2)

### Sequence Modeling

Idea: aggregate the meaning from all words into a vector

 Compositionality Method:

Basic combination: average, sum

Neural combination:

Recursive neural network (RvNN)

Recurrent neural network (RNN)

Convolutional neural network (CNN)

How to compute

(this)

(specification)

(have)

(sincerity)

N-dim

(3)

## Recursive Neural Network

From Words to Phrases

(4)

### Recursive Neural Network

Idea: leverage the linguistic knowledge (syntax) for combining multiple words into phrases

Assumption: language is described recursively

(5)

### Related Work for RvNN

Pollack (1990): Recursive auto-associative memories

Previous Recursive Neural Networks work by Goller & Küchler (1996), Costa et al. (2003) assumed fixed tree structure and used one-hot vectors.

Hinton (1990) and Bottou (2011): Related ideas about

recursive models and recursive operators as smooth versions of logic operations

(6)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(7)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(8)

### Phrase Mapping

Principle of “Compositionality”

The meaning (vector) of a sentence is determined by 1) the meanings of its words and

2) the rules that combine them

Idea: jointly learn parse trees and compositional vector representations

the country of my birth 0.4

0.3

2.1 3.3

7 7

4 4.5

2.3 3.6 1

3.5

2.5 3.8 5.5 6.1 1

5 the country of my birth

the place where I was born

(9)

### Sentence Syntactic Parsing

Parsing is a process of analyzing a string of symbols Parsing tree conveys

1) Part-of-speech for each word 2) Phrases

3) Relationships

(NN = noun, VB = verb, DT = determiner, IN = Preposition)

The cat sat on the mat.

DT NN VB IN DT NN

NP PP VP

NP

S

(10)

### Sentence Syntactic Parsing

Parsing is a process of analyzing a string of symbols Parsing tree conveys

1) Part-of-speech for each word 2) Phrases

3) Relationships

The cat sat on the mat.

DT NN VB

(NN = noun, VB = verb, DT = determiner, IN = Preposition)

IN DT NN NP PP VP

NP

S

(11)

### Sentence Syntactic Parsing

Parsing is a process of analyzing a string of symbols Parsing tree conveys

1) Part-of-speech for each word 2) Phrases

Noun phrase (NP): “the cat”, “the mat”

Preposition phrase (PP): “on the mat”

Verb phrase (VP): “sat on the mat”

Sentence: “the cat sat on the mat”

3) Relationships

The cat sat on the mat.

DT NN VB

(NN = noun, VB = verb, DT = determiner, IN = Preposition)

IN DT NN

NP PP VP

NP

S

(12)

### Sentence Syntactic Parsing

Parsing is a process of analyzing a string of symbols Parsing tree conveys

1) Part-of-speech for each word 2) Phrases

3) Relationships

The cat sat on the mat.

DT NN VB

(NN = noun, VB = verb, DT = determiner, IN = Preposition)

IN DT NN

NP PP VP

NP

S

subject verb modifier_of_place

• “the cat” is the subject of “sat”

• “on the mat” is the place modifier of “sat”

(13)

### Learning Structure & Representation

Vector representations incorporate the meaning of words and their compositional structures

The cat sat on the mat.

NP PP

VP

NP

S

(14)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(15)

### Recursion Assumption

Are languages recursive?

Recursion helps describe natural language

Ex. “the church which has nice windows”, a noun phrase containing a relative clause that contains a noun phrases

NP  NP PP

debatable

(16)

### Recursion Assumption

Characteristics of recursion

John and Jane went to a big festival. They enjoyed the trip and the music there.

“they”: John and Jane; “the trip”: went to a big festival; “there”: big festival

3. Works better for some tasks to use grammatical tree structure Language recursion is still up to debate

(17)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(18)

### Recursive Neural Network Architecture

A network is to predict the vectors along with the structure

Input: two candidate children’s vector representations

Output:

1) vector representations for the merged node 2) score of how plausible the new node would be

Neural Network

on the mat.

NP

score PP

(19)

### Recursive Neural Network Definition

1) vector representations for the merged node

2) score of how plausible the new node would

### Neural Network

be

same W parameters at all nodes of the tree

 weight-tied

(20)

3.1 0.3 0.1 0.4 2.3

### Sentence Parsing via RvNN

Neural Network Neural

Network

Neural Network

Neural Network

Neural Network

(21)

1.1

0.1 0.4 2.3

Neural Network

Neural Network

Neural Network

Neural Network

(22)

1.1

0.1

3.6

Neural Network

Neural Network

Neural Network

(23)

1.1

3.8

### Sentence Parsing via RvNN

Neural Network

Neural Network

(24)

Sentence parsing score

### Sentence Parsing via RvNN

Neural Network

Sentence vector embeddings

(25)

### Backpropagation through Structure

Principally the same as general backpropagation (Goller& Küchler, 1996)



1

1 1 l x

l a

j l

l j

i

Backward Pass

Forward Pass

Three differences

Sum derivatives of W from all nodes

Split derivatives at each node

Add error messages from parent + node itself

(26)

Neural Network

Neural Network

(27)

### 2) Split derivatives at each node

During forward propagation, the parent node is computed based on two children

During backward propagation, the errors should be computed wrt each of them

Neural Network Neural Network

(28)

For each node, the error message is compose of

Error propagated from parent

Error from the current node

Neural Network

(29)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(30)

### Composition Matrix W

Neural Network

Issue: using the same network W for different compositions

(31)

### Syntactically Untied RvNN

Idea: the composition function is conditioned on the syntactic categories

Neural Network

Benefit

• Composition function are syntax-dependent

• Allows different composition functions for word pairs, e.g. Adv + AdjP, VP + NP

Issue: speed due to many candidates

(32)

### Compositional Vector Grammar

Compute score only for a subset of trees coming from a simpler, faster model (Socher et al, 2013)

Prunes very unlikely candidates for speed

Provides coarse syntactic categories of the children for each beam candidate

Probability context-free grammar (PCFG) helps decrease the search space

(33)

### Labels for RvNN

The score can be passed through a softmax function to compute the probability of each category

x1 x2 x3 x4

y1

y2

y3 Neural Network softmax

NP

Softmax loss  cross-entropy error for optimization

(34)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(35)

### Recursive Neural Network

Neural Network

Issue: some words act mostly as an operator, e.g. “very” in “very good”

(36)

### Matrix-Vector Recursive Neural Network

Neural Network

Idea: each word can additionally serve as an operator

(37)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(38)

### Recursive Neural Tensor Network

Idea: allow more interactions of vectors

(39)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(40)

(41)

### Image Compositionality

Idea: image can be composed by the visual segments (same as natural language parsing)

(42)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(43)

### Paraphrase for Learning Sentence Vectors

A pair-wise sentence comparison of nodes in parsed trees for learning sentence embeddings

(44)

### Outline

Property

Syntactic Compositionality

Recursion Assumption

Network Architecture and Definition

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network Applications

Parsing

Paraphrase Detection

Sentiment Analysis

(45)

### Sentiment Analysis

Sentiment analysis for sentences with negation words can benefit from RvNN

(46)

### Sentiment Analysis

Sentiment Treebank with richer annotations

Phrase-level sentiment labels indeed improve the performance

(47)

### Sentiment Tree Illustration

Stanford live demo: http://nlp.stanford.edu/sentiment/

Phrase-level annotations learn the specific compositional functions for sentiment

(48)

### Concluding Remarks

Recursive Neural Network

Idea: syntactic compositionality

& language recursion Network Variants

Standard Recursive Neural Network

Weight-Tied

Weight-Untied

Matrix-Vector Recursive Neural Network

Recursive Neural Tensor Network

 A host connecting to the outside network is allocated an external IP address from the address pool managed by NAT... Flavors of

In the third quarter of 2002, the Census and Statistics Department conducted an establishment survey (5) on business aspirations and training needs, upon Hong Kong’s

 A socket is a file descriptor that lets an application read/write data from/to the network.  Once configured the

Responsible for providing reliable data transmission Data Link Layer from one node to another. Concerned with routing data from one network node Network Layer

In the work of Qian and Sejnowski a window of 13 secondary structure predictions is used as input to a fully connected structure-structure network with 40 hidden units.. Thus,

The remaining positions contain //the rest of the original array elements //the rest of the original array elements.

• We need to make each barrier coincide with a layer of the binomial tree for better convergence.. • The idea is to choose a Δt such

To solve this problem, this study proposed a novel neural network model, Ecological Succession Neural Network (ESNN), which is inspired by the concept of ecological succession