• 沒有找到結果。

Semantically-Aligned Equation Generation

N/A
N/A
Protected

Academic year: 2022

Share "Semantically-Aligned Equation Generation"

Copied!
29
0
0

加載中.... (立即查看全文)

全文

(1)

N T U M I U L A B

Semantically-Aligned Equation Generation

for Solving and Reasoning Math Word Problems

Ting-Rui Chiang and Yun-Nung (Vivian) Chen

https://github.com/MiuLab/E2EMathSolver

(2)

N T U M I U L A B Each notebook takes $0.5 and each pen takes $1. Tom has

$10. How many notebooks can he buy after buying 5 pens?

Math Word Problem

x = 10 − 1 × 5 ÷ 0.5

Reasoning & Solving

(3)

N T U M I U L A B

Prior Work

Non-neural approaches

• Template-based

(Kushman et al., Upadhyay and Chang)

Rely on hand-crafted features!

Deep learning

• Seq2Seq

(Wang et al., Ling et al.)

Does not use the structure of math expression.

x = (? + ?) × ? - ?

x = (1+ 2) × 3 - 4

fill

Problem

x = (1+ 2) × 3 - 4

generate

Our model is end-to-end and structural!

(4)

N T U M I U L A B

Decoder

Encoder

Overview of the Proposed Model

stack action

stack action

stack action

stack action

Each notebook takes $0.5 and each pen takes $1.

Tom has $10. How many notebooks can he buy after buying 5 pens?

x = 10 − 1 × 5 ÷ 0.5

(5)

N T U M I U L A B Each notebook takes $0.5 and each pen takes $1. Tom has

$10. How many notebooks can he buy after buying 5 pens?

Look Again at the Problem

$1

$10

$0.5

?

(6)

N T U M I U L A B Each notebook takes $0.5 and each pen takes $1. Tom has

$10. How many notebooks can he buy after buying 5 pens?

Semantic Meaning of the Operands

x = ( 10 − 1 × 5 ) ÷ 0.5

The amount of money Tom has

Price of a pen

Price of a notebook

Number of pens bought

(7)

N T U M I U L A B

Idea: Bridging Symbolic and Semantic Worlds

Symbolic World Semantic World

(8)

N T U M I U L A B

Preprocess

0.5 1 10

5

Preprocess

Symbolic Part

Each notebook takes $0.5 and each pen takes $1. Tom has

$10. How many notebooks can he buy after buying 5 pens?

(9)

N T U M I U L A B

Symbol Encoding

0.5 1 10

5

Preprocess

Symbolic Part

Each notebook takes $0.5 and each pen takes $1. Tom has

$10. How many notebooks can he buy after buying 5 pens?

Encode

Semantic Part

(10)

N T U M I U L A B

Inside Encoder

Each notebook takes $ 0.5 and ...

(11)

N T U M I U L A B

Semantic Generation for Unknown x

Each notebook takes $ 0.5 and ...

(12)

N T U M I U L A B

Operands & Their Semantics

0.5 1 10

5

Symbolic Part x Semantic Part

Each notebook takes $0.5 and each pen takes $1. Tom has

$10. How many notebooks can he buy after buying 5 pens?

(13)

N T U M I U L A B

Intuition of Using Semantics

x = ( 10 − 1 ? 5 )

Price of a pen.

Number of pens bought.

Each notebook takes $0.5 and each pen takes $1. Tom has

$10. How many notebooks can he buy after buying 5 pens?

(14)

N T U M I U L A B

x 10 1 5 × − 0.5 ÷ =

Equation Generation in Postfix

Each notebook takes $0.5 and each pen takes $1. Tom has

$10. How many notebooks can he buy after buying 5 pens?

(15)

N T U M I U L A B

• Stack is used

• The decoder generates stack actions.

• An equation is generated with actions on stack.

Equation Generation by Stack Actions

x = 10 − 1 × 5 ÷ 0.5

Decoder

stack action

stack action

stack action

stack

action

(16)

N T U M I U L A B Encoder

Action Selection in Each Step

Decoder

classifier

stack action

{+, -, × , ÷ , =, Push}

(17)

N T U M I U L A B

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

Equation Generation by Stack Actions

0.5 1 10

5 x

Action: push

(18)

N T U M I U L A B

x 0.5

1 10

5 x

10

Action: push

1

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x

Equation Generation by Stack Actions

5

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10 1

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10 1 5

(19)

N T U M I U L A B

x 0.5

1 10

5 x

10 1

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x

Equation Generation by Stack Actions

5

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10 1

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10 1 5

Action: ×

1 × 5

(20)

N T U M I U L A B

x=(10−1 × 5) ÷ 0.5 0.5

1 10

5 x

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x

Equation Generation by Stack Actions

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10 1

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

x 10 1 5 × 0.5 ÷ =

After many steps…

(21)

N T U M I U L A B

• Target equation is given.

• Trained as Seq2Seq.

Training Process

Decoder Encoder

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

x 10 1 5

x 10 1 5

<bos>

(22)

N T U M I U L A B

• Dataset: Math23k

• In Chinese

• 23000 math word problems.

• Operators: +, -, ×, ÷

Experiments

(23)

N T U M I U L A B

Results

45 50 55 60 65 70

Retrieval BLSTM Self-Attention Seq2Seq w/SNI Proposed Hybrid

Acc.

Retrieval Template Generation Ensemble

≈ 8%

> 1%

(24)

N T U M I U L A B

59 60 61 62 63 64 65 66

Char-Based Word-Based Word-Based -Semantic

Word-Based -Gate

Word-Based -Gate -Attention

Word-Based -Gate -Attention

-Stack

Acc.

Ablation Test

≈ 3%

≈ 2.5%

(25)

N T U M I U L A B Encoder

Self-Attention for Qualitative Analysis

Each notebook takes $ 0.5 and ...

(26)

N T U M I U L A B Encoder

Self-Attention for Qualitative Analysis

Each notebook takes $ 0.5 and ...

(27)

N T U M I U L A B

The attention focuses on:

• Informative verbs

o “gain”, “get”, “fill”, etc.

• Quantifier-related words

o “every”, “how many”, etc.

Attention for Operand Semantics

(28)

N T U M I U L A B

Three main contributions

Approach: equation generation with stack

Originality: automatic extraction of operand semantics

Performance: a SOTA end-to-end neural model on Math23k

Conclusion

(29)

N T U M I U L A B

Code Available @

https://github.com/MiuLab/E2EMathSolver

參考文獻

相關文件

Using this symmetry structure, one can easily prove that the z function automatically satisfies the vacuum condition of the W 1 + o~ algebra if it obeys the string

If the source is very highly coherent and the detector is placed very far behind the sample, one will observe a fringe pattern as different components of the beam,

Solution: pay attention on the partial input object each time Problem: larger memory implies more parameters in RNN. Solution: long-term memory increases memory size without

Salakhutdinov, Richard Zemel, Yoshua Bengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, ICML, 2015.. Image

Discovering the City by Mining Diverse and Multimodal Data Streams – IBM Grand Challenge: New York City 360. §  Exploring and Integrating Multiple Contents and Sources for

Keywords: pattern classification, FRBCS, fuzzy GBML, fuzzy model, genetic algorithm... 第一章

The presented methods for mining semantically related terms are based on either internal lexical similarities or external aspects of term occurrences in documents

• Zhen Yang, Wei Chen, Feng Wang, Bo Xu, „Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets“, arXiv 2017. • Supervised