• 沒有找到結果。

Slides credited from Hsueh-I Lu, Hsu-Chun Hsiao, & Michael Tsai

N/A
N/A
Protected

Academic year: 2022

Share "Slides credited from Hsueh-I Lu, Hsu-Chun Hsiao, & Michael Tsai"

Copied!
80
0
0

加載中.... (立即查看全文)

全文

(1)
(2)

Mini-HW 4 released

Due on 10/19 (Thu) 17:20

Homework 1 due a week later

Homework 2 released

Due on 11/09 (Thur) 17:20 (4 weeks)

TA Recitation

10/26 (Thu) at R103

Homework 1 QA

Mid-term date changed

Original: 11/09 (Thu)

New: 11/16 (Thu)

Frequently check the website for the updated information! 2

(3)
(4)

Dynamic Programming

DP #1: Rod Cutting

DP #2: Stamp Problem

DP #3: Matrix-Chain Multiplication

DP #4: Sequence Alignment Problem

Longest Common Subsequence (LCS) / Edit Distance

Viterbi Algorithm

Space Efficient Algorithm

DP #5: Weighted Interval Scheduling

DP #6: Knapsack Problem

0/1 Knapsack

Unbounded Knapsack

Multidimensional Knapsack

Fractional Knapsack

(5)

有100個死囚,隔天執行死刑,典獄長開恩給他們一個存活的機會。

當隔天執行死刑時,每人頭上戴一頂帽子(黑或白)排成一隊伍,在

死刑執行前,由隊伍中最後的囚犯開始,每個人可以猜測自己頭上 的帽子顏色(只允許說黑或白),猜對則免除死刑,猜錯則執行死刑。

若這些囚犯可以前一天晚上先聚集討論方案,是否有好的方法可以

使總共存活的囚犯數量期望值最高?

(6)

囚犯排成一排,每個人可以看到前面所有人的帽子,但看不到自己 及後面囚犯的。

由最後一個囚犯開始猜測,依序往前。

每個囚犯皆可聽到之前所有囚犯的猜測內容。

……

Example: 奇數者猜測內 容為前面一位的帽子顏 色  存活期望值為75人 有沒有更多人可以存活的好策略?

(7)

Do not focus on “specific algorithms”

But “some strategies” to “design” algorithms

First Skill: Divide-and-Conquer (各個擊破)

Second Skill: Dynamic Programming (動態規劃)

(8)

Textbook Chapter 15 – Dynamic Programming

Textbook Chapter 15.3 – Elements of dynamic programming

8

(9)

▪ Dynamic programming, like the divide-and-conquer

method, solves problems by combining the solutions to subproblems

▪ 用空間換取時間

▪ 讓走過的留下痕跡

▪ “Dynamic”: time-varying

“Programming”: a tabular method

Dynamic Programming: planning over time

(10)

Divide-and-Conquer

partition the problem into independent or disjoint subproblems

repeatedly solving the common subsubproblems

 more work than necessary

Dynamic Programming

partition the problem into dependent or overlapping subproblems

avoid recomputation

Top-down with memoization

Bottom-up method

(11)

▪ Apply four steps

1.

Characterize the structure of an optimal solution

2.

Recursively define the value of an optimal solution

3.

Compute the value of an optimal solution, typically in a bottom- up fashion

4.

Construct an optimal solution from computed information

(12)

Fibonacci sequence (費波那契數列)

Base case: F(0) = F(1) = 1

Recursive case: F(n) = F(n-1) + F(n-2)

Fibonacci(n)

if n < 2 // base case return 1

// recursive case

return Fibonacci(n-1)+Fibonacci(n-2) F(5)

F(4) F(3)

F(3) F(2) F(2) F(1)

F(2) F(1) F(1) F(0) F(1) F(0)

F(1) F(0) Calling overlapping subproblems result in poor efficiency

F(3) was computed twice

F(2) was computed 3 times

(13)

Solve the overlapping subproblems recursively with memoization

Check the memo before making the calls

13

F(5)

F(4) F(3)

F(3) F(2)

F(2) F(1)

F(1) F(0)

備忘錄

n 0 1 2 3 4 5

F(n) 1 1 ?2 3? ?5 ?8

(14)

Memoized-Fibonacci(n)

// initialize memo (array a[]) a[0] = 1

a[1] = 1

for i = 2 to n a[i] = 0

return Memoized-Fibonacci-Aux(n, a) Memoized-Fibonacci-Aux(n, a)

if a[n] > 0 return a[n]

// save the result to avoid recomputation

a[n] = Memoized-Fibonacci-Aux(n-1, a) + Memoized-Fibonacci-Aux(n-2, a) return a[n]

(15)

Building up solutions to larger and larger subproblems

15

Bottom-Up-Fibonacci(n) if n < 2

return 1 a[0] = 1 a[1] = 1

for i = 2 … n

a[i] = a[i-1] + a[i-2]

return a[n]

F(5) F(4) F(3) F(2) F(1) F(0)

(16)

Principle of Optimality

Any subpolicy of an optimum policy must itself be an optimum policy with regard to the initial and terminal states of the subpolicy

Two key properties of DP for optimization

Overlapping subproblems

Optimal substructure – an optimal solution can be constructed from optimal solutions to subproblems

Reduce search space (ignore non-optimal solutions)

If the optimal substructure (principle of optimality) does not hold, then it is incorrect to use DP

(17)

Shortest Path Problem

Input: a graph where the edges have positive costs

Output: a path from S to T with the smallest cost

Taipei (T)

M

CSM CMT

C’SM < CSM?

The path costing CSM+ CMT is the shortest path from S to T

 The path with the cost CSM must be a shortest path from S to M

Proof by “Cut-and-Paste” argument (proof by contradiction):

Suppose that it exists a path with smaller cost C’SM, then we can

“cut” C and “paste” C’ to make the original cost smaller

(18)

Textbook Chapter 15.1 – Rod Cutting

18

(19)

Input: a rod of length 𝑛 and a table of prices 𝑝

𝑖

for 𝑖 = 1, … , 𝑛

Output: the maximum revenue 𝑟

𝑛

obtainable by cutting up the rod and selling the pieces

length 𝑖 (m) 1 2 3 4 5

price 𝑝𝑖 1 5 8 9 10

4m

2m

2m

(20)

A rod with the length = 4

length 𝑖 (m) 1 2 3 4 5

price 𝑝𝑖 1 5 8 9 10

4m

3m 1m

2m 2m

1m 3m

2m 1m 1m

1m 2m 1m

1m 2m 1m

1m 1m 1m

1m

 9

 8 + 1 = 9

 5 + 5 = 10

 1 + 8 = 9

 5 + 1 + 1 = 7

 1 + 5 + 1 = 7

 1 + 1 + 5 = 7

 1 + 1 + 1 + 1 = 4

(21)

A rod with the length = 𝑛

For each integer position, we can choose “cut” or “not cut”

There are 𝑛 – 1 positions for consideration

The total number of cutting results is 2

𝑛−1

= Θ 2

𝑛−1

length 𝑖 (m) 1 2 3 4 5

price 𝑝𝑖 1 5 8 9 10

n

(22)

We use a recursive function to solve the subproblems

If we know the answer to the subproblem, can we get the answer to the original problem?

Optimal substructure – an optimal solution can be constructed from optimal solutions to subproblems

𝑟𝑛−𝑖 𝑟𝑖

no cut

cut at the i-th position (from left to right)

𝑟𝑛: the maximum revenue obtainable for a rod of length 𝑛

(23)

Version 1

Version 2

try to reduce the number of subproblems  focus on the left-most cut no cut

cut at the i-th position (from left to right)

left-most value maximum value obtainable from the remaining part

𝑟𝑛−𝑖 𝑝𝑖

(24)

Focus on the left-most cut

assume that we always cut from left to right  the first cut

24

optimal solution to subproblems

𝑟𝑛−𝑖 𝑝𝑖

𝑟𝑛−1 𝑝1

𝑟𝑛−2 𝑝2

: : optimal solution

Rod cutting problem has optimal substructure

(25)

𝑇 𝑛 = time for running

Cut-Rod(p, n) Cut-Rod(p, n)

// base case if n == 0

return 0

// recursive case q = -∞

for i = 1 to n

q = max(q, p[i] + Cut-Rod(p, n - i)) return q

(26)

Rod cutting problem

Cut-Rod(p, n) // base case if n == 0

return 0

// recursive case q = -∞

for i = 1 to n

q = max(q, p[i] + Cut-Rod(p, n - i)) return q

CR(4)

CR(3) CR(0)

CR(2) CR(1) CR(1) CR(0)

CR(1) CR(0) CR(0)

CR(0)

CR(0)

Calling overlapping subproblems result in poor efficiency

CR(2) CR(1)

CR(0) CR(0)

(27)

Idea: use space for better time efficiency

Rod cutting problem has overlapping subproblems and optimal substructures

 can be solved by DP

When the number of subproblems is polynomial, the time complexity is polynomial using DP

DP algorithm

Top-down: solve overlapping subproblems recursively with memoization

Bottom-up: build up solutions to larger and larger subproblems

(28)

Top-Down with Memoization

Solve recursively and memo the subsolutions (跳著填表)

Suitable that not all

subproblems should be solved

Bottom-Up with Tabulation

Fill the table from small to large

Suitable that each small problem should be solved

f(0) f(1) f(2) f(n) f(0) f(1) f(2) f(n)

(29)

Memoized-Cut-Rod(p, n)

// initialize memo (an array r[] to keep max revenue) r[0] = 0

for i = 1 to n

r[i] = -∞ // r[i] = max revenue for rod with length=i return Memorized-Cut-Rod-Aux(p, n, r)

Memoized-Cut-Rod-Aux(p, n, r) if r[n] >= 0

return r[n] // return the saved solution q = -∞

for i = 1 to n

q = max(q, p[i] + Memoized-Cut-Rod-Aux(p, n-i, r)) r[n] = q // update memo

return q

𝑇 𝑛 = time for running

Memoized-Cut-Rod(p, n)

(30)

Bottom-Up-Cut-Rod(p, n) r[0] = 0

for j = 1 to n // compute r[1], r[2], ... in order q = -∞

for i = 1 to j

q = max(q, p[i] + r[j - i]) r[j] = q

return r[n]

𝑇 𝑛 = time for running

Bottom-Up-Cut-Rod(p, n)

(31)

Input: a rod of length 𝑛 and a table of prices 𝑝

𝑖

for 𝑖 = 1, … , 𝑛

Output: the maximum revenue 𝑟

𝑛

obtainable and the list of cut pieces

length 𝑖 (m) 1 2 3 4 5

price 𝑝𝑖 1 5 8 9 10

4m

2m

2m

(32)

Add an array to keep the cutting positions cut

Extended-Bottom-Up-Cut-Rod(p, n) r[0] = 0

for j = 1 to n //compute r[1], r[2], ... in order q = -∞

for i = 1 to j

if q < p[i] + r[j - i]

q = p[i] + r[j - i]

cut[j] = i // the best first cut for len j rod r[i] = q

return r[n], cut

Print-Cut-Rod-Solution(p, n)

(r, cut) = Extended-Bottom-up-Cut-Rod(p, n) while n > 0

print cut[n]

n = n – cut[n] // remove the first piece

(33)

f(0) f(1) f(2) f(n)

Top-Down with Memoization

Better when some subproblems not be solved at all

Solve only the required parts of subproblems

Bottom-Up with Tabulation

Better when all subproblems must be solved at least once

Typically outperform top-down method by a constant factor

No overhead for recursive calls

Less overhead for maintaining the table

f(0) f(1) f(2) f(n)

F(5) F(4) F(3) F(2) F(1) F(0)

(34)

▪ Approach 1: approximate via (#subproblems) * (#choices for each subproblem)

For rod cutting

#subproblems = n

#choices for each subproblem = O(n)

 T(n) is about O(n2)

▪ Approach 2: approximate via subproblem graphs

(35)

The size of the subproblem graph allows us to estimate the time complexity of the DP algorithm

A graph illustrates the set of subproblems involved and how subproblems depend on another 𝐺 = 𝑉, 𝐸 (E: edge, V: vertex)

𝑉 : #subproblems

A subproblem is run only once

|𝐸|: sum of #subsubproblems are needed for each subproblem

Time complexity: linear to 𝑂( 𝐸 + 𝑉 )

F(5) F(4)

F(3)

F(2)

F(1)

F(0)

Bottom-up: Reverse Topological Sort Top-down: Depth First Search

Graph Algorithm (taught later)

(36)

1.

Characterize the structure of an optimal solution

Overlapping subproblems: revisit same subproblems

Optimal substructure: an optimal solution to the problem contains within it optimal solutions to subproblems

2.

Recursively

define the value of an optimal solution

Express the solution of the original problem in terms of optimal solutions for subproblems

3.

Compute the value of an optimal solution

typically in a bottom-up fashion

4.

Construct an optimal solution from computed information

Step 3 and 4 may be combined

(37)

1. Characterize the structure of an optimal solution

2. Recursively define the value of an optimal solution

3. Compute the value of an optimal solution

4. Construct an optimal solution from computed information

(38)

Step 1-Q1: What can be the subproblems?

Step 1-Q2: Does it exhibit optimal structure? (an optimal solution can be represented by the optimal solutions to subproblems)

Yes.  continue

No.  go to Step 1-Q1 or there is no DP solution for this problem Rod Cutting Problem

Input: a rod of length 𝑛 and a table of prices 𝑝𝑖 for 𝑖 = 1, … , 𝑛 Output: the maximum revenue 𝑟𝑛 obtainable

(39)

Step 1-Q1: What can be the subproblems?

Subproblems: Cut-Rod(0), Cut-Rod(1), …, Cut-Rod(n-1)

Cut-Rod(i): rod cutting problem with length-i rod

Goal: Cut-Rod(n)

Suppose we know the optimal solution to Cut-Rod(i), there are i cases:

Case 1: the first segment in the solution has length 1

Case 2: the first segment in the solution has length 2

:

Case i: the first segment in the solution has length i 39 Rod Cutting Problem

Input: a rod of length 𝑛 and a table of prices 𝑝𝑖 for 𝑖 = 1, … , 𝑛 Output: the maximum revenue 𝑟𝑛 obtainable

從solution中拿掉一段長度為1的鐵條, 剩下的部分是Cut-Rod(i-1)的最佳解 從solution中拿掉一段長度為2的鐵條, 剩下的部分是Cut-Rod(i-2)的最佳解

(40)

Step 1-Q2: Does it exhibit optimal structure? (an optimal solution can be represented by the optimal solutions to subproblems)

Yes. Prove by contradiction.

Rod Cutting Problem

Input: a rod of length 𝑛 and a table of prices 𝑝𝑖 for 𝑖 = 1, … , 𝑛 Output: the maximum revenue 𝑟𝑛 obtainable

(41)

Suppose we know the optimal solution to Cut-Rod(i), there are i cases:

Case 1: the first segment in the solution has length 1

Case 2: the first segment in the solution has length 2

:

Case i: the first segment in the solution has length i

Recursively define the value Rod Cutting Problem

Input: a rod of length 𝑛 and a table of prices 𝑝𝑖 for 𝑖 = 1, … , 𝑛 Output: the maximum revenue 𝑟𝑛 obtainable

從solution中拿掉一段長度為1的鐵條, 剩下的部分是Cut-Rod(i-1)的最佳解

從solution中拿掉一段長度為2的鐵條, 剩下的部分是Cut-Rod(i-2)的最佳解

從solution中拿掉一段長度為i的鐵條, 剩下的部分是Cut-Rod(0)的最佳解

(42)

Bottom-up method: solve smaller subproblems first

42

Rod Cutting Problem

Input: a rod of length 𝑛 and a table of prices 𝑝𝑖 for 𝑖 = 1, … , 𝑛 Output: the maximum revenue 𝑟𝑛 obtainable

i 0 1 2 3 4 5 n

r[i]

Bottom-Up-Cut-Rod(p, n) r[0] = 0

for j = 1 to n // compute r[1], r[2], ... in order q = -∞

for i = 1 to j

q = max(q, p[i] + r[j - i]) r[j] = q

return r[n]

(43)

Bottom-up method: solve smaller subproblems first Rod Cutting Problem

Input: a rod of length 𝑛 and a table of prices 𝑝𝑖 for 𝑖 = 1, … , 𝑛 Output: the maximum revenue 𝑟𝑛 obtainable

i 0 1 2 3 4 5 n

r[i] 0

cut[i] 0 1

1

2 5

3 8

2 10

length 𝑖 1 2 3 4 5

price 𝑝𝑖 1 5 8 9 10

(44)

Cut-Rod(p, n) r[0] = 0

for j = 1 to n // compute r[1], r[2], ... in order q = -∞

for i = 1 to j

if q < p[i] + r[j - i]

q = p[i] + r[j - i]

cut[j] = i // the best first cut for len j rod r[i] = q

return r[n], cut

Print-Cut-Rod-Solution(p, n) (r, cut) = Cut-Rod(p, n) while n > 0

print cut[n]

n = n – cut[n] // remove the first piece

(45)

45

(46)

▪ Input: the postage 𝑛 and the stamps with values 𝑣

1

, 𝑣

2

, … , 𝑣

𝑘

▪ Output: the minimum number of stamps to cover the postage

(47)

The optimal solution 𝑆

𝑛

can be recursively defined as

Stamp(v, n) r_min = ∞

if n == 0 // base case return 0

for i = 1 to k // recursive case r[i] = Stamp(v, n - v[i])

if r[i] < r_min r_min = r[i]

return r_min + 1

(48)

Subproblems

S(i): the min #stamps with postage i

Goal: S(n)

Optimal substructure: suppose we know the optimal solution to S(i), there are k cases:

Case 1: there is a stamp with v1 in OPT

Case 2: there is a stamp with v2 in OPT

:

Case k: there is a stamp with vk in OPT

48

Stamp Problem

Input: the postage 𝑛 and the stamps with values 𝑣1, 𝑣2, … , 𝑣𝑘 Output: the minimum number of stamps to cover the postage

從solution中拿掉一張郵資為v1的郵票, 剩下的部分是S(i-v[1])的最佳解

從solution中拿掉一張郵資為v2的郵票, 剩下的部分是S(i-v[2])的最佳解

從solution中拿掉一張郵資為v 的郵票, 剩下的部分是S(i-v[k])的最佳解

(49)

Suppose we know the optimal solution to S(i), there are k cases:

Case 1: there is a stamp with v1 in OPT

Case 2: there is a stamp with v2 in OPT

:

Case k: there is a stamp with vk in OPT

Recursively define the value

Stamp Problem

Input: the postage 𝑛 and the stamps with values 𝑣1, 𝑣2, … , 𝑣𝑘 Output: the minimum number of stamps to cover the postage

從solution中拿掉一張郵資為v1的郵票, 剩下的部分是S(i-v[1])的最佳解

從solution中拿掉一張郵資為v2的郵票, 剩下的部分是S(i-v[2])的最佳解

從solution中拿掉一張郵資為vk的郵票, 剩下的部分是S(i-v[k])的最佳解

(50)

Bottom-up method: solve smaller subproblems first

50

Stamp Problem

Input: the postage 𝑛 and the stamps with values 𝑣1, 𝑣2, … , 𝑣𝑘 Output: the minimum number of stamps to cover the postage

i 0 1 2 3 4 5 n

S[i]

Stamp(v, n) S[0] = 0

for i = 1 to n // compute r[1], r[2], ... in order r_min = ∞

for j = 1 to k

if S[i - v[j]] < r_min r_min = 1 + S[i – v[j]]

return S[n]

(51)

Stamp(v, n) S[0] = 0

for i = 1 to n r_min = ∞

for j = 1 to k

if S[i - v[j]] < r_min r_min = 1 + S[i – v[j]]

B[i] = j // backtracking for stamp with v[j]

return S[n], B

Print-Stamp-Selection(v, n) (S, B) = Stamp(v, n)

while n > 0 print B[n]

n = n – v[B[n]]

(52)

Textbook Chapter 15.2 – Matrix-chain multiplication

52

(53)

Input: a sequence of n matrices 𝐴

1

, … , 𝐴

𝑛

Output: the product of 𝐴

1

𝐴

2

… 𝐴

𝑛

𝐴1 𝐴2 𝐴3 𝐴4 …… 𝐴𝑛

𝐴1.cols=𝐴2.rows

𝐴1and 𝐴2are compatible.

(54)

54

Each entry takes 𝑞 multiplications

There are total 𝑝𝑟 entries

A B C

Matrix multiplication is associative: 𝐴 𝐵𝐶 = (𝐴𝐵)𝐶. The time required by obtaining 𝐴 × 𝐵 × 𝐶 could be affected by which two matrices multiply first .

(55)

Overall time is

= =

(56)

Overall time is

= =

(57)

Input: a sequence of integers 𝑙

0

, 𝑙

1

, … , 𝑙

𝑛

𝑙𝑖−1 is the number of rows of matrix 𝐴𝑖

𝑙𝑖 is the number of columns of matrix 𝐴𝑖

Output: a order of performing 𝑛 − 1 matrix multiplications in the minimum number of operations to obtain the product of 𝐴

1

𝐴

2

… 𝐴

𝑛

𝐴1 𝐴2 𝐴3 𝐴4 …… 𝐴𝑛

𝐴1.cols=𝐴2.rows

𝐴1and 𝐴2are compatible.

Do not need to compute the result but find the fast way to get the result!

(58)

𝑃

𝑛

: how many ways for 𝑛 matrices to be multiplied

The solution of 𝑃

𝑛

is Catalan numbers, Ω

4𝑛

𝑛

3 2

, or is also Ω 2

𝑛 Exercise 15.2-3

(59)

Subproblems

M(i, j): the min #operations for obtaining the product of 𝐴𝑖 … 𝐴𝑗

Goal: M(1, n)

Optimal substructure: suppose we know the OPT to M(i, j) , there are k cases:

Case k: there is a cut right after Ak in OPT Matrix-Chain Multiplication Problem

Input: a sequence of integers 𝑙0, 𝑙1, … , 𝑙𝑛indicating the dimensionality of 𝐴𝑖

Output: a order of matrix multiplications with the minimum number of operations

左右所花的運算量是M(i, k)M(k+1, j)的最佳解

𝐴𝑖𝐴𝑖+1… 𝐴𝑘 𝐴𝑘+1𝐴𝑘+2 … 𝐴𝑗 𝑖 ≤ 𝑘 < 𝑗

(60)

Suppose we know the optimal solution to M(i, j), there are k cases:

Case k: there is a cut right after Ak in OPT

Recursively define the value

左右所花的運算量是M(i, k)M(k+1, j)的最佳解

Matrix-Chain Multiplication Problem

Input: a sequence of integers 𝑙0, 𝑙1, … , 𝑙𝑛indicating the dimensionality of 𝐴𝑖

Output: a order of matrix multiplications with the minimum number of operations

𝐴𝑘+1..𝑗 𝐴𝑖.rows

=𝑙𝑖−1

𝐴𝑘.cols=𝑙𝑘

𝐴𝑘+1.rows=𝑙𝑘

𝐴𝑗.cols=𝑙𝑗 𝐴𝑖𝐴𝑖+1… 𝐴𝑘 𝐴𝑘+1𝐴𝑘+2… 𝐴𝑗 =

(61)

Bottom-up method: solve smaller subproblems first

How many subproblems to solve

#combination of the values 𝑖 and 𝑗 s.t. 1 ≤ 𝑖 ≤ 𝑗 ≤ 𝑛

Matrix-Chain Multiplication Problem

Input: a sequence of integers 𝑙0, 𝑙1, … , 𝑙𝑛indicating the dimensionality of 𝐴𝑖

Output: a order of matrix multiplications with the minimum number of operations

(62)

Matrix-Chain(n, l)

initialize two tables M[1..n][1..n] and B[1..n-1][2..n]

for i = 1 to n

M[i][i] = 0 // boundary case

for p = 2 to n // p is the chain length

for i = 1 to n – p + 1 // all i, j combinations j = i + p – 1

M[i][j] = ∞

for k = i to j – 1 // find the best k

q = M[i][k] + M[k + 1][j] + l[i - 1] * l[k] * l[j]

if q < M[i][j]

M[i][j] = q return M

(63)

How to decide the order of the matrix

multiplication?

1 2 3 4 5 6 n

1 0

2 0

3 0

4 0

5 0

6 0

0

n 0

(64)

64

Matrix-Chain(n, l)

initialize two tables M[1..n][1..n] and B[1..n-1][2..n]

for i = 1 to n

M[i][i] = 0 // boundary case

for p = 2 to n // p is the chain length

for i = 1 to n – p + 1 // all i, j combinations j = i + p – 1

M[i][j] = ∞

for k = i to j – 1 // find the best k

q = M[i][k] + M[k + 1][j] + l[i - 1] * l[k] * l[j]

if q < M[i][j]

M[i][j] = q

B[i][j] = k // backtracking return M and B

Print-Optimal-Parens(B, i, j) if i == j

print 𝐴𝑖 else

print “(”

Print-Optimal-Parens(B, i, B[i][j]) Print-Optimal-Parens(B, B[i][j] + 1, j) print “)”

(65)

Matrix 𝑨𝟏 𝑨𝟐 𝑨𝟑 𝑨𝟒 𝑨𝟓 𝑨𝟔 Dimension 30 x 35 35 x 15 15 x 5 5 x 10 10 x 20 20 x 25

1 2 3 4 5 6

1 0

2 0

3 0

4 0

5 0

6 0

15,750

2,625 750

1,000

5,000 7,875

4,375

2,500

3,500 9,375

7,125

53,75 11,875

10,500 15,125

1 2 3 4 5 6

1 2 3 4 5 6

1

2

3

4

5 1

3

3

5 3

3

3 3

3 3

(66)

Textbook Chapter 15.4 – Longest common subsequence Textbook Problem 15.5 – Edit distance

66

(67)

猴子們各自講話,經過語音辨識系統後,哪一支猴子發出最接近英 文字”banana”的語音為優勝者

How to evaluate the similarity between two sequences?

aeniqadikjaz

svkbrlvpnzanczyqza

banana

(68)

Input: two sequences

Output: longest common subsequence of two sequences

The maximum-length sequence of characters that appear left-to-right (but not necessarily a continuous string) in both sequences

X = banana

Y = svkbrlvpnzanczyqza X → ---ba---n-an---a Y → svkbrlvpnzanczyqza X = banana

Y = aeniqadikjaz X → ba-n--an---a- Y → -aeniqadikjaz

The infinite monkey theorem: a monkey hitting keys at random for an infinite amount of time will almost surely type a given text

4 5

(69)

Input: two sequences

Output: the minimum cost of transformation from X to Y

Quantifier of the dissimilarity of two strings

X = banana

Y = svkbrlvpnzanczyqza X → ---ba---n-an---a Y → svkbrlvpnzanczyqza X = banana

Y = aeniqadikjaz X → ba-n--an---a- Y → -aeniqadikjaz

1 deletion, 7 insertions, 1 substitution 12 insertions, 1 substitution

9 13

(70)

Input: two sequences

Output: the minimal cost 𝑀

𝑚,𝑛

for aligning two sequences

Cost = #insertions × 𝐶INS + #deletions × 𝐶DEL + #substitutions × 𝐶𝑝,𝑞

(71)

Subproblems

SA(i, j): sequence alignment between prefix strings 𝑥1, … , 𝑥𝑖 and 𝑦1, … , 𝑦𝑗

Goal: SA(m, n)

Optimal substructure: suppose OPT is an optimal solution to SA(i, j), there are 3 cases:

Case 1: 𝑥𝑖 and 𝑦𝑗 are aligned in OPT (match or substitution)

OPT/{𝑥𝑖, , 𝑦𝑗} is an optimal solution of SA(i-1, j-1)

Case 2: 𝑥𝑖 is aligned with a gap in OPT (deletion)

OPT is an optimal solution of SA(i-1, j)

Case 3: 𝑦𝑗 is aligned with a gap in OPT (insertion)

OPT is an optimal solution of SA(i, j-1)

Sequence Alignment Problem Input: two sequences

Output: the minimal cost 𝑀𝑚,𝑛 for aligning two sequences

(72)

Suppose OPT is an optimal solution to SA(i, j), there are 3 cases:

Case 1: 𝑥𝑖 and 𝑦𝑗 are aligned in OPT (match or substitution)

OPT/{𝑥𝑖, , 𝑦𝑗} is an optimal solution of SA(i-1, j-1)

Case 2: 𝑥𝑖 is aligned with a gap in OPT (deletion)

OPT is an optimal solution of SA(i-1, j)

Case 3: 𝑦𝑗 is aligned with a gap in OPT (insertion)

OPT is an optimal solution of SA(i, j-1)

Recursively define the value

Sequence Alignment Problem Input: two sequences

Output: the minimal cost 𝑀𝑚,𝑛 for aligning two sequences

(73)

Bottom-up method: solve smaller subproblems first

X\Y 0 1 2 3 4 5 n

0 1 : m

Sequence Alignment Problem Input: two sequences

Output: the minimal cost 𝑀𝑚,𝑛 for aligning two sequences

(74)

Bottom-up method: solve smaller subproblems first

74

X\Y 0 1 2 3 4 5 6 7 8 9 10 11 12

0 0 4 8 12 16 20 24 28 32 36 40 44 48

1 4 7 11 15 19 23 27 31 35 39 43 47 51

2 8 4 8 12 16 20 23 27 31 35 39 43 47

3 12 8 12 8 12 16 20 24 28 32 36 40 44

4 16 12 15 12 15 19 16 20 24 28 32 36 40 5 20 16 19 15 19 22 20 23 27 31 35 39 43 6 24 20 23 19 22 26 22 26 30 34 38 35 39

Sequence Alignment Problem Input: two sequences

Output: the minimal cost 𝑀𝑚,𝑛 for aligning two sequences

a e n i q a d i k j a z

b a n a n a

(75)

Seq-Align(X, Y, CDEL, CINS, Cp,q) for j = 0 to n

M[0][j] = j * CINS // |X|=0, cost=|Y|*penalty for i = 1 to m

M[i][0] = i * CDEL // |Y|=0, cost=|X|*penalty for i = 1 to m

for j = 1 to n

M[i][j] = min(M[i-1][j-1]+Cxi,yi, M[i-1][j]+CDEL, M[i][j-1]+CINS)

Bottom-up method: solve smaller subproblems first Sequence Alignment Problem

Input: two sequences

Output: the minimal cost 𝑀𝑚,𝑛 for aligning two sequences

(76)

76

Bottom-up method: solve smaller subproblems first Sequence Alignment Problem

Input: two sequences

Output: the minimal cost 𝑀𝑚,𝑛 for aligning two sequences

X\Y 0 1 2 3 4 5 6 7 8 9 10 11 12

0 0 4 8 12 16 20 24 28 32 36 40 44 48

1 4 7 11 15 19 23 27 31 35 39 43 47 51

2 8 4 8 12 16 20 23 27 31 35 39 43 47

3 12 8 12 8 12 16 20 24 28 32 36 40 44

4 16 12 15 12 15 19 16 20 24 28 32 36 40 5 20 16 19 15 19 22 20 23 27 31 35 39 43 6 24 20 23 19 22 26 22 26 30 34 38 35 39

a e n i q a d i k j a z

b a n a n a

(77)

Bottom-up method: solve smaller subproblems first Sequence Alignment Problem

Input: two sequences

Output: the minimal cost 𝑀𝑚,𝑛 for aligning two sequences

Find-Solution(M) if m = 0 or n = 0

return {}

v = min(M[m-1][n-1] + Cxm,yn, M[m-1][n] + CDEL, M[m][n-1] + CINS) if v = M[m-1][n] + CDEL // ↑: deletion

return Find-Solution(m-1, n)

if v = M[m][n-1] + CINS // ←: insertion return Find-Solution(m, n-1)

(78)

78

Find-Solution(M) if m = 0 or n = 0

return {}

v = min(M[m-1][n-1] + Cxm,yn, M[m-1][n] + CDEL, M[m][n-1] + CINS) if v = M[m-1][n] + CDEL // ↑: deletion

return Find-Solution(m-1, n)

if v = M[m][n-1] + CINS // ←: insertion return Find-Solution(m, n-1)

return {(m, n)} ∪ Find-Solution(m-1, n-1) // ↖: match/substitution Seq-Align(X, Y, CDEL, CINS, Cp,q)

for j = 0 to n

M[0][j] = j * CINS // |X|=0, cost=|Y|*penalty for i = 1 to m

M[i][0] = i * CDEL // |Y|=0, cost=|X|*penalty for i = 1 to m

for j = 1 to n

M[i][j] = min(M[i-1][j-1]+Cxi,yi, M[i-1][j]+CDEL, M[i][j-1]+CINS) return M[m][n]

(79)

79

(80)

Course Website: http://ada17.csie.org Email: ada-ta@csie.ntu.edu.tw

80

Important announcement will be sent to @ntu.edu.tw mailbox

& post to the course website

參考文獻

相關文件

jobs

▪ Step 2: Run DFS on the transpose

Greedy-Choice Property : making locally optimal (greedy) choices leads to a globally optimal

 From a source vertex, systematically follow the edges of a graph to visit all reachable vertices of the graph.  Useful to discover the structure of

 “Greedy”: always makes the choice that looks best at the moment in the hope that this choice will lead to a globally optimal solution.  When to

 Definition: A problem exhibits  optimal substructure if an ..

 Definition: A problem exhibits optimal subst ructure if an optimal solution to the proble m contains within it optimal solutions to su bproblems..  怎麼尋找 optimal

✓ Express the solution of the original problem in terms of optimal solutions for subproblems.. Construct an optimal solution from