Tree partition(1/2)

(1)

R99945020 林澤豪 F98942047 許芷榕 R00922113 謝宗潛 R98922144 駱家淮

(2)

Outline

• 1. What is tree partition problem

• 2. Tree partition history

• 3. some tool will be used here

– FTEST0

– M(P) and MSEARCH

• 4. Path partitioning

• 5. Tree partitioning

(3)

Tree partition(1/2)

+ = ??

(4)

Tree partition(2/2)

Min-max problem Max-min problem

(5)

History

Year Complexity Note

1981 Time : O(n(logn)²) ,Space: O(nlogn) [MTZC], N. Megiddo, will be improved here

1981 Time: O(k²rd(T)+kn) [PS], Y. Perl

1982 Time: O(k³rd(T)+kn) [BPS], R. I. Becker and Y. Perl 1983 Time and Space: O(nlogn) [FJ1], will be improved here

1983 Time : O(n(logn)²) , Time : O(n(logn)³) [M], O(n(logn)²) if the tree is a path, O(n(logn)³) if a general tree

1985 Time: O(krd(T)(k+log )+n)△ [PV] , Y. Perl, improved BPS

△ is the max degree in the tree rd(T) is the radius of the tree

(6)

FTEST0

5

3 1

3 2

2

3

4 4

3

(7)

FTEST0

• λ=5

• k=4

27

5 9

3 2

8

6

4 8

30

(8)

FTEST0

• λ=5

• k=4

27

5 9

3 2

8

6

4 8

30

(9)

FTEST0

• λ=5

• k=4

8

5 1

2 3

2

6

4 8 11

1 2

3

4

K=4 and return “yes”

(10)

FTEST0(reverse)

• λ=9

• k=4

5

3 1

3 2

2

3

4 4

3

(11)

FTEST0(reverse)

• λ=9

• k=4

27

5 9

3 2

8

6

4 8

30

(12)

FTEST0(reverse)

• λ=9

• k=4

5

5 9

3 2

8

6

4 8 8

1 2

3

4

K=4 and return “yes”

(13)

M(P)

9 6 11

2 1 15

7 8

(14)

MSEARCH

(15)

MSEARCH

(16)

MSEARCH

(17)

MSEARCH

• {59,3,31,0,28,0 ,0,0}

• K=3

• Median: (3+0)/

2=1.5

• FTEST0: ok

• Set λ₁=1.5

• Discard matrix

9 6 11

2 1 15

7 8

(18)

MSEARCH

• {59,3,31,0,28, 0}

• K=3

• Median: (28+

3)/2=15.5

• FTEST0: no

• Set λ₂=15.5

• No discard ma tirx

9 6 11

2 1 15

7 8

(19)

MSEARCH

• {59,45,42,25,31 ,22,15,0,44,23,2 7,3,16,0,0,028,2 0,11,0,17,0,0,0}

• K=3

• Median: 16.5

• Discard matrixe s

9 6 11

2 1 15

7 8

(20)

MSEARCH

• {15,0,16,0,27,3, 11,0,17,0}

• K=3

• Median: (11+3) /2=7

• Set λ₁=7

• No discard matr ixes

9 6 11

2 1 15

7 8

(21)

MSEARCH

• {15,8,7,0,16,15,1,0,27,18,12,3,11,2,9,0,17,1 1,6,0}

• K=3

• Median: (8+9)/2=8.5

• Set λ₁=8.5

• Discard half matrixes

• And finally get median=12

9 6 11

2 1 15

7 8

(22)

MSEARCH

• b_ij = i th iteration, the j th matrix

• B_i = Σ^N_j=1b_ij

• ncells(i) = after cutting, remain cells

• ncells’(i) = after deleting, remain cells

• K_i= the matrixes be added in this iteration

(23)

MSEARCH

• Proof:

• Let ncells’(i) 2B≦ _i+2 be the result

• 4*(ncells’(i-1)+k_i) = ncells(i)

• By induction: ncells’(i-1) 2B≦ _i-1+2

• ncells(i) 4*(2B≦ _i-1+2+ k_i) ≦4*(2B_i-1+2k_i+2)

≦4*(2B_i+2) ≦8B_i+8

(24)

MSEARCH

• For the second part of MSEARCH, each iteratio n will discard d1+d2+d3:

– First time: d1 >= ncells(i) – B_i /2

– Second time: d2 >= ncells(i) – d1 - B_i /2 – Third time: d3 >= ncells(i) – d1 – d2 - B_i /2

• ncells’(i) = ncells(i) – d1 – d2 - d3 (15/8)(B≦ _i+1) < 2B_i+2

(25)

MSEARCH

• In cross-splitting:

b_ij = 2(n_jm_j/c)^0.5 -1 = O(2(n_jm_j/c)^0.5) c start from n_jm_j/4 down to 1.

4 * (2^logm_j -1) / 2-1 = O(m_j)

• In quatering-splitting b_ij = O(log(n_j/m_j))

• The total splitting time complexity = O(m_jlog(n_j /m_j))

(26)

PARTITIONING A PATH

(27)

Partitioning a path

• Given k, partition a path of vertices for max-min.

• The proposed three algorithm overview

•

PATH1 PATH1

• Phase 1:

Gather

information.

• Phase 2:

Use faster feasibility test to complete parametric search.

• O(n log log n)

• Phase 1:

Gather

information.

• Phase 2:

Use faster feasibility test to complete parametric search.

• O(n log log n)

PATH2 PATH2

• Gather

information in more than one phases.

• O(n log*n)

• Gather

information in more than one phases.

• O(n log*n)

PATH3 PATH3

• Use a variety of refinement.

• O(n)

• Use a variety of refinement.

• O(n)

(28)

Algorithm PATH1

• Phase 1 – collect information

– Set .

– Form an f(n)-partition of path . – Form a set of sorted matrices.

– using feasibility test .

– Compute functions using .

• Phase 2 – complete parametric search

•

(29)

Form an f(n)-partition of path

• r-partition of a path

– A partition of the vertices of into subpaths , where each subpath contains vertices, and the last subpa th contains at most vertices.

• f(n)

– The largest power of 2 which no larger than .

• Total subpaths.

•

(30)

Algorithm PATH1

– Set .

•

(31)

Form a set of sorted matrices

• Each subpath forms a sorted matrix .

– Each with dimension .

– Last one might have smaller dimension size.

• All form a set of sorted matrices.

•

(32)

Algorithm PATH1

– Set .

•

(33)

�� using feasibility test �� 0

• Call with algorithm input:

– The set of matrices – Stopping count

• Using feasibility test

• Output search boundary

• At most active values remains.

– Active value is any candidate value from a subpath and .

•

(34)

Algorithm PATH1

– Set .

•

(35)

Compute functions using �� 1

• For subpaths with no active values in their matric es, compute

• Maximum number of cuts from to , that each connecte d component except the last, has weight .

• Maximum possible weight of the last component given number of cut in subpath from to is .

•

�_� � _�

20

(36)

�� 1

• A dynamic programming to compute and for all vertices in subpath .

– Range from back to . – If ,

– Otherwise, ,

• is the smallest index such that .

•

� _� � _�

�₂=12

(37)

Algorithm PATH1

– Set .

•

(38)

(39)

Algorithm PATH1

– Set .

•

(40)

• Stopping count

•

Example

^(1/6)

Medeian (O) ->

Medeian (X) ->

13

Medeian0 (O) -> 0

13

The number of value remains stopping value 2 -> stops

(41)

•

Example

^(2/6)

13

Subpaths with no active value:

(42)

Example

^(3/6)

• Phase 2 of algorithm PATH1:

– Only test a value which – Test feasibility of using

13 •

17

(43)

Example

^(4/6)

– Only consider a value which – Test feasibility of using

13 •

0 7

(44)

Example

^(5/6)

13 •

11 -> (O) 13

(45)

Example

^(6/6)

13 13 •

12 -> (O) 13

All values are discarded, terminates.

Return

(46)

Complexity of Algorithm PATH1

– Set .

•

(47)

�� using feasibility test �� 0.

• By Theorem 2.1, the number of test values is .

– -> number of test values

• The test values are produced in .

• Each feasibility test using takes

• The total time is

•

(48)

Complexity of Algorithm PATH1

– Set .

•

log �

� log ¿

� ¿ log �

� log ¿

� ¿

(49)

Compute functions using �� 1

• Lemma 3.1. Let be a subpath with no active values. T hen computes and for all vertices in in linear time.

For each of values of , a constant time amount of work is do ne. The remaining work can be apportioned as constant for e ach of values of .

• For each , takes .

• There are total paths and thus takes time to complet e all.

•

(50)

Complexity of Algorithm PATH1

– Set .

•

log �

� log ¿

� ¿ log �

� log ¿

� ¿

))

(51)

�� using feasibility test �� 1

• Produce test values

– By Theorem 2.1, the number of test values is and produced in .

• Test feasibility

– What is the complexity of ?

•

(52)

�� using feasibility test �� 1

• Lemma 3.2. Let be a path of n vertices partitioned i nto subpaths, each of size at most . Let all but at mo st subpaths have computed all and . determines th e feasibility of a test in time.

There are subpaths to be examined. Consider subpath . Th e search for uses time.

If there are no active values in , the remained operations ta ke constant time. There are such paths, which take in total .

If there active values for , exam the path takes . Only subpa ths with active values, which take in total.

•

(53)

�� using feasibility test �� 1

• Produce test values

– By Theorem 2.1, the number of test values is and produced in .

• Test feasibility

– Each feasibility test takes . – All tests take .

•

(54)

Complexity of Algorithm PATH1

– Set .

•

log �

� log ¿

� ¿ log �

� log ¿

� ¿

))

log �

� log ¿

� ¿ log �

� log ¿

� ¿

log �

� log ¿

� ¿

log �

� log ¿

� ¿

(55)

PATH2

• PATH2 is an improved algorithm of PATH1.

– With complexity

– By applying PATH1 strategy with more than two ph ases.

– The size of subpaths will be larger and larger in ea ch phase.

• Prior phase provide data to fasten the current phase, a nd decrease the difference between and .

•

(56)

PATH2 are called only when

(57)

Complexity of PATH2

• Theorem 3.4. Algorithm PATH2 solves the max-min k- partitioning problem on a path of n vertices in time.

There are phases, where is . Each call to with matrices of si ze uses time and generates test values.

On phase , performing all tests takes . On other phase I, takes

Since is , total time is .

The time to compute and is .

•

(58)

PATH3: LINEAR TIME PATH PARTITIO

NING

(59)

Before we start…

• The algorithm of O(nlog*n) time complexity (P ATH2):

– For each rotation, find a closer bound to the answ er by dividing into larger subpath each time

– Total O(log*n) rotations is needed to get the answ er

– O(n) time for each rotation is required to do MSEA RCH and counting inactive cells’ rmdr/ncut

(60)

The main idea to speed up PATH2

• Instead of recounting all values every rotation, reuse the ncut and rmdr values counted in la st rotation

• Trimming the matrix between rotations, so tha t MSEARCH will not need to process all the su b-matrices

(61)

The first part: a new digest way

• No need to recalculate all inactive cells’ vertic es every time, just recalculate them when nee ded

– When to?

– How to?

– How about the complexity?

(62)

DIGEST2

(63)

When to do DIGEST2

• Maintain a queue for those vertices(in the subpat h) of which rmdr and ncut values needed to be upd ated

– When does a vertex needed to be updated?

• The first time this vertex was become inactive

• The end of the subpath been extended since the subpath gro wing larger so that new cuts can be found in the new subpath

– In other words:

the total weight from next(l) to the end of subpath > cur rent upper bound

(64)

How to do DIGEST2

• For each vertex:

– Updating it’s ncut value by:

1. Generating the value for a new vertex, or

2. Using its previous result + number of new cut appear ed after previous “next vertex”

– Keep updating “next” value instead of rmdr value f or each vertex

– Only the rmdr value of the vertices at the tail of su bpath needed to be updated

(65)

How to do DIGEST2: Step by step

1. Check if the vertex needs to be updated in order 2. Find the next “related” vertex

a. If the “next” vertex exists, relate to it

b. If not, it’s its first time been updated. Check vertex by vertex

3. Check if the related vertex we found is a vertex at the end part of the subpath?

a. If yes, then the update end here.

b. If no, then put the related vertex into queue.

4. Update the vertex next/ncut in reverse order.

(66)

Time complexity of DIGEST2

• For each vertex, it needs a “vertex by vertex upd ate” once totally in all rotation at most. It costs O(n) time

• How many times will a vertex be updated more t han once in total?

– Each round bounded by (n/ri)*r_i+1

• The time complexity of rearrangements of queu es is O(n) because it’s always a rotated sorted lis t

(67)

Trim the matrix

• Counting all submatrices is not efficient enough!

• Trim the submatrices before send them to MSEARCH:

– Trim the row or column that are too small before each rotatio n

– Between rotations, keep a “possible area” that contains only matrix of which value is not too large. Send only the intersecti on area that both in trimmed matrix and possible area to be p rocessed in MSEARCH

• In order to trim the matrix, the “thin matrices” should b e processed by MSEARCH to get the information needed for trimming

(68)

Trim the matrix: process thin matrices

• At the very beginning of each rotation, take ev ery submatrices’ first column and row as a ”thi n matrix” and apply them to MSEARCH first

• For each value on the thin matrices, find out w hether it is too large, too small or in active are a

(69)

Trim the matrix:

process thin matrices(cont.)

(70)

Trim the matrix: trim the small part

• Discard all rows and columns start with a too s mall value and form a new set of matrices

(71)

Trim the matrix:

build possible area and find intersection

• Maintain a area that could have active value in it by removing those rows or columns ended with a too large value

(72)

Trim the matrix:

build possible area and find intersection (cont.)

• Since the memory can only keep matrix area, so me scattered active value not in the matrix area might be abandoned

• In that case, use another set to keep those activ e value from missing

• In the following rotations, send both the interse ction area of possible areas and trim matrices a nd possible active values kept in set to MSEARC H to process

(73)

PATH3

(74)

Before complexity analysis:

Reviewing theorem 2.1

(75)

Time complexity of trimming process

• Analysis: rotation i for instance

– It takes O(n/r_i+1) time to form intersection submatric es. These submatrices include O(n/r_i+1) thin matrices . And O((n/r_i+1)logr_i) time is consumed to process O(l ogr_i) search values

– O(n/r_i+1) time is used to trim the matrix

– The total dimension value of remaining submatrices is bounded by O(2n/ri + large(i)), thus there will be O ((n/r_i+ large(i))/r_i) r_ix r_i matrices in the worse case

(76)

Time complexity of trimming process (cont.)

• By theorem 2.1, the time to process matrices and se t of possible values by MSEARCH is as following:

– It takes O((n/r_i+ large(i)) + n/r_i+1) time to produce O(log r_i) search values

– Each search value costs O(n/r_i+1(log r_i+1)) time to test.

• But while generating search value, the new “possibl e area” for next rotation is also produced at the sam e time. Thus, the total cost of producing search valu e + generating possible area = O(n/ri+1(log ri) + large (i))

(77)

TREE PARTITIONING

(78)

Methods

• 1.

• 2.

• 3.

(79)

Edge partition

(80)

Vertex partition

(81)

r-partition

(82)

Method 1

• Two phases

• First phase reduce the problem of partitioning a tree to the problem of partitioning a tree wit h fewer than leaves.

• When there are at least leaves, at least half of the leaf-paths are of length at most

(83)

Method 1

• Initialize

(84)

Method 1

• First phase

(85)

Method 1

• Second phase

(86)

Method 1

(87)

Method 1

• In the while loop of phase 1:

• MSEARCH take O(n) to produce O(loglogn) sea rch value. And cost O(n) to test each search val ue.

• It needs O(logr) = O(loglogn)

• So the total time is O(n(loglogn)^2)

(88)

Method 1

• By lemma 4.1, feasibility test will use O((n/r)lo gr) = O(nloglogn/(logn)^2)

• The number of calls to MSEARCH is O(logn), an d each call will produce O(logn) test value.

• Then the total time of phase 2 is O(nloglogn)

(89)

Method 2

(90)

Method 2

• There will be lev+1 phases, where lev is O(log*

n).

• Phase i will produce O((logri)^2) test values.

• The total time of phase i will be O((n/ri+1)logri +1(logri)^2).

• ri+1 is θ((logri)^3), then the time of each phas e will be O(n)

• The total time is O(nlog*n)

(91)

TREE3: LINEAR TREE PARTITIONING

(92)

General ideas and difficulties encountered

• Using the same technique transform PATH2 int o PATH3 on tree causes some difficulties:

– Prune a leaf path may cause additional maintain ti me for DIGEST2 because of the unsure vertex weig ht changing

– To trim the submatrices on trees like PATH3, stead y enough submatrices might be needed s.t. each r otation can preserve useful information for the foll owing rotations

(93)

The first problem of algorithm improvement

• Copy tree each rotation in a straight forward w ay cost too much time (O(nlog*n) totally)

– Thus, find a new way to maintain and store the pr uned tree and original tree is necessary

(94)

New structure

(95)

How does the structure work?

• When pruning a leaf path, simply remove the value in the table

• After pruning, if a leaf path became the only c hild of some vertex, move the leaf path to the left of its parent to form a new leaf path

• Use pointer to remember where the subpaths’

parents are

(96)

Additional value

• Keep the value of each vertex’s all descendant s’ weight (including itself)

• These value can be maintained in the same co st as the cost of moving when the right most c hild is always pruned, which is not always the case. Thus, another mechanism to solve this p roblem is needed

(97)

Light path

• A light path is a path’s total weight under curr ent upper bond, otherwise it is a heavy path

• For a light path, maintain its total weight only

• At i_th rotation after tree being pruned, there wi ll be at most O(n/r_i) light path with active valu e. Reduce it to O(n/r_i²) by applying the total w eight of each light path to MSEARCH

(98)

Heavy path

• Use overlapping subpath idea to maintain the h eavy path

• Each vertex will be included in at most 2 overla pping subpath

• For any 2 adjacent overlapping subpath, the ov erlapping range should be at least larger than c urrent upper bound to ensure all the possible a ctive value can be generated by those subpaths

(99)

To maintain heavy path structure

• In the beginning, there’s no heavy path

• Heavy path appear when:

– Upper bound reduced

• Set 2 overlapping subpaths as the first and the last subp ath which both cover the whole path

– Path concatenation

• There will be 3 cases

(100)

To maintain heavy path structure(cont.)

• When it is a heavy path after concatenating:

– Case 1: light path + light path

• Set 2 overlapping subpaths as the first and the last subp ath which both cover the whole path

– Case 2: light path + heavy path

• Stretch one side of overlapping subpath to cover the lig ht path

– Case 3: heavy path + heavy path

• Combine the last overlapping subpath of the ancestor with the first overlapping subpath of the descendant

(101)

Light/Heavy path mechanism

• The new mechanism solve the maintenance cost problem of additional value mechanism

– Maintain a light path cost O(n) time in total

– Maintain a heavy path: Instead of recalculate all ances tors' value, only the corresponding overlapping subpa th will be recalculated. It cost O(1) time for each verte x, and O(n) time in total

• Each vertex will be contained in at most 2 overlapping subp aths, and each will be updated at most twice (first time app earing and combining with adjacent overlapping subpath)

(102)

Building the RB-tree

• The new mechanism makes applying the algorit hm developed previously to the O(n) algorithm of solving tree partition problem possible

• A log base search function for subpaths is need ed for the feasibility test function, and the light/

heavy path mechanism won’t guarantee includi ng the whole subpath

• Build and maintain a RB-tree for each subpath t o replace binary search

(103)

TREE3

(104)

The time complexity for TREE3

• As phase i started, there will be O(n/r_i+1) matri ces, O(n/r_i+1²) light paths and O(n/r_i+1²) element s in set

• The first step of pruning and relocating/recalc ulating subpaths totally costs O(n) time

• Maintaining the RB-tree during concatenation s costs O(loglogr_i+1) each time. It will appear at most O(n/r_i+1) times

(105)

The time complexity for TREE3(cont.)

• To inactivate some light paths costs O(n/r_i) time to produce O(log ri) test values, which can be te sted in O((n/r_i+1)logr_ilogr_i+1)=O((n/r_i)logr_i) time

• The last part of TREE3 and rebuilding the overla pping subpaths are proved to cost O(n) time ov er all rotations

• TREE3 solves` the tree partitioning problem in O (n) time

Tree partition(1/2)

Outline

Tree partition(1/2)

Tree partition(2/2)

History

FTEST0

FTEST0

FTEST0

FTEST0

FTEST0(reverse)

FTEST0(reverse)

FTEST0(reverse)

M(P)

MSEARCH

MSEARCH

MSEARCH

MSEARCH

MSEARCH

MSEARCH

MSEARCH

MSEARCH

MSEARCH

MSEARCH

MSEARCH

MSEARCH

PARTITIONING A PATH

Partitioning a path

Algorithm PATH1

Form an f(n)-partition of path

Algorithm PATH1

Form a set of sorted matrices

Algorithm PATH1

������� using feasibility test ����� 0

Algorithm PATH1

Compute functions using ������ 1

������ 1

Algorithm PATH1

Algorithm PATH1

Example

Example

Example

Example

Example

Example

Complexity of Algorithm PATH1

������� using feasibility test ����� 0.

Complexity of Algorithm PATH1

Compute functions using ������ 1

Complexity of Algorithm PATH1

������� using feasibility test ����� 1

������� using feasibility test ����� 1

������� using feasibility test ����� 1

Complexity of Algorithm PATH1

PATH2

Complexity of PATH2

PATH3: LINEAR TIME PATH PARTITIO

NING

Before we start…

The main idea to speed up PATH2

The first part: a new digest way

DIGEST2

When to do DIGEST2

How to do DIGEST2

How to do DIGEST2: Step by step

Time complexity of DIGEST2

Trim the matrix

Trim the matrix: process thin matrices

Trim the matrix:

process thin matrices(cont.)

Trim the matrix: trim the small part

Trim the matrix:

build possible area and find intersection

PATH3

Before complexity analysis:

Reviewing theorem 2.1

Time complexity of trimming process

Time complexity of trimming process (cont.)

TREE PARTITIONING

Methods

Edge partition

�� using feasibility test �� 0

Compute functions using �� 1

�� 1

�� using feasibility test �� 0.

Compute functions using �� 1

�� using feasibility test �� 1

�� using feasibility test �� 1

�� using feasibility test �� 1