June1st,2021 STOC2019:Proceedingsofthe51stAnnualACMSIGACTSymposiumonTheoryofComputingFabrizioGrandoniBunditLaekhanukitShiLi O ( log k / loglog k ) -ApproximationAlgorithmforDirectedSteinerTree:ATightQuasi-Polynomial-TimeAlgorithm

(1)

O(log

²

k/ log log k)-Approximation Algorithm for Directed Steiner Tree: A Tight Quasi-Polynomial-Time

Algorithm

STOC 2019: Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing

Fabrizio Grandoni Bundit Laekhanukit Shi Li

Final Presentation, Special Topics on Graph Algorithms, Spring 2021 B07902024 塗大為 B07902133 彭道耘 B07902134 黃于軒 B07902141 林庭風

June 1st, 2021

(2)

Overview

1 Introduction

2 Previous & Related Works

3 Decomposition Tree

4 Label-Consistent Subtree

5 Approximation Algorithm for Label-Consistent Subtree

6 LP Relaxation

7 Hardness Results

(3)

1 Introduction

6 LP Relaxation

7 Hardness Results

(4)

Introduction

(5)

Outline

For a directed graph with k terminals, the paper [GLL18] presents:

An O(log²k/ log log k)-approximation algorithm for Directed Steiner Tree (DST) in quasi-polynomial time. (Section 3, 4, 5, 6) Under certain conjectures, O(log²k/ log log k) is the optimal

approximation ratio for quasi-polynomial time algorithms. (Section 7)

(6)

Directed Steiner Tree

Definition 1

Given a weighted directed graph G, a root r∈ V(G) and a set of terminals K⊆ V(G) \ {r}, a directed steiner tree T is an aborescence (directed tree) rooted at r that contains all the terminals.

The goal of Directed Steiner Tree is to find such T with minimum cost. Throughout the presentation, we will additionally assume that the edge weights in the input graph satisfy triangle-inequality, without loss of generality.

(7)

Directed Steiner Tree

1

(8)

Basic Complexity Theory

Definition 2

Quasi-polynomial-time algorithms run in O(2^polylog(n)) time. More precisely, QP =∪

c∈NDTIME(2^O(log^cⁿ⁾).

Definition 3

ZPTIME(f(n)) refers to randomized algorithms that always return the correct answer and have randomized running times with expectation O(f(n)).

Remark

It will be shown that, assuming the Projection Game Conjecture (which will be stated later) and NP̸⊆∩

0<ϵ<1ZPTIME(2ⁿ^ϵ), the optimal approximation ratio for quasi-polynomial time algorithms is

O(log²k/ log log k).

(9)

Sketch of Approach

1 Directed Steiner Tree (DST) ⇐⇒ Decomposition Tree

2 Decomposition Tree ⇐⇒ Label-Consistent Subtree (LCST)

3 Integer linear programming formulation of LCST

4 Sherali-Adams lifting of the corresponding relaxed LP formulation

5 O(log²k/ log log k)-approximation of the ILP instance in O(n^log⁵^k)

(10)

Sherali-Adams Lifting

A lift and project method to solve ILP, originally introduced in [SA90].

1 Formulate (relaxed) LP problem, resulting in a relaxed polytope containing the integer polytope.

2 Apply Sherali-Adams to ”tighten” the polytope.

3 It can be shown that, after enough runs of the previous step, the integer polytope is obtained.

4 The idea is to run “the right number of” rounds such that the resulting polytope is “somewhat tight” but yet not very diﬀicult to solve.

For the R-th round, Sherali-Adams adds the variables x_S=∏

i∈Sx_i for every subset S⊆ [n] of size not exceeding R and replaces the original constraints accordingly.

(11)

Sherali-Adams Lifting

For a polytope P, SA(P, R) denotes the polytope tightened by the R-th round of Sherali-Adams lift. Additionally, the hierarchy makes conditioning on an event possible

Definition 4

Let x∈ SA(P, R) for R ≥ 1, for xi> 0 define x^′∈ SA(P, R − 1) to be x conditioned on x_i as

x^′_S= x_S_∪{i}

x_i for S∈( _[n]

R−1

).

(12)

Basic Definitions

Definition 5

Given a rooted tree T and a vertex v∈ V(T), we use root(T) to denote the root of T and T[v] to denote the subtree containing v and all of its decendents.

Definition 6

The rooted trees under discussion are out-arborescences, i.e., edges are directeds toward the leaves. For a directed edge e = (u, v), we define head(e) = u and tail(e) = v.

(13)

1 Introduction

6 LP Relaxation

7 Hardness Results

(14)

Previous & Related Works

(15)

Comparison to Previous Work

The best polynomial-time approximation algorithm for DST achieves an O((1/ϵ)³k^ϵ)-approximation ratio in O(n^1/ϵ)-time, due to Charikar et al. [Cha+99]

The previously best approximation algorithm for DST in

quasi-polynomial-time achieves ratio O(log³k), due to Charikar et al.

[Cha+99] as well.

Recursive greedy algortihm of Chekuri and Pal for GST² [CP05]

The first one that yields an approximation ratio of O(log²k) for GST in quasi-polynomial-time.

Their algorithm exploits that any optimal solution can be shortcut into a path of length k, while paying only a factor of 2. (such a path exists in the metric-closure of the input graph).

Hierarchy based LP-rounding techniques by Rothvoß [Rot12].

Handle the dependency rules.

(16)

Related work

An O(log²k log n)-approximation for GST in polynomial time, where k is the number of groups by Garg et al. [GKR98]

Map the input instance into a tree instance by invoking the Probabilistic Metric-Tree Embeddings.

Apply an elegant LP-based randomized rounding algorithm.

(17)

Related work

ℓ-DST and ℓ-GST, survivable network variants of DST and GST.

ℓ edge-disjoint directed (resp., undirected) paths from the root to each terminal (resp., group).

There is no 2^log^1−ϵⁿ-approximation, for any ϵ > 0, unless NP⊆ DTIME(2^polylog(n)), by Cheriyan et al. [Che+14]

There is no ℓ^1/2^−ϵ-approximation, for any constant ϵ > 0, unless NP = ZPP, by Laekhanukit [Lae14]

Gupta et al. [GKR10] presented a ˜O(log³n log k)-approximation algorithm for 2-GST.

Chalermsook et al. presented an LP-rounding bicriteria approximation algorithm for ℓ-GST that returns a subgraph with cost O(log²n log k) times the optimum while guaranteeing a connectivity of at least Ω(ℓ/ log n).

(18)

1 Introduction

6 LP Relaxation

7 Hardness Results

(19)

Decomposition Tree

(20)

Decomposition Tree

Definition 7

A decomposition tree τ of G is a rooted tree where α∈ V(τ) is associated with µα ∈ V(G)

Leaf β∈ V(τ) is associated with eβ ∈ E(G) and µβ = head(e_β) For α₂ being a child of α, there is a child α₁ of α with

µ_α= µ_α₁

µ_α₂ is involved in τ [α1]

The cost of τ is the sum of costs of the edges corresponding to its leaves.

Definition 8

A vertex v∈ V(G) is involved in τ[α] if either µα= v

There is a leaf β∈ τ[α] with v = tail(eβ)

(21)

Decomposition Tree

Intuitively, for a subgraph G^′⊆ G, a decomposition tree of G^′ represents the process of recursively dividing E(G^′) until the number of edges becomes one. Additionally,

The decomposition is associated with a root r. r is the head of at least one edge in the current edge set.

At the decomposition step, each sub-instance created either has the same root r^′ as the current instance, or r^′ is the tail of an edge in the edge set of another sub-instance having the same root as the current instance. This ensures that the sub-instance is reachable from the current root.

(22)

Reduction

The followings will be shown

Given a directed steiner tree T, there exists a decomposition tree τ of T rooted at r with cost(τ )≤ cost(T). All the terminals are involved in τ . Additionally, τ is a full binary tree of height O(log k).

Given a decomposition tree τ rooted at r of some unknown subgraph in G that involves all terminals, there exists a directed steiner tree T with cost(T)≤ cost(τ).

Therefore, finding the minimum DST can be reduced to finding the minimum decomposition tree τ^∗ rooted at r in G involving all k terminals.

Moreover, such decomposition tree will be a full binary tree and have height O(log k).

(23)

DST to Decomposition Tree

Let T be a directed steiner tree. Since triangle-inequality is satisfied, the internal vertices in T have out-degree at least two and thus |V(T)| ≤ 2k.

The following lemma was proved in class Lemma 9

For a tree T of n edges, there exists a vertex v∈ V(T) such that T can be decomposed into two subtrees T₁ = T[v] and T2 = T\ T[v] and

n

3 ≤ E(Ti)≤ 2n 3 holds for i∈ {1, 2}.

(24)

DST to Decomposition Tree

τ can be constructed by recursively decomposing T into balanced subtrees.

In particular, by setting τ ← Decompose(T, r)

procedure Decompose(T, r) α← a new vertex with µα = r if T ={(r, v)} then

e_α = v

return the decomposition tree with a single vertex α end if

Let v be the vertex with ^|T|₃ ≤ |T[v]| ≤ ²^|T|₃ by Lemma 9 τ1 ← Decompose(T \ T[v], r)

τ2 ← Decompose(T[v], v)

Set root(τ₁) and root(τ₂) to be children of α return the decomposition tree rooted at α end procedure

(25)

DST to Decomposition Tree

Figure: The steiner tree T^∗ and the decomposition tree τ^∗ that it maps to. Each time a balanced subtree is peeled of and the two parts are recursively divided.

(26)

DST to Decomposition Tree

Let’s verify the outcome indeed has the desired property:

The cost of τ is trivially the same as the cost of T because each decomposition is disjoint.

Since |V(T)| ≤ 2k and the size of the edge set reduces by a constant factor at each recursion, the height of τ is O(log k).

Every internal vertex of τ has exactly two children, hence τ is a fully binary tree.

(27)

Decomposition Tree to DST

Let τ be a decomposition tree in G that involves all terminals with µ_{root(τ )} = r. We’d like to show

Lemma 10

Let α∈ V(τ) and Gα be the subgraph induced by edges at the leaves in τ [α]. Then, there exists a directed path from µα to every vertex in G that is involved in τ [α].

Let E^′ be the edges at the leaves of τ and G^′ be the subgraph induced by E^′. By Lemma 10, all terminals are reachable from µ_{root(τ )} = r in G^′. Thus, by removing redundant edges in G^′ a directed steiner tree with cost not exceeding cost(τ ) is constructed.

(28)

Decomposition Tree to DST

Let’s prove the Lemma by induction on the height of α:

α is a leaf: Only head(eα) and tail(eα) are involved in τ [α] and they are both reachable from µ_α = head(e_α) using e_α.

α is an internal vertex: Let v be a vertex that is involved in τ [α].

There are two possibilities:

v = µα: Then it is trivially reachable from itself.

v = tail(eβ) for β∈ τ[α]: Let α2 be the child of α such that β∈ τ[α2] and hence β is involved in τ [α2] as well. If µ_α₂ = µ_α, by inductive hypothesis, β is reachable from µ_α.

Otherwise, µα₂ is involved in τ [α1] for some child α1 with µα₁= µα. By inductive hypothesis, there exists a path P from µα₁ to µα₂ and a path Q from µα₂ to β. Concatenating P and Q raises the desired path.

(29)

1 Introduction

6 LP Relaxation

7 Hardness Results

(30)

Label-Consistent Subtree

(31)

Label-Consistent Subtree

Definition 11

Let T⁰ be a rooted tree with cost c_v on vertex v. Let L be the set of labels. Each vertex v∈ V(T⁰) is associated with two sets, the service ser(v)⊆ L and the demand dem(v) ⊆ L. A subtree T of T⁰ with root(T) = root(T⁰) is label-consistent if

For all vertices v∈ V(T) and l ∈ dem(v), there exists a descendant u of v in T such that l∈ ser(u). That is, each demand of v is supplied by some descendant of v.

Definition 12

Let K⊆ L be the set of global labels. Symmetrically, L \ K is the set of local labels. The goal of Label-Consistent Subtree is to find a label-consistent subtree T^∗ with minimum cost such that all global labels are supplied.

(32)

Sketch of Reduction

For an instance of DST, we’d like to find a desired decomposition tree by constructing an instance of LCST T⁰ such that

All possible full binary decomposition trees of height O(log k) are embedded in T⁰.

To achieve better approximation ratio, each decomposition tree is divided into twigs of heights O(log log k).

The structure and property of decomposition trees are guaranteed by the label-consistencies of subtrees.

(33)

Sketch of Reduction

In particular, T⁰ will have the following property Height h = O(log k/ log log k)

s := max_v_∈V(T0)| dem(v)| is O(log k)

The number of vertices N = n^O(log²k/ log log k)

The size of global labels |K| is the same as the number of terminals k.

A solution to LCST can be converted to a decomposition tree τ with the same cost that involves all terminals, and vice versa.

(34)

Sketch of Reduction

With

Theorem 13

There is an (shN)^O(sh²⁾-time O(h log k)-approximation algorithm for the Label-Consistent Subtree problem.

We get an O(h log k) = O(log²k/ log log k)-approximation algorithm for DST that runs in (shN)Ô(sh²⁾= nÔ(log⁵^k) = 2Ô(log⁶ⁿ⁾-time.

(35)

Twigs

A decomposition tree is divided into twigs and embedded in the LCTS instance. Specifically

Definition 14

Let g :=⌈log2log₂k⌉. A twig η is a full binary tree of height at most g, where

Each α∈ V(η) is associated with µα∈ V(G)

Each leaf β ∈ V(η) may or may not has an associative eβ ∈ E(G), and if it does, µ_β = head(e_β)

We can think of leaves β without e_β be internal vertices in the

decomposition tree τ to which η corresponds; while leaves with e_β map to leaves in τ .

(36)

Construction of T

⁰

Let ¯h = O(log k) be the upper-bound of height(τ^∗). The height of the twig τ^∗ maps to is then upper-bounded by ⌈¯h/g⌉.

T⁰ is constructed by calling ConstructLCST(r, 0). The set of global labels is identical to the set of terminals.

1: procedure ConstructLCST(u, j)

2: Let p be a new node with c_p = 0 and dem(p) = {ℓ}, where ℓ is a new local label

3: if j <⌈¯h/g⌉ then

4: for each possible non-singular twig η with µ_root(η) = u do

5: AddChild(p, ℓ, η)

6: end for

7: end if

8: end procedure

(37)

Construction of T

⁰

1: procedure AddChild(p, ℓ, η)

2: Let q be a new child of p with c_q = ∑

β∈ηc_β, ser(q) = {ℓ}, dem(q) =∅

3: for leaf β of η do

4: if e_β is defined then

5: if tail(e_β)∈ K then Add global label tail(eβ) to ser(q)

6: end if

7: else

8: T^q_β ← ConstructLCST(µβ, j + 1)

9: Set root(T^q_β) to be q’s child

10: Create a new local label ℓ^′ and add it to dem(q) and ser(root(T^q_β))

11: end if

12: end for

(38)

Construction of T

⁰

1: procedure EnsureStructure(p, η, q)

2: for internal node α of η do

3: Let α₁ be a child of α with µ_α₁ = α₂ and α₂ be the other child

4: if µ_α₂ ̸= µα and̸ ∃ leaf β ∈ η[α1] with tail(e_β) = µ_α₂ then

5: Create a new local label ℓ^′ and add it to dem(q)

6: for leaf β of η[α₁] with e_β undefined, q^′ ∈ T^q_β do

7: if η_q′ has a leaf β^′ with tail(e_β′) = µ_α₂ then

8: Add ℓ^′ to ser(q^′)

9: end if

10: end for

11: end if

12: end for

(39)

Construction of T

⁰

- Intuition

The height (considering the number of twigs) of T⁰ is bounded by the j <⌈¯h/g⌉ guard.

Each call to ConstructLCST is a two-level expansion:

The first level (L2 in ConstructLCST) corresponds a vertex u∈ V(G) (referred to as p-nodes)

The second level (L2 in AddChild) corresponds to a twig rooted at u (referred to as q-nodes)

We must choose at least one such twig if u is chosen (enforced by the local label ℓ)

(40)

Construction of T

⁰

- Intuition

Figure: Single step of recursion made in AddChild. Each q-node is associated with a twig ηqwhere µroot(ηq)being the vertex upits parent p corredsponds to.

4

(41)

Construction of T

⁰

- Intuition

A label-consistent subtree is a valid decomposition tree:

If the leaf node β of η does not have a well-defined eβ, we must further expand it, as enforced by the local label ℓ^′ at L10 in AddChild α2 is involved in η[α1] because the local label ℓ^′ at L5 in

EnsureStructure is eventually supplied by one of the descendants that points to µα2

A label-consistent subtree that supplies all global labels map to a decomposition tree in which all terminals are involved since there is a one-to-one correspondence between the global labels and the

terminals.

(42)

Properties of T

⁰

Obviously, the height is by construction O(¯h/g) = O(log k/ log log k) For a node v in T⁰

If v is a q-node, then| dem(v)| is bounded by the number of vertices in a twig, which is O(2^{log log k}) = O(log k)

If v is a p-node, then| dem(v)| = 1

The size of T⁰ is dominated by the number of branches of p-nodes.

For u∈ V(G), the number of twigs rooted at u is bounded by (2^g)²= 2^2g: the number of shapes of twigs, times

(n²)²^g= n²^·2^g: the number of ways to assign µ_∗ and e_∗ to vertices in a twig

which is roughly log k× n^{log k}= n^{O(log k)}. Thus, N≤(

n^{O(log k)})log k/ log log k

= n^O(log²k/ log log k)

The set of global labels is the same as the set of terminals

(43)

Decomposition Tree to LCST

Given the optimal decomposition tree τ^∗, we can locate its corresponding T^∗ in T⁰ as follows

Decompose τ into twigs at vertices of depths ig for some i. That is, if vertex v is of depth ig, then the twig it corresponds to contains descendants of v of depth ig, ig + 1, . . . , (i + 1)g

Locate T^∗ recursively, starting from r. At vertex v of depth ig, add the nodes p corresponding to v and q corresponding to twigs rooted at v into T^∗. For each descendant u of v at depth (i + 1)g, there will be a child p^′ of q associated to u and we keep the locating process on u.

(44)

Decomposition Tree to LCST

In particular, the following can be easily shown by comparing the required property of decomposition tree and the construction of T⁰

Lemma 15

T^∗ is a label-consistent subtree of T with cost(T^∗) = cost(τ^∗). Moreover, all global labels are supplied by T^∗.

(45)

LCST to Decomposition Tree

Conversely, we need to show the following Lemma 16

Given any feasible solution T to the LCST instance T⁰, in time poly(|V(T)|) we can construct a decomposition tree τ with

cost(τ ) = cost(T). Moreover, if a global label v∈ K is supplied by T, then τ involves v.

Let C be the set of twigs contained in T. For a q-node q in T and the twig ηq it corresponds to, consider its child p (a p-node) and p’s child q^′. µ_root(η_q′₎ will be µ_β where β is the leaf in η_qrelated to p. τ is then constructed by connecting η_q and η_q′ at µ_β for all such q and q^′. The cost and the structure of τ can be easily argued by the way we construct T⁰ and the placement of local labels.

(46)

1 Introduction

6 LP Relaxation

7 Hardness Results

(47)

Approximation Algorithm for Label-Consistent

Subtree

(48)

Main Theorem

In the following few sections, we will derive the following theorem.

Theorem 17

There is an (shN)^O(sh²⁾-time O(h log k)-approximation algorithm for the Label-Consistent Subtree problem, where s := max_v_∈V(T0)| dem(v)|.

(49)

Redefining

To make the statement clearer and easier, we will, without loss of generality, transform the input LCTS instance to the one satisfying the following properties:

dem(u) and dem(v) are disjoint for all u, v∈ V(T⁰).

Make copies of duplicate labels.

Demand labels are only located at the internal nodes.

Demand labels at leaves are either irrelevant or never supplied Service labels are only located at the leaves, and each leaf contains exactly one service label.

Attach| ser(v)| leaves with cost 0 to v and distribute the service labels to them.

(50)

Redefining

With above simplifications, we could redefine the LCST problem. Let V^leaf and V^intbe the sets of leaves and internal nodes. Let Λ_v be the set of children of v. Let Λ^leaf_v = V(T⁰[v])∩ V^leaf.

We need to find a minimum cost subtree T of T₀ such that root(T) = root(T⁰) and for all ℓ∈ K there exists v ∈ V(T) ∩ V^leaf with a_v= ℓ, where av is the unique label in ser(v).

(51)

Redifining

Consider the change in the size and height of T⁰ after the transformation.

Let h^′ and N^′ be that of the old T₀. h≤ h^′+ 1.

The number of leaves in the new T⁰ is at most s(h^′+ 1)N^′, so N = O(sh^′N^′).

For an optimum tree T^∗ with cost opt, we can assume that

For every ℓ∈ K, there exists exactly one node v ∈ V(T^∗)∩ V^leaf with a_v= ℓ.

For every ℓ∈ L \ K, there is at most one node v ∈ V(T^∗)∩ V^leaf with a_v= ℓ.

(52)

Randomized Algorithm

The algorithm is based on the following theorem Theorem 18

There is an (sN)^O(sh²⁾-time algorithm that outputs a random

label-consistent tree ˜T such thatE(c(˜T)) ≤ opt, and for every ℓ ∈ K, we have P[

∃v ∈ V^leaf∩ V(˜T) | av= ℓ

]≥ _h+1¹ .

We run O(h log k) times the algorithm above and let T^′ be the union of all T. The expected cost of T˜ ^′ is at most O(h log k) opt, and by the union bound, we have

P[

∀l ∈ K, ∃v ∈ V^leaf∩ V(T^′)| av= ℓ ]≥ 1

2 In expectation, we just need to run the procedure twice.

(53)

1 Introduction

6 LP Relaxation

7 Hardness Results

(54)

LP Relaxation

(55)

Linear Programming

To construct the algorithm, we first transform the problem into Linear Programming Problem. We formulate an LP relaxation to find T^∗.

D := V(T⁰)∪ (V(T⁰)× L) is the set of variables. xe∈ {0, 1} for all e∈ D.

u∈ V(T⁰) iff u∈ V(T^∗).

(u, ℓ)∈ V(T⁰)× L iff u ∈ V(T^∗) and Λ^leaf_u ∩ V(T^∗) has a node with label ℓ.

(56)

Linear Programming

The following constraints must hold.

1 x_v≤ xu, ∀u ∈ V^int, v∈ Λu.

The children can not be chosen if the parent is not

2 x_(u,ℓ)≤ xu, ∀u ∈ V(T0), ℓ∈ L.

3 x_(u,ℓ)= x_u, u∈ V^int, ℓ∈ dem(u).

If u is present, labels it demands must be present as well

4 x_(v,a_v₎= xv, ∀v ∈ V^leaf.

5 x_(u,ℓ)=∑

v∈Λ^ux_(v,ℓ), ∀u ∈ V^int, v∈ L.

6 x_(v,ℓ)= 0, ∀v ∈ V^leaf, ℓ̸= av.

7 x_(root(T0),ℓ)= 1, ∀ℓ ∈ K.

Global labels must be supplied

(57)

Linear Programming

Let P be the polytope containing all vectors x ∈ [0, 1]^Dsatisfying constraints above.

With Sherali-Adams hierarchy, we can find a solution x^∗ ∈ SA(P, O(sh²)) satisfying the lifted constraints with ∑

v∈V(T⁰)c_vx^∗_v ≤ opt in time

|D|^O(sh²⁾= (sN)^O(sh²⁾, using any polynomial-time linear programming algorithm.⁵

(58)

Sketch of Randomized Algorithm

In the next slide, we will present the pseudo code of the randomized algorithm. Here we give some intuition of such algorithm.

The solution x^∗ of linear programming problem can be regarded as the probability that each events will happen. Therefore, the algorithm tries to randomly choose the nodes recursively based on the solution x^∗ from linear programming.

The LP polytope is lifted to the O(sh²)-th level so as to repeatedly condition⁶ on events throughout the algorithm.

(59)

Rounding a Lifted Fractional Solution

1: procedure solve(u, L^′, x)

2: V˜ ← ˜V ∪ {u}.

3: if u∈ V^leaf then return

4: end if

5: let S_v ← ∅ for every v ∈ Λu 6: for every ℓ∈ L^′ do

7: randomly choose a child v of u with probability x_(v,ℓ)

8: S_v ← Sv∪ {ℓ}

9: x← x conditioned on the event (v, ℓ)

10: end for

11: for every v∈ Λu, with probability x_v do

12: solve(v, S_v∪ dem(v), x conditioned on event v)

13: end for

(60)

Rounding a Lifted Fractional Solution

T can be induced by ˜˜ V through the following algorithm. Initially, ˜V← ∅.

Then we call solve(root(T⁰), dem(root(T⁰)), x^∗).

Note that although we had the constraint that x_(root(T0),ℓ)= 1 for ℓ∈ K, the solution is fractional and thus there might not be a leaf that ”fully”

supply the label. Regardlessly, as claimed before, the probability that a local label will be included by the conditioning procedure is high.

(61)

Analyzing the Algorithm

We shall show some properties first to make sure the algorithm is well-defined.

Lemma 19

For every recursion of solve that the algorithm invokes,

(a). at the beginning of the recursion, we have x_u= 1 and x_(u,ℓ) = 1 for all ℓ∈ L^′, and

(b). the random sampling process is well-defined, that is, ∑

v∈Λ^ux_(v,ℓ) = 1 before each step.

(62)

Analyzing the Algorithm

We prove the lemma recursively.

u = root(T⁰) follows by the definition of x^∗.

If (a). holds for some u /∈ V^leaf, then (b). also holds as x_(u,ℓ)=∑

v∈Λux_(v,ℓ)

If (b). holds for some u /∈ V^leaf. After finishing Loop 6, x_(v,ℓ)= 1, ∀v ∈ Λu, ℓ∈ Sv. Let x^′ be the polytope passed in sub-recursion at Line 12. We have x^′v = x^′_(v,ℓ) = 1, ∀ℓ ∈ Sv. Also, x^′_(v,ℓ)= x^′_v= 1, ∀ℓ ∈ dem(v) follows by definition. Therefore, (a).

holds for every child v∈ Λu.

(63)

Analyzing the Algorithm

Next, we shall show that ˜T is indeed label-consistent.

dem(u)⊆ L by Line 12 and the first call solve(root(T⁰), dem(root(T⁰)), x^∗).

By Loop 6, each label ℓ∈ dem(u) will be passed down to some leaf node v∈ Λ^leafu . By 19 (a). we have x_(v,ℓ) = 1 at the beginning of the recursion. Hence ℓ = a_v.

(64)

Marginal Probabilities

We define some notations for the convenience of the following discussion.

Let x^(u,i) be the value of x after the i-th iteration of Loop 6 in solve(u,·, ·), or the value at the end of loop if it terminates in less than i iterations.

(65)

Marginal Probabilities

The following lemmas state that the marginal probabilities of events are maintained in the random process.

Lemma 20

Let u∈ V(T⁰), i∈ [sh], x^old= x^(u,i⁻¹⁾, x^new= x^(u,i). LetE be any event determined by the random numbers generated before x^old. Then, for every e∈ D, we have

E[

x^new_e | x^olde ,E]

= x^old_e . Lemma 21

Let u∈ Vînt, v∈ Λu, xôld= x^(u,sh), x^new= x^(v,0). LetE be any event determined by the random numbers generated before xôld. Then, for every e = v∈ V(T⁰[u]) or e = (v, ℓ) for some v∈ V(T⁰[u]), we have

[ ]

(66)

Marginal Probabilities

The aforementioned lemmas can be proved by the definition of expectation and conditioning in the Sherali-Adams Hierarchy. With these lemmas, we could derive the following corollaries.

Corollary 1

For every v∈ V(T⁰), we haveP[ v∈ ˜V]

=x^∗_v.

Corollary 2 E[

cost(˜T)]

≤ opt.

(67)

Bounding Probabilities

To finish the proof, it suﬀices to show that each label is provided by ˜T with high probability.

Let t_ℓ be the number of nodes v∈ ˜V ∩ V^leaf with a_v= ℓ, then Lemma 22

E[tℓ] = 1.

It is easy to prove. By Corollary 1 we have E[tℓ] = ∑

v∈V^leaf:av=ℓ

P[v ∈ ˜V] = ∑

v∈V^leaf:av=ℓ

x^∗_v= x^∗_(root(T0),ℓ)= 1.

(68)

Bounding Probabilities

Lemma 23

For every w∈ V^leaf with a_w= ℓ, we haveE[

t_ℓ | w ∈ ˜V]

≤ h + 1.

We give an approach to prove it. Let h^′ be the depth of w in T⁰. For i = 0, 1, . . . , h^′− 1, Define

U_i :={w^′ ∈ V^leaf\ {w} | aw^′ = ℓ, the depth of LCA of w, w’ is i} If we could proveE[

|Ui∩ ˜V| | w ∈ ˜V]

≤ 1, then the results follows by summing up the inequality over all i = 0, 1, 2, . . . , h^′− 1.

(69)

Bounding Probabilities

Now we shall prove E[

|Ui∩ ˜V| | w ∈ ˜V]

≤ 1 for all 0 ≤ i < h^′. Let u_i be the ancestor of w with depth i, and (S_v)v∈Λu be the vector before Loop 11 in solve(u,·, ·). The following equation holds.

P[

w^′ ∈ ˜V | (Sv)_v_∈Λ_u_,x(u,sh),w∈˜V

]

=P[

w^′ ∈ ˜V | (Sv)_v_∈Λ_u_,x(u,sh)

]

=E[

x^(w_w_′^′^,0)| (Sv)v∈Λ^u, x^(u,sh) ]

= x^(u,sh)_w_′ The last equality follows from Lemma 20 and Lemma 21.

Summing up all w^′∈ Ui we have E[

|Ui∩ ˜V|]

= ∑

w^′∈Ui

x^(u,sh)_w_′ = ∑

w^′∈Ui

x^(u,sh)_(w_′_,ℓ)≤ x^(u,sh)_(u,ℓ) ≤ 1.

(70)

Bounding Probabilities

Lastly, we will state the following lemma.

Lemma 24

For every ℓ∈ K, we have E [tℓ| tℓ ≥ 1] ≤ h + 1.

With the lemma above, we could simply derive that

1 =E[tℓ] =E [tℓ | tℓ≥ 1] · P[tℓ ≥ 1] ⇒ P[tℓ≥ 1] ≥ 1 h + 1. This finishes the proof of Theorem 18.

(71)

Bounding Probabilities

Finally, we give a derivation of Lemma 24. In the following, w, w^′ in summations are over all nodes in V^leaf with label ℓ.

E [tℓ| tℓ≥ 1]² ≤ E[

t²_ℓ | tℓ ≥ 1]

=∑

w,w^′

P[

w, w^′∈ ˜V | tℓ ≥ 1]

=∑

w

( P[

w∈ ˜V | tℓ≥ 1] ∑

w^′

P[

w^′ ∈ ˜V | w ∈ ˜V, tℓ ≥ 1])

=∑

w

P[

w∈ ˜V | tℓ≥ 1]

E[tℓ| w ∈ ˜V]

≤ (h + 1)∑

w

P[

w∈ ˜V | tℓ≥ 1]

= (h + 1)E [tℓ| tℓ ≥ 1] ⇒ E [tℓ | tℓ≥ 1] ≤ h + 1.

(72)

1 Introduction

6 LP Relaxation

7 Hardness Results

(73)

Hardness Results

(74)

Overview

Definition 25

Quasi-polynomial-time algorithms run in O(2^polylog(n)) time. More precisely, QP =∪

c∈NDTIME(2^O(log^cⁿ⁾).

Definition 26

ZPTIME(f(n)) refers to randomized algorithms that always return the correct answer and have randomized running times with expectation O(f(n)).

Remark

It will be shown that, assuming the Projection Game Conjecture (which will be stated later) and NP̸⊆∩

0<ϵ<1ZPTIME(2ⁿ^ϵ), the optimal approximation ratio for quasi-polynomial time algorithms is

O(log²k/ log log k).

(75)

Prior Bounds

Remark

[HK03] shows that, by assuming NP̸⊆ ZPTIME(n^polylog(n)), an approximation ratio O(log²^−ϵk) is infeasible for (quasi-)polynomial time algoithms.

(76)

Projection Game

Definition 27

Projection Game Let G = (U, W; E) be a bipartite graph and Σ a set of labels. We associate each edge (u, w)∈ E with a projection πuw: Σ→ Σ.

We are to find an labeling f : (U∪ W) → Σ that associates each vertex to a label. A labeling f is said to cover an edge (u, w) if π_uw(f(u)) = f(w).

Our goal is then to find a labeling that covers the most edges.

(77)

Group Steiner Tree

Definition 28

Group Steiner Tree (GST) Given a (weighted) undirected graph G with n vertices, a root vertex r∈ V(G), and some vertex subsets

S₁, . . . , S_k ⊆ V(G) referred to as groups, the goal of GST is to find a minimal cost subgraph that contains a path from the root to at least a vertex in each group.

Remark

GST can be reduced to DST by first making the edges bi-directed, then add, for each group S_i, a terminal t_i, together with zero-cost edges from each vertex in S_i to t_i.

(78)

(Randomized) Exponential Time Hypothesis

Definition 29

(Randomized) Exponential Time Hypothesis There exists some δ > 0 such that 3-SAT cannot be solved in (randomized) O(2^δn).

Remark

The paper uses the slightly weaker formulation NP̸⊆∩

0<ϵ<1ZPTIME(2ⁿ^ϵ), though we still assume ETH for simplicity.

(The general idea is the same.)

(79)

Projection Game Conjecture

Definition 30

Projection Game Conjecture There exists a c > 0 such that, for every 0 < ϵ≤ c, any SAT instance ϕ of size n can be eﬀiciently reduced to a Projection Game on a (n^poly(ϵ)-regular bipartite) graph with n^1+o(n) vertices with |Σ| = O(n^poly(ϵ)). Furthermore, the game satisfies the following.

If ϕ is satisfiable, then there is a labeling covering every edge.

Otherwise, there exists no labeling covering more than a n^−ϵ fraction of the edges.

(80)

ETH for Projection Games

By comparing the coeﬀicients, we can derive Corollary 31

Assuming the previous hypothesises, for any 0≤ ϵ < c, there is no

O(2ⁿ^ϵ)-time algorithm that outputs an n^ϵ-approximation of the Projection Game.

(81)

Polylogarithmic Inapproximability

Theorem 32

Consider an instance ψ of the Projection Game on a ∆-regular n-vertex bipartite graph G with the label set Σ of size σ. For any parameter 1≤ h ≤ O(log²n), there is a randomized reduction from ψ to a GST on a tree T with k groups such that |V(T)| = (σn)^hand k = ∆n^hsatisfying the following with high probability.

If there is a labeling covering all edges of G, then there is a feasible solution to the GST with cost h².

If there is no labeling covering γ fraction of edges, there is no feasible solution to the GST with cost < min(γ⁻¹²h, Ω(h log k)).

The proof can be found in [HK03].

(82)

Lower Bound

By choosing the parameters carefully, we obtain Theorem 33

Assuming Corollary 31, there exists no quasi-polynomial time algorithm that approximates GST (and therefore DST) to a ratio of

o(log²k/ log log k) or o(log²N/ log log N).

(83)

References I

Moses Charikar et al. “Approximation Algorithms for Directed Steiner Problems”. In: Journal of Algorithms 33.1 (1999), pp. 73–91. issn: 0196-6774. doi:

https://doi.org/10.1006/jagm.1999.1042. url:

https://www.sciencedirect.com/science/article/pii/

S0196677499910428.

Joseph Cheriyan et al. “Approximating Rooted Steiner

Networks”.In: ACM Trans. Algorithms 11.2 (Oct. 2014). issn:

1549-6325. doi: 10.1145/2650183. url:

https://doi.org/10.1145/2650183.

Chandra Chekuri and M. Pal. “A recursive greedy algorithm for walks in directed graphs”.In: 46th Annual IEEE

Symposium on Foundations of Computer Science (FOCS’05).

2005, pp. 245–253. doi: 10.1109/SFCS.2005.9.

(84)

References II

Anupam Gupta, Ravishankar Krishnaswamy, and R. Ravi.

“Tree Embeddings for Two-Edge-Connected Network Design”.

In: Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms. SODA ’10. Austin, Texas:

Society for Industrial and Applied Mathematics, 2010, pp. 1521–1538. isbn: 9780898716986.

Naveen Garg, Goran Konjevod, and R. Ravi.“A

Polylogarithmic Approximation Algorithm for the Group Steiner Tree Problem”. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms. SODA ’98.

San Francisco, California, USA: Society for Industrial and Applied Mathematics, 1998, pp. 253–259. isbn: 0898714109.

(85)

References III

Fabrizio Grandoni, Bundit Laekhanukit, and Shi Li.

O(log²k/ log log k)-Approximation Algorithm for Directed Steiner Tree: A Tight Quasi-Polynomial-Time Algorithm.

2018. arXiv: 1811.03020 [cs.DS].

Eran Halperin and Robert Krauthgamer. “Polylogarithmic Inapproximability”. In: Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing. STOC ’03. San Diego, CA, USA: Association for Computing Machinery, 2003, pp. 585–594. isbn: 1581136749. doi:

10.1145/780542.780628. url:

https://doi.org/10.1145/780542.780628.

Shunhua Jiang et al. Faster Dynamic Matrix Inverse for Faster LPs. 2020. arXiv: 2004.07470 [cs.DS].

(86)

References IV

Bundit Laekhanukit. “Parameters of Two-Prover-One-Round Game and the Hardness of Connectivity Problems”. In:

Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms. SODA ’14. Portland, Oregon: Society for Industrial and Applied Mathematics, 2014, pp. 1626–1643. isbn: 9781611973389.

Thomas Rothvoß.Directed Steiner Tree and the Lasserre Hierarchy. 2012. arXiv: 1111.5473 [cs.DS].

Hanif Sherali and Warren Adams. “A Hierarchy of Relaxations Between the Continuous and Convex Hull Representations for Zero-One Programming Problems”. In: SIAM J. Discrete Math. 3 (May 1990), pp. 411–430. doi: 10.1137/0403036.