• 沒有找到結果。

Hierarchical Scheduling fo r Diverse Datacenter Workl oads

N/A
N/A
Protected

Academic year: 2022

Share "Hierarchical Scheduling fo r Diverse Datacenter Workl oads"

Copied!
31
0
0

加載中.... (立即查看全文)

全文

(1)

Hierarchical Scheduling fo r Diverse Datacenter Workl oads

Arka A. Bhattacharya, David Culler, Ali Ghodsi, Scott Shenker, and Ion Stoica

University of California, Berkeley

Eric Friedman

International Computer Science Institute, Berkeley

ACM SoCC’13

(2)

Hierarchical Scheduling

A feature of cloud schedulers.

Enables scheduling resources to re flect organizational priorities.

(3)

Hierarchical Share Guarant ee

Assign to each node in the weighted tr ee some guaranteed share of the resour ces.

A node nL is guaranteed to get at least x s hare of resources from it parent, where x e quals to

Wi: weight of node ni

P(): parent of a node

C(): the set of children of a node

A(): the subset of demanding nodes

(4)

Example

Given 480 servers

240

48

96

96 48

80

96 160

9 6 0

(5)

Multi-resource Scheduling

Workloads in data centers tend to be diverse.

CPU-intensive, memory-intensive, or I /O intensive.

Ignoring the actual resource needs of jobs leads to poor performance isolat ion and low throughput for jobs.

(6)

Dominant Resource Fairness (DRF)

A generalization of max-min fairne ss to multiple resource types.

Maximize the minimum dominant shar es of users in the system.

Dominant share si is the maximum shar e of resource among all shares of a u ser.

Dominant resource is the resource cor responding to the dominant share.

} {

max 1 ,

j j m i

j

i r

s u

(7)

Example

Dominant resource

Job 1: memory

Job 2: CPU

Dominant share

60%

(8)

How DRF Works

Given a set of users, each with a resource demand vector.

The resources required to execute one job.

Starts with every user being allocated with zero resources.

Repeatedly picks the user with the lowest dominant share.

Launches one of the user’s job if there are enough resources availab le in the system.

(9)

Example

System with 9 CPUs and 18 GB RAM.

User A: <1 CPU, 4 GB>

User B: <3 CPUs, 1 GB>

(10)

Hierarchical DRF (H-DRF)

Static H-DRF

Collapsed hierarchies

Naive H-DRF

Dynamic H-DRF

(11)

Static H-DRF

A static version of DRF to handle hierarchies.

Algorithm

Given the hierarchy structure and the amount of resources in the system.

Starts with every leaf nodes being al located with zero resources.

Repeatedly allocates resource to a le af node until no more resources can b e assigned to any node.

(12)

Resource Allocation in Static H-D RF

Start at the root of the tree and travers e down to a leaf.

At each step picking the demanding child that has the smallest dominant share.

Internal nodes are assigned the sum of all th e resources assigned to their immediate child ren.

Allocate the leaf node an ε amount of it s resource demands.

Increases the node’s dominant share by ε.

(13)

Example

Given 10 CPUs and 10 GPUs.

(14)

Weakness of Static H-DRF

Re-calculating the static H-DRF al locations for each of the leaves a nd arrivals from scratch is comput ationally infeasible.

(15)

Collapsed Hierarchies

Converts a hierarchical scheduler into a flat one and apply weighted DRF algorithm.

Works when only one resource is invol ved.

Violates the hierarchical share guara ntee for internal nodes in the hierar chy.

(16)

Example

Given

Flatten

nr

n1,1 <1,1>

50%

n2,1 <1,0>

25%

n2,2 <0,1>

25%

(17)

Weighted DRF

Each user i is associated a weight vector Wi = {wi,1, … wi,m}.

wi,j represents the weight of user i fo r resource j.

Dominant share max 1{ , }

j j m i

j

i r

s u

wi,j

(18)

Weighted DRF in Collapsed Hierarc hies

Each node ni has a weight wi.

Let wi,j = wi for 1≦j≦m

The ratio between dominated resources allocated to user a and user b equals to wa/wb.

} {

max 1 ,

i j m i

j

i w

s u

(19)

Example

Given

Collapsed Hierarchies

nr

n1,1 <1,1>

50%

n2,1 <1,0>

25%

n2,2 <0,1>

25%

(20)

Naive H-DRF

A natural adaptation of the origin al DRF to the hierarchical setting .

The hierarchical share guarantee i s violated for leaf nodes.

Starvation

(21)

Example

Static H-DRF

Naive H-DRF

Dominate share = 1.0

(22)

Dynamic H-DRF

Does not suffer from starvation.

Satisfy the hierarchical share gua rantee.

Two key features:

Rescaling to minimum nodes

Ignoring blocked nodes

(23)

Rescaling to Minimum Nodes

Compute the resource consumption o f an internal node as follows:

Find the demanding child with minimum dominant share M.

Rescale every child’s resource consu mption vector so that its dominant sh are becomes M.

Add all the children’s rescaled vect ors to get the internal node’s resou rce consumption vector.

(24)

Example

Given 10 CPUs and 10 GPUs.

After n2,1 finishes a job and relea se 1 CPU:

<0.4, 0>

<0, 1>

<0, 1>

<0, 0.4>

<0.4, 0.4>

Dominate share =

<0.5, 0.4 0>

Dominate share = 0.5

(25)

Ignoring Blocked Nodes

Dynamic H-DRF only consider non-bl ocked nodes for rescaling.

A leaf node is blocked if either

Any of the resources it requires are saturated.

The node is non-demanding.

An internal node is blocked if all of its children are blocked.

(26)

Example

Static H-DRF

Without blocked

Dominate share = 1/3

(27)

Allocation Properties

Hierarchical Share Guarantees

Group Strategy-proofness

No group of users can misrepresent their resource requirements in such a way that all of them are weakly better off, and at least one of them is strictly better off.

Recursive Scheduling

Not Population Monotonicity

PM: Any node exiting the system should not decrease the resource allocation to any other node in the hie rarchy tree.

(28)

Example

(29)

Evaluation - Hierarchical Sharing

49 Amazon EC2 severs

Dominant resource:

n1,1, n2,1, n2,2: CPU

n1,2: GPU

(30)

Result

pareto-efficiency: no node in the hierarchy can be allocat ed an extra task on the cluster without reducing the share of some other node.

(31)

Conclusion

Proposed H-DRF, which is a hierarc hical multi-resource scheduler.

Avoid job starvation and maintain hie rarchical share guarantee.

Future works

DRF under placement constraints.

Efficient allocation vector update.

參考文獻

相關文件

“The Rediscovery of Three Early Buddhist Scriptures on Meditation: A Preliminary Analysis of the Fo shuo shier men jing, the Fo shuo jie shier men jing Translated by An

The function f (m, n) is introduced as the minimum number of lolis required in a loli field problem. We also obtained a detailed specific result of some numbers and the upper bound of

In 2006, most School Heads perceived that the NET’s role as primarily to collaborate with the local English teachers, act as an English language resource for students,

Strands (or learning dimensions) are categories of mathematical knowledge and concepts for organizing the curriculum. Their main function is to organize mathematical

• Contact with both parents is generally said to be the right of the child, as opposed to the right of the parent. • In other words the child has the right to see and to have a

In order to achieve the learning objectives of the OLE – providing students with a broad and balanced curriculum with diverse learning experiences to foster whole-person development

Although many excellent resource synchronization protocols have been pro- posed, most of them are either for hard real-time task scheduling with the maxi- mum priority inversion

 The IEC endeavours to ensure that the information contained in this presentation is accurate as of the date of its presentation, but the information is provided on an