• 沒有找到結果。

Large-scale cluster manage ment at Google with Borg

N/A
N/A
Protected

Academic year: 2022

Share "Large-scale cluster manage ment at Google with Borg"

Copied!
24
0
0

加載中.... (立即查看全文)

全文

(1)

Large-scale cluster manage ment at Google with Borg

Google Inc.

(2)

Agenda

Borg 的目標   (1, 2.1)

使用者怎麼描述工作 / 任務對運算資源  

的需求 (2.3, 2.5)

運算資源的分派單位 (2.2, 2.4)  

排程算法與資源分配 (3.2, 6.2)  

評估方式 (5.1)  

實驗結果 (5.2, 5.4, 5.5)  

Lesson Learned (8)

(3)

Agenda

Borg 的目標   (1, 2.1)

使用者怎麼描述工作 / 任務對運算資源  

的需求 (2.3, 2.5)

運算資源的分派單位 (2.2, 2.4) 

排程算法與資源分配 (3.2, 6.2) 

評估方式 (5.1) 

實驗結果 (5.2, 5.4, 5.5) 

Lesson Learned (8)

(4)

Borg

A cluster manager.

◦Runs hundreds of thousands of jobs fr om many thousands of different applic ations.

◦Across a number of clusters each with up to tens of thousands of machines.

With very high reliability and ava

ilability.

(5)

Workloads

Heterogeneous workload with two ma in parts.

◦Long-running services

Handle short-lived latency-sensitive reque sts.

High priority(prod).

◦Batch jobs

Take a few seconds to a few days to comple te.

Low priority(non-prod).

(6)

Agenda

Borg 的目標  (1, 2.1)

使用者怎麼描述工作 / 任務對運算資源  

的需求 (2.3, 2.5)

運算資源的分派單位 (2.2, 2.4) 

排程算法與資源分配 (3.2, 6.2) 

評估方式 (5.1) 

實驗結果 (5.2, 5.4, 5.5) 

Lesson Learned (8)

(7)

Jobs and Tasks

Job

◦Runs in one Borg cell.

◦Consist of many tasks.

◦Has properties and constraints.

name, owner, number of tasks, priority.

Task

◦Maps to a set of Linux processes runn ing in a container on a machine.

◦Has properties and constraints.

resource requirements(CPU cores, RAM, disk space, disk access rate, TCP ports, etc).

(8)

Jobs and Tasks(Cont.)

(9)

Jobs and Tasks(Cont.)

Non-overlapping priority bands

◦Monitoring, production, batch, and be st effort.

◦Tasks from jobs with higher priority can preempt lower priority one.

◦Disallow tasks in the production prio rity band to preempt one another.

(10)

Jobs and Tasks(Cont.)

Jobs with insufficient quota are i mmediately rejected upon submissio n.

◦Quota: a vector of resource quantitie s.

(CPU, RAM, disk space, etc.)

◦Higher-priority quota costs more.

(11)

Agenda

Borg 的目標  (1, 2.1)

使用者怎麼描述工作 / 任務對運算資源  

的需求 (2.3, 2.5)

運算資源的分派單位 (2.2, 2.4)  

排程算法與資源分配 (3.2, 6.2) 

評估方式 (5.1) 

實驗結果 (5.2, 5.4, 5.5) 

Lesson Learned (8)

(12)

Architecture(Cont.)

Cell

◦A set of heterogeneous machines that run jobs in a cluster.

◦Median cell size: 10k machines.

Alloc

◦A reserved set of resources on a mach ine.

(13)

Agenda

Borg 的目標  (1, 2.1)

使用者怎麼描述工作 / 任務對運算資源  

的需求 (2.3, 2.5)

運算資源的分派單位 (2.2, 2.4) 

排程算法與資源分配 (3.2, 6.2)  

評估方式 (5.1) 

實驗結果 (5.2, 5.4, 5.5) 

Lesson Learned (8)

(14)

Scheduler

The scheduling algorithm consists of two parts.

◦Feasibility checking: find machines o n which the task could run.

◦Scoring: picks one of the feasible ma chines.

Spreading load v.s. Best-fit

Use a hybrid method to reduce the amount o f stranded resources – ones that cannot b e used because of another resource on the machine is fully allocated.

(15)

Performance Isolation

To help with overload and over-commitm ent.

Latency-sensitive(LS)

tasks v.s. the r est(batch).

LS tasks are capable of temporarily starvi ng batch tasks for several seconds.

Compressible

v.s.

non-compressible

res ources.

Terminates low priority tasks while runnin g out of non-compressible.

Throttles usage(favoring LS tasks) while r unning out of compressible.

(16)

Agenda

Borg 的目標  (1, 2.1)

使用者怎麼描述工作 / 任務對運算資源  

的需求 (2.3, 2.5)

運算資源的分派單位 (2.2, 2.4) 

排程算法與資源分配 (3.2, 6.2) 

評估方式 (5.1)  

實驗結果 (5.2, 5.4, 5.5)  

Lesson Learned (8)

(17)

Combined vs Segregated

(18)

Agenda

Borg 的目標  (1, 2.1)

使用者怎麼描述工作 / 任務對運算資源  

的需求 (2.3, 2.5)

運算資源的分派單位 (2.2, 2.4) 

排程算法與資源分配 (3.2, 6.2) 

評估方式 (5.1) 

實驗結果 (5.2, 5.4, 5.5) 

Lesson Learned (8)

(19)

Lesson Learned

The bad:

◦Jobs are restrictive as the only grou ping mechanism for tasks.

◦One IP address per machine complicate s things.

◦Optimizing for power users at the exp ense of casual ones.

(20)

Lesson Learned(Cont.)

The good:

◦Allocs are useful.

◦Cluster management is more than task management.

◦Introspection is vital.

◦The master is the kernel of a distrib uted system.

(21)

Conclusion

Virtually all of Google’s cluster workloads have switched to use Bor g over the past decade.

They continue to evolve it, and ha

ve applied the lessons we learned

from it to Kubernetes.

參考文獻

相關文件

2 Distributed classification algorithms Kernel support vector machines Linear support vector machines Parallel tree learning.. 3 Distributed clustering

2 Distributed classification algorithms Kernel support vector machines Linear support vector machines Parallel tree learning?. 3 Distributed clustering

• When light is refracted into two rays each polarized with the vibration directions.. oriented at right angles to one another, and traveling at

Looking for a recurring theme in the CareerCast.com Jobs Rated report’s best jobs of 2019.. One

In particular, in the context of folded supersymmetry it is pointed out in Ref.[4] that production of the squirk-antisquirk pair ˜ Q ˜ Q ∗ at the large hadron collider (LHC)

In implementing the key tasks, schools should build on past experiences and strengthen the development of the key tasks in line with the stage of the curriculum reform, through

We will quickly discuss some examples and show both types of optimization methods are useful for linear classification.. Chih-Jen Lin (National Taiwan Univ.) 16

• About 14% of jobs in OECD countries participating in Survey  of Adult Skills (PIAAC) are highly automatable (i.e., probability  of automation of over 70%).  ..