• 沒有找到結果。

Chapter 6Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism

N/A
N/A
Protected

Academic year: 2021

Share "Chapter 6Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism"

Copied!
26
0
0

加載中.... (立即查看全文)

全文

(1)

Chapter 6

Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism

Computer Architecture

A Quantitative Approach, Sixth Edition

(2)

Copyright © 2019, Elsevier Inc. All rights Reserved 2

Warehouse-scale computer (WSC)

Provides Internet services

Search, social networking, online maps, video sharing, online shopping, email, cloud computing, etc.

Differences with HPC “clusters”:

Clusters have higher performance processors and network

Clusters emphasize thread-level parallelism, WSCs emphasize request-level parallelism

Differences with datacenters:

Datacenters consolidate different machines and software into one location

Datacenters emphasize virtual machines and hardware heterogeneity in order to serve varied customers

(3)

Introduction

Important design factors for WSC:

Cost-performance

Small savings add up

Energy efficiency

Affects power distribution and cooling

Work per joule

Dependability via redundancy

Network I/O

Interactive and batch processing workloads

Introduction

(4)

Copyright © 2019, Elsevier Inc. All rights Reserved 4

Ample computational parallelism is not important

Most jobs are totally independent

“Request-level parallelism”

Operational costs count

Power consumption is a primary, not secondary, constraint when designing system

Scale and its opportunities and problems

Can afford to build customized systems since WSC require volume purchase

Location counts

Real estate, power cost; Internet, end-user, and workforce availability

Computing efficiently at low utilization

Scale and the opportunities/problems associated with scale

Unique challenges: custom hardware, failures

Unique opportunities: bulk discounts

(5)

Efficiency and Cost of WSC

Location of WSC

Proximity to Internet backbones, electricity cost, property tax rates, low risk from earthquakes, floods, and hurricanes

Power distribution

Efficiency and Cost of WSC

(6)

Copyright © 2019, Elsevier Inc. All rights Reserved 6

Batch processing framework: MapReduce

Map: applies a programmer-supplied function to each logical input record

Runs on thousands of computers

Provides new set of key-value pairs as intermediate values

Reduce: collapses values using another programmer-supplied function

g Models and Workloads for WSCs

(7)

Prgrm’g Models and Workloads

Example:

map (String key, String value):

// key: document name

// value: document contents

for each word w in value

EmitIntermediate(w,”1”); // Produce list of all words

reduce (String key, Iterator values):

// key: a word

// value: a list of counts

int result = 0;

for each v in values:

result += ParseInt(v); // get integer from key-value pair

Emit(AsString(result));

Programming Models and Workloads for WSCs

(8)

Copyright © 2019, Elsevier Inc. All rights Reserved 8

Availability:

Use replicas of data across different servers

Use relaxed consistency:

No need for all replicas to always agree

File systems: GFS and Colossus

Databases: Dynamo and BigTable

g Models and Workloads for WSCs

(9)

Prgrm’g Models and Workloads

MapReduce runtime environment schedules map and reduce task to WSC nodes

Workload demands often vary considerably

Scheduler assigns tasks based on completion of prior tasks

Tail latency/execution time variability: single slow task can hold up large MapReduce job

Runtime libraries replicate tasks near end of job

Programming Models and Workloads for WSCs

(10)

Copyright © 2019, Elsevier Inc. All rights Reserved 10

g Models and Workloads for WSCs

(11)

Computer Architecture of WSC

WSC often use a hierarchy of networks for interconnection

Each 19” rack holds 48 1U servers connected to a rack switch

Rack switches are uplinked to switch higher in hierarchy

Uplink has 6-24X times lower bandwidthGoal is to maximize locality of communication relative to the rack

Computer Ar4chitecture of WSC

(12)

Copyright © 2019, Elsevier Inc. All rights Reserved 12

Storage options:

Use disks inside the servers, or

Network attached storage through Infiniband

WSCs generally rely on local disks

Google File System (GFS) uses local disks and maintains at least three relicas

r4chitecture of WSC

(13)

Array Switch

Switch that connects an array of racks

Array switch should have 10 X the bisection bandwidth of rack switch

Cost of n-port switch grows as n2

Often utilize content addressible memory chips and FPGAs

Computer Ar4chitecture of WSC

(14)

Copyright © 2019, Elsevier Inc. All rights Reserved 14

Servers can access DRAM and disks on other servers using a NUMA-style interface

r4chitecture of WSC

(15)

WSC Memory Hierarchy

Computer Ar4chitecture of WSC

(16)

Copyright © 2019, Elsevier Inc. All rights Reserved 16

r4chitecture of WSC

(17)

Infrastructure and Costs of WSC

Cooling

Air conditioning used to cool server room

64 F – 71 F

Keep temperature higher (closer to 71 F)

Cooling towers can also be used

Minimum temperature is “wet bulb temperature”

Physcical Infrastrcuture and Costs of WSC

(18)

Copyright © 2019, Elsevier Inc. All rights Reserved 18

Cooling system also uses water (evaporation and spills)

E.g. 70,000 to 200,000 gallons per day for an 8 MW facility

Power cost breakdown:

Chillers: 30-50% of the power used by the IT equipment

Air conditioning: 10-20% of the IT power, mostly due to fans

How man servers can a WSC support?

Each server:

“Nameplate power rating” gives maximum power consumption

To get actual, measure power under actual workloads

Oversubscribe cumulative server power by 40%, but monitor power closely

frastrcuture and Costs of WSC

(19)

Infrastructure and Costs of WSC

Determining the maximum server capacity

Nameplate power rating: maximum power that a server can draw

Better approach: measure under various workloads

Oversubscribe by 40%

Typical power usage by component:

Processors: 42%

DRAM: 12%

Disks: 14%

Networking: 5%

Cooling: 15%

Power overhead: 8%

Miscellaneous: 4%

Physcical Infrastrcuture and Costs of WSC

(20)

Copyright © 2019, Elsevier Inc. All rights Reserved 20

Power Utilization Effectiveness (PEU)

= Total facility power / IT equipment power

Median PUE on 2006 study was 1.69

Performance

Latency is important metric because it is seen by users

Bing study: users will use search less as response time increases

Service Level Objectives (SLOs)/Service Level Agreements (SLAs)

E.g. 99% of requests be below 100 ms

frastrcuture and Costs of WSC

(21)

Measuring Efficiency of a WSC

Physcical Infrastrcuture and Costs of WSC

(22)

Copyright © 2019, Elsevier Inc. All rights Reserved 22

Capital expenditures (CAPEX)

Cost to build a WSC

$9 to 13/watt

Operational expenditures (OPEX)

Cost to operate a WSC

frastrcuture and Costs of WSC

(23)

Cloud Computing

Amazon Web Services

Virtual Machines: Linux/Xen

Low cost

Open source software

Initially no guarantee of service

No contract

Cloud Computing

(24)

Copyright © 2019, Elsevier Inc. All rights Reserved 24

Cloud Computing Growth

tin

g

(25)

Fallacies and Pitfalls

Cloud computing providers are losing money

AWS has a margin of 25%, Amazon retail 3%

Focusing on average performance instead of 99

th

percentile performance

Using too wimpy a processor when trying to improve WSC cost-performance

Inconsistent Measure of PUE by different companies

Capital costs of the WSC facility are higher than for the servers that it houses

Fallcies and Pitfalls

(26)

26

Trying to save power with inactive low power modes versus active low power modes

Given improvements in DRAM dependability and the fault tolerance of WSC systems software,

there is no need to spend extra for ECC memory in a WSC

Coping effectively with microsecond (e.g. Flash and 100 GbE) delays as opposed to nansecond or millisecond delays

Turning off hardware during periods of low

activity improves the cost-performance of a WSC

Copyright © 2019, Elsevier Inc. All rights Reserved

itfalls

參考文獻

相關文件

Elements of Computing Systems, Nisan & Schocken, MIT Press, www.nand2tetris.org , Chapter 9: High-Level Language slide 2.. Where we

L1:add eax,C_minutesInDay ; totalMinutes+=minutesInDay call WriteString ; display str1 (offset in EDX) call WriteInt ; display totalMinutes (EAX) call Crlf. inc days

General Entrance Requirement (2022 Entry) Chinese Language: Level 3 English Language: Level 3 Mathematics Compulsory Part: Level 2. Liberal Studies:

Objectives  To introduce the Learning Progression Framework LPF for English Language as a reference tool to identify students’ strengths and weaknesses, and give constructive

Help pupils create paradigmatic associations by introducing the superordinates of different sports (e.g. water sports, track and field events, ball games) and guiding them to

• Develop students’ career-related competencies, foundation skills (notably communication skills), thinking skills and people skills as well as to nurture their positive values

 Context level: Teacher familiarizes the students with the writing topic/ background (through videos/ pictures/ pre- task)..  Text level: Show a model consequential explanation

Making use of the Learning Progression Framework (LPF) for Reading in the design of post- reading activities to help students develop reading skills and strategies that support their