Other Factors and Comparison Table - Background and Related Work

Chapter 2 Background and Related Work

2.6 Other Factors and Comparison Table

Some paper may concern other factors such as dynamic voltage frequency scaling (DVFS). However, some papers like [7] have pointed out that such technique has little effect on energy saving compared with resizing the number of active servers. Another reason that we do not use such a technique is that as number of cores increases within a CPU, such a scaling may incur more VM migrations due to consolidating VMs that are working in the same frequency to the same destination machine. A qualitative comparison between the above-mentioned related work including the proposed TD-D is shown in Table 1.

Table 1. Comparison of different resource allocation algorithms for cloud data centers

Approach Resizing

Chapter 3 Problem Formulation

In this chapter we describe the energy consumption model, the cost function and the problem formulation of our minimum energy consumption resource allocation algorithm.

3.1 Energy Consumption Model and Cost Function

To solve the minimum energy consumption problem, we transform it into an optimization problem and build a cost function to minimize the cost of cloud data centers. To simplify our work when doing the auto-scaling on both server and VM level, we use a homogeneous model that all servers and VMs are of same capacities and energy consumption. This assumption makes our work as a simple discrete packing problem that all servers have equal number of slots to host VMs and each VM has equal capacity to handle requests. Now, we need an energy consumption model for a running server. We refer to the energy consumption model introduced in [11], which is

(1) Here, the energy consumption of a running server is shown as a linear equation, where Pidle is the basic energy consumption when the server is idle, Pbusy is the augmented energy consumption when the server is at 100% utilization, and u is the CPU utilization which falls into the interval of [0, 1]. We modify the linear model of (1) and give a discrete step function version:

(2)

where n is the number of VMs that currently run on that server, P_server is the basic energy consumption of an idle server like Pidle, and PVM is the augmented energy consumption for each VM. The transformation from (1) to (2) implies every VM is fully loaded and such an implication is feasible since an auto-scaling algorithm always resize its resource provisioning closely to its actual demand.

Now we can build up our cost function. Table 2 lists the symbols used in our cost function. There are two kinds of cost, the operating cost and the switching cost.

Operating cost is the energy consumption of running servers and VMs, while switching cost includes the cost to switch on/off servers and startup/shutdown VMs.

For simplicity, we let the switching cost of switching on and switching off a particular resource be the same value. We can put operating cost and switching cost together in one cost function by introducing the break-even time parameter. The break-even time is the time period that the operating cost of a particular resource equals to the switching cost of that resource multiplied by two. For example, for a server, it will be

(3) where ∆_server is the break-even time of a server and δ_server is the switching cost of a server. For a VM, we can also use the same idea of break-even time and have a similar relation:

(4) Assume that there are N apps, and the prediction window size is W time slots, then we can build our cost function as follows:

Minimize

Subject to

(6)

(7)

The cost function (5) is simply the summation of all operating cost and switching cost over W+1 time slots. Restrictions (6) and (7) restrict that the allocated VMs must be sufficient for minimum resource demand of each application and the number of active servers must be able to host all VMs at any moment. We require that a time slot in our system must be long enough to perform any desired resource rearrangement manipulations (server switch on/off, VM startup/shutdown). We also assume we have a workload predictor deployed in our system so that we can get the predicted workload of each application up to several time slots later.

Table 2. Symbols used in the cost function

Variable Definition

The number of VMs allocated to the ith app in time slot t

The number of active servers in time slot t

The predicted resource demand (in terms of number of required VMs) of ith app in time slot t

Constant Definition

3.2 Difficulties of Resizing at Both VM and Server Levels

In most cases, such a discrete resizing problem is relatively easy if we have predicted workload data. Intuitively, to achieve the minimum energy consumption between operating cost and switching cost, we can allocate resources to exactly match the predicted resource demand if the demand is rising or no change, compared with the demand in the previous time slot. If the demand is falling, we check if the demand will return to the previous level within a break-even time or not to decide to keep or switch off the resource. But the problem becomes complicated if we want to resize at both VM and server levels. The problem is that the amount of allocated VMs will affect the amount of servers that need to be allocated. Keeping an unnecessary VM who will be needed within a break-even time will not always bring the minimum energy consumption. We give two examples here. In the first example, in most cases, when there is a VM break-even time event, or we say the demand of that VM goes down then back within a break-even time, we keep that VM. However, if closing that VM may help to switch off the host server, and if the total resource demand of the data center goes down in the time slot that the VM is again to be needed, closing that VM may be a better choice since we can start that VM in other server and thus saving the operating cost of the original host server. We can use (8) to express such relation:

(8) In (8), the left-hand side is the cost to release a VM, and the right-hand side is the cost to keep that VM. Note that in the right-hand side, we don’t have the switching cost of the host server since no matter we decide to keep that VM or not, the host server is

temporarily unnecessary VM and meanwhile, some new VMs from other applications are about to be activated, we may need to switch on a new server to support the increasing capacity demand. In fact, things are usually more complicated than the above examples, if the break-even time period is long, and several overlapped but not simultaneous VM break-even time events occur. In this case, it forms a chain of break-event time events among several applications, and these events may affect one another. In such cases, there is no simple rule to decide the optimal amount of allocated resources.

Chapter 4 Proposed Time-Directed Dijkstra Algorithm

In this chapter, we introduce the proposed resizing algorithm called Time-Directed Dijkstra (TD-D), which can help us reach the minimum energy consumption by considering both operating cost and switching cost. That is, find a minimum cost solution of equation (5) in Chapter 3.

4.1 Preliminaries of the Algorithm

Since in a resizing problem we don’t consider the placement of resource, we simplify (5) to (N+1) kinds of resources in all W time slots. The (N+1) kinds of resources include the number of allocated VMs for N apps, plus the number of active servers.

First we define a new term called number-of-combinations. It gives a best known tight upper-bound of the optimal resource provision, which means the optimal number of allocated VMs for a particular application, or the optimal number of active servers, in a particular time slot. It can be defined as follows:

minimum resource demand ≤ optimal resource provision ≤

minimum resource demand + number-of-combinations - 1 (9)

That is, assuming the minimum resource demand of a particular resource in a particular time slot is m, then the optimal resource provision must be an element of a resource sequence {m, m + 1, m + 2, … , m + number-of-combinations - 1}. From the description in Chapter 3, we know the number-of-combinations will not be 1 only when there is a break-even time event of that kind of resource. To find the number-of-combinations for each kind of resource in each time slot, we examine if it happens a break-even time event. If it does, then we check every descending count of predicted resource demand to see if that resource demand will climb back to the previous level within the following break-even time. For example, we assume the resource demand for a particular application in a consecutive four time slots are 3, 1, 2, 3, and assuming is 1. From our definition, we can find the number-of-combinations of these consecutive four time slots are 1, 2, 1, 1. Note that the number-of-combinations in the second time slot is not 3 because the resource demand climbs back to 3 after the break-even time period. Since the resource demand of servers is affected by the number of allocated VMs, we need a two-pass scan to determine the number-of-combinations of each kind of resource. In the first pass, we determine the number-of-combinations of N apps in all W time slots. Since the maximum number of active servers must be enough to accommodate the maximum number of allocated VMs, in the second pass, we determine the number-of-combinations of server in all W time slots by the information of maximum possible VMs from the first pass. Once we complete the two-pass scan, we have (N+1) number-of-combinations in each time slot, and we can find the resource combination set in all W time slots. In each time slot, the resource combination set is a (N+1)-ary Cartesian product over (N+1) resource sequences in that time slot.

VM

Figure 1. Illustration of the Time-Directed Dijkstra algorithm

Here we introduce the terminologies used in our TD-D Algorithm. As illustrated in Fig. 1, it can be understood as finding the minimum cost path, or the shortest path from the source node (the root node in the figure) to the destination node (the leaf node in the figure). The graph is composed of W+1 levels and there are edges connecting any two adjacent levels. The term “level” can be considered as “a group of nodes that resides at same time slot”. Here each node stands for a resource allocation state in that time slot; that is, an element of a resource combination set in that time slot. An edge represents a switching process from one resource allocation state to another. Since a path can travel from any node in the upper level to any node in the lower level, any two adjacent levels form a complete bipartite graph. Clearly, the number of nodes in a particular level equals the cardinality of the resource combination set of that level. We

call it “Time-Directed Dijkstra” since it uses the same idea of Dijkstra’s algorithm to find the minimum cost path (shortest path), however, with a few differences. First, it is a directed graph and each path can only be traversed along the time-axis. Second, unlike conventional weighted graphs, there are three types of weights (costs) used in this algorithm, and both nodes and edges are weighted. It is illustrated in Fig. 2. Oper_x is the operating cost of a node x; Switchxy is the switching cost from node x to node y;

and Low_x is the lower bound cost (operating cost plus switching cost) from node x to the leaf node. The operating cost of each node and the switching cost of each edge remain unchanged, while the lower bound cost will be updated throughout the algorithm.

Figure 2. An example of deriving a minimum cost path

4.2 Time-Directed Dijkstra Algorithm

Now we introduce the whole process of our Time-Directed Dijkstra Algorithm (TD-D). First, we use the two-pass scan to obtain all number-of-combinations and the resource combination set in every time slot, thus producing all nodes in every level.

Then we use a reverse update manner, that is, from the leaf level to the root level, we update the lower bound cost of every node. Again we use Fig. 2 for example, Low_a will be the min(Lowb + Switchab , Lowc + Switchac), plus Opera. In this way, nodea

iteratively check all nodes in the next level, update the lower bound cost, and finally Low_a becomes the minimum cost from node_a to the leaf node. The whole process is shown in Algorithm 1.

Algorithm 1 Time-Directed Dijkstra

1: Two-pass scan to generate all nodes in each level 2: Set Lowleaf = Operleaf and Low = ∞ for other nodes 3: for t = W - 1 to 0 do

4: for i = 1 to cardinality of resource combination set at level = t do

5: for j = 1 to cardinality of resource combination set at level = t + 1 do 6: Lowi = min(Operi + Lowj + Switchij , Lowi)

7: Output Low of root node and the corresponding minimum cost path

4.3 Correctness and Complexity Issues

For the correctness of the algorithm, we use Fig. 2 again as an example. In Fig. 2, it is easy to see that, if nodea chooses nodeb as its best descendent toward the leaf node, then for one of the node_a’s ancestor node, named noded, its best path through nodea must be d-a-b, not d-a-c. The suggested best path obtained from the lower level

still works when we move on to the upper level. That is to say, when we iteratively update the lower bound cost of all nodes level by level, in a reverse manner, we always produce the minimum cost path to the leaf node. In fact, we can actually run this algorithm in the opposite direction, that is, update from the root node to the leaf node, and it will output the same result. The structure shown in Fig. 2 is a reversible structure.

The other thing is the complexity of this algorithm. We analyze it from two parts.

First part is the size of the outer layer for loop, which is proportional to the prediction window size W. We can consider this part as a polynomial time complexity part. The other part is the two inner layer for loop. These two for loop should have the same complexity order, since they represent the average cardinality of resource combination set of all levels. The average cardinality of resource combination set is strongly dominated by the severity of workload fluctuation, the trend of workload, and the number of applications in data center. If many applications have smooth workload, or monotonic increasing or monotonic decreasing workload, the cardinality of resource combination set will be small since there are not too much break-even time events we should concern. But in the worst case, though it is almost impossible in the real world, that all applications experience break-even time events simultaneously and severely, then the resource combination set will be a very large set.

The order of the average cardinality of resource combination set is O((average number-of-combination in a break-even time event) ^ (average number of break-even time events in a level)). It is exponential time complexity, but in our algorithm, we only consider those promising resource combinations, like break-even time events, making it still a practical approach. In our simulation, in almost all cases, we can finish our Time-Directed Dijkstra Algorithm in three minutes, under the common scenario used in public cloud data centers.

4.4 System Architecture

The proposed TD-D algorithm is not a standalone component. It needs to work with other components to perform its function. Fig. 3 illustrates the system architecture for our TD-D algorithm and its workflow when time slot = t. First, as illustrated in (a), the workload monitor will monitor and record the workload from each application that running on computing cloud in real-time. Since the real-time monitored data is numerous and jumbled, the workload monitor will process them into ordered and usable statistics data, usually the peak workload values for each application in that time slot, and then sends the data to reactive controller and workload predictor, as illustrated in (b). In (c), once the reactive controller receives the monitored workload data from workload monitor, it will perform its only but essential function, that is, dynamically switched on VMs/servers if any resource under-allocation is detected. This is a critical function that a proactive, long-term controller like our TD-D algorithm will need, since there are always prediction error and resource under-allocation is inevitable. We put this reactive, or short-term controller into our system architecture as the auxiliary of our long-term, proactive controller, and the countermeasure to prediction error. Another component that uses the workload statistics data is the workload predictor. As mentioned in the previous chapter, we make use of existing workload prediction technique to deploy a workload predictor in our system, to provide workload prediction for the following W time slots for each application. Finally, as illustrated in (d), the proactive controller receives the updated prediction data from workload predictor, performs our TD-D algorithm, and then sends control messages to computing cloud to perform any desired resource reallocation, as illustrated in (e).

Figure 3. Resource management system architecture and its workflow, assuming time slot = t. (a) Real-time workload data gathered from the Computing Cloud. (b) Real-time workload statistics, usually using the peak workload within a time slot. (c) Dynamically switch on new VMs/servers if under-allocation is detected. (d) Provide updated workload prediction from t + 1 to t + W. (e) Perform dynamic resource allocation at the beginning of each time slot, according to the direction of the proposed Time-Directed Dijkstra algorithm.

Chapter 5 Evaluation

In this chapter, we introduce our experiment settings, the comparison approaches we used, the experimental results and discussion.

5.1 Experiment Settings

We build up a simulation environment to do our evaluation. First, we defined the parameters used in our evaluation, which are listed in Table 3. Note that the energy consumption unit we use here is a relative unit so there is no energy unit like joule or KW/hr. That is, we set or = 1, then we get and from the value of and , by applying (3) and (4). The values of break-even time and can be determined by measuring the energy consumption on real servers and VMs or determined by operator policy. We also use the energy consumption measurement data from [12] as our operating cost parameter. The peak energy consumption for a 2x Intel Xeon X5550 Quad core server is 248W, and the evaluation are listed in Table 3.

Then we implement a synthetic workload generator, which can provide the predicted workload of every app for the following W time slots. By intuition we may

server

Table 3. Parameter settings used in the evaluation

Parameter Definition Value

N The number of applications 30

MAX_VM_APP

The maximum number of VMs which can be allocated to each

application

NUM_SERVER The number of available servers in the data center

ceiling(NUM_APP

×MAX_VM_APP / C)

W Prediction window size 7 time slots

Same as defined in Table 2 9

Same as defined in Table 2 1

Same as defined in (3) 3

Same as defined in (4) 2

C Same as defined in Table 2 6

think a workload generator that generates workload that follow a Gaussian distribution.

But such workload generator mostly generates workload that vibrate along the mean

在文檔中考量最低能源消耗之雲端資料中心動態資源分配演算法 (頁 18-0)