Chapter 3 Proposed SLA-aware Load Balancing Scheme for Cloud Datacenters 9
3.2 Neural network-based dynamic weighted round-robin (nn-dwrr) scheduling
In this paper, we focus on dynamically adjusting the weight of each VM. We propose a novel neural network-based load balancing algorithm, called nn-dwrr (neural network-based dynamic weighted round-robin), to dispatch requests to appropriate VMs based on their weights. A weight should be able to reflect the current capacity of a VM. We give each active VM a weight according to the capacity index (CIi) from Load Monitor and the neural index (NIi) from Load Prediction. The Request Scheduler module distributes the requests to active VMs by their weights assigned by the Weight Adjustment module.
The first part of the information required by the Weight Adjustment module is remaining capacity information. Load balancing ought to be achieved using an inferred system state based on locally gathered data [11]. The Load Monitor module collects four load metrics, utilizations of CPU, memory, network bandwidth, and disk I/O. Weight Adjustment will use following formula to calculate capacity index (CIi) for VMi.
CI𝑖 = 1 − MAX(𝐶𝑃𝑈𝑖, 𝑀𝑒𝑚𝑖, 𝐵𝑎𝑛𝑑𝑤𝑢𝑑𝑡ℎ𝑖, 𝐷𝑖𝑠𝑘 𝐼/𝑂𝑖)
The greater capacity index means more remaining resources in this VM. We are not sure what kinds of services will be provided in datacenters. Different services require different critical resources. For example, the critical resource of a Web server is CPU and the critical resource of a FTP server is network bandwidth. The critical resource may become the bottleneck of a VM. Therefor we simply use a maximal to find the current bottleneck of a VM [13].
14
The second part is the load prediction information from a neural network-based load predictor. We used the delta learning rule in our ANN design (see Figure 8 and
Figure 8 Schematic representation of an artificial neural network model for VMi.
Figure 9. The process of delta learning rule VMi.
15
Figure 9) because the neural network has the capability of optimization and prediction.
Due to there is no certain mathematical approach for obtaining the optimum number of hidden layers and their neurons [14], we used a single hidden layer for less computation time in our design.
In Figure 8, input 𝑥 is a vector which contains recent ten history weights. To avoid SLA violations, such as the response time required (di), which is specified in the SLA , we consider the response time when training the neural network. The neural network will calculate a weight for each VMi, which we call neural indexi (NIi). Request Scheduler allocates requests according to NIi, and then measure the average response time (oi). When the current average response time is close to the certain proportion (called pre-reaction rate (p), e.g., 80%) of response time in the SLA, the neural network will automatically adjust the hidden layer’s weights before SLA violation. If the learning rate (𝛼) is set to a large value, the neural network can learn faster.
However, if there is a large variability input, then the neural network may not learn very well. We use the following formula to train the neural network:
𝑁𝐼𝑖 = 𝑓 (∑𝑓(𝑛𝑒𝑡𝑗))
𝑟 = (𝑝 × 𝑑𝑖 − 𝑜𝑖) × 𝑓′(𝑛𝑒𝑡𝑗)
∆ω = 𝛼 × 𝑟 × 𝑥
w𝑗(𝑡+1) = w𝑗(𝑡)+ ∆ω𝑗
16
If there are n VMs in a local load balancer, the Weight Adjustment module will combine remaining capacity system real time information CI and neural network output NI together to calculate weighti (Wi) for VMi by following formula:
𝑊𝑖 = 𝐶𝐼𝑖 × 𝑁𝐼𝑖
∑𝑛𝑗=1(𝐶𝐼𝑗× 𝑁𝐼𝑗)∗ 100%
Wi reflects the remaining resources proportion of VMi in the entire n VMs. The Weight Adjustment module sends these weights to Request Scheduler. Figure 8 shows the flowchart of our algorithm.
17
Figure 10. Flowchart of the nn-dwrr algorithm.
18
Chapter 4
Evaluation and Discussion
4.1 Experimental environment
We built a testbed that includes a local load balancer and a VM configuration, as shown in Figure 10. This testbed was for hosting a web page service. There was three active VMs (VM1, VM2, and VM3) with different capabilities and two spare VMs (VMs1 and VMs2), which were running in an apache web server in a virtual zone. We used the load balancer to link these VMs together to form a virtual zone. The load balancer would distribute user requests to three VMs according the proposed scheduling algorithm nn-dwrr. The experimental environment setup and related parameters are shown in Table 3 and the configuration of the five VMs is shown in Table 4.
Figure 11. Experimental setup.
19
We used this testbed to host web services, and evaluated average response time using an apache benchmark (ab) to collect real web traffic for different load balancing algorithms. Requests are based on a real web service. We then compare four different scheduling algorithms.
4.2 Comparison of different load balancing algorithms
How to utilize the advantage of cloud computing and make each task to obtain the required resources in the shortest time is an important topic [9]. Therefore, we use
Table 3. Load balancing experimental parameters.
OS CentOS 5.5
Virtual machine hypervisor Xen
Number of VMs 3
Number of spare VMs 2
Application Web service
Duration (time limit) 60 sec
Response time specified in the SLA 2000, 1000, 432 ms
Pre-reaction rate (p) 80%
Transfer function (f)
(for hidden and output layers) Log-sigmoid
Learning rate (𝛼) 0.5
Table 4. Configuration of each VM.
VM1 VM2 VM3 VMs1 VMs2
CPU (cores) 1 2 3 2 2
Memory (MB) 512 1024 2048 1024 1024
Virtual disk (GB) 10 10 10 10 10
Static weight (wrr) 1 2 4 - -
20
the average response time as a metric for comparing different scheduling algorithms.
Figure 12. Comparison of four scheduling algorithms (maximum response time specified in the SLA: 2000 ms).
Figure 13. Average response time (maximum response time specified in the SLA:
2000 ms).
21
Figure 12 shows the comparison of four scheduling algorithms. The response time requirement specified in the SLA is 2000 ms. In Figure 12, we found that the static scheduling algorithm (wrr) has the longest response time. The capacity-based
Figure 14. Comparison of four scheduling algorithms (maximum response time specified in the SLA: 1000 ms).
Figure 15. Comparison of four scheduling algorithms (maximum response time
specified in the SLA: 432 ms).
22
and wrr scheduling algorithm has near the same performance before number of requests over 510. After that, the disparities of the response time between them will become more obvious. The performance of the ANN is good when the number of requests is large. However, we found the average response time of the ANN-based algorithm is the worst and changes greatly before the average response time exceeding 80% (pre-reaction rate) of the response time specified in the SLA. This is because the ANN-based algorithm will continue to distribute requests to a VM when the response time not exceeding 80% of the response time specified in SLA.
Disregarding the number of requests, the performance of the proposed nn-dwrr is always the best. Figure 13 shows that the proposed nn-dwrr is 1.86 times faster than wrr, 1.49 times faster than capacity-based, and 1.21 times faster than ANN-based scheduling algorithms in terms of average response time. Figure 14 and Figure 15 shows the cases under different response times (1000 ms and 432 ms) specified in the SLA. They shows the performance differences of the ANN-based and nn-dwrr are getting closer when the specified response time become smaller.
23
4.3 Comparison of SLA violation rates with and without a spare VM pool
Figure 16 shows the comparison of the SLA violation rate with and without a spare VM pool in the proposed tldlb architecture, both running the proposed nn-dwrr algorithm. In this experiment, the threshold of the SLA violation rate was set to 5%.
The SLA violation rate is defined as follows:
SLA violation rate =Number of requests violated Number of total requests
The SLA Engine, as shown in Figure 7, will keep monitoring the response time of each request and calculating the SLA violation rate. The SLA Engine would activate a spare VM when the SLA violation rate exceeds its threshold (5%, in this case). We found that the proposed tldlb can avoid exceeding the SLA violation rate of 5% by activating VMs from a spare VM pool. The proposed tldlb can indeed reduce the SLA violation rate by activating VMs in the spare VM pool in time.
Figure 16. Comparison of SLA violation rates with and without a spare VM pool.
24
Chapter 5 Conclusion
5.1 Concluding remarks
We have presented an SLA-aware decentralized load balancer architecture, tldlb, which can reduce the SLA violation rate. If active VMs are overloaded, the proposed tldlb avoids SLA violations by activating spare VMs in a spare VM pool. In addition, we also proposed a novel neural network-based load balancing algorithm, nn-dwrr, to distribute incoming requests to appropriate VMs. Experimental results have shown that the proposed nn-dwrr is 1.86 times faster than the wrr, 1.49 times faster than the capacity-based, and 1.21 times faster than the ANN-based scheduling algorithms, in terms of average response time. The experiment results have demonstrated that our proposed nn-dwrr algorithm has faster response time, which means we can handle more requests per second. Since our scheduling algorithm is simple and efficient, it is well-suited for cloud computing environments to service a large number of requests balancers to a cloud datacenter testbed for further evaluation.
25 Proceedings of International Conference on Industrial Mechatronics and Automation (ICIMA), pp. 240-243, 2010.
[4] W. Y. Lin, G. Y. Lin, and H. Y. Wei, "Dynamic Auction Mechanism for Cloud Resource Allocation," in Proceedings of Cluster, Cloud and Grid Computing (CCGrid), pp. 591-592, 2010.
[5] “Amazon Elastic Load Balancing,” [Online]. Available:
http://aws.amazon.com/elasticloadbalancing.
[6] “rackspace - Cloud Load Balancers On-Demand,” [Online]. Available:
http://www.rackspace.com/cloud/cloud_hosting_products/loadbalancers.
[7] “Service-level agreement – Wiki,” [Online]. Available:
http://en.wikipedia.org/wiki/Service-level_agreement.
[8] R. Rajavel, “De-Centralized Load Balancing for the Computational Grid environment,” in Proceeding of International Conference on Communication and Computational Intelligence (INCOCCI), pp. 419-424, Dec. 2010.
[9] S. C. Wang, K. Q. Yan, W. P. Liao, and S. S. Wang, “Towards a Load Balancing in a Three-level Cloud Computing Network,” in Proceeding of IEEE International
26
Conference on Computer Science and Information Technology (ICCSIT), vol. 1, pp. 108-113, Jul. 2010.
[10] “Linux Virtual Server,“ [Online]. Available: http://www.linuxvirtualserver.org.
[11] M. Randles, D. Lamb, and A. Taleb-Bendiab, “A Comparative Study into Distributed Load Balancing Algorithms for Cloud Computing,” in Proceeding of Advanced Information Networking and Applications Workshops, pp. 551-556, Apr.
2010.
[12] C. C. Li, and K. Wang, “SLA-aware Load Balancing for Cloud Data Centers,”
Report, 2012.
[13] V. Nae, A. Iosup, and R. Prodan, “Dynamic Resource Provisioning in Massively Multiplayer Online Games,” IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 3, pp. 380-395, Mar. 2011.
[14] Y. Zhang, J. Pang, R. Zhao, and Z. Guo, "Artificial Neural Network for Decision of Software Maliciousness", in Proceedings of Intelligent Computing and Intelligent Systems (ICIS), pp. 622-625, 2010.
[15] J. Hu, J. Gu, G. Sun, and T. Zhao, “A Scheduling Strategy on Load Balancing of Virtual Machine Resources in Cloud Computing Environment,” in Proceedings of International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 89-96, Dec. 2010.