As cloud computing emerges, nowadays, web services have been used widely in every popular application domain; for example, e-commerce, social networking, web game, and etc.
[7][8][16][23]. John McCarthy presented a model of computation as a public utility in 1960 [19], which is similar to today’s cloud computing. However, the term, cloud, first appeared in a MIT published paper in 1996 [20]. Cloud computing has become popular since Google, IBM, and a large number of universities launched a large-scale cloud computing project in 2007 [22]. Today, cloud computing are usually categorized into SaaS (Software as a Service)[17], PaaS (Platform as a Service) [33], and IaaS (Infrastructure as a service), where users can apply these services through web browsers and Internet. Common examples of these services include Salesforce.com [24] and Gmail [25] for SaaS, Google App Engine [26] and Microsoft Azure [27] for PaaS, and Amazon EC2 for IaaS[28]. Salesforce.com began to deliver business applications through web interface in 2000, which is actually a kind of SaaS applications. The first universal IaaS service was Elastic Compute Cloud (EC2) provided by Amazon since 2006 [21]. It is a commercial web service which provides small businesses and individuals to rent computing resources on demand.
Nowadays, some popular web services, such as facebook, have to simultaneously serve a huge number of users from all over the worldwide. Since short response time is crucial for all kinds of service-oriented applications, the service providers have to deploy a large-scale server cluster for sharing the incoming workloads. Such a large-scale server cluster makes traditional load sharing mechanisms inappropriate and requires further research. For traditional small-scale web applications, a single dispatcher is usually deployed to dispatch incoming requests evenly onto the servers. This naturally leads to a design of centralized load
2
balancing structure. However, the single-dispatcher mechanism will be overwhelmed by the huge amount of incoming requests once deployed for large-scale cloud services. Therefore, a distributed dispatching structure has to be employed to process the incoming requests efficiently through a set of dispatchers which work independently.
In small-scale web applications with single dispatchers, the dispatcher can easily track the workload on each server since all incoming requests pass through it and all responses are also sent back through it. This structure blends well with centralized load sharing algorithms, such as Join-the-Shortest-Queue (JSQ) [2], without extra communication overheads between the dispatcher and servers for load detection. However, the situation has changed as the growing large-scale cloud services have to adopt distributed dispatching structure for sharing the huge amount of incoming requests. In such structure, incoming requests are routed to a dispatcher randomly via a specific mechanism in the router. Load balancing of incoming requests across dispatchers is not a problem since the numbers of packets in service requests are usually similar. On the other hand, the service time of each request can vary largely because some requests might require the processing of a large amount of data or complicated computation [1]. Therefore, dispatchers have to balance the workload well among the large-scale of servers. However, in distributed dispatching structure, each dispatcher independently tries to balance the workload generated by the incoming requests passing through it. Unlike in the traditional centralized dispatching structure, since only a fraction of incoming requests pass through a particular dispatcher, the dispatcher has no idea of the workload on each server. This makes the implementation of traditional load balancing algorithms, such as JSQ [2], inefficient under such structure because now each dispatcher has to query every server about its current workload before making each request dispatching decision, resulting in a large amount of communication overheads between the dispatchers
3
and servers. The communication overhead will be exacerbated as the number of dispatchers and servers grows to a large scale, e.g., thousands of servers, which is becoming common for popular cloud services. Therefore, efficient distributed load balancing becomes a crucial research issue for emerging large-scale cloud services.
In a recent research [1], Lu et al. proposed a Join-Idle-Queue (JIQ) algorithm for efficient distributed load balancing, where the servers will automatically join a dispatcher for receiving incoming workload, opposite to traditional approaches in which the dispatchers have to query the workload of each server for dispatching decisions. The mechanism of JIQ can avoid the overwhelming communication costs incurred when implementing the traditional JSQ algorithm for large-scale cloud services. However, in the basic idea of JIQ, a server will join a dispatcher for receiving incoming requests only when it becomes idle. This mechanism becomes ineffective when system load is high since no servers will be idle and the dispatchers have to dispatch requests randomly with no load balancing effects. In this thesis, we propose a Join-Queue-Anytime (JQA) mechanism in which each server always registers with a particular dispatcher at anytime, avoiding the situation of random dispatching in JIQ. The JQA mechanism is expected to achieve better performance than JIQ under moderate or high system load. Four distributed load balancing methods were developed based on the JQA mechanism. The proposed methods have been evaluated through a series of simulation experiments implemented with the CloudSim toolkit [12][13]. The experimental results indicate that the proposed JQA mechanism achieves significant performance improvement compared to the JIQ algorithm, up to 31% reduction of average response time.
The reminder of this thesis is organized as follows. Chapter 2 discusses related works on dynamic load balancing. We present our JQA mechanism in chapter 3. Chapter 4 describes our simulation environment based on the CloudSim toolkit. Chapter 5 evaluates the proposed
4
JQA-based methods and compares them with the JIQ-based approaches. Chapter 6 concludes this thesis.
5