In recent years, the number of people using Internet services is increasing due to the rapid development of Internet. Single service server that provides Internet service has become unable to cope with the growing Internet service demands. Constantly improving the performance of a single server not only increases the cost, it cannot really solve the problem of rapidly growing Internet service demands. Instead, it is necessary to use more service servers to provide services simultaneously.
However, even if there are multiple servers providing services at the same time, if we can not effectively assign users among servers, this would cause some service servers’ load to be too high, while others’ would be relatively low, resulting in bad and unstable quality of service. There are many researches that have been presented to solve this server uneven load problem. In [1], it classifies load balancing architecture into four classes: client-based, dispatcher-based, DNS-based, and server-based load balancing architecture, summarized as follows:
Firstly, in client-based load balancing architecture, client hosts need to modify their software or hardware or through the user manually depending on the service quality of service server to choose a better service server. Its shortcoming is not convenient to users.
Secondly, in dispatcher-based load balancing architecture, all service servers are usually placed in a geographically centralized area, and through a dispatcher to receive all user requests, and then depending on the states of service servers at each time, dispatcher can determine which service server is the best to provide service.
Because of the service servers are placed in a geographical central area, the state information of servers can be more immediately get to help dispatcher to do load balancing. However, its first disadvantage is that services would be suspended due to
the single dispatcher fault resulting in poor reliability. The second disadvantage is that only part of users who are closer to that system can get low propagation delay.
Thirdly, in DNS-based load balancing architecture, servers can be placed in geographical distributed areas. The users first send a domain name resolution request to the DNS server to obtain an IP address of a service server, and then send the service request to the service server with that IP address to get services. DSN servers usually do load balancing through assign users to different servers by random or round robin manners or according to the server states, such as load condition, network situation, which are periodically receiving from servers. As the master/slave DSN architecture has been widely used, that is, once the master DNS server failed, there is a slave DNS server can take over its works resulting in high reliability. But the difficulty is that because of servers are placed in geographical decentralized area. Hence the service server’s states are not allowed to be obtained immediately to avoid congesting or wasting network bandwidth.
Finally, in server-based load balancing architecture, it first needs the assistance of a simple dispatcher or DNS server to simply distribute users among servers by random or round robin manner. At time goes by, the problem of load uneven between servers could begin to appear because of different execution time and resource requirements among jobs. At this time, servers must to determine if they need to exchange jobs or not by exchanging the states to each other to raising the load balancing degree. The advantage of this load balancing architecture is that the states of servers can be exchanged more immediately because of servers are placed in a geographic central area to achieve more good load balancing degree. On the other hand, load balancing is done by the coordination of all service servers in this architecture, small part of service servers failed would not dramatically affect the quality of services, so this architecture has high reliability.
In the load balancing policy, the conventional methods of load balancing always set a load buffer range to decrease the state change frequency of a service server in the geographic distributed load balancing architecture, and mostly assume that servers are homogeneous and just consider single resource consumption, such as CPU load;
however, the load buffer range would result in load oscillation among servers. On the other hand, servers may not always have the same capacity and jobs almost needs many kinds of resource requirements, such as memory space, network bandwidth, etc.
Only consider single resource consumption would cause the system bottleneck to derive from the lack of a small number of resources, and lead to low system utilization.
In this thesis, we would integrate the DNS-based and server-based load balancing architecture to implement our load balancing method. As shown in Fig. 1, all service servers would be first partitioned to multiple server clusters and placed in geographical distributed areas. The servers with the minimum ID in a server cluster should determine the states of its server cluster by Random Early Detection (RED) method. The idea of RED method is that the probability of the state of a server cluster becoming overloaded is directly proportional to the load of the server cluster at that time. After determine the state of the server cluster, the server with the minimum ID in that server cluster would periodically send that state to DNS server, then DNS server can assign client requests among server clusters according to the state of each server cluster.
Once the client requests arrived in a server cluster, our distributed market mechanism load balancing method would do the second phase load balancing inside the server cluster. The concept of our market mechanism load balancing method is that the cost of one job executed on a service server is related to the proportion of its different resource requirements, and the cost of each resource requirement is directly
proportional to the load of that resource of the server. Hence, we would consider the different cost of a job executed on each server to determine the server with the best fit.
On the other hand, each server must consider its multiple heterogeneous resource capacities and the multiple heterogeneous resource requirements of jobs, and through exchanging state information of each other to determine if they need to exchange jobs or not in order to raise the overall system load balancing degree and provide stable, reliable, and scalable high quality services further. In our simulation, we use four metrics: average standard deviation of service server loads, average standard deviation of resource loads, average server utilization, and average turn around time to analysis and compare our method with other conventional methods.
Fig. 1、DNS-based load balancing architecture