1.1. Preface
A Massively Multiplayer Online Game (MMOG) can be defined as a computer game able to support a multitude of players which interact with each other within the same virtual world, across the Internet, and regardless of their geographical locations.
However it is much harder than we can image to develop a stable, scalable and high performance MMOG. According to the report [1], online game market will reach to
$30 billion by 2009. Recently many research focus on the MMOG middleware which provide the toolkits for MMOG developers and short the time to market for game companies. The competition of this industry makes the demand of MMOG middleware raises every year and improves the research of MMOG development platform.
1.2. Design Issues of MMOG Middleware
There are many issues when we design a MMOG middleware.
Scalability
It can be simply defined as how many avatars or game logics could this MMOG supports. As a popular MMOG, there are usually thousands of avatars joining this game concurrently and might be hundreds of game regions and logics operated at the same time.
Availability
Distributed technologies are usually adopted by modern MMOG middleware. In order to handle plenty of avatars and game content simultaneously, MMOG is always
hosted on lots of different hardware such as servers, gateways and network devices.
However these sophisticated devices are not as reliable as we suppose. Each crash will cause serious damages to avatars’ benefit and moneyed lost to MMOG companies.
Another point of availability is load distributing. Because of the unpredictable movements of avatars in the virtual world, the load varies with time between different game servers. While the server is overloaded, it will decrease the performance of the system (e.g. network latency, CPU throughput) and even cause game server crash.
Interactivity
Avatars update it own status by sending events or messages to game server. The results will be calculated by server and reply to game players. The interactivity of MMOG middleware defines the response time of these messages exchange. Different kinds of MMOG have their own interactivity constraints.
Consistency
Consistency is a traditional problem of distributed technologies. In MMOG, this usually means the “view” synchronization of avatars. Consistency is easy to maintain in single server, but complicate in server cluster.
1.3. Motivation and Objectives
As we mentioned in the previous section, the availability of a MMOG takes an important role when we develop a MMOG middleware. However, even the most popular MMOGs today do not confront this problem. The most popular solution is duplicating the game world, and managing them separately in different servers.
Players create their characters in different servers can not interact each other. In this way, each server can balance the load naturally and will not affect the other servers while one of them crashes. Obviously this is not a mature solution, because it still
suffers hardware failure and overload. Also, separating the avatars affects the game fun.
In this paper, we design a high availability service named HAMS (High Availability MMOG Service) that is fault resistant and is able to share load dynamically in MMOG middleware. Also, we hope this service could be applied to most of the MMOG middleware.
In order to reach our goal, there are many features we should provide.
Distributed technology
Because a fault resistant system must overcome the single point of failure, a distributed control mechanism is necessary for HAMS. There must be no central controller in the MMOG middleware because it is not reliable. Since the controller maybe fail, it will suffer the single point of failure. Also the data should be duplicated to guarantee the availability. After one replication of data being destroyed because of hardware crash, there should be always a backup.
High performance
The performance of fail-recovery mechanism indicates that the repairing time while hardware failure. We can take the mechanism into two phases. First, when failure occurs, it takes time to detect which node is down. The faster we find the fail node the earlier we can start to recover data. Second, after detecting which node failed, it still needs time to recover the game data belongs to the failure node. The objective is to shorten the time of these two phases. The best result we want is that players can hardly feel anything different between the fail-recovery, at least a tolerable affection (Eg: a short time rollback).
The performance of load distributing mechanism indicates the time we cost from the moment we find overload problem to the time the load be shared.
Flexibility
Different kinds of MMOG middleware should be able to implement their own recovery police or load distribution mechanism.
Load distributing and recovery policy should be adjustable because different game design may have different consideration. For example, sometimes we would like to share our clients to server which is responsible for the clients “near” us in order to reduce the server communications since avatars may interact to avatars belong to other servers. But sometimes we hope to the clients to a “far away” server just because the remote server’s CPU utilization is much more less than the neighbor server and the communication between local server and neighbor is not frequent.
1.4. Summary
Because the MMOG industry is getting hotter, a good development environment of MMOG becomes necessary. A MMOG middleware is used to help the developer to design their MMOG products short the time to market.
There are many design issues in MMOG middleware. In this paper, we focus on the availability of MMOG middleware. We hope to provide a general solution for most MMOG middleware which want to have a high availability environment.
The availability issue can be separated into two major problems, fail-recovery and load sharing. In order to provide a good solution, our system should be designed with distributed architecture and must be flexible and high performance.