Spanning Tree – Random - 同儕網路上資源排程方法之公平性研究

order():

collect children with !roundCompleted into childList shuffle childList randomly

return the childList

Excessive-Resource Child First The idea is to start from the child who “seems” to have

most excessive resources, including the available resources in all of its sub-trees, hoping to guess the right path earlier. This approach also has to potential to prevent contention of resources among “busy” managers. For example, suppose a given child (including its sub-trees) generates requests more than the resources it provides for a given round. If the child is searched first (e.g. using the randomized strategy described previously) and an available resource is indeed found, this may make the requests generated from that sub-tree later unable to find a resource close-by, hence incur additional search steps. The expected

“excessiveness” for a child can be accumulated and averaged round by round easily through the spanning tree (when nextRound() is called).

Algorithm 4.3: Spanning Tree – Excessive Resources First Manager m

requestCount : int;

resourceCount: int;

requestCountAvg : float;

resourceCountAvg: float;

totalRequestCountAvg : float;

totalResourceCountAvg: float;

registerResource(Resource res):

resourceCount++

same as registerResource(res) in 4.1

requestFromLocal(Request r):

requestCount++

same as requestFromLocal (r) in 4.1

averageCounts():

compute requestCountAvg based on requestCount and old requestCountAvg

compute totalRequestCountAvg based on requestCount, requestCountAvg, and childrens’ totalRequestCountAvg compute resourceCountAvg and totalResourceCountAvg similarly

nextRound():

(root only) perform averageCounts() bottom-up same as nextRound () in 4.1

order():

order children according to their (totalResourceCountAvg – totalRequestCountAvg)

Note that how averages are obtained is not explicitly given in Algorithm 4.3. With a spanning tree, the averages can be computed easily in a bottom-up manner. Also, we do not give specific formula in averageCounts() for computing the averages for each manager because there may be alternatives. An extreme case is to compute the averages based solely on resourceCount and requestCount recorded in the current round disregarding the past

records. A more typical approach is to account for the averages from previous rounds using some weighting factors.

Member Reassignment. The idea behind this strategy is to distribute requests and

resources evenly among managers by gradually reassigning producers and requesters to new managers in a decentralized manner. Intuitively, we would like to reassign the membership between a parent and a child if they have large difference in terms of net resource supply (i.e.

average resources count minus average request count). Like the accumulation of request/resource counts in Algorithm 4.3, the net resource supply can be derived by considering both current and past resource/request statistics. With this reassignment step, it is hoped that the whole P2P network can evolve according to the patterns of resource supply and requests over time. The algorithm is depicted below:

Algorithm 4.4: Member adjustment Manager m

exchangePeers(parent, child) parentRequests = 0;

parentResources = 0;

halfRequests = (parent.requestAvg + child.requestAvg) / 2 halfResources = (parent.resourceAvg + child.resourceAvg) / 2 collect all peers belong to parent and child in l

Note algorithm 4.4 does not stated when and how a given parent-child pair is chosen to perform the member reassignment. Although it can be determined using a threshold, in our experiments in the next chapter we pick the pair that exhibit maximum difference in net resource supply, with the goal to examine the effectiveness of such member reassignment approach.

Chapter 5 Experiments

In this chapter we evaluate several resource scheduling approaches for P2P networks via simulation. In addition to FFI and response time, there are also other important performance indexes we would like to investigate for different approaches:

z Turn-around time: the time interval between the request issue time and complete time.

z Overhead: the number of (control) messages for request routing, book-keeping messages, and so on.

In order to compare our spanning tree-based scheduling approach to other P2P networks with different topologies, we also modify existing P2P search algorithms to suite our resource model. They are outlined below:

Decentralized Search. This method assumes Gnutella-like networks. The search is over

a similar unstructured overlay where peers (including requestors and providers) maintain a limited number of neighbors. However, when a request is issued by a requestor, the request is searched (unlike flooding in Gnutella) among its neighbors. Like Freenet, some information

is accumulated in each peer that is used to decide the order of search among its neighbors.

Specifically, each peer maintains a round counter similar to our spanning tree-based approach, although the round number is not synchronized throughout the network. Like the spanning tree-based method, newly registered resources to a manager are always placed in the queue for the next round. In addition, new requests arrived at a manager will not be served locally if the manager knows that some of its neighbors have lower round number. In this case it will forward the request to the neighbor with lowest round number (but only up to a constant number of hops, called TTL, or time to live). The requests are served locally when all the

neighbors have equal or higher round numbers. Once the local resources are run out, the manager will advance to the next round and notify its neighbor about its new round number.

Figure 5.1 shows an example of a decentralized search over such a network in which each manager maintains a round number. Suppose manager B has round number 6 and it receives a request locally, it will forward the request to manager A, who will process the request locally because it still have resources available (otherwise it would have advanced to round 6 earlier). On the other hand, when manager C receives a request, it will process the request locally, and if the resource is the last one, manager C will increment its round number to 7 and notify its neighbors.

Centralized Manager. This method, somewhat similar to Napster, uses the same setup

as our tree-based method, except that the root is assigned the responsibility of request registry, resource registry, and match making. In other words, the root implements the centralized producer-consumer queue, and all requests and resources that arrive at different managers are

routed through the spanning tree to the central manager. Figure 5.2 depicts the centralized architecture.

Table 5.1 shows the list of parameters that are varied in our experiment. Each data point is obtained by performing 20 simulations each with a distinct, randomly generated network topology under the same network parameters. In the table, the communication speeds among participating peers and managers are relative.

Table 5.1: The environment parameters of simulation:

Network Size 1000~2000

Percentage of managers 5%, 15%, 50%, 75%, 100%

Percentage of Requesters 1%, 5%, 10%, 20%

Communication speed between managers 0.05, 0.25, 0.5 Communication speed between managers

and managed peers

0.5 Communication speed between peers 0.5~1.5

request queue service queue

Figure 5.3 shows the simulation result of FFI for a typical network setting, namely when there are 5% of managers among the overall network of peers (hence there are 20 peers in average assigned to each manager), where the number of nodes ranges from 1000 to 2000.

Furthermore, there are also 20% of requestors.

The result in Figure 5.3 shows that the tree-based method outperforms the decentralized method when FFI is concerned. When the response time is concerned, the tree-based method lies between the centralized and decentralized method, as shown in Figure 5.4 below:

Figure 5.4: Average response time Average response time

1000~2000 peers 5% managers 20% requestors

peers Figure 5.3: FFI Statistics FFI

1000~2000 peers 5% managers 20% requestors peers

Similar results are also obtained when the request-only peers are set to 1%, as shown in Figure 5.5, where both the response time and turn-around time for the tree-based method is comparable to decentralized method; both of which are better than the centralized method.

Figure 5.6 shows the the average hops for different methods. Note that similar to P2P networks such as Gnutella or Freenet, the decentralized method also imposes a fixed time-to-live (TTL) constant that limits the search range. Here it is set to 3. Accordingly, it is indicated in the figure that our tree-based mehod also has average hops of 3, while the centralized method has the averagel hop number doubled, due to the required bottom-up request routing. In general, based on our performance study, the tree-based method has good response time when compared to the decentralized method (which typically have higher FFI) while maintaining low FFI when compared to the centralized method (which typically have higher response time).

Figure 5.5: Average response time and turnaround time Average response time Average turnaround time

peers

In the next set of experiments we would like to vary the percentage of request-only peers and observe the effects. In particular, we consider the cases for 1%, 5%, 10%, and 20%

request-only peers assuming there are 5% managers overall (Figure 5.7).

Figure 5.6: Average hop number

1000~2000 peers 5% managers 20% requestors peers

Average hop number

As shown in Figure 5.7, when the percentage of requesters increases, the FFI results for the tree-based method stay with centralized method in general, and their growth rates are less significant than the decentralized method. In the case of 20% requesters, the centralized method has quite small FFI as it should be, but the tree-based method is about four times better than the decentralized methods.

Figure 5.7: FFI results with 1%, 5%, 10%, 20% requestors 1% 5%

20%

10%

FFI

FFI FFI

1000~2000 peers 5% managers

1%, 5%, 10%, 20% requestors peers

peers peers

FFI

peers

Figure 5.8 shows the response time results. As indicated there, the decentralized method has best response time overall, which is natural since it only acts based on local information and avoids many control overhead. When the requests are relative low in quantity, the tree-based method has comparable response time to the decentralized method. On the other hand, when the requests are abundant, the response time for both tree-based and centralized methods grows larger than the decentralized method, although it is within the 200%-250%

range. Similar results are also indicated in Figure 5.9 when measuring the average hop numbers.

Figure 5.8: Average response time between 1%, 5%, 10%, 20% requestors 5%

20%

10%

Average response time Average response time Average response time

Average response time

1000~2000 peers 5% managers

1%, 5%, 10%, 20% requestors

peers

peers peers

peers

Figure 5.10 shows the change of FFI over time for the case of 2000 nodes with 5%

managers. We measure the FFI for each of the 10 intervals. As indicated there, both the tree-based and centralized methods have their FFI stabilized quickly. This implies that requests and resources get fulfilled steadily and timely. In contrast, the growth of FFI over time for the decentralized method indicates that some peers in the network may suffer from unfair scheduling and wait longer than the others.

Figure 5.9: Average hop number between 1%, 5%, 10%, 20% requestors 5%

20%

10%

Average hop number Average hop number

1000~2000 peers 5% managers

1%, 5%, 10%, 20% requestors

peers peers

Figure 5.11 shows an interesting observation about the distribution of requests among resource providers. Specifically, the centralized method exhibits the desirable behavior because the requests are distributed evenly among resource providers. This is not surprising because the all resource providers that become ready need to (re)enter the central queue and get fulfilled in an FIFO manner. The decentralized method, on the other hand, has the worst request distribution. This is due to the uneven request pattern generated among the requesters and the fact that requests are served by the resource providers closer to them.

Figure 5.11: Average job number among resource providers Average job number

2000 peers 5% managers 20% requestors peers

Figure 5.10: Change of FFI over time

2000 peers 5% managers 20% requestors FFI

The next set of experiments are concerned with the impact of the manager percentage, which represents the degree of decentralization – the larger the percentage of the managers, the more decentralized the resulting network is. We investigate the cases of 5%, 15%, 50%, 75%, and 100% managers when the requesters are 20%.

Figure 5.12 shows the FFI for different methods and different manager percentages. As before, the centralized method has lowest FFI, and is relatively insensitive to the change in manager percentage. On the contrary, the decentralized method is quite sensitive to the manager percentage change, and its FFI is larger in general. In all three methods, the FFI improves when the number of managers increases.

Figure 5.12: FFI for the cases of 5%, 15%, 50%, 75%, 100% managers Spanning Tree Centralized

Decentralized FFI

FFI

peers peers

peers FFI

Figure 5.13 shows the simulation results of the response time. It is shown that the centralized method has similar response time under different manager percentages. The centralized method has roughly same response time. This is due to the fact that the average path length from managers to the root is roughly the same for different node sizes. On the other hand, the tree-based and decentralized methods have the response time decreased when the manager percentage decreases. This is reflected by the fact that with more managers in the network, the longer it takes to search for available resources in these two methods.

Figure 5.13: Average response time between 5%, 15%, 50%, 75%, 100% managers Spanning Tree

Centralized

Decentralized Average response time

Average response time

peers peers

peers

Average response time

In the following experiments we are interested in the effectiveness of some self-adaptation strategies on the scheduling. We investigate two independent approaches that have been described in the previous chapter for the tree-based method: by changing the search order among children, and by re-assigning members between a parent and its child. To better exploit the effectiveness, we also change the request generation pattern such that 10% of the requesters have higher request generation rate than normal. Here the managers are set to 15%

and requesters are set to 20%.

Figure 5.14 shows both the FFI and response time for the tree variations of tree-based methods. The result shows that both self-adaptation schemes improve the FFI and response time, and in the case of member reassignment the response time dropped significantly when compared to the base tree-based methods.

To further appreciate the effect of the two self-adaption strategies, Figure 5.15 shows the communication overhead over time (2000 nodes, 15% managers), where the overhead represents the amount of messages for both request/resource fulfillment and network

Figure 5.14: FFI and average response time

peers peers

FFI Response time

maintenance. The results show that both self-adaption approaches can reduce unsuccessful searches, with the member reassignment approach improves the most.

Figure 5.15: Change of overhead over time Overhead

Chapter 6 Discussion and Future Work

It is interesting to compare DHTs and virtual queues at an abstract level. A DHT implements a virtual hash table using a set of multiple collaborating peers, and there are many approaches to implementing DHTs – the main differences being the mechanisms for requesting routing and object-peer assignment. Similarly, a virtual queue also implements a (doubly-ended) FCFS queue using multiple, distributed peers, and the goal is to meet the key requirements mentioned previously.

FCFS queues or the associated scheduling policies are not a new concept per se and they have been an important research topic in operating systems, parallel computing, networking, and other research fields. However, its use as a fairness measure for resource scheduling in P2P networks is uncommon. This is quite expectable for several reasons. First, to be effective, FCFS policies and other closed related policies such as least-used-first policy (when deciding which item to kick out off the cache) or earliest-starting-time-first heuristics (when scheduling jobs over multiple processors) more or less need global and timely status about the resources to be scheduled. Implementing FCFS policies in P2P networks will most likely incur unnecessary overhead.

Secondly, and probably more importantly, network-wise FCFS fairness is irrelevant in application areas such as file sharing supported by P2P systems, where resources offered by providers (who earn some credits as reward) being shared are expected to last for some time.

Even for P2P networks sharing generic, uniform resources such as machine cycles, as mentioned, the usual goal is to improve job processing rate, in which case imposing FCFS fairness seems to reduce the processing rate, especially when the network grows larger and the request pattern is highly skewed.

When resource scheduling is concerned, it is interesting to compare SETI@Home with traditional systems such as operating systems, clusters, or grids, which often need to predict the performance characteristics of the participating resource providers painstakingly in order to derive a suitable execution plan, yet only to find that the predicted performance model disagrees with actual machine statistics due to machine dynamism. In the SETI@Home architecture, instead, the scheduling is done automatically by the resource providers since their act of registration indicates that they are available for the moment, fully respecting machine dynamism. Our resource sharing model bears the same idea as SETI@Home’s, but generalized it in some aspects. First, unlike in SETI@Home where the central server is the one who keeps the work to be done, ours leaves what to be done to resource requestors. In addition, the central server is replaced with a set of collaborating managers that implements the virtual FCFS queue, hoping to improve locality, load balance, fault tolerance, and ultimately scalability. Despite the fact that our virtual queue may incur unavoidable communication delay due to FCFS requirements.

Interestingly, however, the FCFS fairness can play an important role in designing sound incentive mechanisms. For example, if the resource consumers and resource providers are the same set, and whether a consumer can receive resources it needs only after it has earned corresponding credits by providing matching resources. This scenario is not uncommon, and similar works have been done on file-sharing P2P systems ([1] and [11]) where a peer gets paid for providing a specific file to a remote peer, and earned credits are subsequently used for the peer to request a file at another peer. Clearly, without proper FCFS fairness, such incentive mechanisms cannot guarantee that participants with equal capability and willingness to contribute (and consume) may receive unfair treatment.

Our investigation focuses nevertheless on a narrow scope that can be outlined as follows.

First, in our simplified model, jobs are uniform, that is, they are of the same type and same

processing complexity statistically. Secondly, providers are of the same processing power so that the execution time for a given job is the same when run by different providers. These assumptions are made to avoid some pathological cases. Although it is possible to drop these assumptions, doing so may raise new issues of fairness again, but they are nevertheless interesting questions that can be pursued further.

As an example, what is considered a proper pricing of processing a job? It is natural to associate prices with number of instructions and/or space used rather than by mere job counts.

By distinguishing job counts from job pricing, however, the notion of fairness needs to be re-evaluated. By ensuring that each awaiting provider receives fair treatment in terms of job counts, as demanded by our fairness model addressed above, can some providers eventually earn much more than the others under certain request patterns? The problem becomes more challenging when providers can have quite diverse processing power. Although it is natural to demand a “capitalism-oriented” policy that capable providers should receive requests proportional to their processing capabilities. Again, assume all providers participate eagerly in a P2P network and all other aspects being equal, can some providers eventually earn much

在文檔中同儕網路上資源排程方法之公平性研究 (頁 31-0)