• 沒有找到結果。

Chapter 4 Simulation Results

4.2 Simulation results

We used Gini coefficient (G) as a load balancing index for evaluation of load balancing regarding the number of indexes handled by each zone. The range of G is between 0 and 1.

The closer the G approach to 0, the more load balancing it is. G is computed as follows [22]:

∑∑

For calculating G regarding the number of published indexes in each zone. N is the number of zones (N = 256), li and lj are the numbers of indexes handled by the ith and jth zones, respectively, and µ is the average number of indexes handled by each zone. For calculating G

Table III. Simulation parameter settings.

Number of KAD peers 256 × 8,000

Number of KAD zones 256

Peers per zone 8,000

Number of different keywords 1,000,000

Keywords popularity distribution Zipf’s law [18]

Search distribution Zipf’s law [18]

Raito of publish messages to search messages 10 : 1

Because RFT would affect the performance of KAD, KAD-7, and the proposed KAD-mod, we conducted experiments to decide the best RFT. Figure 11 shows the G regarding the number of indexes published in each zone under a different RFT. We found that the lowest value of G occurs when the values of RFT are between 5000 and 6000.

There are two issues in the proposed KAD-mod. First, the average hop count of finding a target to publish an index will increase after applying the KAD-mod method. We used the results of [17] to evaluate the average hop count of finding a target to publish an index. Figure 12 shows the average hop count of finding a target to publish an index under a different RFT.

In our method, for some popular keywords receiving peers may need to redirect KAD_REQs to other peers because the total number of indexes of a popular keyword in the receiving peers exceeds RFT. The redirection of KAD_REQs needs an additional hop to find the next target.

Figure 11. The Gini coefficient regarding the number of indexes published in each zone under a different RFT.

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000

Gini coefficient (G)

RFT

Second, the number of search messages will also increase after applying the proposed KAD-mod method. However, the number of publish messages will not be affected. Note that total network messages include search messages and publishing messages. The percentage of extra traffic (T ) for KAD-mod is defined as: e increases with a higher RFT, as shown in Figure 11. We found that 6000 is the optimal RFT in the proposed KAD-mod. Remind that the percentage of extra traffic for KAD-mod is small (8% for RFT = 6000) number of search messages is much smaller than the number of publish messages.

Figure 12. The average hop count of finding a target peer to publish an index under a different RFT.

4.3 Comparison with existing load balancing schemes

KAD-mod can publish popular indexes more balanced than KAD-7 and KAD. The proposed KAD-mod can publish indexes to all of 256 zones when the number of the indexes in the original publish target peer exceeds RFT, while KAD-7 and KAD can only publish indexes to seven zones and one zone, respectively. Figure 14 shows the comparison of the Gini coefficient regarding the number of indexes in each zone among KAD-mod, KAD-7 and KAD. We found that KAD-mod is more balanced than KAD-7 and KAD. Figure 15(a) shows the percentage of extra traffic compared to KAD. KAD-mod has only 0.68% more extra network traffic than KAD-7. In Figure 15(b), we observed that KAD-mod’s average publish hop count is only 0.5 hop more than KAD and KAD-7.

Figure 13. The percentage of extra traffic under a different RTF.

0

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000

Percentage of extra traffi (%)

RFT

8.2827

Figure 14. Comparison of publish load balancing among the three schemes in terms of the Gini coefficient.

Indexes in failed peers are called missing indexes. Objects referenced by missing indexes are not searchable. The search hit rate is calculated by 1- (dividing the number of missing indexes to the number of total indexes). According to [25], the percentage of failed peers in a day is about 27%. Figure 16 shows that the search hit rates of KAD-mod and KAD-7 are 98.24% and 97.71% when 27% of peers failed.

The publish load will affect the request load. We also evaluate the load balancing of requests. Since MHF [24] did not describe how to publish indexes, we only include MHF [24]

for load balancing requests of comparison here. MHF [24] set the threshold of the request rate to 800 requests per second. Figure 17 shows G of the request load under the best, average, and worst cases of MHF. In the best case, all requests are from the same peer and the request rate is higher than the threshold of request rate all the time. The best case is almost impossible to happen because it does not meet the real P2P network characteristics. In the worst case, the request rates of all requests are lower than the threshold of request rate and this also does not

Figure 16. The search hit rate with respect to the peer failed rate for different approaches.

meet the real P2P network characteristics. The average case can reflect the real P2P network characteristics.

For the proposed KAD-mod, because indexes are evenly published, the request load will become even as well. Figure 18 shows the comparison of G’s regarding the request load among the four approaches. KAD-mod performs the best in terms of G of the request load, because in KAD-mod, the more popular indexes are handled by more peers.

Figure 17. G’s of the request load for the best, average, and worst cases of MHF.

0.1898

Chapter 5 Conclusion

5.1 Concluding remarks

In this paper, we have presented an efficient modulo based method (KAD-mod) to balance the publish load and request load of KAD P2P networks. Our approach also improves the hit rate of keyword searching. The proposed KAD-mod is a simple and effective method without complex calculations. By redirecting overloaded indexes, indexes can be distributed more even, and not only the publish load but also the request load of each peer would be more balanced. Although the average hop count of finding a target to publish an index will increase and the total network traffic will slightly increase, these overhands are very small. Based on the simulation results, the G (G, 0 ≤ G ≤ 1, 0: fully balanced) of publishing load for

Figure 18. Comparison of G’s regarding the request load for four representative approaches.

KAD-mod is 0.23, KAD-7 is 0.80, and KAD is 0.93. As to G of request load, KAD-mod is 0.33, KAD-7 is 0.67, and KAD is 0.83. KAD-mod improves the search hit to 98% and only causes 8% extra traffic and KAD-mod‘s is only 0.5 hop more than KAD and KAD-7. Our method can not only improve the search resilience but also balance the publish and request load among peers in KAD P2P networks.

5.2 Future work

The proposed KAD-mod is simple and effective method to achieve publish load balancing, request load balancing, and search resilience. In the future, we will adapt our method to let it be applicable to other DHT based P2P networks. In addition, if the number of indexes become too large, how to flexibly adjust RFT to balance load in the KAD P2P network is deserved to further study.

Bibliography

[1] D. Kundur, Z. Liu, M. Merabti, and H. Yu, “Advances in peer-to-peer con tent search,” in Proceedings of the IEEE International Conference on Multimedia and Expo, pp. 404-407, July 2007.

[2] “The Gnutella 0.4 protocol specification, 2000.” [Online]. Available:

http://dss.clip2.com/GnutellaProtocol04.pdf

[3] A. Oram et al., Peer-to-Peer: Harnessing the Power of Disruptive Technologies, 2001, O'Reilly.

[4] I.Clake, TW. Hong, O.Sanberg, and B.Wiley. “Protecting free expression online with freenet,” IEEE Trans. Internet Computing, vol. 6, no. 1, pp.40-49, 2002.

[5] P. Maymounkov and D. Mazieres, “Kademlia: A peer-to-peer information system based on the XOR metric”, in Proceedings of the 1st International Workshop on Peer-to-Peer Systems (IPTPS), pp. 53-65, March 2002.

[6] Stoica, R. Morris, D. R. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord: A scalable peer-to-peer lookup service for Internet applications,” in Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 149-160, August 2001.

[7] Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, “A scalable content-addressable network,” in Proc. ACM Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 161-172, August 2001.

[8] Rowstron and P. Druschel, “Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems,” in Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 161-172, August 2001.

[9] Y. Zhao, J. Kubiatowicz, and A. D. Joseph, “Tapestry: An infrastructure for fault-tolerant wide-area location and routing,” University of California, Berkeley, Tech. Rep.

UCB/CSD-01-1141, April 2001.

[10] “eMula Project,” [Online]. Available: http://www.emule.com/

[11] “BitTorrent,” [Online]. Available: http://www.bittorrent.com/

[12] “aMula Project,” [Online]. Available: http://www.amule.org/

[13] B. T. Loo, J. M. Hellerstein, R. Huebsch, S. Shenker, and I. Stoica. “Enhancing P2P File-Sharing with an Internet-Scale Query Processor,” Proceedings of the Thirtieth international Conference on Very large data bases, vol. 30, pp. 432-443, 2004.

[14] Loo, b. T., Huebsch, r., Stoica,i., & Hellerstein, j. (2004). The Case for a Hybrid P2P Search Infrastructure. In Proceedings of the 4th International Workshop on Peer-to-Peer Systems (IPTPS), pp. 141-150, February 2004.

[15] Y.J. Joung, L.W. Yang, and C.T. Fang, "Keyword search in DHT-based peer-to-peer networks," IEEE Journal on Selected Areas in Communications, vol. 25, pp. 46-61, January 2007.

[16] R. Brunner, “A performance evaluation of the KAD-protocol,” Master’s Thesis, University of Mannheim and Institut Eurecom, November 2006

[17] M. Steiner, D. Carra, and E. W. Biersack, “Faster content access in KAD,” in Proceedings of the Eighth International Conference on Peer-to-Peer Computing, pp.

195-204, September 2008.

[18] M. Steiner, W. Effelsberg, T. En-Najjary, and E. W. Biersack, “Load reduction in the KAD peer-to-peer system,” in Proceedings of the 5th International Workshop on

dynamic structured p2p systems,” in Proceedings of the IEEE INFOCOM, pp. 2253-2262, March 2004.

[21] M. Steiner, T. En-Najjary, and E. Biersack. “Long Term Study of Peer Behavior in the KAD DHT,” in Proceedings of the IEEE/ACM Transactions on Networking, 2009.

[22] T. Pitoura , P. Triantafillou , T. Pitoura , P. Triantafillou. “Load Distribution Fairness in P2P Data Management Systems,” in Proceedings of the IEEE 23rd International Conference on Data Engineering, pp. 396-405, April 2007.

[23] T.T. Wu, K.C. Wang. “An Efficient Load Balancing Scheme for Resilient Search in KAD Peer to Peer Networks,” in Proceedings of the Ninth IEEE Malaysia International Conference on Communications, pp. 759-764, March 2010.

[24] Y.Mu, C. Yu, T. Ma, C. Zhang, W. Zheng, X. Zhang. “Dynamic Load Balancing with Multiple Hash Functions in Structured P2P System,” in Proceeding of WiCom '09. 5th International Conference on Wireless Communications, Networking and Mobile Computing , October 2009.

[25] M. Steiner, T. En-Najjary, and E. W. Biersack, “A global view of KAD” in Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, pp. 117-122, October 2007.

[26] E.K.Lua, J. Crowcroft, M. Pias, R. Sharma, and S. Lim, “A survey and comparison of peer-to-peer overlay network schemes,” IEEE Trans. Communications Surveys &

Tutorials, IEEE, vol. 7, no. 2, pp. 72-93, 2005

[27] “Internet Study 2007,” [Online]. Available: http://www.ipoque.com/

[28] Y. Zhu , Y. Hu. “Efficient, Proximity-Aware Load Balancing for DHT-Based P2P Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 16, no. 4, pp. 349-361, 2005

[29] M. CASTRO, M. COSTA AND A. ROWSTRON,, “Peer-to-Peer overlays: structured , unstructured, or both?,” Microsoft Research, Cambridge, CB3 0FB, UK, 2004.

相關文件