Group locality and network locality - 針對雲端計算環境建置以分散式雜湊表為基礎之同儕式檔案系統

Next, we design an experiment on PlanetLab to evaluate the effects of group locality and network locality on performance. PlanetLab is a global research test bed comprising approximately 1025 machines distributed at 488 sites around the world in July 2009. When the experiment was running, only about 640 nodes were active.

0.0

k = 8, erasure 0.0116854546 0.3017631322 0.7882322328 0.9751780577 0.9989487992 k = 16, erasure 0.0004467193 0.1967625597 0.8437627667 0.9959893851 0.9999878177 full replication 0.3439000000 0.5904000000 0.7599000000 0.8704000000 0.9375000000

Average node availability

0.6 0.7 0.8 0.9 1

k = 8, erasure 0.9999871684 0.9999999735 0.9999999999 0.9999999999 1.0000000000 k = 16, erasure 0.9999999970 0.9999999999 0.9999999999 0.9999999999 1.0000000000 full replication 0.9744000000 0.9919000000 0.9984000000 0.9999000000 1.0000000000

Figure 4-2 Availability comparison when storage cost is four times.

The experiment setup is described below. First, a node was randomly selected as the file source. Latency between the file source node and all other nodes was measured, and group members consisted of the nodes with lower latency. Since Peeraid distributes a file to 32 nodes, we chose 32 file nodes according to three policies of file distribution, RANDOM, GROUP, and MIX, respectively. Nodes which received a request would download 1 KB shares from arbitrary eight of 32 file nodes of each

0.0 0.5 1.0 1.5 2.0 2.5

Number of group nodes

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

32 48 128 320 480

Download time (s)

Figure 4-3 The internal nodes of group.

policy sequentially.

Then, the experiment was repeated with considering network locality. The only difference was that nodes which received a request would download 1 KB shares from eight nodes with the lowest latency of 32 file nodes of each policy. Peeraid can achieve this because latency information is measured by the DHT module.

Figure 4-3 shows how performance changes as the number of group nodes increases for internal nodes of the group. The symbol “+” means network locality was employed when downloading shares. It can be seen that, for internal nodes of the group, distributing file shares with GROUP policy can achieve the best performance; the next is MIX, and RANDOM is the worst. When the number of group nodes is 32, which is exactly sufficient to store 32 file shares, performance of GROUP is about 2.5 times better than MIX, and almost 5 times better than RANDOM without network locality.

0.0 0.5 1.0 1.5 2.0 2.5

Number of group nodes

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

32 48 128 320 480

Download time (s)

Figure 4-4 The external nodes of group.

As the number of group nodes increases, performance of GROUP and MIX descends because the divergence of latency between group members also increases.

With network locality, performance of GROUP and RANDOM increases more than 2 times, and MIX increases almost 4 times when the number of group nodes is 32.

Because in MIX half of file nodes are selected from group members, downloading file shares from eight nodes with the lowest latency has the same effect as GROUP.

Although performance of GROUP and MIX descends as the number of group nodes increases, network locality still raises 2-3 times.

Figure 4-4 shows download results of external nodes of the group. For external nodes, GROUP and MIX are worse than RANDOM. When the number of group nodes is 32, performance of RANDOM is 1.6 times better than GROUP, and 1.2 times better

0.0 0.5 1.0 1.5 2.0 2.5

Number of group nodes

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

32 48 128 320 480

Download time (s)

Figure 4-5 The file source node.

than MIX. Because external nodes of the group spread around the world, high degree of concentration of file shares are unfavorable for them.

Figure 4-5 shows download results of the file source node. It is similar to the internal nodes of group, but performance of GROUP is better. This property is good for the file system because the owner of a certain file is usually the user who accesses that file most often. Network locality also raises performance 2-3 times for external nodes and the file source node.

Next, we again repeat the experiment but increase the number of shares downloaded from one file node. Figure 4-6 and Figure 4-7 show the performance status with different download size from one file node when the number of group nodes is 32 and 48. It can be seen that the rank of three policies does not change when download size increases. For internal nodes of group, GROUP is still the best; the next

is MIX, and RANDOM is the worst. For external nodes of group, RANDOM is better than GROUP and MIX. However, the difference between three policies becomes greater when download size increases. Figure 4-8 shows download results when the number of group nodes is 320. The benefit from group nodes disappears and the curves of three policies are almost overlapped whether internal nodes or external nodes of group. With network locality, performance increases about 1.5-4 times no matter what kind of conditions.

These experiment results can make useful suggestions for file distribution. For those files which only group members have permission to access, users should distribute them with GROUP policy to achieve the best performance. However, for those files which are public to everyone, users should distribute them with MIX policy, because it has an average performance for internal nodes and external nodes of the group.

0 5 10 15 20 25

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 2.180 2.368 5.518 22.854

MIX 1.069 1.579 3.443 12.435

GROUP 0.401 0.699 1.878 5.490

RANDOM+ 0.954 0.992 2.560 16.156

MIX+ 0.276 0.753 1.722 8.894

GROUP+ 0.144 0.372 0.575 2.408

(a) Internal nodes of group.

0 5 10 15 20 25 30 35 40

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 1.003 1.382 2.558 14.201

MIX 1.292 1.942 3.038 21.903

GROUP 1.624 2.105 3.204 34.762

RANDOM+ 0.411 1.051 1.642 8.580

MIX+ 0.330 1.651 2.620 13.256

GROUP+ 1.117 1.865 2.691 21.333

(b) External nodes of group.

0 5 10 15 20 25

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 1.899 2.104 5.305 23.593

MIX 0.916 1.237 2.476 10.626

GROUP 0.172 0.287 1.108 3.182

RANDOM+ 0.584 0.830 2.505 15.855

MIX+ 0.233 0.409 1.262 6.351

GROUP+ 0.166 0.197 0.410 2.141

(c) The file source node.

Figure 4-6 Performance comparison when the number of group nodes is 32.

0 5 10 15 20 25

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 1.810 2.315 5.806 20.803

MIX 1.164 2.197 3.798 15.026

GROUP 0.464 1.020 2.765 6.937

RANDOM+ 0.944 1.236 2.728 14.604

MIX+ 0.362 0.803 1.948 8.051

GROUP+ 0.302 0.491 0.635 2.507

(a) Internal nodes of group.

0 5 10 15 20 25 30 35 40

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 1.019 1.241 2.945 16.661

MIX 1.211 1.828 3.155 20.328

GROUP 1.359 1.918 3.182 33.386

RANDOM+ 0.418 1.110 2.227 9.081

MIX+ 0.378 1.359 2.770 13.216

GROUP+ 0.874 1.461 2.677 24.343

(b) External nodes of group.

0 5 10 15 20 25

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 1.714 2.369 5.800 21.432

MIX 1.204 1.531 2.683 13.712

GROUP 0.200 0.842 2.263 4.831

RANDOM+ 0.509 0.914 2.683 12.584

MIX+ 0.219 0.467 1.569 7.110

GROUP+ 0.158 0.208 0.641 2.301

(c) The file source node.

Figure 4-7 Performance comparison when the number of group nodes is 48.

0 5 10 15 20 25 30

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 2.193 2.338 4.936 23.687

MIX 1.539 2.301 5.337 25.903

GROUP 1.372 2.241 5.423 22.574

RANDOM+ 0.782 1.557 2.430 15.390

MIX+ 0.705 1.337 2.297 14.959

GROUP+ 0.752 1.369 2.451 16.014

(a) Internal nodes of group.

0 5 10 15 20 25

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 0.950 1.564 3.015 15.270

MIX 1.160 1.641 2.876 16.117

GROUP 0.939 1.428 2.814 14.780

RANDOM+ 0.441 0.939 2.708 10.675

MIX+ 0.421 1.116 2.442 11.258

GROUP+ 0.411 1.098 2.593 13.572

(b) External nodes of group.

0 5 10 15 20 25 30

0 1K 10K 100K 1M

Share size (Byte)

RANDOM MIX GROUP

RANDOM+ MIX+ GROUP+

Download time (s)

1K 10K 100K 1M

RANDOM 1.714 2.369 5.800 21.432

MIX 1.204 1.531 2.683 13.712

GROUP 0.200 0.842 2.263 4.831

RANDOM+ 0.509 0.914 2.683 12.584

MIX+ 0.219 0.467 1.569 7.110

GROUP+ 0.158 0.208 0.641 2.301

(c) The file source node.

Figure 4-8 Performance comparison when the number of group nodes is 320.

5 Conclusion and Future Works

Peeraid is a high scalable, robust, and fault-tolerant peer-to-peer file system based on DHT for open clouds structured as a collection of peer nodes which are supplied by different participants around the world. It provides three basic mechanisms to ensure file security.

Erasure coding is applied for file backup to achieve higher data availability with less storage cost. Assuming the average node availability is 0.8, erasure coding achieves higher availability than replication when storage cost is more than double.

Peeraid supports the concept of group with two mechanisms. First, the information about a group is recorded in a SysInfo-object. Only the group manager has write permission for its SysInfo-object. Second, every group has their own routing table for key lookup. Peeraid combines file distribution with group locality to improve access performance. From our experiment results, distributing file shares with GROUP policy can achieve better access performance than MIX and RANDOM for the internal nodes of the group.

At present, Peeraid does not guarantee file consistence when multiple users write a sharing file. If there are conflicts between write logs, users have to select the ones they trust by themselves. A mechanism for synchronization should be provided in the future.

Besides, although erasure coding achieves higher data availability, we still need an efficient way to maintain the number of file shares in the system for durability.

Eventually, accounting is another important issue in our future research.

Bibliography

[1] A. Muthitacharoen, R. Morris, T. M. Gil, and B. Chen, “Ivy: a read/write peer-to-peer file system,” in OSDI, 2002.

[2] A. Rowstron and P. Druschel, “Pastry: scalable, distributed object location and routing for large-scale peer-to-peer systems,” in IFIP/ACM Middleware, 2001.

[3] A. Rowstron and P. Druschel, “Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility,” in ACM SOSP, 2001.

[4] B. Cohen, “Incentives build robustness in BitTorrent,” May, 2003

[5] B. G. Chun, F. Dabek, A. Haeberlen, E. Sit, H. Weatherspoon, M. F. Kaashoek, J. Kubiatowicz, and R. Morris, “Efficient replica maintenance for distributed storage systems,” in Proceeding of the 3rd Conferenc on 3rd Symposium on Networked Systems Design & Implementation, vol. 3, 2006.

[6] B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph, “Tapestry: an infrastructure for fault-resilient wide-area location and routing,“ U.C. Berkeley, Tech. Rep.

No. USB//CSD-01-1141, April 2001.

[7] Clip2, The Gnutella Protocol Specification v0.4, 2000.

[8] D. A. Patterson, G. Gibson, and R. H. Katz, “A case for redundant arrays of inexpensive disks,” ACM SIGMOD Record, vol. 17, pp. 109-116, June 1988.

[9] D. Mazi è res, M. Kaminsky, M. F. Kaashoek, and Emmett Witchel, ”Separating key management from file system security,” ACM SIGOPS Operating Systems Review, vol. 33, pp. 124-139, December 1999.

[10] F. Dabek et al, “Wide-area cooperative storage with CFS,” in Usenix SOSP, 2001.

[11] F. Dabek, J. Li, E. Sit, J. Robertson, M. F. Kaashoek, and R. Morris, “Design a DHT for low latency and high throughput,” in USENIX-NSDI, 2004.

[12] FIPS 180-1, Secure hash standard, U.S. Department of Commerce/NIST, National Technical Information Service, Springfield, VA, April 1995.

[13] H. Weatherspoon and J. D. Kubiatowicz, “Erasure coding vs. replication: a quantitative comparison,” in Electronic Proceedings for the 1st International Workshop on Peer-to-Peer Systems, March 2002.

[14] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord:

a scalable peer-to-peer lookup service for Internet applications,” in ACM SIGCOMM, 2001.

[15] J. Kubiatowicz et al, “OceanStore: an architecture for global-scale persistent storage,” in ACM ASPLOS, 2000.

[16] Kazaa, http://www.kazaa.com.

[17] L. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, “A Break in the Clouds: Towards a Cloud Definition,” ACM SIGCOMM Computer Communication Review, vol. 39, pp. 50-55, January 2009.

[18] M. A. Armbrust et al, ”Above the clouds: a Berkeley view of cloud computing,” UC Berkeley Reliable Adaptive Distributed Systems Laboratory, Tech. Rep. No. UCB/EECS-2009-28, 2009.

[19] M. D. Stefano, Distributed data management for grid computing, Hoboken, N.J., Wiley-Interscience, 2005.

[20] Morpheus, http://www.morpheus.com

[21] M. Satyanarayanan et al, ”Coda: a highly available file system for a distributed workstation environment,” IEEE Transactions on Computers, vol.

39, pp. 447-459, April 1990.

[22] Napster Inc., http://www.napster.com.

[23] P. Maymounkov and D. Mazières, “Kademlia: a peer-to-peer information system based on the XOR metric,” in Electronic Proceedings for the 1st International Workshop on Peer-to-Peer Systems, March 2002.

[24] R. Pike, D. Presotto, S. Dorward, B. Flandrena, K. Thompson, H. Trickey, and P. Winterbottom, “Plan 9 from Bell Labs,” Computing Systems, 1995.

[25] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, “A scalable content-addressable network,” in ACM SIGCOMM, August 2001.

[26] S. Shepler et al, Network File System (NFS) Version 4 Protocol, RFC 3530, April 2003.

在文檔中針對雲端計算環境建置以分散式雜湊表為基礎之同儕式檔案系統 (頁 40-57)