The Grid system is a powerful distributing computing system. It could provide data sharing and the numerous computing resources. In this study, we design an efficient decentralized Grid middleware, named SORS, which is developed by the Peer-to-Peer technology. The decentralized structure in SORS includes two kinds of nodes (super peers and general peers).
The super peers are responsible for the site resource collection and integration. The general peers are responsible for supplying themselves information status to super peer. In this study, we design light-weighted system by adopting simple and efficient method.
In the aspect of resource collection, the site information is gathered by a decentralized structure, which is different to the former server/client architecture. In the portion of job management, we adopt the distributed resource brokers to allocate jobs to between the sites, and to migrate the idle/waiting job to the affordable site. We could use the adaptive load sharing policy to resolve the resource wasting and imbalanced resource spreading. Therefore, we propose two load sharing algorithms to improve the overall system performance. One is to consider about the degree of CPU usage, named load barrier policy, and the other is to consider about the heterogeneity factors in the Grid system, named heterogeneity policy.
Experimental results show that the average CPU utilization of computing resources increases form 28.52989% (100/10) to 49.2584% (10/100), and we could share the system load and increase the overall system utility successfully, besides the SORS also completes the cross internet job migration. The problems in the past Grid system, such as dynamically resource allocation and the ability of cross-sites/VOs communications could be solved in this study, via accomplish the cross-VO job migration.
In the future work, we will consider more factors into our prototype, such as the communication delay, the heterogeneous jobs, and more superior load sharing strategies.
References
1. Amazon EC2 website. http://aws.amazon.com/ec2/
2. Amoretti, M., Zanichelli, F. and Conte, G., “SP2A: a service-oriented framework for
P2P-based Grids”, Proceedings of the 3rd international workshop on Middleware for
Grid computing, November 2005.3. Androutsellis-theotokis, S. and Spinellis, D., “A survey of peer-to-peer content
distribution technologies “, ACM Computing Surveys 2004.
4. China Grid website. www.chinaGrid.edu.cn/
5. Condor Java API website. http://staff.aist.go.jp/hide-nakada/condor_java_api/
6. Condor website. http://www.cs.wisc.edu/condor/
7. Corbato, F. J. and Vyssotsky, V. A., “Introduction and overview of the MULTIC
system”, Proe. AFIPS 1965 FJCC, Vol. 27, pp. 185-196.
8. Dandamudi, S., “Performance Impact of Scheduling Discipline on Adaptive Load
Sharing in Homogeneous Distributed System”, 15th IEEE International Conference on
Distributed Computing Systems (ICDCS'95), May 1995.9. Distribute.net website. http://www.distributed.net/
10. Eager, D. L., Lazowska, E. D. and Zahorjan, J., “A comparison of receiver initiated and
sender initiated adaptive load sharing”, Performance Evaluation 1986, pp.53-68.
11. Eager, D. L., Lazowska, E. D. and Zahorjan, J., “The Limited Performance Benefits of
Migrating Active Processes for Load Sharing” Proceedings of the 1988 ACM
SIGMETRICS conference on Measurement and modeling of computer systems, 1998.12. eBoD website. http://www.ibm.com/developerworks/ibm/library/i-ebodov/
13. EGEE gLite website. http://glite.web.cern.ch/glite/
14. EGEE website. http://public.eu-egee.org/
15. Foster, I., “What Is The Grid : A Three-Point Checklist” , GridToday, July 20, 2002.
16. Foster, I., and Iamnitchi, A., “On death, taxes, and the convergence of peer-to-peer and
Grid computing”, In Proceedings of the 2nd International Workshop on Peer-to-Peer
Systems (IPTPS’03), 2003.17. Foster, I., and Karonis, N., “A Grid-Enabled MPI: Message Passing in Heterogeneous
Distributed Computing System,” Proceedings of 1998 Supercomputing Conference,1998.
18. Foster, I., and Kesselman, C., “The Grid: Blueprint for a New Computing
Infrastructure”, Morgan Kaufmann, 1999, pp.259-278.
19. Freenet website. http://freenetproject.org/
20. Globus Toolkit website. http://www.globus.org/
21. Gnutella website. http://www.gnutella.com/
22. Gong, L., Oaks, S. and Traversat, B., “JXTA in a nutshell a desktop quick reference”, O’Reilly & Associates, 2002.
23. Gradecki, J. D. and Gradecki, J., “Mastering JXTA Building JAVA Peer-to-Peer
Applications”, Jon Wiley & Sons.
24. Grid Café Chinese version http://www2.twGrid.org/Gridcafe/
25. Harchol-Balter, M. and Downey, A. B., “Exploiting process lifetime distributions for
dynamic load balancing”, Proceedings of the 1996 ACM SIGMETRICS international
conference on Measurement and modeling of computer systems (SIGMETRICS '96), May 1996, pp. 122.26. Hasan, R., Anwar, Z., Yurcik, W., Brumbaugh, L., and Campbell, R., “A survey of
peer-to-peer storage techniques for distributed file Systems”, International Conference
on Information Technology: Coding and Computing (ITCC'05) - Volume II, April 2005, pp. 205-213.27. Iyengar, M. S. and Singhalc, M., “Effect of network latency on load sharing in
distributed systems”, Journal of Parallel and Distributed Computing archive Volume 66,
June 2006, pp. 839-853.28. JNGI Project website http://jngi.jxta.org/
29. JXTA Project http://www.jxta.org/
30. Kazaa website http://www.kazaa.com/
31. LCG website. http://lcg.web.cern.ch/lcg/
32. Lei, S., Yuyan, S., and Lin, W., “Effect of Scheduling Discipline on CPU-MEM Load
Sharing System” Sixth International Conference on Grid and Cooperative Computing,
August 2007, pp. 242-249.33. Lu, K. and Zomaya, A.Y., “A hybrid policy for job scheduling and load Balancing in
heterogeneous computational Grids”, Sixth International Symposium on Parallel and
Distributed Computing (ISPDC'07), July 2007, pp. 19.34. Lu, K., Subrata, R. and Zomaya, A.Y., “An efficient load balancing algorithm for
heterogeneous Grid systems considering desirability of Grid sites”, 2006 IEEE
International Performance Computing and Communications Conference, April 2006, pp.44.
35. Milojičić, D. S., Douglis, F., Paindaveine, Y., Wheeler, R., and Zhou, S., “Process
migration”, ACM Computing Surveys (CSUR), Volume 32 Issue 3, September 2000,
pp. 241-299.36. Napster website http://free.napster.com/
37. Oracle News website.
http://www.oracle.com/global/tw/corporate/press/2006/060811.html
38. Plaszak, P. and Wellner, R., “Grid computing, the savvy manager’s guide”, Morgan Kaufmann publishers, 2006, pp. 67.
39. Richmond, M. and Hitchens, M., “A new process migration algorithm”, ACM SIGOPS Operating Systems Review, Volume 31 Issue 1, January 1997, pp. 31-42.
40. Rowstron, A. and Druschel, P., “Pastry: scalable, distributed object location and
routing for large-scale peer-to-peer systems”, in: Proc. of IFIP/ACM International
Conference on Distributed Systems Platforms, November, 2001.41. SETI website. http://setiathome.berkeley.edu/
42. Shah, R., Veeravalli, B., and Misra, M., “On the design of adaptive and decentralized
load balancing algorithms with load estimation for computational Grid Environments”,
IEEE Transactions on Parallel and Distributed Systems, December 2007, pp. December 2007.43. Shan, J., Chen, G., He J., and Chen X.,”Grid Society: A System View of Grid and P2P
Environment”, Proceedings of the International Workshop on Grid and Cooperative
Computing 2002.pp.19-28.44. Shivaratri, N. G., Krueger, P. and Singhal, M., “Load Distributing for Locally
Distributed Systems”, Computer, December 1992, pp. 33-44.
45. Smith, J. M., “A survey of process migration mechanisms”, ACM SIGOPS Operating Systems Review, Volume 22 Issue 3, July 1988, pp. 28-40.
46. Srikumar, V., Rajkumar, B. and Kotagiri, R., “A taxonomy of Data Grids for distributed
data sharing, management, and processing”, ACM Computing Surveys 2006. Vol. 38
Article 3, March 2006.47. SUN On-Demand Computing website. http://www.network.com/
48. Therning, N. and Bengtsson, L., “Jalapeno: secentralized Grid computing using
peer-to-peer technology”, CF '05: Proceedings of the 2nd conference on Computing
frontiers, May 2005.49. Tiger Grid website. http://gamma2.hpc.csie.thu.edu.tw/ganglia/
50. Tiwan UniGrid website. http://www.uniGrid.org.tw/
51. Traversat, B., Abdelaziz, M., Duigou, M., Hugly, J., Pouyoul, E. and Yeager, B.,
“Project JXTA Virtual Network”, Sun Microsystems, Inc, 2002.
52. UNICORE website. http://www.unicore.eu/
53. Yang, C. T., Li, K. C., Chiang, W. C. and Shih, P. C., “Design and Implementation of
TIGER Grid: an Integrated Metropolitan-Scale Grid Environment”, National Science
Council, Taiwan (R.O.C.), under grants no. NSC93-2213-E-126-010 and NSC92-2218-E- 164-002.54. Yang, D. Y., “Job scheduling and processor allocation for heterogeneous Grid
computing environments”, National Taichung University, Department of Digital
Content Technology, 2007.Appendix A SORS User Guide
The software requirements Java JDK1.6.0 or later version Condor 6.7.20
Condor API http://staff.aist.go.jp/hide-nakada/condor_java_api/
Figure A - 1 The SORS component diagram
SORS
ConfigureInformation Service File Transfer
Execution Management
Load Sharing
start.java config.java readfile.java GetTime.java
FileReceiver.java FileSender.java XferDaemon.java xfer.java
infoSubmit.java JaverTest.java PeerL.java PeerR.java LinuxSystemTool.java
CheckJobDes.java queue.java jobSubmit.java jobrm.java jobExecute.java
CheckLoad.java allocation.java Compare.java Compare_LB.java Compare_HP.java
Th
Peer discovery
The function of peer discovery according to the /etc/hosts. After we edit /etc/hosts, SORS could establish a new hostname list, shown as Figure A - 3. Figure A - 4 shown the screen of peer discovery.
tcp://210.240.196.2:9701 http://210.240.196.2:9700 tcp://210.240.196.3:9701 http://210.240.196.3:9700 tcp://210.240.196.77:9701 http://210.240.196.77:9700 tcp://210.240.196.78:9701 http://210.240.196.78:9700
Figure A - 3 The peer list format of SORS
Figure A - 4 The peer discovery of SORS
The load sharing policy support
The Load sharing policy need be edited in Compare.java. You can use the policy we built or edit new policy by the user.
The Load Barrier Policy:Compare_LB.java
& cp Compare_LB.java Compare.java The heterogeneity policy:Compare_HP.java
&cp Compare_HP.java Compare.java
Figure A - 5 The load sharing service of SORS
Execution Management
Before start the SORS, the user must to make sure that the condor environment setup. If there have the idle job in the condor job` queue, SORS will remove this idle job, shown as Figure A - 6. Then, SORS submit the job describe to the remote site, shown as Figure A - 7.
Figure A - 6 SORS remove the idle job of condor job queue
Figure A - 7 SORS submit the idle to the remote site
Information Transfer
The Information transfer service in SORS can be divided into the computing resource information transfer (infoSubmit.java) and the job transfer (jobSubmit.java). The computing resource information is for information service, includes the Memory state, CPU state, job status and so on, shown as Figure A - 8. The job transfer in SORS is for job execution service, includes the job describe (*.submit), job data (*.log) and job file.
506434.0 Average Job Response Time Idle job
Running job
Figure A - 8 The computing resource information format
Appendix B Load Barrier Measure
100 90 80 70 60 50 40 30 20 10 L\R(%)
Load TRT Load TRT Load TRT Load TRT Load TRT Load TRT Load TRT Load TRT Load TRT Load TRT
0.239369 5165 0.246971 4332 0.254796 4015 0.314469 4040.667 0.337617 3568 0.342378 3376.667 0.341129 3147 0.38 3102 0.396324 3365.8 0.394419 3096.6 10
0.250658 4878.5 0.240467 4550 0.294391 3821 0.32897 3965.333 0.341868 3360 0.328988 3459 0.328551 3081 0.368035 3007 0.403476 3037.667 0.38962 3117 20
0.249952 4943 0.236538 4575 0.301174 3703 0.32897 3718 0.328551 3473 0.336776 3436 0.341868 3286 0.386363 3351 0.40978 3179.5 0.403857 3190 30
0.257318 4758 0.247723 4277 0.308303 3798.25 0.357149 3473.333 0.353336 2937 0.365041 3329 0.367233 3133 0.371194 3302.5 0.439334 2894 0.405773 3046 40
0.240467 4992 0.269247 4385 0.311467 3708 0.3276 3690.667 0.384437 3246.667 0.352925 3219.333 0.357188 3041.5 0.407149 3067.333 0.444647 3119.333 0.424138 3214 50
0.261107 5001 0.260767 4250 0.315884 3968.5 0.350781 3955 0.402655 3086.333 0.379064 3005 0.376379 3384 0.439271 3344 0.442434 3020 0.443732 3248 60
0.272251 5087 0.247735 4413.5 0.305201 3902.333 0.349098 3899 0.384506 2953 0.368772 2937 0.392329 2902 0.464792 3319 0.442658 3046 0.432487 2987 70
0.253077 5039 0.25608 4320 0.319455 3930.5 0.361414 3529 0.381352 3156 0.380671 3354 0.421352 2962 0.437693 3257 0.461592 3215 0.443732 3046 80
0.268295 4878 0.265016 4311 0.284014 3863 0.390417 3438 0.397571 3050 0.400629 3078 0.409846 3245 0.424304 3292 0.441231 2972 0.466537 3093 90
0.26681 4716.333 0.276732 4247 0.326732 3784 0.384988 3617.333 0.409846 3142.333 0.423652 2979.667 0.435635 2960 0.452373 3082 0.462191 3029 0.476165 2855 100