• 沒有找到結果。

Conclusions and Future works

5.1 Conclusions

Using GPU for problems with High density computation normally brings remarkable improving of performance. Of course, these problems should be able to be parallelized. The advance development of GPGPU has already speeded up many applications which are used to be computed in host CPU. In this paper, we survey the background of existing main memory data base and CUDA programming model. We proposed an entire new architecture for experimental data base. Major computation bound of our data base operations are data sorting, prefix-sum process and the aggregate functions calculation. The proportion of these operations to memory I/O bound operations such as selection query is small.

The experiment result shows the operations of our GPU DB takes almost the same execution time for the same total number of records. The execution time of our data operations are not sensitive to how many number of records queried. The result of experiment is different between our GPU DB and SQLite. After query operations, more records in the result table more execution time SQLite take. Oppositely, our GPU DB is not sensitive to how many number of records in result table but sensitive to how many number of total records in data table.Based on this, we evaluated the turning point between our GPU DB and SQLite memory DB. The change of turning points trended to linear variation and we figured out the approximate ratio of records queried to total number of records. Finally, the ratio is small, about 0.161% to 2.061% according to different functions, and we believe that it is easily to be exceeded in common case.

40 

5.2 Future Works

Although we have seen the capability of our GPU DB, there are many issues considered for improving our data base.

(1) Parallel string data query

Because there is no appropriate solution for parallel string data query, our GPU DB is designed for only numeric data now. One 2-D array stores data in the same data type. String data has to be stored as numeric type or in other 2-D array different from the numeric data stored. Each word has its own alphabet order, so the comparison on words supposed to begin from the prefix to suffix. To parallel computation, the characteristic on sequence of string seem to be undefeated. However, the string comparison is necessary to complete implementation of data base.

(2) Join Query

An SQL join query combines fields from two tables by using value common to each. Our GPU DB does not support the join query now. In the future, we will design a relation table to manage the relationship between each table.

(3) Concurrency data base query

Because our GPU DB can process only one global function call at the same time, we plan to design a scheduler. This scheduler can combines multiple requests to one GPU function call and maintains the data consistency for concurrency data base query in a multiuser database environment.

41 

Bibliography

[1] Tobin J. Lehman and Michael J. Carey, “A Study of Index Structures for Main Memory Database Management Systems”. VLDB, Kyoto, Japan, 1986.

[2] Ali Bakhoda, George L. Yuan, Wilson W. L. Fung, Henry Wong and Tor M. Aamodt,

“Analyzing CUDA workloads using a detailed GPU simulator”, Performance Analysis of Systems and Software-ISPASS, Boston, Massachusetts, 2009.

[3] Shubhabrata Sengupta, Mark Harris, Yao Zhang, and John D. Owens, “Scan Primitives for GPU Computing”, ACM, 2007.

[4] Mark Harris, “Parallel Prefix Sum(Scan) with CUDA”, NVIDIA, 2008

[5] Garland M, Le Grand S, Nickolls J, Anderson J, Hardwick J, Morton S, Phillips E, Yao Zhang, Volkov V, “Parallel Computing Experiences with CUDA”. IEEE, Los Alamitos, CA, USA, 2008.

[6] Qihang Huang, Zhiyi Huang, Paul Werstein, and Martin Purvis, “GPU as a General Purpose Computing Resource”. IEEE , Washington, DC, USA, 2008.

[7] Ziyi Liu, Wenjing Ma, “Exploiting Computing Power on Graphics Processing Unit”

IEEE, Washington, DC, USA, 2008.

[8] David Luebke, “CUDA: Scalable Parallel Programming for High-Performance Scientific Computing”, NVIDIA 2008.

[9] E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, “NVIDIA Tesla: A Unified Graphics and Computing Architecture”, IEEE, 2008.

[10] “NVIDIA CUDA Programming Guide, 2.0 edition”, NVIDIA Corporation 2008.

[11] “NVIDIA CUDA Programming Guide, 2.2 edition”, NVIDIA Corporation 2009.

[12] I. Buck, T. Foley, D. Horn, J. Sugerman, K. Mike, and H. Pat, “Brook for gpus: Stream computing on graphics hardware”, ACM, New York, USA, 2004.

42 

[13] Shin-Jae Lee, Minsoo Jeon, Andrew Sohn, and Dongseung Kim, “Partitioned Parallel Radix Sort”, ISHPC, Tokyo, Japan, 2000.

[14] Pablo Barcel´o, “Logical Foundations of Relational Data Exchange,ACM,SIGMOD, Mrch”, New York, US, 2009.

[15] T. Purcell, I. Buck, W. Mark, and P. Hanrahan, “Ray tracing on programmable graphics hardware. ACM Trans on Graphics”, SIGGRAPH, San Antonio, Texas USA, 2002.

[16] T. Purcell, C. Donner, M. Cammarano, H. Jensen, and P. Hanrahan, “Photon mapping on programmable graphics hardware”, SIGGRAPH/Eurographics Conference on Graphics Hardware, San Diego, California, USA, 2003.

[17] E. S. Larsen and D. K. McAllister. “Fast matrix multiplies using graphics hardware”, IEEE Supercomputing, 2001.

[18] D. Manocha, “Interactive Geometric and Scientific Computations using Graphics Hardware”, SIGGRAPH, 2003.

[19] A. Ailamaki, D. J. DeWitt, M. D. Hill, and M. Skounakis. “Weaving relations for cache performance”, In Proceedings of the Twenty-seventh International Conference on Very Large Data Bases, 2001.

[20] S. Manegold, P. Boncz, and M L. Kersten, “What happens during a join? Dissecting CPU and memory optimization effects”, VLDB, Proceedings of 26th International Conference on Very Large Data Bases, Cairo, Egypt, , September 10-14, 2000.

[21] Shintaro Meki and Yahiko Kambayashi. “Acceleration of relational database operations on vector processors”. Systems and Computers, Japan, August 2000.

[22] Jun Rao and Kenneth A. Ross. “Cache conscious indexing for decision-support in main memory”, VLDB, 1999.

[23] Kenneth A. Ross. “Conjunctive selection conditions in main memory”, Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems: PODS, 2002.

43 

[24] M. Macedonia. , “The gpu enters computing's mainstream”. IEEE, October 2003.

[25] Honghoon Jang; Anjin Park; Keechul Jung, “Neural Network Implementation Using CUDA and OpenMP”, Computing: Techniques and Applications, 2008.

[26] Guobin Shen, Lihua Zhu, Shipeng Li, Heung-Yeung Shum, Ya-Qin Zhang, “Accelerating video decoding using GPU”, IEEE International Conference,2003.

[27] Kenneth Moreland, Edward Angel, “The FFT on a GPU”, Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, San Diego, California, 2003.

[28] Simek, V, Asn, R.R, “GPU Acceleration of 2D-DWT Image Compression in MATLAB with CUDA”, Computer Modeling and Simulation, 2008.

[29] Wei-Nien Chen; Hsueh-Ming Hang, “H.264/AVC motion estimation implmentation on Compute Unified Device Architecture (CUDA)”,IEEE International Conference 2008.

[30] Manavski, S.A, “CUDA Compatible GPU as an Efficient Hardware Accelerator for AES Cryptograph”, ICSPC 2007.

[31] Naga K. Govindaraju , Brandon Lloyd , Wei Wang , Ming Lin , Dinesh Manocha, “Fast computation of database operations using graphics processors”, ACM SIGMOD international conference on Management of data, Paris, France, June 13-18, 2004.

[32] Jingren Zhou, Kenneth A. Ross, “Implementing Database Operations Using SIMD Instructions”, ACM SIGMOD, Madison, Wisconsin, USA, 2001.

相關文件