基於繪圖處理器之疊代層級平行演化演算法軟體框架

全文

(1)國立臺灣師範大學資訊工程研究所碩士論文. 指導教授：蔣宗哲. 博士. 基於繪圖處理器之疊代層級平行演化演算法軟體框架 A software framework for iteration-level parallel evolutionary algorithm on graphics processing units. 研究生：何泳陖中華民國. 103. 年. 撰 8. 月.

(2) 中文摘要中. 化考使用演化演算法來. 演化演. 算法 NVIDIA. 於平行環境 CUDA. 致. 行 CUDA. C++. 使用平行計算. 使用. 行 CUDA 實驗論文. 演化演算法. Algorithms base on CUDA) 算法中要. 算. 使用. 框架演化演算法. PEAC(Parallel Evolutionary 基. 使用. CUDA. 演化演算法 CUDA 平行. i. PEAC. 設. 演化演.

(3) 致謝謝. 謝建. 與. 中. 與. 討論. 中. 謝. 論文. 建. 論文. 謝實驗. 基. 文. 與. 平安. 與. 謝. 與. 與建. 謝討論與. 謝與. ii. 使. 中.

(4) 目錄 i. 中文摘要致謝. ii. 目錄. iii. 圖目錄. vi. 表目錄. viii. 1 緒論. 1. 1.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1. 1.2. 演化演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2. 1.3. CUDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3. 1.3.1. . . . . . . . . . . . . . . . . . . . . .. 5. 與設. 1.3.2. 行緒結. . . . . . . . . . . . . . . . . . . . . . . . .. 6. 1.3.3. 結. . . . . . . . . . . . . . . . . . . . . . . . .. 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9. 1.4 2 文獻探討. 11. 2.1. 平行. 演算法 . . . . . . . . . . . . . . . . . . . . . . . 11. 2.2. 平行演化演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . 15. 2.3. 平行. 化演算法. 框架 . . . . . . . . . . . . . . . . . . . 18 iii.

(5) 3 基於 CUDA 之平行演化演算法框架 3.1. 3.2. 23. 平行. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23. 3.1.1. 平行 . . . . . . . . . . . . . . . . . . . . . . . 23. 平行. . . . . . . . . . . . . . . . . . . . . . . . . . . . 24. 3.2.1. . . . . . . . . . . . . . . . . . 24. 設. 3.2.2. thrust . . . . . . . . . . . . . . . . . . . . . . . . . . . 31. 3.2.3. 平行. 實. . . . . . . . . . . . . . . . . . . . . 32. 3.3. 平行. 3.4. 平行演化演算法框架 . . . . . . . . . . . . . . . . . . . . . . . 37. 3.5. 設參. 3.6. 使用. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44. 4 實驗設計. 46. 結. . . . . . . . . . . . . . . . . . . . . . . . 35. . . . . . . . . . . . . . . . . . . . . . . . . . . . 39. 4.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46. 4.2. 實驗環境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47. 4.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47. 4.4. 結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48. 4.5. 框架. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50. 5 結論與未來展望. 51. A PEAC 安裝與環境建置. 55. iv.

(6) A.1 安裝 CUDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 A.2 CUDA 環境設 A.3. CUDA. A.4 PEAC 環境設 A.5. . . . . . . . . . . . . . . . . . . . . . . . . . 55. PEAC. 行 . . . . . . . . . . . . . . . . . . 55 . . . . . . . . . . . . . . . . . . . . . . . . . 56 行 . . . . . . . . . . . . . . . . . . . . . 57. B 使用 PEAC B.1 建 B.2 建 B.3. 59. PEAC PEAC 與. B.4 使用演算法與設 B.5 建演算法. . . . . . . . . . . . . . . . . . . . . . 59 之. 與設. . . . . . . . . . 62. . . . . . . . . . . . . . . . . . . . . . . . 63 . . . . . . . . . . . . . . . . . . . . . . . 64 . . . . . . . . . . . . . . . . . . . 66. v.

(7) 圖目錄圖 1 演化演算法. 圖. . . . . . . . . . . . . . . . . . . . . . . .. 4. 圖 2 cpu 與 gpu. [2] . . . . . . . . . . . . . . . . . . . .. 5. 圖 [3] . . . . . . . . . . . . . . . . . . .. 6. 圖 [3] . . . . . . . . . . . . . . . . .. 7. . . . . . . . . . . . . . . . . . . . . . . . .. 9. 圖 3 cpu 與 gpu 結圖 4 thread. block grid. 圖5 圖 6 平行圖7 圖8. [4] . . . . . . . . . . . . . . . . . 12. 演算法. [5] . . . . . . . . . . . . . . . . . . . . . . . . 14 算. [5] . . . . . . . . . . . . . . . . . . . . . . . . . . . 14. 圖 9 平行. [6] . . . . . . . . . . . . . . . . . . . . . . . . . . . 15. 圖 10 平行基演算法圖 11. GPU. [6] . . . . . . . . . . . . . . 16. GPU. [10] . . . . . . . . . . . . . . . . 17. 圖 12. 圖 . . . . . . . . . . . . . . . . . . . . . . . . 24. 圖 13. 圖 . . . . . . . . . . . . . . . . . . . . . . 25. 圖 14. 圖 . . . . . . . . . . . . . . . . . . . . . . 26. 圖 15. 圖 . . . . . . . . . . . . . . . . . . . 27. 圖 16. 圖 . . . . . . . . . . . . . . . . 28. 圖 17 virtual function. 圖 1 . . . . . . . . . . . . . . . . . . . 29. 圖 18 virtual function. 圖 2 . . . . . . . . . . . . . . . . . . . 29. 圖 19 virtual function. 圖 3 . . . . . . . . . . . . . . . . . . . 29. vi.

(8) 圖 20 CRTP(curiously recurring template pattern). 圖 . . . . 30. 圖 21 平行. 圖-. . . . . . . . . . . . . . . . . 34. 圖 22 平行. 圖-. . . . . . . . . . . . . . . . . 34. 圖 23. . . . . . . . . . . . . . . . . . . . . . . . 35. 圖 24. 圖 . . . . . . . . . . . . . . . . . . . 36. 圖 25 演算法. 圖 . . . . . . . . . . . . . . . . . . . . . . . 38. 圖 26 設. . . . . . . . . . . . . . . . . . . . . . . . . . . . 38. 圖 27 演算法框架圖 28. 圖 . . . . . . . . . . . . . . . . . . . . . . . 40. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48. 圖 29 結. 實驗 . . . . . . . . . . . . . . . . . . . . . . . . . . 49. 圖 30 框架. 圖 . . . . . . . . . . . . . . . . . . . . . . . . . . . 50. vii.

(9) 表目錄表 1 CUDA. 結. . . . . . . . . . . . . . . . . . . . . . . .. 8. 表2. . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. 表 3 框架. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22. 表 4 平行. . . . . . . . . . . . . . . . . . . . . . . . . 33. 表5. 使用. 與. 文 . . . . . . . . . . . . . . . . . 41. 表 6 GTX660. . . . . . . . . . . . . . . . . . . . . . 47. 表 7 t-test 結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49. viii.

(10) 目錄. 1. 緒論. 1.1 中. 化考使用演化演算法來. 演化演算 (bug). 法演算法. 要. 行結. 行. 中參. 實驗結. 要. 參. 論. 行. 中. 算. 論文. 演化演算法. tionary Algorithms base on CUDA) 演化演算法中. 參. 算. PEAC(Parallel Evolu-. 框架. 演化演算法. 使用. 基. 設. 使用平行計算. 使用實驗. 平行框架中. 平行. 論文使用. 1. 使用. 平行平行.

(11) 行. 框架. 使用. 參. 行演算法. 參. 實驗. 用. 演算演算法. 文框架. 框架平行. 使使用. 使用平行. 中. 框架. 算. 框架計. 圖. 圖. 結. 1.2. 演化演算法. 演化演算法化演算法. (population-based). 演化用. 演化使 •. 化. 要演算法. 行演化. 使用 (infeasible solution). 法 •. 使用. 法來. (evaluation). 行環境. 使用. 目. 中. 結 •. (mating selection). 結法 (tournament selection). 法 2. 法.

(12) (roulette wheel selection) •. 用. 法 (rank selection) 來. (crossover) 結 (one-point crossover). (two-point crossover). (mutation) • 環境. generational model. (n + n) mechanism 演算法實. 考. 演化圖1. 1.3. CUDA. CUDA[1]. Compute Unified Device Architecture. 使用 GPU 計算. 於 CPU. single data). GPU 使用. multiple data). Multiprocessor. NVIDIA. SISD(single instruction, SIMD(single instruction,. GPU. 行. 行. 圖2 圖3. CPU 與 GPU 架. 圖 CPU 中 3. (DRAM). 圖 (control).

(13) 圖 1: 演化演算法. 4. 圖.

(14) 圖 2: cpu 與 gpu. (cache). (ALU). 算. GPU. GPU 要. 1.3.1. [2]. CPU. 算 CPU. 表. 與設 (CPU). CUDA. (host). 設. (host function) (kernel function) __host__. 算. __global__. •. __device__. •. __host__. 設. CUDA. (GPU). 圖 (device). 行. 行 __device__. 使用表表 5. 行. 設.

(15) 圖 3: cpu 與 gpu 結. 圖 [3]. (C/C++) •. __device__. __host__. 表. 設. 設 •. __global__. 表表. 1.3.2. 法行緒 (thread) 行. CUDA 設計緒行緒. gpu. 行緒結. 實使用. cpu. 基. 行緒 warp. 結. SIMD. 行. 平行. 行緒. warp 中. 行緒 (if) (false). 表 CUDA. 行. 行行. CUDA. 行緒. 6. (true) 結. 行.

(16) 圖 4: thread. warp size. 圖結. 16. (grid). block grid. 圖 [3]. 32 (block). 行緒 (thread). 圖4. 行緒行緒. (shared memory). 使用. 7.

(17) 表 1: CUDA. register. thread. thread. shared memory. blocks. blocks. local memory. thread. thread. const memory. global. program. global memory. global. program. 1.3.3. 結. CUDA 中. 考. 設計. GTX660. 使用 64. 化. (int) global memory. GB. 圖置. 表1. 使用. register. 中. 用. 結. const memory. 設. 與 global memory 與 CUDA 於 register. 使. 設 local memory. 用. 設. 用 local memory. shared memory. 使用. 使用 global memory. 圖5. 8.

(18) 圖 5:. 1.4 論文目化演算法. 基於 CUDA. 使用. 演. 框架 (framework) 結. • 環境建置. 960. 行 C++ 與 CUDA 之 •. •. 計算. 設計. •. 安裝與設平行. 要. 平行參. 實驗. 9. 行. 與. CPU. 行. 使用. 結.

(19) 論文框架. 演算法論文. 論文使用. 實驗結與. 之. 框架框架. 結. 文框架使用 CPU 與 GPU 與. 10. 平行. 框架. 行行. 結.

(20) 2. 文獻探討. 2.1. 平行. 演算法. Van Luong. [4]. (algorithmic-level). 演算法 level). 演算法 (local search). 平行. (iteration-. (solution-level). •. 平行. 平行使用. •. 演算法. 行. 中. 平行 • 演算法. 演算法. 演算法參. 平行. 與. 置. 行. 演算法. 計算. GPU. 與文. 表 representation). 演算法. 表實. 表. (binary encoding). (discrete vector. 表. (vector of real values). representation). 表. (permutation. CUDA 中表2. 11. 實驗結. 80. 240.

(21) 圖 6: 平行. 演算法. [4]. 表 2:. Registers Local memory Shared memory. 結. Texture memory. 表. Global memory. 結. 12.

(22) Czapiński 與 Barnes [5]. 演算法 (tabu search). CUDA. 實. Permutation Flowshop Scheduling Prob-. lem(PFSP). 圖7. 中. 中. π表. Ci,j (π) 表. (makespan) 表. πj. i. j. πj 表. π中. π. j. pπj ,k. i. Ci,j (π). C1,1 (π) = p1,π1. (1). Ci,1 (π) = pi,π1 + C1,j−1 (π). (2). C1,j (π) = p1,πj + C1,j−1 (π). (3). Ci,j (π) = pi,πj + max{Ci,j−1 (π), Ci−1,j (π)}. (4). 4. i, j. 要. 1. 目. 目圖8. 表 C1,1 (π) 算. 表 C3,4 (π) 行. 算. 算. 表平行結. 算. 實驗結 13. 89. 算.

(23) 圖 7:. 圖 8:. [5]. 算. 14. [5].

(24) 圖 9: 平行. 2.2. [6]. 平行演化演算法. Pedemonte. [6]. 行. 行. 平行 CPU. 建行. 圖9. 環境. 算. Wang. 演算法. 平行. 圖 10. 化 CPU. 行 32bit. 行 [7]. 使用 CUDA 平行. GOjDE. 實驗. Wong [8] 參考 NSGA-II[9]. 結 68.48 實. 平行. 15. GPU 100. 演化演算法平行. 與. 要. 演化演算法. 參.

(25) 圖 10: 平行基. 演算法. 16. GPU. [6].

(26) 圖 11:. GPU. [10]. 使. 實. GPU. 10.75. 結. 實驗. 與. 99% Vidal 與 Alba [10] 實. CPU. GPU 與. 算法 (Cellular Genetic Algorithm) 圖 11. 基. 演. GPU. 要. 512 × 512. GPU. GPU. 實驗結 771. GPU. 與. GPU. 計 Arenas. [11]. 要. 平行裝置. 平行演化演算法於. nvidia 與 AMD 文獻. 用 (Master-Slave Approaches). 17. 演算法. 架.

(27) 用 (Coarse-Grained Approaches). 於. 73.6. 用 (Fine-Grained Approaches). 於. 用 (Hybrid Approaches) Mussi. [12]. 7000 25. 平行. 化演算法. (stochastic star topology ). 結. 於. 使用 CUDA 實. 設置. 與參. 實驗. 138. 結. 2.3. 平行. 化演算法. 框架 PyGMO(the Python Parallel Global. 框架 Multiobjective Optimizer) [13]. libCudaOptimize[14]. EASEA(EAsy Spec-. ification of Evolutionary Algorithm)[15] 與 ParadisEO-MO-GPU[16] 用. PyGMO. 來. Python. 使用 Python. 與 C++. + 與平行. EASEA. CUDA C/C++ PyGMO 使用. 來使用 GPU. 實. 使. 使用 C++ 使用 C/C+. 平行. CUDA C/C++. 平行. CPU(multi-core CPU) 平. EASEA. 來. ParadisEO-MO-GPU. unix 平演算法來. EASEA. ParadisEO-MO-GPU 演算法. PSO PyGMO. 使用演化演算法. libCudaOptimize. libCudaOptimize 實. 要演算法. 18. 使用. 演算法實.

(28) (continuous, integer, or mixed integer). (constrained or unconstrained). 目. (single-objective or multi-objective). 目. PyGMO 使用 CPU. 演算法使用 CUDA 來行. 平行. 使用演算法. 演算法. 來考. 使用 Python. 實. 使用 C++. libCudaOptimize 環境建置. 要. cmake. 建置與使用 cmake. 法. 行於未使用. 演算 cmake. 化演算法 (particle swarm optimizer). search) Solis and Wets local search 與致. 算. 於要. (與. ). (使用. 結. CUDA. 法 (scatter. 使用演化演算法要 (float). 使用. 結. 使用. 演化演算法 (differential evolution. 演算法 ). 平. 行. 來實. (與 ). ) 目. 與. 來法. 化演算法. 使用論. 演算法. 行. 框架 ParadisEO-MO-GPU ParadisEO. 2.0.1. ParadisEO-1.3. 使用. ParadisEO-MO-GPU 19. 於用.

(29) (deprecated). bug. 使用. 與 libCudaOptimize CUDA. 行. 要 CUDA. 要. 要 template. 使用 EASEA. EASEA. CUDA C/C++ EASEA. EASEA. 使用 CUDA. 行. 與 C/C++. C/C++. EASEA 目. 結. char bool. float. double. 於演化演算法. 結於. 法使用. 中. 參. EASEA 行. 與. 使用 int. long long int. 參. (population size). (float). 要. 參. 使用 CUDA. 實驗 (fitness) 結. 設計目. 法使用. 要來 #define SIZE 1024. 置. 置. 基 PFSP 使. 置要. 中. 法. 中. EASEA. 使用於. 使用. 行. 使用建. 框架. 使用環境. 20.

(30) 使用建. 法行. 基. 致演化演算法. 使用 gpu. 算) gpu. 使用 cpu. 算. 要使用. 法. 框架. 行. 要. CUDA. 演化演算法. 法 (2-tournament). 演算法與. CUDA. 法 (. 算. 使用 cpu 要. EASEA. 算. 使用法. 要要 windows 平. 設使用. 論文 CUDA. 演化演算法. 設計. 框架. 使用. 21. 框架.

(31) 22. 圖. 建. 環境. 平行. 框架. 結. Python. Python, C++. 演算法. 圖. 法. :. R 法. CUDA(GPU). multi-core CPU. unix. C/C++. Python. 演算法. EASEA[15]. PYGMO[13]. 與 CUDA. 演算法. CUDA(GPU). C/C++. :. libCudaOptimize[14]. 表 3: 框架. 用 (deprecated). CUDA. 演算法. unix. CUDA(GPU). C/C++. ParadisEO-MO-GPU[16].

(32) 3. 基於 CUDA 之平行演化演算法框架. 3.1. 平行平行. [4]. 參考. 演算法. 演算法. 與框架目. 演算法. 行. 法. 使用中. 平行. 算 Pedemonte [6]. 算. 使用. 平行設計. 演算. 平行. 框架. 考. 演化演算法中行 SIMD 架. 考. 文. 平行. 設計框架. 3.1.1. 平行平行. 演化演算法. 演化演算法中環境. 行P. P. 平行. 行使用. 平行. 1. P 文. 結. 23. 參考. 3.3.

(33) 圖 12:. 3.2. 圖. 平行 cuda. 使用. 使用. 設. cpu. 與. 設. 參. 用. 計算. 行 (. 使用. 3.2.1. )中. 使用. (. 框架. (global function). 參. 行設. 要設. 設. 圖 12. gpu. gpu. 要設. ). cpu. gpu. (device function). 設. 設於框架. 框架. 使用. C/C++ 中. 使用. cpu. 中. (function pointer). 實. 24.

(34) 圖 13:. 圖實. C/C++ 中. 使用. cpu. gpu. 於. cpu. 法. CUDA. 行. CUDA. cudaGetSymbolAddress. 使用. 圖 13. C/C++ 12. 15 行. 4 與. 與 25. 7行 21 行.

(35) 圖 14:. 圖. 26.

(36) 圖 15:. 圖圖 14. 3 行與. 9 行設. 19 行設. 20 行結. 致. 11 行. gpu. cpu. 建. 論. cpu. 框架建. 行與 gpu. 建. 使用. (template function) 要. 圖 16. 法. virtual function. delete 行. 使用. 使用. 實. 來. 於 C++ 實. CUDA 4.0. 實 (. 行. 要. C++ 用. 圖 15. int 與 double. 參. cpu. 使. virtual function 與 new. virtual function. __host__ 與 __device__ (new). 與設 ) 之. 27. 設.

(37) 圖 16:. 圖. 設. 設. 法. 設法. 使用. 圖 17. 圖 18. 49 行 CUDA 實. 設計. 框架. 設. 要. 實. 圖 19 39. 行. 45 行. 設設. 望. 設. 設法. 來. 用 CRTP(curiously recurring template pattern) (static function) (template) 參. 20 行建. 使用. 21 行. (class). 建法. 使用使用. 圖 20 26 行. 28. 1 28. 32 行.

(38) 圖 17: virtual function. 圖1. 圖 18: virtual function. 圖2. 圖 19: virtual function. 圖3. 29.

(39) 圖 20: CRTP(curiously recurring template pattern). 30. 圖.

(40) 36 行. 使用. 40 行. 行 fun. 3.2.2. 行. thrust. thrust[17]. 基於 CUDA. 與 C++. (STL 與. CPU. C++. Standard Template Library) 與 STL. 用演算法. GPU. thrust. (template library). STL. CPU. thrust. 使用. thrust. CUDA. CUDA thrust mations. 基. 結. vectors. reductions prefix-sums reordering. host_vector 與 device_vector host. device. transfor-. 演算法 sorting vectors. 與 C++STL 中. vector host. 使用. device_vector transformations X. 計算 Zi = Xi + Yi. Y. 設. 使用 transform replace. fill. 中A. B. 演化演算法使用 reductions. 中 31. 結. copy.

(41) 平. 計算 prefix-sums. 使用. 計. 計算 Zi =. reordering. ∑i k=1. Xk. sorting. 演化演算法中. 用 thrust 中. 與. transformations. 於. (. ). 於環境. transformations 環境. 法. 中. 法. 用. 於. 致. 於 transformations 考. 使用. transformations. 參. 用參. 3.2.3. 與 transformations. 使用. 平行於 thrust. 實演化演算法表4. 表. n 環境. 目. 文. 行設計. 平行. 考 n. 表法. 要. 4. 要 32. 表.

(42) 表 4: 平行. 參參化環境 1.. 1. 2.. 0. 3. 4.. 實. 5.. 使用. 使用. 參. 6.. 設設計 1. 2. 置要. i∗s. 參. s. 使用. i. 使用 i×s, i×s+1, i×s+2, ..., (i+1)×s−1 33.

(43) 圖 21: 平行. 圖-. 圖 22: 平行. 圖-. 4 要 3. 0. s 5. C++11. 於. 參圖 21. 13 行中 ). 行緒. 要. variadic template. 圖 22 (. 1. 表. s. d1. GPU 置. 參. mem_type size. s1 34. 23 行. 25 行.

(44) 圖 23: 7行 size. 用行緒. 16 行. 行. 5行. 設. 6行. sec:pThreadSize 目. 計算. CUDA. overloading). 參圖 23. 來實. 使用. Evaluation GPU CPU. 設. problem, fitness,. population size. 行. 行 0. 行緒 1. 結參考 [5] 設. population size= 5,warp size= 4 結. 目 (function. 使用. fitness, population. 平行. grid. 設 C++11. 於 problem. 3.3. thread block. 行緒. 環境. GPU population. 18 行. 設. 19 行 (與 20 行). thrust. 使用. 行緒. 9行. (a). 圖 24. 設計表. 35. 法.

(45) 圖 24:. 圖. 表 (b) [5]. 法. (c),(d). warp. 文 (b),(c) 結. 於 (c). 論文. population size. warp size. (b). (d). 1. warp. warp 要 (x,y) 表 4( 中. 5 5. x. y. 表 ). (b) 中. (1,1). 表 6. (c) 中. (a) 中 9. Sp =population size= 5, Sw =warp size, Sc =. 36. (d).

(46) 用. =3. idi 與. idc. idg. idg(a) = idi × Sc + idc. idg(d). idg(b) = idc × Sp + idi ⌈ ⌉ Sp idg(c) = idc × × Sw + idi Sw ⌊ ⌋ idi × Sw × Sc + idc × Sw + idi %Sw = Sw (b). 與圖 (d). 算. 行. 結. 實驗. 平行演化演算法框架. 演算法 25. 使用. population size. 要 4.4. 3.4. warp size. 要設. 設. 設要. 圖. 置. 演化. 於 C++ 中使用. 使用. 圖 26. 設演算法. 中. 行使用. 參. 參 3.2.3 基. (base class) 37. 建. 使用建. 要使用.

(47) 圖 25: 演算法. 圖 26: 設. 38. 圖.

(48) (dynamic_cast). 要. (. ). 使用要. 使用基. 基. 於. 使用 (virtual function). 實 (clone). 參. 基行 (run). 化參. 演算法. 行演算法. 設. 行圖. 行. 化實. 圖 27. 演化演算法用. 化. 要. 要. 3.5. 設. 參. CUDA. 參. (threads per block) 與. (blocks per grid) x×y×z. x, y, z 使用. 參. B. 參. x, y, z 使用表5. 要行緒. 行緒. M. 表. 39. T 表. 要表參.

(49) 圖 27: 演算法框架. 40. 圖.

(50) 表 5:. 使用. 與. 中文. 文 threads per block. 行緒. T. 文. blocks per grid. B 行緒. MT. Maximum number of threads per block. 行緒. M Tx , M T y , M Tz. (x,y,z). each dimension of a block Maximum sizes of. M Bx , M By , M Bz. (x,y,z). each dimension of a grid Warp size. Sw MT m. Maximum sizes of. 行緒. Maximum number of threads per multiprocessor. 41.

(51) 表. x, y, z. 使用. x, y, z. 參. 於. (Warp size). 行緒. 100%. 設. 使用. 行緒. 表. T = n 1 × Sw , n 1 ∈ N. (5). T ≤ MT b. (6). Tx ≤ M Tx , Ty ≤ M Ty , Tz ≤ M Tz ,. (7). Bx ≤ M Bx , By ≤ M By , Bz ≤ M Bz ,. (8). 設. CUDA. 目行緒. M Tbest. gcd. (Maximum number. 行緒. of threads per multiprocessor). 100%. 算. 設. 表使用 9. 行緒. 表 M Tbest = gcd(M T, M T m) 設. 要平行. 參. 算. Tx , Ty , Tz , Bx , By , Bz. (population size). Sp Sw (warp size). 參. M Tbest. 演算法. (9). 1. 用. 演算法. CUDA. 要. 設. 行 (圖 21. 緒 42.

(52) Algorithm 1 設 Input: Sp , M Tbest , Sw. CUDA. 參. 1: if Sp < M Tbest then. ⌈. 2:. Tx ←. 3:. Bx ← 1. Sp Sw. ⌉. 4: else 5: 6:. Tx ← M Tbest ⌈ ⌉ Sp Bx ← M Tbest. 7: end if 8: Ty ← 1, Tz ← 1. ⌈. 9: By ←. ⌉ ⌈ ⌉ Bx By , Bz ← M Bx M By. 10: if Bx > M Bgx then 11:. Bx ← M Bx. 12: end if 13: if Bgy > M By then 14:. By ← M By. 15: end if 16: if Bgz > M Bz then 17:. Bz ← M Bz. 18: end if 19: return Tx , Ty , Tz , Bx , By , Bz. 43.

(53) M Tbest <= M Tb. 目. 設. Tx. 設. Ty , Tz. 1 (id). 要要. 行緒. 要. (T IDx , T IDy , T IDz ). 行緒. x, y, z. (Ttotal ). 行緒. (BIDx , BIDy , BIDz ). 10. 11. id = T IDx + BIDx ∗ (Tx + BIDy ∗ (Bx + BIDz ∗ (By + BIDz ))). (10). Ttotal = Tx ∗ Bx ∗ By ∗ Bz. (11) 100% 使用. 3.6. 使用使用 CUDA. 使用. CUDA. 表使用 •. (kernel function) 與. __device__. 設. __host__ __device__. 行. • 使用 ( •. STL 中. 使用 new delete. vector list CUDA 4.0. (compute capability). 2.0. 44. ) GPU.

(54) •. 建 (new) 建. •. virtual function. 實. virtual function. 實置. 致法 • 與框架. 置. 結使用. 建. 1 結 (linking). 45.

(55) 4. 實驗設計. 4.1 Sphere problem n ∑. x2. (12). [100(xi+1 − x2i )2 + (xi − 1)2 ]. (13). f (x) =. i=1. n = 30, xi ∈ [−100, 100] Rosenbrock problem f (x) =. n−1 ∑ i=1. n = 30, xi ∈ [−30, 30] (PFSP Permutation Flowshop Scheduling Problem). π表. Ci,j (π) 表. (makespan) pi,k 表. k. i. πj 表. j. π中. π. j. j C1,1 (π) = p1,π1. (14). Ci,1 (π) = pi,π1 + C1,j−1 (π). (15). C1,j (π) = p1,πj + C1,j−1 (π). (16). Ci,j (π) = pi,πj + max{Ci,j−1 (π), Ci−1,j (π)}. (17). (machine size) (Cm,n ). m 來. (job size) 於 E. Taillard 46. n. 論文 [18].

(56) 表 6: GTX660 1024 (x,y,z). 1024 x 1024 x 64. (x,y,z). 2147483647 x 65535 x 65535 32 2048. 4.2. 實驗環境 NVIDIA GeForce GTX 660. 8G. ubuntu. CPU Inter(R) Core(TM) i5-3470 CPU @3.20GHz 3.60GHz. 4.3 使用 sphere rosenbrock 參. 演算法. DE. CPU. GPU. mating selection: 2-tournament, crossover:best/1/. bin, environmental selection: generation, population_size: 10000,F: 0.5,CR: 0.6,generation_size: 500 結. 圖 28(b). 行 10. sphere 結 18. speedup =. 47. 5.78. Told Tnew. 圖 28(a). rosenbrock. 5.69. (18).

(57) (a) sphere. (b) rosenbrock. 圖 28:. 4.4. 結 3.3 行實驗. (圖 24). C結. job size=100, machine size=5. 設. 行 100. 結. 圖 29(a). 未. 結. 37.83. C結. 3.17 (a). CPU 與 A 結. 之 warp size. 法 (c). 圖 29(b) (d). CPU CPU. 於. A. 行 1000 圖. 行計算. 結. 使用 FPSP. 於. 結 (c),(d),(b). 文. [5]. 論文於 (c). 行. (d). 48. warp.

(58) (a). (b). 結. 圖 29: 結. 結. (d) 表7. 95%. +. (c),. (d),(b),(a) 表 7: t-test 結 (a). (b). (c) (d). -. -. -. -. (b) +. -. -. -. (c). +. +. -. +. (d) +. +. -. -. 49. T 結. -. (a). 結. 實驗. (c) (t-test). A.

(59) 圖 30: 框架. 4.5. 圖. 框架 2. algorithm). 行實驗使用 PFSP. 使用. 演算法. 文. 演算法 (genetic. EASEA. 框架. 用 EASEA 使用 CUDA. 基. 框架中建基. 演算法. 與. EASEA. EASEA. 論文之框架環境中. population_size:. 10000,Generation size:100,Run:10,Mutation rate: 0.05,Crossover rate: 1,Job: 500,Machine: 20 2.85. 行 10. 結. 圖 30. 與 CPU. 9.69. 50. 與 EASEA.

(60) 5. 結論與未來展望設計 CUDA/C++. 文. PEAC 實. 法. 使用. 平. 參. 使用. 實驗. 與 CPU 文. 結. 論. 行. 實驗. 9.69. 於. 與. 平行演化演算法框架. 使用 PEAC 2.85. 演化演算法框架. 與. 框架. 使用. 結. 演算法. 與參. 行未. 結目. 實. 基. 5. 演算法. 未來. 演算法 (solution-level). 文於. 平行. 平行. 用. 探討. 未來. 平行框架計算. 參. 設. 望未來. 框架. 使用. 51. 結. 討.

(61) 參考文獻 [1] Cuda toolkit documentation, http://docs.nvidia.com/cuda/. [2] R. Couturier, “Introduction to cuda,” Designing Scientific Applications on GPUs, p. 13, 2013. [3] Cuda c programming guide, http://docs.nvidia.com/cuda/cudac-programming-guide/. [4] T. Van Luong, N. Melab, and E.-G. Talbi, “Gpu computing for parallel local search metaheuristic algorithms,” IEEE Transactions on Computers, vol. 62, no. 1, pp. 173–185, 2013. [5] M. Czapiński and S. Barnes, “Tabu search with two approaches to parallel flowshop evaluation on CUDA platform,” Journal of Parallel and Distributed Computing, vol. 71, no. 6, pp. 802–811, 2011. [6] M. Pedemonte, E. Alba, and F. Luna, “Bitwise operations for gpu implementation of genetic algorithms,” in Proceedings of the 13th annual conference companion on Genetic and evolutionary computation, ACM, 2011, pp. 439–446. [7] H. Wang, S. Rahnamayan, and Z. Wu, “Parallel differential evolution with self-adapting control parameters and generalized opposition-based. 52.

(62) learning for solving high-dimensional optimization problems,” Journal of Parallel and Distributed Computing, vol. 73, no. 1, pp. 62–73, 2013. [8] M. L. Wong, “Parallel multi-objective evolutionary algorithms on graphics processing units,” in Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, ACM, 2009, pp. 2515–2522. [9] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: nsga-ii,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182–197, 2002. [10] P. Vidal and E. Alba, “A multi-gpu implementation of a cellular genetic algorithm,” in IEEE Congress on Evolutionary Computation, 2010, pp. 1–7. [11] M. Arenas, G Romero, A. Mora, P. Castillo, and J. Merelo, “Gpu parallel computation in bioinspired algorithms: a review,” in Advances in Intelligent Modelling and Simulation, Springer, 2012, pp. 113–134. [12] L. Mussi, F. Daolio, and S. Cagnoni, “Evaluation of parallel particle swarm optimization algorithms within the CUDA™ architecture,” Information Sciences, vol. 181, no. 20, pp. 4642–4657, 2011. [13] D. Izzo, “PyGMO and PyKEP: open source tools for massively parallel optimization in astrodynamics (the case of interplanetary trajectory 53.

(63) optimization),” Proceedings of the Fifth International Conference on Astrodynamics Tools and Techniques, ICATT, 2012. [14] Y. S. Nashed, R. Ugolotti, P. Mesejo, and S. Cagnoni, “libcudaoptimize: an open source library of gpu-based metaheuristics,” in Proceedings of the Fourteenth International Conference on Genetic and Evolutionary Computation Conference, ACM, 2012, pp. 117–124. [15] O. Maitre, F. Krüger, S. Querry, N. Lachiche, and P. Collet, “Easea: specification and execution of evolutionary algorithms on gpgpu,” Soft Computing, vol. 16, no. 2, pp. 261–279, 2012. [16] N. Melab, T. Luong, K Boufaras, and E.-G. Talbi, “Towards paradiseomo-gpu: a framework for gpu-based local search metaheuristics,” in Advances in Computational Intelligence, Springer, 2011, pp. 401–408. [17] Thrust::cuda toolkit documentation, http://docs.nvidia.com/cuda/ thrust/index.html. [18] E. Taillard, “Benchmarks for basic scheduling problems,” European Journal of Operational Research, vol. 64, no. 2, pp. 278 –285, 1993, Project Management anf Scheduling, issn: 0377-2217. doi: http:// dx . doi . org / 10 . 1016 / 0377 - 2217(93 ) 90182 - M. [Online]. Available: http : / / www . sciencedirect . com / science / article / pii / 037722179390182M. 54.

(64) A. PEAC 安裝與環境建置 windows 平使用 Microsoft Visual C++( 環境 CUDA 環境安裝設 PEAC 設與. A.1. VC++) 參. 安裝 CUDA 結 https://developer.nvidia.com/cuda-downloads. CUDA. 安. 裝. A.2. CUDA 環境設. 使用 VC++ (Solution Explorer) → 建 (Build Customizations) 用用安裝 CUDA CUDA 用 (ex: CUDA 6.0(.targets, .props)) A.1 安裝 (Solution Explorer) (Properties) (Property Pages) (Configuration Properties)→ 結 (Linker)→ (Input) (Additional Dependencies) ”cudart.lib;” 1. A.3. CUDA. 行. helloworld.cu 1 2 3 4 5 6 7 8 9 10 11. # include <stdio .h> const int N = 16; const int blocksize = 16; __global__ void hello ( char *a, int *b) { a[ threadIdx .x] += b[ threadIdx .x]; } int main () 1. (debug release. 要設. 55.

(65) 12 { 13 char a[N] = " Hello ␣"; 14 int b[N] = {15 , 10 , 6, 0, -11, 1}; 15 16 char *ad; 17 int *bd ; 18 const int csize = N* sizeof ( char ); 19 const int isize = N* sizeof ( int ); 20 21 printf ("%s", a); 22 23 cudaMalloc ( ( void **)& ad , csize ); 24 cudaMalloc ( ( void **)& bd , isize ); 25 cudaMemcpy ( ad , a, csize , cudaMemcpyHostToDevice ); 26 cudaMemcpy ( bd , b, isize , cudaMemcpyHostToDevice ); 27 28 hello <<<1, blocksize >>>(ad , bd ); 29 cudaMemcpy ( a, ad , csize , cudaMemcpyDeviceToHost ); 30 cudaFree ( ad ); 31 cudaFree ( bd ); 32 33 printf ("%s\n", a); return EXIT_SUCCESS ; 34 35 } F5. A.4. Hello World!. 行. 設. PEAC 環境設. PEAC( ) 目錄中 PEAC peac.hpp 目錄置 C:\PEAC\ 目錄 (Solution Explorer) (Properties) (Property Pages) (Configuration Properties)→VC++ 目錄 (VC++ Directories)→Include 目錄 (Include Directories) ”C:\PEAC\;”( 目錄置) 2 2. (debug release. 要設. 56.

(66) (Configuration Properties) → Code Generation 2.0(compute_20,. 設 CUDA C/C++→(Device) 3 sm_20). A.5. PEAC. 行 helloworld.cu. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29. # include <cstdio > # include <peac .hpp > const int N = 16; class Hello { public : __device__ __host__ static void Run ( char *a, int *b) { *a += *b; } }; int main () { using namespace peac ; char a[N] = " Hello ␣"; int b[N] = {15 , 10 , 6, 0, -11, 1}; ArrayType <char >:: Type ac(CPU ,N); ArrayType <int >:: Type bc (CPU ,N); ac. SetData (a,N); bc. SetData (b,N); ArrayType <char >:: Type ag(GPU ,N); ArrayType <int >:: Type bg (GPU ,N); Copy (ag ,ac ,N ); 3. 未. code generation. CUDA. 行. 57. 法. 使用 PEAC.

(67) 30 31 32 33 34 35 36 37 38 39 40 41 42 43 }. Copy (bg ,bc ,N ); printf ("%s", a); AssignFunction < Hello ,1 ,1 >( GPU ,N, ag. AssignDataByShift () ,1 , bg. AssignDataByShift () ,1); Copy (ac ,ag ,N ); ac. GetData (a,N); printf ("%s\n", a); return EXIT_SUCCESS ; F5. Hello World!. 行. 58. 設.

(68) B. 使用 PEAC PEAC 中演算法使用. 中與使用 • 建 • 建 •. 結. 建. PEAC PEAC. 之. 與設. 與. • 使用演算法 • 建演算法使用 PFSP. B.1. 建設. 結. PEAC 算算 PFSP. (population) (makespan). 用設建要 (fitness). 59. 於 PEAC. 中結 (problem) 計算 PFSP.

(69) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38. # include <iostream > # include <peac .hpp > using namespace peac ; const int JOB_SIZE = 3; const int MACHINE_SIZE = 2; typedef int ProblemDataType [ JOB_SIZE ][ MACHINE_SIZE ]; typedef SimpleDataArray <int , JOB_SIZE > SolutionType ; typedef SimpleSmallerData <int > FitnessType ; int main () { // population size int pop_size = 4; ArrayType < ProblemDataType >:: NonChangeType problem_cpu (CPU ,1); ArrayType < ProblemDataType >:: NonChangeType problem_gpu (GPU ,1); ArrayType < SolutionType >:: Type population_cpu (CPU , pop_size ); ArrayType < SolutionType >:: Type population_gpu (GPU , pop_size ); ArrayType < FitnessType >:: Type fitness_cpu (CPU , pop_size ); ArrayType < FitnessType >:: Type fitness_gpu (GPU , pop_size ); SetProblem ( problem_cpu ); SetPopulation ( population_cpu , pop_size ); CallEvaluation ( pop_size , problem_gpu , problem_cpu , population_gpu , population_cpu , fitness_gpu , fitness_cpu ); CheckResult ( pop_size , problem_cpu , population_cpu , fitness_cpu ); } 60.

(70) 8行使用 (template) 參結 peac 於建參 CPU GPU 36 行設設. 設. 14 行建與使用用使用未結參 (enumeration) 參 27 行結之中. 設. 設. 實 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25. void SetProblem ( ArrayType < ProblemDataType >:: NonChangeType & input ) { ProblemDataType & problem = * input . GetPointer (); for ( int i =0;i< JOB_SIZE ;++ i){ for ( int j =0;j < MACHINE_SIZE ;++ j){ problem [i][j] = random (1 ,99); } } } void SetPopulation ( ArrayType < SolutionType >:: Type & population , int pop_size ) { for ( int i =0;i< pop_size ;++ i){ SolutionType sol ; for ( int j =0;j < JOB_SIZE ;++ j){ sol [j] = j; } for ( int j =0;j < JOB_SIZE ;++ j){ std :: swap ( sol [j], sol [ random (0 , JOB_SIZE -1)]); } population . SetData (& sol ,1 ,i); } } 4 9 行與 16 23 行 PEAC 使用結與行 24 行. 與 (solution) 使用 SetData 設 61. 於 GetData 與.

(71) B.2. PEAC. 建. 之. 與設. 建 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32. class Evaluation { public : __device__ __host__ static int Max ( int a, int b) { return a > b ? a : b; } __device__ __host__ static void Run ( const ProblemDataType * problem_ptr , FitnessType * fitness_ptr , const SolutionType * solution_ptr ) { const ProblemDataType & problem = * problem_ptr ; FitnessType & fitness = * fitness_ptr ; const SolutionType & solution = * solution_ptr ; int finish_time [ MACHINE_SIZE ] = {0}; for ( int i =0;i < JOB_SIZE ;++ i){ int index = solution [i]; finish_time [0] += problem [ index ][0]; for ( int j =1;j< MACHINE_SIZE ;++ j){ finish_time [j] = Max ( finish_time [j -1] , finish_time [j]) + problem [ index ][j]; } } fitness = finish_time [ MACHINE_SIZE -1]; } }; PEAC 使用 31 行. (static member function) 10 行中 62. Run. 10 設.

(72) __host__ __device__ 與設行 12 行 14 參 PEAC 26 行中 Max 4 8行 __host__ __device__ c++. 行. B.3. 表. 20. 30 行與. 與. 1 void CallEvaluation ( int pop_size , 2 ArrayType < ProblemDataType >:: NonChangeType & 3 problem_gpu , 4 ArrayType < ProblemDataType >:: NonChangeType & 5 problem_cpu , 6 ArrayType < SolutionType >:: Type & population_gpu , 7 ArrayType < SolutionType >:: Type & population_cpu , 8 ArrayType < FitnessType >:: Type & fitness_gpu , 9 ArrayType < FitnessType >:: Type & fitness_cpu ) 10 { 11 Copy ( problem_gpu , problem_cpu ,1); 12 Copy ( population_gpu , population_cpu , pop_size ); 13 14 AssignFunction < Evaluation ,1 ,1 ,1 >( GPU , pop_size , 15 problem_gpu . GetPointer () ,0 , 16 fitness_gpu . AssignDataByShift () ,1 , 17 population_gpu . AssignDataByShift () ,1); 18 19 Copy ( fitness_cpu , fitness_gpu , pop_size ); 20 } 11 12 行參 19 行中. 設 4 0. 中設設. 1. 目算使用 1 4. 與. 設使用 14 17 行算. 表 1 用 1. 表. 表. 63. 1 15 行中 16 17 行. 算 4 與1.

(73) B.4. 使用演算法與設. 使用基演算法 ( 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35. GA). 行. 建 PFSP. # include <iostream > # include <peac .hpp > using namespace peac ; const int JOB_SIZE = 500; const int MACHINE_SIZE = 20; typedef int ProblemDataType [ JOB_SIZE ][ MACHINE_SIZE ]; typedef peac :: SimpleDataArray <int , JOB_SIZE > SolutionType ; typedef peac :: SimpleSmallerData <int > FitnessType ; class TypePFSP { public : typedef :: ProblemDataType ProblemDataType ; typedef :: SolutionType SolutionType ; typedef :: FitnessType FitnessType ; }; int main () { SetupRand (); GeneticAlgorithm < TypePFSP > ga; SetGaConfig ( ga ); ArrayType < ProblemDataType >:: NonChangeType problem_cpu (CPU ,1); SetProblem ( problem_cpu ); ga. MemoryAllocate (); SetGaFunctions (ga ); Copy (ga. GetDataProblem () , problem_cpu ,1); ga. RunAlgorithm (); FitnessType best = ga. GetBestFitness (); std :: cout << " best :␣" << best . data << std :: endl ; } 64.

(74) 10 16 行設 GA 與 ProblemDataType SolutionType FitnessType 20 行 PEAC 使用要建基演算法 23 行設 GA 參 24 26 行建與 B.1 置 ga 設置 28 行設 ga 實 32 行設 ga 算法 33 34 行 27 行實參設. 22 行 27 行 ga 化中. 31 行. 行演. 1 void SetGaConfig ( GeneticAlgorithm < TypePFSP >& ga) 2 { 3 Config & config = ga. GetConfig (); 4 config . Read (" config . txt "); 5 6 std :: stringstream ss; 7 ss << ( JOB_SIZE ); 8 config . Set ( KEY_JOB_SIZE ,ss. str ()); 9 ss. str (""); 10 ss << ( MACHINE_SIZE ); 11 config . Set ( KEY_MACHINE_SIZE , ss. str ()); 12 ss. str (""); 13 ss << ( JOB_SIZE ); 14 config . Set ( KEY_GENE_SIZE , ss. str ()); 15 } 參 4行 1 2 3 4 5. 使用. 3 行 ga config.txt 目錄與. 中中. 設. 參. 目錄. population_size : 10000 crossover_rate : 1 mutation_rate : 0.05 generation_size : 100 run_size : 1 stringstream 設. 中設設參. 6. 65. 14 行. 中. 參. 使用.

(75) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33. void SetGaFunctions ( GeneticAlgorithm <TypePFSP >& ga) { ga. ChangeFunctionClassPointer ( KEY_INITIALIZATION_FUNCTION , new PermutationInitializeStaticDataFunction <TypePFSP >()); ga. ChangeFunctionClassPointer ( KEY_MATING_SELECTION_FUNCTION , new Tournament2ToIndexStaticDataFunction <TypePFSP >()); ga. ChangeFunctionClassPointer ( KEY_CORSSOVER_FUNCTION , new LinearOrderCrossoverStaticDataFunction <TypePFSP >()); ga. ChangeFunctionClassPointer ( KEY_MUATATION_FUNCTION , new SwapMutationStaticDataFunction <TypePFSP >()); ga. ChangeFunctionClassPointer ( KEY_ENVIRONMENTAL_SELETCTION_FUNCTION , new GenerationalModelSelectionFunction <TypePFSP >()); ga. ChangeFunctionClassPointer ( KEY_UPDATE_BEST_FUNCTION , new UpdateBestFunction <TypePFSP >()); PfspEvaluationStaticDataFunction <TypePFSP > evaluation_function ; evaluation_function . SetDefaultParentsDataKey (); ga. ChangeFunctionClassPointer ( KEY_EVALUATION_PARENTS_FUNCTION , evaluation_function .Clone ()); evaluation_function . SetDefaultChildrenDataKey (); ga. ChangeFunctionClassPointer ( KEY_EVALUATION_CHILDREN_FUNCTION , evaluation_function .Clone ()); }. 4 21 行 TypePFSP. 建. 建 23 行 25 行與 29 行演算法中. 與 28 與 32 行. B.5. 建. 1. 設設與. 演算法與. 用. 之 32 行設. ( B.4 ) B.2 建. 中計算. class CallEvaluation : public. 66. 使.

(76) 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41. FunctionInitializeStaticData < KeyDataTuple2 < FitnessType , SolutionType >, KeyConfigTuple0 > { public : typedef CallEvaluation MyType ; typedef KeyDataTuple2 < FitnessType , SolutionType > DataTuple ; typedef KeyConfigTuple0 ConfigTuple ; typedef FunctionInitializeStaticData <DataTuple , ConfigTuple > ParentType ; virtual void Initialize ( Config & config , AlgorithmDatabase & database ) { m_porblem_data = dynamic_cast < ArrayType < ProblemDataType >:: NonChangeType * >( database . GetData ( KEY_PROBLEM_DATA )); ParentType :: Initialize (config , database ); } virtual void Run( Config & config , AlgorithmDatabase &) { DataTuple & data_tuple = this -> GetDataTuple (); AssignFunction <Evaluation ,1 ,1 ,1>( this -> GetMemoryType (),this -> GetSize (), m_porblem_data -> GetPointer (),0, data_tuple . GetData0 (). AssignDataByShift (),1, data_tuple . GetData1 (). AssignDataByShift () ,1); } private : ArrayType < ProblemDataType >:: NonChangeType * m_porblem_data ; virtual AlgorithmFunctionBase * DoClone () const { return new MyType (* this ); } };. 1. 4行化. 參 0. 參. 建. 於行建 KeyDataTuple2 與 KeyConfigTuple0 2 參 7 67. 表 13 行. 2.

(77) 建. 使用 15 24 行設 GA 中用 key 化 25 34 行與 B.3 GetData0() 2 36 行. 化於置行設 18 20 行 key KEY_PROBLEM_DATA 23. 設. 行 tuple GetData1() 37 39 行. 1. B.4. 與 1 2 3 4 5 6 7 8 9 10 11 12 13. 要. CallEvaluation function ; const char * parents_key [] = { KEY_PARENTS_FITNESS , KEY_PARENTS_SOLUTION }; function . SetDataTupleKeys ( parents_key ); ga. ChangeFunctionClassPointer ( KEY_EVALUATION_PARENTS_FUNCTION , function . Clone ()); const char * chlidren_key [] = { KEY_CHILDREN_FITNESS , KEY_CHILDREN_SOLUTION }; function . SetDataTupleKeys ( chlidren_key ); ga. ChangeFunctionClassPointer ( KEY_EVALUATION_CHILDREN_FUNCTION , function . Clone ()); 建行設. tuple. 設 GA 中. 68. key 與. key. 要與之.

(78)