中 華 大 學 碩 士 論 文
題目: 以基因演算法求解雙酶切問題
A Genetic Algorithm for the Double Digest Problem
系 所 別 : 資訊工程研究所
學 號 姓 名 : E08902002 易 聲 宏 指 導 教 授 : 侯 玉 松 博 士
中 華 民 國 九 十 二 年 一 月
以基因演算法求解雙酶切問題
研究生:易聲宏 指導教授:侯玉松 博士 中華大學資訊工程研究所
摘 要
在生物資訊研究領域中,雙酶切問題(Double Digest Problem, DDP)是 實踐「限制輿圖」 (restriction mapping)法所要解決的重要問題。在文獻中 指出,雙酶切問題是 NP-難(NP-hard)問題,尚未發展出多項式時間之演算 法解決這個問題。在本論文中,我們將設計基因演算法(genetic algorithm) 求解雙酶切問題,並開發電腦程式,檢討其執行成效,並提出加快計算速 度之建議。
A Genetic Algorithm for the Double Digest Problem
Student:Sheng-Hung Yi Advisor:Dr. Y.S. Hou Institute of Computer Science and Information Engineering
Chung Hua University
ABSTRACT
In the field of computational biology, Double Digest Problem(DDP) is an important issue of reconstructing the restriction maps of DNA sequences. DDP has been proven to be NP-hard, and no algorithm with polynomial time complexity was developed.
This thesis presents a new approach to solve DDP using a genetic algorithm. At the end of thesis, we evaluated its performance and offered some suggestions to speed up computation.
iii
誌 謝
本論文能順利完成,皆蒙指導教授侯玉松老師之諄諄教誨與悉心 指導,由於侯老師無比的愛心與耐心,使本論文自孕育之初,乃至方 向的確立、研究的過程、撰寫定稿,皆受益匪淺;每當研究期間遭逢 挫折難題,老師總是不厭其煩地予以指正與鼓勵,終能撥雲見日、豁 然開朗。於此編寫誌謝之際,深感師恩昊大,內心充滿無限之謝意與 敬意。
此外要感謝資工所諸位師長,願意犧牲假日在週末時間授課。尤 其要感謝吳哲賢老師,不論公事私事,吳老師都不吝耗費時間在學生 身上,在此特別致謝。最後要感謝王裕國同學,沒有您的鞭策與鼓勵,
這篇論文無法如期完成。
謹以此篇論文獻給所有關心我的人。
目 錄
摘 要 ...i
ABSTRACT ...ii
致 謝 ...ii
目 錄 ...iiv
圖 表 ...vi
第一章 簡介 ...1
1-1 概要 ...1
1-2 實體輿圖的繪製方法 ...2
1-3 雙酶切問題 ...3
第二章 文獻回顧 ...5
第三章 雙酶切問題之基因演算法設計 ...7
3-1 基因演算法概述 ...7
3-2 雙酶切問題的基因演算法 ...9
3-2-1 向量表示方式...9
3-2-2 適應函數... 11
3-2-3 染色體互換...13
3-2-4 基因突變...14
第四章 程式設計 ...15
4-1 改良 GA 演算法 ...15
4-2 選擇配種代表 ...16
4-3 加速優化人口適應力 ...17
4-4 演化停滯 ...18
4-4-1 同值異形染色體向量...18
4-4-2 處理演化停滯問題...19
4-5 最佳解、完美解與正解 ...19
4-6 程式參數 ...20
4-7 系統開發環境 ...20
4-8 程式流程 ...21
第五章 結果與討論 ...22
5-1 程式執行結果 ...22
5-1-1 實驗數據...22
5-1-2 實驗方法...23
5-1-3 結果...24
5-2 結論 ...24
5-2-1 以 GA 求解 DDP 的可行性...24
5-2-2 執行效能...25
5-3 討論 ...25
5-3-1 加速演化的動力...25
5-3-2 育種的效果...26
第六章 建議 ...27
6-1 計算大量人口的分散式系統 ...27
6-2 育種式的分散式系統 ...29
參考文獻 ...31
附錄一 ...34
附錄二 ...35
附錄三 ...36
圖 表
表一:程式自訂參數列表...20
表二:程式執行結果...24
表三:育種與無育種之效果比較...26
圖1 : 雙酶切問題 三個試瓶內的 DNA 小片段長度資訊 ...3
圖2 : 滿足限制酶 A、B、A+B 所切割之小片段長度資料的排列方式之一 ....4
圖3 : 基因演算法示意圖 ...8
圖4:基因演算法的一般程序 ...8
圖5:與圖 2 相似的小片段排列 ... 11
圖6:改良後的基因演算法程序 ...16
圖7:俄羅斯輪盤選擇法程序 ...16
圖8:俄羅斯輪盤選擇法 ...17
圖9:程式流程圖 ...21
圖10:雙酶切實驗數據 ...23
圖11:實驗程序...23
圖12:計算大量人口的分散式系統主行程的虛擬程式碼 ...28
圖13:計算大量人口的分散式系統僕行程的虛擬程式碼 ...28
圖14:育種式的分散式系統示意圖 ...29
圖15:育種式的分散式系統主行程的虛擬程式碼 ...30
圖16:育種式的分散式系統僕行程的虛擬程式碼 ...31
第一章 簡介
1-1 概要
அӰ(Gene)ؚۡᒸ༈੫ޟஅҏ՝Ȃҥўਯᗞਯሖ(DNA)ᄺԙȄӵҡ ސжࣨϛȂᙏޟԃࢳȂፒᚕޟԃΡȂоѲᆍਯҠሖȂϷտҢѲএԅҔ ȞAȃTȃCȃGȟޟጆלԒфߒȂᒸ༈ଉᓽԆӵߝᜦ DNA ϷυϛȂ࡚ᄺ Οҡڼ༹ફޟᙢყȄծӋҡڼᙢყፒᚕᛁσȂᜲо၌ȄоΡ࣏ٽȂ൷Ԥ 30 ቇᄇਯҠሖȂѓ֤घԤ 3 Ս 10 এஅӰ[17]Ȅ
ӵϷυҡސᏰۦҐีоࠉȂᒸ༈Ᏸড়оᖅᄂᡛंـஅӰϞޟᜰᖒȂᛲ ᇧюஅӰޟᚇᗖყ(linkage map)Ȅڏন౩ٷᐃӣΙՓᡝΰޟஅӰȂष܄Ԫ ޟ՝ညࣺߖȂ٥ቄಠफ໌ኵϷນלԙᆠυᇄ֊υਢȂӵՓᡝ܄ԪҺޟႆ
แϛȂࣺߖޟஅӰϷޟޟᐠ࡞ճȄІϞȂषΠএஅӰຽᚔልሉȂϷນޟᐠ
ོ൷࡞σȄငҥσ໔ᖅᄂᡛȂᢎᄆኵ̈এжфޟড়ఊفᜊᒸ༈੫ȂၼҢಛॎ
ޟПݲ൷џоᘪઽюஅӰϞޟຽᚔᜰ߽ȂᛲᇧюஅӰᚇᗖყȄኞਲ਼(T.H.
Morgan)้Ρ։ցҢᚇޟݲࠌȂᛲᇧюݎᜁޟஅӰᚇᗖყȄծᆍПݲ
҆ငႆᖅᄂᡛȂϚᎌҢܻΡȄՄиσഋϷΡড়৳ޟΡοኵЊЍȂζฒ ݲҡٗஊޟಛॎኺҏհᘪઽϷݙȄདौоஅӰᚇᗖყПԒ၌ᐌএҡڼᙢ ყȂߨல֨ᜲȄ
ᓍऋޟ໌ȂऋᏰড়ีюӻᆍȂџоޢᐇհ DNAȄٽԃ३ڙ 酶(restriction enzyme)џоϸᘞߝᜦ DNA ޟ੫ۡޟ՝ညȂٺϞԙ࣏ DNA аࢲȄ Ⴋݨݲ(gel-electrophoresis)џ໔ก DNA аࢲߝ࡙Ȅᆹӫ酶ᚇІᔖ(polymerase chain reaction, PCR)ȂցҢ DNA ޟӫԙሕશ໌டΙܒޟᚇፒᇧȂџоΙ ࢲDNA ፒᇧ࣏নپޟΙԻቇՍΙνቇॻȄԤώڎࡣȂϷυҡސᏰড়ۖ
კၐϷυཌᢎޟُ࡙၌ҡڼᙢყȂᛲᇧᄂᡝᗖყ(physical map)ȂஅӰޟ ܚӵ՝ညᆠጂӴජक़юپȄ
1-2 實體輿圖的繪製方法
ӵҡސၥଉंـሴϛȂᄂᡝᗖყ(physical mapping)୰ᚠବᄇΙࢲ DNA аࢲȂौײюѺӵஅӰᡝ(genome)ϛܚ৴ဣޟ՝ညȄҬࠉᄂᡝᗖყ୰ᚠޟ၌ؚП ݲ Ȃ к ौ Ԥ ڍ ᆍ Ȉ Ι ३ ڙ ᗖ ყ(restriction mapping) ݲ Ȃ Ѫ Ιᚕ Һ ᗖ ყ (hybridization mapping)ݲ[6]Ȅ
३ڙᗖყݲցҢ३ڙ酶(restriction enzyme)ོӵ੫ۡ௶Ӗޟᢃஅ՝ညΰϸ
ߝᜦ DNA ޟ੫ܒȂᙤҥۡ՝юٲ੫ۡޟϸᘈ(sites)ȂऋᏰড়ঈுоᛲᇧю ᐌࢲDNA ޟᄂᡝᗖყȄ1973[9]ΡᇧюޟΙӋᄂᡝᗖყȈࢳ SV40Ȃ൷
ցҢ३ڙ酶HindII ོϸ DNA ΰޟ GTGCAC ᇄ GTTAAC ΠᆍਯҠሖוӖޟ੫ ܒȂо३ڙᗖყݲᛲᇧՄԙȄӵᄂሬᔖҢΰȂϷݙDNA וӖȃᛲᇧஅӰಢޟђ
ყᜊȃंـ DNA ޟฒܒᖅȃᄺ࡚அӰМ৲(gene library)ᇄஅӰࡾ(gene fingerprint)้ώհȂ࡚ҳ३ڙ酶ᗖყϚџીЍޟᕗȄ
ᚕҺᗖყݲоߝ࡙ 8~30 এਯҠሖȂϐۡוޟ DNA аࢲ࿋հବ (probe)ȂڏᇄٱӑՌᚖϷᚔ࣏ޟกၐ DNA ӫȄӰ࣏ሖҠሖԤ A-T, C-G Ռଢ଼ᄇޟ੫ܒȂ࿋ DNA аࢲϛԤᇄବϣ၄(complementary)ޟਯҠሖו ӖȂବོᇄϞ๖Ȃᆎ࣏ᚕҺ(hybridize)Ȅٽԃବ ACCGTGGA ོᇄ DNA а
ࢲCCCTGGCACCTA ᚕҺȂӰڏளԤᇄବϣ၄ޟਯҠሖוӖ TGGCACCTȄϷ
ݙବᇄDNA аࢲޟᚕҺݷȂ൷џоᛲᇧ၎ DNA аࢲޟᄂᡝᗖყȄषӻ এବ௶ӖӵஅݖΰȂџоΙԩกኵএ੫ۡਯҠሖוӖȂᆎ࣏DNA වа(DNA Chip)ȄҡϽᏰড়லҢޟࠒПᚕҺݲ(Southern hybridization)[4]։ᚕҺᗖყݲޟ
ᔖҢȄ
1-3 雙酶切問題
ӵ३ڙᗖყݲϛȂԤΙᆍငႆσ໔պ(cloning)ޟ DNA аࢲϷ၆ ΣέЛၐ౭ȂցҢڍᆍϚӣޟ३ڙ酶ϷտᅎΣΙᇄΠЛၐ౭Ȃၐ౭ϱޟ DNA ϸസԙωаࢲȄӔڍᆍ३ڙ酶ӫᅎΣέЛၐ౭ȂᡱڏӣਢհҢȄ
έЛၐ౭ϱޟDNA Ӱڧڗڍᆍ३ڙ酶ੑϽȂོೝϸுಠȄᓗညΙࢲਢȂጂ
߳३ڙ酶ޟੑϽႆแϐӒഋׇԙȂӔоႫݨݲϷտᓃέএၐ౭ϱޟωаࢲޟߝ
࡙ȄցҢٲߝ࡙ၥਟȂ१ಢೝ३ڙ酶ੑϽޟӨএωаࢲޟ௶ӖԩוȂԪ։ᚖ酶 ϸ୰ᚠ(double digest problemȂᙏᆎ DDP)ȄоٽΙᇳ݂Ȉ
ٽΙȈ೩࢚DNA аࢲȂٺҢ३ڙ酶 A ёоϸസȂџுڗ 4 ಢωаࢲȂڏߝ࡙
Ϸտ࣏3ȃ5ȃ8ȃ9ȇٺҢ३ڙ酶 B ёоϸസȂһџுڗ 4 ಢωаࢲȂڏߝ࡙Ϸ տ࣏ 3ȃ4ȃ7ȃ11ȇӣਢٺҢ३ڙ酶 A ᇄ B ёоϸസȂџоுڗ 7 ಢωаࢲȂ ڏߝ࡙Ϸտ࣏2ȃ3ȃ3ȃ4ȃ4ȃ4ȃ5Ȅԃყ 1 ܚҰȄ
ყ1Ȉᚖ酶ϸ୰ᚠέএၐ౭ϱޟ DNA ωаࢲߝ࡙ၥଉ
ցҢΰक़ϞӨၐᆓϱ DNA ωаࢲߝ࡙ၥଉȂϷտᄇ३ڙ酶 A ᇄ B ܚϸസ ޟ4 ᆍωаࢲୈ௶ӖȂٺڏ֛ӫӣਢٺҢ३ڙ酶 A ᇄ B ܚϸസޟ 7 ಢωаࢲޟ ߝ࡙Ȃڏ၌ٮϚ୲ΙȂԃყ2 ։࣏ΙᆍᅖٗనӇޟ௶ӖПԒȄ
ყ2Ȉᅖٗ३ڙ酶 AȃBȃAɮB ܚϸസϞωаࢲߝ࡙ၥਟޟ௶ӖПԒϞΙ
ᜰܻDDP ޟंـМᝦϛȂGoldstein ᇄ Waterman ӵ[10]ϛϐᜌ݂ DDP NP-
ᜲ(NP-hard)୰ᚠȂܚоҬࠉۦҐีюӻԒਢϞᅋᆗݲ၌ؚএ୰ᚠȄҏ
፣Мᔣ௴ҢஅӰᅋᆗݲ(genetic algorithmȂᙏᆎ GA)ПԒپ၌ DDPȂڥڏܾܻϷ ඹԒ౩ޟᓺᘈȄٮӵএΡႫသΰีแԒȂᄂሬຟեڏஈਝȄ
ҏМӵΠണӱDDP ޟ၌ݲȂέണϭಝ DDP ޟ GA ޟ೩ॎ྅܈Ȃ
ѲണϭಝแԒ೩ॎ१ᘈȂϤണᔮଆڏஈਝȂϲണඪю࡚ដȄ
第二章 文獻回顧
Ռ1970 ԑ[5] Hamilton Smith ี౪३ڙ酶ࡣȂоᚖ酶ϸ(double digest)ᛲᇧ३ ڙ酶ᗖყԙ࣏ҡϽᄂᡛࡉலҢޟϞΙȄծᚖ酶ϸ୰ᚠ੫ԤޟፒᚕܒΙޢ֨ᘙ
ҡϽᏰড়Ȅоᅋᆗݲُ࡙پࣼȂDDP ਢፒᚕ࡙࣏ O(n!2)Ȃॎᆗ໔ོᓍа ࢲኵ໔ቨёՄ֕ࡾኵԙߝȄо[14]࣏ٽȂ
ٽΠȈ ೩࢚ DNA аࢲȂٺҢ३ڙ酶 A ёоϸസȂџுڗ 16 ಢωаࢲȂ Ϸտ࣏{1, 1, 3, 3, 3, 3, 4, 4, 5, 6, 7, 19, 21, 23, 23, 24}ȇٺҢ३ڙ酶 B ёоϸസȂһџுڗ16 ಢωаࢲȂڏߝ࡙Ϸտ{1, 1, 1, 2, 2, 4, 4, 5, 9, 9, 9, 15, 15, 19, 19, 35}ȇӣਢٺҢ३ڙ酶 A ᇄ B ёоϸസȂџоு
ڗ31 ಢωаࢲȂڏߝ࡙Ϸտ࣏{1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 6, 7, 8, 9, 9, 9, 10, 10, 15, 23}Ȅषоጏᖞݲؑ၌ȂӓԤ (16!×16!)ʝ(2!7×3!2×4!)=3.96×1021ᆍ௶ӖȄ
Ԟӵ1987[10]Ᏸޱ൷ϐᜌ݂ DDP NP-Hard ୰ᚠȄᐣԑپϐԤ೨ӻᏰޱඪ ю ӻ ᆍ ᅋ ᆗ ݲ ؑ ၌ DDP Ȅ ٽ ԃ Ԟ ޟ ጏ ᖞ ݲ [18] ȃ ᔣ ଝ Ь ݲ (simulated annealing)[10,1] ȃаࢲШᄇݲ(fragment matching)[20,12,16]ᇄΡώසኋ [13] Ȅ W. Schmitt ᇄ Waterman ඪ ю о њ ֈ Һ (cassette exchange) ᇄ њ ֈ І ৢ (cassette reflection) [19] ௰Ᏺ DDP ޟӻ१၌ (multiple solutions) Ȅ1994 छडᏇ σᏰᚂᏰีюৈ၆แԒ ”Double Digester” [11]ȂᐌӫᄂᡛኵᐃᒯΣȃᅋᆗݲ
ॎᆗᇄᛲყ้ђȂП߯ҡϽᏰড়ᛲᇧ३ڙ酶ᗖყȄߖέԑپࠌԤ Ming-Yang Kao ้έΡඪю Enhanced Double Digest Problem [15]Ȃӵ੫ۡనӇήȂџ DDP Ͻᙏ࣏ਢፒᚕ࡙࣏ጣܒӻԒޟ୰ᚠȄYin Zhang ้ΠΡඪюоӫᐌኵጣܒ
ೣგ (mixed-integer linear programming)[22]ȂԤਝӴ౩σ໔аࢲޟᚖ酶ϸ୰
ᚠȄ
ҏМඪюѪѴΙᆍПԒؑ၌ DDPȂ։அӰᅋᆗݲ(Genetic Algorithm, ᙏᆎ GA)Ȃӵέണ၏क़ϞȄ
第三章 雙酶切問題之基因演算法設計
3-1 基因演算法概述
அӰᅋᆗݲJohn Holland ܻ 1975 ԑॶӑඪю[3,7]ȄGA ശٹϽཪ൶ᐠ ڙϞΙȂΙᆍӒᓍᐠ൶ᓺᅋᆗݲȂѺҽҡސ໌ϽޟႆแȂоȶᓺഽӛఁȷ ޟনࠌپᓺϽ၌๎Ȃംᅚؑுശٹ၌ȄGA џоӵ࡞σޟ၌ӫުϛȂפഀཪ
൶юശٹ၌Ȅ࡞ӻޟंـϐငᜌ݂ΟGA ӵώแᔖҢᇄശٹϽޟԤਝܒȄᇄ༈ಛ ኵПݲޟᘈᐿҳཪ൶ПԒШၶଔپȂஅӰᅋᆗݲޟӻᘈཪ൶ٮҺၥଉޟ੫ ܒȂڎരҁϽၼᆗޟዖΨȂ࡞ᎌӫҢܻפഀؑ၌DDPȄ
GA ޟώհন౩அܻႀᅭМޟ໌Ͻ፣ᏰᇳȂоᐠడПݲᔣ໌Ͻ౩፣Ȅॶ ӑӵ၌ӫުϛᓍᐠࢅᒵюኵএ၌Ȃୈ࣏ߑۖΡο(initial population)ȂӔആႆ
ᎌ࿋ޟᎌᔖڒኵ(fitness function)ᄇܚԤএᡝ (individual)ȂϷտॎᆗڏᎌᔖ࡙Ȃ ёᖂܚԤএᡝޟᎌᔖ࡙ȂӔϷտॎᆗএտএᡝᎌᔖ࡙ћӒഋΡοᖂᎌᔖ࡙ᖂڷ ޟШٽȂ࿋հೝᒵᐅ࣏ᆍфߒޟᐠȄณࡣਲ਼ᐃԪᐠϷշȂӵܚԤএᡝϛᒵ юᆍфߒȂցҢՓᡝϣ(crossover)ᇄएᡐ(mutation)้ᒸ༈ၼհ(genetic operator)ҡཱིҡф(new generations)ȄٮᆍфߒоѴޟএᡝϛȂоཱིҡфᓍ ᐠڥфϞȂоҡΠфΡοȄԃԪ໊ՄඈۖȂӔҡέфȃѲф…ȂࠌΡ οޟᎌᔖོ࡙ംᅚඪଽȂശࡣҡശᓺؾࠢᆍȂ։ശٹ၌Ȅყ3 ࣏ΰक़ႆแ ҰཎყȄ
ყ3ȈஅӰᅋᆗݲҰཎყ
ყ4[2]ΰक़Ϟ GA ώհন౩оᅋᆗݲПԒߒҰȂᢎ܈ё఼ཿȄ 1. Choose a population size.
2. Choose the number of generations NG.
3. Initialize the population and record the fitness of each individual.
4. Repeat the following for NG generations:
4.1 Select a given number pairs of individuals from the
population probabilistically after assigning each structure a probability proportional to observed performance.
4.2 Copy the selected individuals, then apply genetic operators (crossover and mutation) to them to produce new individuals.
4.3 Select other individuals at random and replace them with the new individuals.
4.4 Observe and record the fitness of the new individuals.
5. Output the fittest individual as the answer.
ყ4ȈஅӰᅋᆗݲޟΙૡแו
3-2 雙酶切問題的基因演算法
೩ॎGA ޟкौՃኌӰશ࣏Ȉө໔ߒҰПԒ(vector representation)ȃᎌᔖڒ ኵޟ೩ॎȃՓᡝϣޟПԒȃоЅஅӰएᡐޟПԒȄоή൷ѲӰશϷक़ԃ ήȈ
3-2-1 向量表示方式
1.3 ωџޣȂᚖ酶ϸ୰ᚠ։ؑ३ڙ酶 A ܚϸസޟωаࢲȂᇄ३ڙ酶 B ܚϸസޟωаࢲޟᎌ࿋௶ӖȂٺு֛ӫ A ᇄ B ӓӣϸസޟωаࢲߝ࡙ၥਟȄ ܚоȂ३ڙ酶A ܚϸസޟωаࢲȂᇄ३ڙ酶 B ܚϸസޟωаࢲޟӈཎ௶ӖȂ֯
࣏DDP Ϟџ၌Ȅ
ԃդ௶ӖߒҰ࣏ө໔ȉӵႆўGA МᝦϛၷȂ೨ӻᏰޱෆंـȶਡ௰
ষ୰ᚠȷ(traveling salesman problemȂᙏᆎ TSP)Ȃ։७ᖝԃդ௶ӖߒҰ࣏ө ໔ޟ୰ᚠȂӵМᝦ[23]ϛϷ࣏έȈၯ৷ߒҰݲ(path representation) ȃ᎑ߖߒҰ ݲ(adjacency representation)ȃоЅוኵߒҰݲ(ordinal representation)Ȅ
ၯ৷ߒҰݲӑTSP ყלޟؐΙএᘈ(node)ጡဴȂө໔ϛޟϯશфߒ ΙএᘈȂٷוӖюө໔ϛޟϯશȂ։фߒࡶၯ৷ȂശޢឈޟߒҰݲȄ᎑
ߖߒҰݲؐΙএө໔ϯશ՝ညᇄޟಢӫфߒTSP ყלޟΙএ(edge)Ȃٽ ԃषj এϯશ՝ညޟ࣏ kȂфߒ edge (j,k)ѓ֤ܻࡶၯ৷ϞϛȄוኵߒҰݲ
೩1ȃ2ȃ…ȃn ौୈ௶ӖȂӑ೩ۡΙএஅᙃ௶ӖȂல(1 2 … n)Ȃᄇ 1ȃ2ȃ…ȃ n ޟӈཎ௶ӖՄِȂ։џٺҢΙএ n ᆰө໔ߒҰȂө໔ϛޟ i এϯશϭܻ 1 ڗn – i + 1 ϞȂߒҰ၎௶Ӗޟ i এኵԅȂӵஅᙃ௶Ӗޟ഻Ꮇኵԅϛޟඁ՝Ȅ ڏϛޟוኵߒҰݲܾܻᄂհՓᡝϣᇄएᡐȂܚоרঈ௴ڥᆍߒҰݲȂڏ၏
ಠᇳ݂ԃٽέȈ
ٽέȈ೩n = 5Ȃࠌஅᙃ௶Ӗ S = (1 2 3 4 5)Ȅᄇ௶Ӗ(1 2 5 3 4)ՄِȂџоߒҰ
࣏ө໔(1 1 3 1 1)ȂॶӑȂӰ࣏௶Ӗ(1 2 5 3 4)ޟΙএኵԅ 1Ȃ S ϛΙ՝Ȃ ܚоө໔ޟΙϯશ࣏1Ȅ௶Ӗ(1 2 5 3 4)ޟΠএኵԅ 2Ȃ S ўଶ 1 Ϟࡣޟ
഻Ꮇኵԅ௶ӖޟΙ՝Ȃܚоө໔ޟΠϯશ࣏1Ȅӣ౩Ȃ5 S Ϟ഻Ꮇኵԅޟ
έ՝Ȃܚоө໔ޟέϯશ3ȄᎷ௰Ȅ
וኵߒҰݲޟᓺᘈܾܻՓᡝϣȂѫौڍএஅӰޟࡣࢲוӖޢҺ
։џȄծڏીᘈ࣏ڍஅӰޟࠉࢲוӖۖತϚᡐȂᄇᔖޟ౪ຫ௶Ӗޟࠉࢲኵԅ ЅԩוۖತϚᡐȂԃݎࠉࢲኵԅЅԩוᇄџ၌ޟ௶ӖϚಒȂࠌฒݲആႆՓ ᡝϣҡџ၌ޟ௶ӖȂᏲमॎᆗਢ۽Ȃӵ[23]ϛၷȂԃݎਡ௰
ষ୰ᚠٺҢוኵߒҰݲ೩ॎGAȂڏॎᆗਝϚٹȄ
࣏Ο໌ીᘈȂቨ໌அӰࠉࢲוӖޟᓍᐠܒȂרঈቨёஅᙃ௶Ӗኵ
ЙȂᘗшஅᙃ௶Ӗ࣏அᙃ௶ӖӫS’Ȉ
ᘗш࣏அᙃ௶ӖӫϞࡣȂՓᡝө໔(chromosome vector)ሯቨёΙএஅᙃ
ኵȂߒҰԪө໔ܚՃޟஅᙃ௶ӖSiȄоٽѲᇳ݂Ȉ
例四:೩S’ = {(1 2 3 4 5)ȃ(2 3 4 5 1)ȃ(3 4 5 1 2)ȃ(4 5 1 2 3)ȃ(5 1 2 3 4)}Ȃࠌ ө໔(1 1 3 1 1)ӵஅᙃኵ = 1 ਢȂфߒ௶Ӗ(1 2 5 3 4)ȇӵஅᙃኵ = 3 ਢȂф ߒ௶Ӗ(3 4 2 5 1)Ȅ
ܚоӵՓᡝө໔џϷ࣏ڍഋϷȈוኵө໔ᇄஅᙃኵȂҏМԪᆍө໔ߒ Ұݲᆎ࣏ȶᘗшוဴߒҰݲȷ(extended ordinal representation)Ȅ࣏ቨёஅӰࠉ
{ 1 , , } ( 1 1 2 1 )
' = S i = n S = i i + n i −
S
iK 此處
iK K
ࢲוӖޟᓍᐠܒȂӵୈՓᡝϣਢȂଶΟוኵө໔ޟࡣࢲϣѴȂӔՃኌՓ ᡝө໔ޟஅᙃኵ֏ҺȂڏҺᐠ࣏pȂζҏ፣Мกᡛџ፡ᐌޟኵȄ
ՓᡝϣޟڏтഋϷӵ3-2-3 ωϛӔक़Ȅ
3-2-2 適應函數
ӵМᝦ[14]ޟ DDP ᔣଝЬᅋᆗݲϛȂڏ໔ڒኵ(energy function)ਲ਼ᐃ ȶխњПྥࠌȷ(chi-square-like criteria)پ೩ॎȂӑᔣଝЬᅋᆗݲܚҡϞ ३ڙ酶A ᇄ B ӓӣϸസޟ DNA ωаࢲߝ࡙Ȃωڗσ௶וȂ࣏ c1’ȃc2’ȃ…ȃ ct’ȂӔᇄ DDP Ҭ DNA ωаࢲߝ࡙(һωڗσ௶ו) c1ȃc2ȃ…ȃctȂϷտॎ
ᆗຽᚔҁПޟࣺᄇ(cj - cj’)2/ cjȂj = 1, …, tȂӔ t এࣺᄇёᖂȂ։࣏ڏ
໔ڒኵȄӵี౪ғጂޟ३ڙ酶AȃB ϸസޟ DNA ωаࢲ௶ӖਢȂڏ໔ڒኵ
࣏0Ȃ֏ࠌڏσܻ 0Ȅ
ΰक़໔ڒኵᗴൕωаࢲޟߝູ࡙ࣺߖȂࠌڏ௶Ӗູོᗍߖܻғጂޟ௶ӖȄ רঈᇯ࣏ኺޟᗴൕϚᅾณϸӫᄂሬޑݷȂоήҢٽϤᇳ݂Ȉ
ٽϤȈԃӣٽΙޟDDPȂԃݎ३ڙ酶 A ᇄ B ܚϸസޟωаࢲࡸྱყ 2 ПԒ௶ӖȂ
ுڗ३ڙ酶A+B ޟғጂ๎ਰȂڏߝ࡙௶וࡣ࣏ 2ȃ3ȃ3ȃ4ȃ4ȃ4ȃ5Ȃषѫ
३ڙ酶A ܚϸസޟശࡣΠএωаࢲϣ՝ညȂࠌོு३ڙ酶 A+B ܚϸസޟω аࢲޟ௶ӖПԒԃყ5Ȃڏߝ࡙௶וࡣ࣏ 1ȃ2ȃ3ȃ3ȃ4ȃ5ȃ7Ȅ
ყ5Ȉᇄყ 2 ࣺխޟωаࢲ௶Ӗ
ყ5 ޟ A ޟωаࢲ௶ӖПԒᇄყ 2 ࣺխȂծყ 5 ϛ A ޟ௶Ӗོٺ A+B ϸ സਢҡߝ࡙1ȃ3ȃ7 ޟωаࢲȂՄੑѶߝ࡙ 3ȃ4ȃ4 ޟωаࢲȂߝ࡙ 1 ӵ௶ו ࡣོಋڗശࠉ७Ȃߝ࡙3 ΙቨΙฒኇȂߝ࡙ 7 ོಋڗശࡣ७ȂོٺுඁнӒ ഋޟߝ࡙ޟ௶ו՝ညөѾܖөѡ՝ಋΙ՝ȂӔᇄғጂ๎ਰޟߝ࡙ڍڍॎᆗࣺ
ᄇຽᚔҁПᇄڏᖂڷȂོԤ࡞σޟюΣȄ
ณՄᢎᄆყ2 ᇄყ 5ȂA+B ϸസޟωаࢲޟ௶ӖПԒȂӰ࣏ѫᡐ A ޟശ ࡣΠωаࢲ՝ညȂܚоڍޱA+B ࠉ७Ѳωаࢲޟ௶ӖϚᡐȂѫԤࡣ७έаࢲڧ ڗኇȄܚо೩ॎᎌᔖڒኵᔖ၎оڍޱϞωаࢲߝ࡙֛ӫኵ࣏кौՃ໔Ȃࣺխޟ ωаࢲ௶ӖϗுڗࣺߖޟᎌᔖڒኵȄ
ҥܻࣺխޟωаࢲ௶ӖПԒȂѫԤഋωаࢲޟ՝ညࣺȂՄσഋϷڏтω аࢲޟ՝ညࣺӣޟȂٺ३ڙ酶A+B ϸസϞωаࢲ௶Ӗ՝ညᇄߝ࡙σᡝΙ मȂڏᎌᔖڒኵᔖࣺխȂܚоҏ፣Мޟᎌᔖڒኵۡဎ࣏ȶڎԤࣺӣߝ࡙ϞA+B ϸസϞωаࢲᖂኵޟҁПȷȂӰ࣏ڎԤࣺӣߝ࡙ϞA+B ϸസϞωаࢲᖂኵູσȂ ڏีҡᐠོࡨഀᡐωȂоዅആிڔШൕȂᄇϛϲጆᇄϤጆኵԅȂࣺᄇܻᄇϛϤ ጆᇄѲጆኵԅȂᗶณѫࣺ৯ΙጆȂծᄇϛϤጆኵԅޟᐠᄇϛϲጆޟ216 ॻȂՄᄇϛѲጆኵԅޟᐠᄇϛϤጆޟ43.75 ॻȂܚоᄇϛϲጆᇄϤጆޟிߜ ৯ȂШᄇϛϤጆᇄѲጆޟ৯ौσுӻȂᎌᔖڒኵխிߜޟђȂܚо௴Ң ҁПၼᆗܜσڏᎌᔖڒኵޟ৯Ȅ
ٽϲȈٽΙϞ३ڙ酶A+B ϸസϞωаࢲߝ࡙Ϸտ࣏ 2ȃ3ȃ3ȃ4ȃ4ȃ4ȃ5Ȃٽ ϤϞ३ڙ酶A+B ϸസϞωаࢲߝ࡙Ϸտ࣏ 1ȃ2ȃ3ȃ3ȃ4ȃ5ȃ7ȂШၶڍޱȂ ڏϛ2ȃ3ȃ3ȃ4ȃ5 ϤএኵࣺӣȂ࢈ڎԤࣺӣߝ࡙Ϟωаࢲᖂኵ࣏ 5Ȃܚоڏ ᎌᔖڒኵ࣏52 = 25Ȅ
ॎᆗᎌᔖڒኵޟᅋᆗݲ࡞ᙏȂӑڍޱޟߝ࡙ωڗσ௶וȂӔցҢ
խӫځ௶וݲ(merge sort)ޟӫځၼᆗȂωڗσ௭ජΙԩȂ։џײюࣺӣኵޟ ӒഋᄇȂॎᆗڏᄇᖂኵࡣӔҁП։џȄ
3-2-3 染色體互換
Ӱ࣏௴ҢוኵߒҰݲȂܚоՓᡝϣϷ࣏ڍഋϷȂΙוኵө໔ޟҺȂ ѪΙஅᙃኵޟҺȄוኵө໔ޟҺПԒᇄ༈ಛޟGA ҺПԒࣺӣȂ։ڥ
༄ኵоҡҺᘈ(crossover point)ȂҺᘈѡПϞࡣࢲוኵө໔ϣ։џȄ அᙃኵࠌѪڥ༄ኵؚۡ֏ҺȂஅᙃኵޟҺོഅԙܚфߒޟ௶
ӖׇӒᢎȂᡐଢ଼ൽ࡙ႆσȂܚоஅᙃኵҺޟีҡᐠϚۣЊσȄ
ٽΜȈ۽៉ٽѲȄ௶ӖX ޟוኵө໔(1 1 3 1 1)Ȃஅᙃኵ = 1 ਢȂфߒ௶Ӗ(1 2 5 3 4)Ȃ௶Ӗ Y ޟוኵө໔(4 4 1 2 1)Ȃஅᙃኵ = 2 ਢȂфߒ௶Ӗ(5 1 2 4 3)Ȅष Һᘈϭܻ՝ည2 ᇄ 3 ϞȂࠌ X ޟוኵө໔ᡐԙ(1 1 1 2 1)ȂY ޟוኵө໔ᡐ ԙ(4 4 3 1 1)Ȃ։וኵө໔ࡣࢲ(1 2 1)ᇄ(3 1 1)ϣȂX ᡐԙ(1 2 3 5 4)ȂY ᡐԙ(5 1 4 2 3)ȂѫԤࡣࢲέኵԅޟ௶ӖԩוᡐȄ
षӔX ᇄ Y ޟஅᙃኵϣȂ։ X ޟஅᙃኵᡐ࣏ 2ȂY ޟᡐ࣏ 1Ȃࠌ X ᡐԙ(2 3 4 1 5)ȂY ᡐԙ(4 5 3 1 2)ȂᇄୈՓᡝϣࠉࣺШȂؐএ՝ညΰޟኵԅ ӒϚӣȄ
3-2-4 基因突變
அӰएᡐޟПԒᇄ༈ಛGA एᡐࣺխȂӵוኵө໔ᇄஅᙃኵϛᓍᐠڥюΙ এኵԅȂӔڥ༄ኵڥфϞ։џȄሯಒӫוኵө໔ᇄஅᙃኵޟጒ൜Ȃ։וኵ ө໔i ᆰኵϭܻ 1 ᇄ n – i + 1 ϞȂஅᙃኵࠌϭܻ 1 ᇄ k ϞȂڏϛ k ࣏ அᙃ௶ӖᖂኵȂ၏َ3-2-1 ωȄ
ٽΤȈ۽៉ٽΜȄष௶ӖX ޟוኵө໔ޟΠᆰ࡙ᡐԙ 2Ȃ։וኵө໔ᡐԙ(1 2 3 1 1)Ȃஅᙃኵϫ࣏ 1Ȃࠌ X ᡐԙ(1 3 5 2 4)Ȅ
第四章 程式設計
4-1 改良 GA 演算法
ٷྱყ4 ޟ GA ᅋᆗ؏ 4.3Ȃ
4.3 Select other individuals at random and replace them with the new individuals.
ᓍᐠӴ్ߨᆍфߒȂٮоཱིҡΡοڥфϞȄԪ؏ᗴൕӵᐌএ໌Ͻႆแ ϛȂΡοᖂኵϚᡐȄषоҡސ໌Ͻޟُ࡙پࣼȂᓍᐌᡝΡοޟᎌᔖΨംᅚඪ ଽȂΡοᔖ၎ོᓍϞቨёȇІϞȂषΡοޟᎌᔖ࡙ϚيȂࠌኵҬོЍȄӵՌ ณࣨϛΡοኵҬᔖ၎ོᡐଢ଼ȂՄߨھۡȄՄи࿋ཱིҡΡοኵӻܻߨᆍфߒ ਢȂһฒݲஈყ3 ޟ؏ 4.3Ȃ࢈҆ёоওғȄ
ӵҏМϛȂҐೝᒵ࣏ᆍфߒޱȂӰஅӰฒݲ༈ሎڗήΙфȂแԒ೩ۡ
ڏӒഋ్ȄЍޟΡοଶΟџՌཱིҡΡο၄шϞѴȂىᆍԱ(َ 4-3 ω) ϱޟᓺؾএᡝζོёΣȄҏМيყ4 ޟྥ GA ᅋᆗ؏Ȃԃყ 6 ܚҰȄ
Ϥണଆ፣يࡣޟਝઉȄ
1. Choose the initial population size.
2. Choose the number of generations NG.
3. Initialize the population record the fitness of each individual.
4. Repeat the following for NG generations:
4.1 Copy the outstanding individuals, whose fitness is above a given level, into the breeding pool.
4.2 Select a given number pairs of individuals from the
population probabilistically after assigning each structure a probability proportional to observed performance.
4.3 Copy the selected individuals, then apply genetic operators (crossover and mutation) to them to produce new individuals.
4.4 Eliminate all other individuals. Add the new individuals and the individuals in breeding pool to the population.
4.5 Observe and record the fitness of the new individuals.
5. Output the fittest individual as the answer.
ყ6ȈيࡣޟஅӰᅋᆗݲแו
4-2 選擇配種代表
ӵՌณࣨϛȂᓺؾޟএᡝШၶৠܾײڗୋᖅȂܖᇳȂᓺؾޟএᡝШၶ ৠܾ֜ЕܒଡؑȂٮҡήΙфȄоᅋϽޟُ࡙پࣼȂᎌᔖΨၶޟএᡝȂᖅ
ήΙфޟᐠШၶଽȄӵஅӰᅋᆗݲϛȂᒵᐅᆍфߒޟᐠڙԤ࡞ӻᆍȂҏМ
௴Ңȶᎈዺᒵᐅݲȷ(roulette wheel selection) [8]Ȅყ 7 ᛳᎈዺᒵᐅݲώհ ন౩оᅋᆗݲПԒߒҰȄ
1. Compute the partial sum Sk =
∑
= k i
individual th
i of fitness
1
'
, for k = 1,… ,n,where n is the total number of the individuals.
2. Generate an integer random number (r) in [0,Sn].
3. Return 1 when r ≤ S1 and return k when Sk-1 < r ≤ Sk for k ≥ 2.
ყ7Ȉᛳᎈዺᒵᐅݲแו
ᆍПݲᙏܾҢȂՄиৠܾΟ၌Ȅԃყ 8 ܚҰȂᐌএᎈዺޟ७ᑖ࣏ SnȂ
೩1 ဴএᡝ(individual)ޟᎌᔖ࡙ശωȂ࢈ܚћ७ᑖശωȇ4 ဴএᡝޟᎌᔖ࡙
ശσȂܚћ७ᑖζശσȄᒵᐅᆍфߒȂ൷ᙽᎈዺৢॴᜪΙኺȂ७ᑖσޱೝ
ᒵϛᐠၶଽȄ
ყ8Ȉᛳᎈዺᒵᐅݲ
4-3 加速優化人口適應力
! ! ىᆍ(breeding)Ρώᖅޟ१ौѽȂᒵю੫տᓺؾޟএᡝҺȂџоפ
ഀஉىю౩དޟࠢᆍȄҏМЕ໌Ԫᆍᢎ܈Ȃ೩ॎΙএடߞԆܹ੫տᓺؾΡοޟى ᆍԱ(breeding pool)ȄӵแԒᔣ໌ϽޟႆแȂፒᇧؐжфᎌᔖ࡙ശσޟࠉέ ӪΡοܻىᆍԱϱȄؐҡΙএཱིжфȂ൷ΣىᆍԱϱಣᑖӨжфޟᓺؾΡ οȂᇄཱིжфޟᆍႆแȄٮ३ڙپՌىᆍԱޟᓺؾΡοϚு຺ႆ၎жфΡο ኵޟ30%Ȃо߳ؐΙжфஅӰޟᐿ੫ܒȄᇳ݂ԃ 21 ॲყ 9Ȅ
ցҢىᆍѽȂငᄂᡛᜌ݂ȂጂᄂёפᓺϽႆแȂϤണଆ፣Ȅ
4-4 演化停滯
GA ॎᆗႆแϛȂषҁ֯Ροᎌᔖ࡙ϚӔᓍжфᅋϽՄԤ໌ȂՄиএ տΡοᎌᔖ࡙ᗍܻΙमȂ։࣏ีҡᅋϽୄᅗȄีҡԪޑݷࡣȂ୲ԤᎬஅӰएᡐ ϗᡐΡοᎌᔖ࡙ȄծஅӰएᡐีҡᐠ࡞ճȂՄиσഋϷޟएᡐϚց
ܻᎌᔖ࡙ȂएᡐΡο࡞פ൷ೝ్Ȅ։ٺएᡐΡοᓺܻҁ֯ȂծएᡐΡኵ໔ᇄ ᖂΡοኵࣺШϞήཌϚٗၾȂ࡞ᜲᖂᡝΡοࠢ፴Ȅ࢈ΙีҡᅋϽୄᅗȂ൷࡞
ᜲԤᐠོॎᆗю๖ݎȄ
ོีҡԪᆍ౪ຫޟনӰӵᐌএжфϛȂσഋϷΡοޟஅӰϱొࣺӣȄܚᒝ அӰϱొࣺӣԤڍᆍџȂΙᆍஅӰׇӒࣺӣޱȂΠᆍӣלஅӰȄ
4-4-1 同值異形染色體向量
ҏМ௴ҢȶᘗшוဴߒҰݲȷජक़Փᡝө໔Ȃԃ 3-2-1 ܚक़ȄԪݲھณ П߯ஈՓᡝϣၼᆗȂծԤΙ੫ᘈȂ։Ιᆍ௶ӖџҢӻএϚӣՓᡝө ໔ߒႀȄҏМڏᆎ࣏ȶӣלȷՓᡝө໔ȂԃٽΞȈ
ٽΞȈ೩S’ = {(1 2 3 4 5)ȃ(2 3 4 5 1)ȃ(3 4 5 1 2)ȃ(4 5 1 2 3)ȃ(5 1 2 3 4)}Ȃ ࠌ௶Ӗ (1 3 5 4 2)џҢ அᙃኵ=1, ө໔(1 2 3 2 1) ߒҰ
ζџҢ அᙃኵ=2, ө໔(5 2 3 2 1) ߒҰ ζџҢ அᙃኵ=3, ө໔(4 1 2 1 1) ߒҰ ζџҢ அᙃኵ=4, ө໔(3 4 2 1 1) ߒҰ ζџҢ அᙃኵ=5, ө໔(2 3 1 2 1) ߒҰ
4-4-2 處理演化停滯問題
१ፒᇄӣלஅӰӵىᆍԱϱ࡞லَȂԪΝӰ࣏ᓺؾΡο࡞ৠܾೝࢅᒵю پԙ࣏ᆍфߒȂΙффᖅήپȂٲᓺؾΡοԤߖᒑᖅޟ༊өȂՄഅԙஅ ӰϚஊӻኺϽȂഅԙᅋϽୄᅗȄҏМ࣏Οএ୰ᚠȂ੫տ೩ॎΠএแԒȂΙএ แԒȂོւଶىᆍԱϱႆӻԤࣺӣஅӰޟᓺؾΡοȂѫ߳ΙএȄΠএแԒ
ོӵᓺؾΡο຺ႆىᆍԱΰ३ਢంଢ଼ȂւଶႆӻԤӣלஅӰޟᓺؾΡοȂ ѫ߳ΙএȄԃ21 ॲყ 9 ܚҰȄ
4-5 最佳解、完美解與正解
GA ΙᆍശٹϽཪ൶Ȃײڗޟ၌๎ᆎ࣏ശٹ၌(optimal solution)ȄΙૡ ᇳپȂശٹ၌ளΣҬڒኵ(object function)ܚுޟުޟഋശٹ(local optimum)ȂϚ߳ᜌӒശٹ(global optimum)ȄषளΣҬڒኵࡣܚுޟ
ӒശٹȂҏМᆎׇ࣏छ၌(perfect solution)Ȅо GA ؑ၌ DDPȂڏҬڒ ኵ࣏3-2-2 ωϭಝޟᎌᔖڒኵȂάӰ࣏ҏ፣Мޟᎌᔖڒኵۡဎ࣏ȶڎԤࣺӣ ߝ࡙ϞA+B ϸസϞωаࢲᖂኵޟҁПȷȂ࢈ᎌᔖڒኵޟΰ३fmaxٱӑџޣ ޟȄԃٽΙϛȂϐޣ३ڙ酶A+B ϸസϞωаࢲߝ࡙Ϸտ࣏ 2ȃ3ȃ3ȃ4ȃ4ȃ4ȃ 5ȂӈդΙಢׇछ၌ҡޟ A+B аࢲኵһᔖ࣏ 2ȃ3ȃ3ȃ4ȃ4ȃ4ȃ5Ȃڏᎌᔖ࡙
࣏72=49Ȃ։ fmax=49Ȅӵ၌ᚠਢȂѫौॎᆗܚுޟശٹ၌ޟᎌᔖ้࡙ܻ fmaxȂ
൷џо߳ᜌএശٹ၌൷ׇछ၌ȄแԒ೩ॎӑॎᆗюfmaxȂᔮࢥؐфϛܚԤ Ροᎌᔖ࡙֏࣏fmaxȂषԤ։ຜ࣏ײڗׇछ၌Ȅ
ᗶณҏ፣МПݲؑுޟശٹ၌൷ׇछ၌Ȃծಒӫfi= fmaxׇछ၌ϚѫΙএȂ
։ٺؑுΟׇछ၌ȂζϚ߳ᜌ൷DDP ୲Ιޟғ၌(exact solution)Ȃ DDP ޟӻ१၌୰ᚠ[7]Ȅ
4-6 程式參數
ҏМ೩ॎϞแԒӓԤΜՌॏኵȂ၏क़ԃߒΙȈ
ኵӪᆎ ኵ ᇳ ݂
ratio_to_be_parent 80%
ᖅΡοϛࢅюپ࿋ᆍфߒޟШٽS_crossover_probability 1%
ҺஅᙃוဴϞᐠmutation_probability 3%
அӰएᡐϞᐠinitial_population_no 200
ߑۖΡοኵҬuplimit_of_population 200
Ροΰ३ratio_of_terminator_to_population 30%
پՌىᆍԱޟᓺؾΡοϚு຺ႆ၎жф ΡοኵޟШٽߒΙȈแԒՌॏኵӖߒ
4-7 系統開發環境
ҏМแԒоMATLAB ᇭِีȂӵএΡႫသΰஈȄفಛีᕗცԃήȈ
ᡝȈAcer Travelmate 351TE, Intel PIII 800MHz CPU/376MB RAM/20GB H.D.
հཾفಛȈWindows XP
ॎᆗفಛȈMATLAB 6.1 Release 12
4-8 程式流程
ყ9 ࣏فಛแԒࢺแყȂแԒጆၷܻߣᓃέȄ
ყ9ȈแԒࢺแყ
第五章 結果與討論
5-1 程式執行結果 5-1-1 實驗數據
ҏМᇧհέಢॎᆗ໔σωϚӣޟᚖ酶ϸᄂᡛኵᐃȂԃყ10 ܚҰȄ࣏П߯Ο၌
ॎᆗ໔σωȂA ᇄ B ޟаࢲኵ໔೩࣏ࣺӣ nȂn ཕσфߒॎᆗ໔ཕσȄ
甲. n=16, ॎᆗ໔ (16!×16!)ʝ(2!7×3!2×4!) =3.96×1021 A, B, A+B аࢲߝ࡙ၥਟԃٽΠȄ
乙. n=30, ॎᆗ໔ 30!×30!ʝ(2!7×3!6×4! 2×6!) = 2.84×1052 AȈ {1,1,2,2,2,2,2,2,3,3,5,5,5,7,7,7,9,9,9,10,12,12,14,15,21,21,21,26,32,33}
BȈ {1,1,1,1,3,3,3,3,4,4,4,5,5,5,8,8,9,10,10,12,12,13,14,18,19,21,24,26,26,27}
A+BȈ{1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5,5,5,5,5,7,7,7,7,7,8,8,8,8,!
8,9,9,9,9,10,11,12,13,18,21}
丙. n=100, ॎᆗ໔ 100!×100!ʝ(2!19×3!18×4!8×5!4 ×6! ×9!)= 2.74×10268 AȈ { 1,1,1,1,1,2,2,2,2,3,3,3,4,5,5,5,5,6,6,6,7,7,7,7,7,7,8,8,9,9,9,10,10,10,10,10,11,11,11,11,12,13,
13,13,13,14,14,14,15,15,15,16,16,17,17,17,18,18,18,19,20,20,20,20,20,21,21,21,22,22,23,24,
24,24,25,25,27,29,30,30,31,31,31,34,34,35,35 ,37,38,42,43,44,46,48,53,57,60,64,76,143}
BȈ {1,1,1,1,1,2,2,3,3,3,3,3,3,3,3,3,4,4,5,5,6,6,6,6,7,7,8,8,8,8,9,10,10,11,11,11,11,11,12,12,12,12,12, 13,13,14,14,14,15,16,16,16,17,17,17,18,18,18,19,19,19,21,21,22,23,24,26,26,26,26,27,28,30,31, 31,31,32,32,32,33,34,34,34,35,35,37,37,38,39,40,40,44,50,50,54,56,63,64,64,69}
A+BȈ{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,
4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,9, 9,9,9,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,12,12,12,12,12,12,13,13,13,13,14,
14,14,14,14,14,15,15,15,15,16,16,16,17,17,17,17,17,18,18,18,19,19,19,19,20,20,20,20,20,21,22, 22,23,23,24,24,24,24,25,26,27,28,29,30,30,30,31,33,33,34,38,42,44,48}
ყ10Ȉᚖ酶ϸᄂᡛኵᐃ
5-1-2 實驗方法
ؐಢᄂᡛஈኵ̈ԩȂࢥᡛ֏ײڗശٹ၌ȉᓃᅋϽюΙএശٹ၌ޟж фኵҬЅแԒॎᆗਢȂแוԃყ11Ȉ
ყ11Ȉᄂᡛแו
5-1-3 結果
ؐΙಢᄂᡛஈแԒኵ̈ԩȂࢥᡛ֏џײڗശٹ၌ȉٮᓃᅋϽю
Ιএശٹ၌ޟᖅжфޟԩኵᇄܚሯਢȄߣᓃΙ࣏ᄂᡛ๖ݎȂಛॎࡣᐌ౩ԃ ߒΠȈ
Ҭʞᄂᡛಢտ
Ҧ. n = 16 Κ. n = 30 е. n = 100
ᖅжфΰ३
100 300 300ᄂᡛஈԩኵ
57 ԩ 20 ԩ 20 ԩײڗശٹ၌ԩኵ/ԻϷШ
55 ԩ / 96.5% 7 ԩ / 35% 0 ԩײڗΙএശٹ၌ޟᖅжфኵҬ
(ҁ֯)
38.3 ф 229.7 ф -
ײڗΙএശٹ၌ޟਢ(ҁ֯)
49.3 ऌ 2,836.9 ऌ -ߒΠȈแԒஈ๖ݎ
5-2 結論
5-2-1
以GA 求解 DDP 的可行性ߒΠᡗҰоGA ؑ၌ DDP ጂᄂџȂӵ n=16 ਢџײڗശٹ၌ޟᐠ࣏
96.5%Ȃծᖅжфΰ३೩࣏ۡ 100 фޟ๖ݎȂष፡ᐌแԒኵȂඪଽᖅ
жфΰ३ȂᡱGA ៉ᅋϽȂ൷џо߳ᜌΙۡײڗശٹ၌ȄծএПݲϫ Ԥڏ३ڙȂࢋn ᡐσਢȂஈਢོ֕ࡾኵࡨቑቨёȄٽԃ࿋ n=30 ਢȂӵ 300 এжфϛײڗശٹ၌ޟᐠѫԤ 35%Ȃҁ֯ؑ၌ਢघ 45 Ϸមȇn=100 ਢ
൷ฒݲጂۡ֏ӵӫ౩ޟਢϱײڗശٹ၌ȂᗙฒݲಳᚔNP-hard ޟ႒ ڙȄծҡϽᏰড়ӵᄂᡛࡉᄂሬоᚖ酶ϸᛲᇧ३ڙ酶ᗖყਢȂலٺҢӻಢϚ
ӣޟᚖ酶ಢӫȂӑӨտ१ಢӨএᚖ酶ϸᄂᡛȂณࡣӔӫځ१ಢȄԃ[21]МᝦϛȂ
൷ٺҢΟ6 ᆍ酶Ȃ7 ಢᚖ酶ϸȂՄڏϸюޟаࢲኵ࣏ 2~7 ࢲȂҥԪџޣ n =16 ϫџᆗӫ౩ᄂҢޟኵԅȂџᅖٗᄂᡛࡉޟሯौȄ
5-2-2 執行效能
Waterman ӵ[14]ϛඪڗоᔣଝЬݲؑ၌ DDPȂڏᄂᡛኵᐃᇄ 5-1-1 ޟ Ҧಢॎᆗ໔ࣺ࿋Ȃࣱ࣏(16!×16!)ʝ(2!7×3!2×4!)ȄоԪПݲӵ 1635 এଟײ ڗΙಢശٹ၌ȂծМᝦϛٮҐࡾюஈਢȄҏ፣Мҁ֯ӵ38.3 фײڗ
Ιಢശٹ၌ȂஈਢघΙϷមȂਝۦӵӫ౩ጒ൜ϱȄ
5-3 討論
5-3-1 加速演化的動力
ᅋϽޟଢ଼ΨپՌЈᐅȂ్৵ޱȂ߳ޱоᖅήΙфȄौёפᅋϽഀ
࡙Ȃ൷҆ёЈᐩޟΨ࡙Ȅҏ፣Мᇯ࣏ёޟПөԤΠȈΙᝒਿ్ȇ
ΠඪଽஅӰޟӻኺܒȄᝒਿ్џٞഀӛ፴அӰўଶȂඪଽஅӰӻኺܒ ࠌџጂ߳ΡοᖅਢҡӻᆍஅӰޟཱིҡфȄծڍএПөधၾՄႻȂᝒ
ਿ్৵ޱ༖ོ҆ЍΡοȂஅӰޟӻኺܒζོᓍϞ६ճȄԃդঙΠޱȂҏ МඪюΠএПݲȂ
ΙȈ్ܚԤҐೝᒵ࣏ᆍфߒޟΡοȂԃ4.1 ωܚक़Ȅ
ΠȈ ӵىᆍԱ߳ᓺؾΡοޟஅӰȂٮىᆍԱϱΡο၄шڗΡοϛ
Ȃԃ4.3 ωܚक़Ȅ
5-3-2 育種的效果
षߒΙแԒኵϛپՌىᆍԱޟᓺؾΡοϚு຺ႆ၎жфΡοኵޟШ ٽ६࣏0ȂᜰഖىᆍђȂڏтኵϚᡐȄо 5-1-3 ωϛޟҦಢᄂᡛኵᐃก ၐȂுڗӵ57 ԩஈϛԤ 38 ԩџײശٹ၌Ȃᐠ࣏ 66.7%ȄײڗΙএശٹ
၌ޟҁ֯ᖅжфኵ࣏54.5Ȃҁ֯ૉਢघ 60.9 ऌȄߣᓃΠ࣏ᄂᡛ๖ݎȄ๖ ݎᇄԤىᆍᐠڙޟᅋϽШၶԃήȈ
֏ىᆍ
Ҭ
Ԥ ฒ
ᖅжфΰ३࣏ 100Ȃײڗശٹ၌ޟᐠ 96.5% 66.7 %
ײڗΙএശٹ၌ޟᖅжфኵҬ(ҁ֯) 38.3 ф 54.5 ф
ײڗΙএശٹ၌ޟਢ(ҁ֯) 49.3 ऌ 60.9 ऌ
ߒέȈىᆍᇄฒىᆍϞਝݎШၶ
ҥߒϛ๖ݎџࣼюȂёΟىᆍᐠڙࡣײڗΙএശٹ၌ޟᖅжфኵҬ
54.5 ЍՍ 38.3ȂॎᆗਢՌ 60.9 ऌЍՍ 49.3 ऌȄጂᄂᜌ݂ىᆍጂᄂ џоԤਝёഀᅋϽȄ
第六章 建議
ӵҡސၥଉሴϛȂငலю౪NP-hard ୰ᚠȂоЅሯौσ໔ၥਟ౩ޟ୰
ᚠȄӰԪȂငலሯौૉາ೨ӻॎᆗਢᇄᏹᓽԆުȄ࣏ΟᅖٗٲሯؑȂϷ ඹԒ౩ޟճԙҏȃଽॎᆗ໔ᇄᏹުޟϷඹᓽԆ้੫ܒȂխн௰ଢ଼ҡސ ၥଉޟंـᇄᔖҢޟശٹᒵᐅȄ
ีDDP ޟ GA แԒȂרঈंـҡސၥଉሴޟΙ؏ȄڏҬޟהఖ ᙤҥԩޟკၐȂี௨ҡސၥଉޟϷඹԒ౩ӵᡝȃᡝᕗცܖᅋᆗݲϚٗϞ
Ȃоୈ࣏ҐپีޟஅᙃᇄငᡛȄᄇܻРࡣޟีȂҏ፣Мඪюоή࡚ដȈ
6-1 計算大量人口的分散式系統
ԃ5-3-1 ܚक़ȂΡοஅӰޟӻኺܒёഀᅋϽଢ଼ΨپྛϞΙȂ࢈ቨёΡοኵ ໔ёഀᅋϽȄծႫသॎᆗ໔ོᓍΡοኵ໔ቨёՄᡐσȂؑ၌ഀ࡙Ϛَுོ
ᡐפȄषีϷඹԒفಛȂσ໔ΡοϷඹڗυفಛॎᆗȂᔖ၎၌ؚএ୰ᚠȄ ϷඹԒհཾᕗცȂڏᡝᕗცശலَޟMPI ᇄ PVM فಛȄPVM فಛඪټ к(master-slave)ϷඹԒ౩ޟ࢜ᄺȄߨலᎌӫีϷඹԒ GAȄڏϛкแ (master process)џо॒ҡଔۖΡοȃӒഋΡοϛᒵᐅᆍфߒᇄоཱིҡф ڥфᆍфߒоѴޟഋϷএᡝ้ώհȇแ(slave process)џо॒ॎᆗഋϷএ ᡝޟᎌᔖڒኵȃஅӰҺȃஅӰएᡐ้ώհȄкȃแϞޟၥਟଉȂџ оٺҢPVM ޟ༈ଚ(send)ᇄԝ(receive)ࡾхׇԙȄ૭кȃแޟຎᔣแԒ ጆ(pseudo code)Ϸտਪቸܻყ 12 ᇄყ 13Ȅ
Procedure master
Generate initial population.
For I = 1 to NG do /* NG is the number of generations */
Send 1/n of individuals to each slave. /* n is the number of slaves */
Receive the fitness values of the individuals from slaves.
Select a given number pairs of individuals from the population.
Send 1/n of selected pairs of individuals to each slave.
Receive individuals from slaves. /* genetic operators are applied */
Select other individuals at random and replace them.
End For
Output the fittest individual as the answer.
End master
ყ12Ȉॎᆗσ໔ΡοޟϷඹԒفಛкแޟຎᔣแԒጆ
Procedure slave
Receive individuals from master.
Compute the fitness value for each individual.
Send the fitness values to master.
Receive the pairs of individuals from master.
Apply the crossover operator to the pairs of individuals.
Apply the mutation operator to individuals randomly.
Send these individuals to master.
End slave
ყ13Ȉॎᆗσ໔ΡοޟϷඹԒفಛแޟຎᔣแԒጆ
6-2 育種式的分散式系統
ԤᠧܻىᆍџоԤਝёפᅋϽഀ࡙Ȃҏ፣МඪюѪΙᆍϷඹԒفಛ࢜ᄺȂ
ىᆍ྅܈ઽΣ೩ॎȄแԒ೩ॎϛแџׇӒᐿҳᅋϽȂҡଔۖΡοȃॎᆗএ ᡝޟᎌᔖڒኵȃᒵᐅᆍфߒȃஅӰҺȃஅӰएᡐȃоཱིҡфڥфᆍфߒȄ ӵкแϛёΣىᆍԱȂแϛޟᓺؾΡοϛܻԪȂкแޟߑۖΡο൷ پՌىᆍԱȂؐфᅋϽ్ܚԤߨᆍфߒΡοȂ၄шоىᆍԱϱޟΡοȄყ 14 ࣏ىᆍԒϷඹفಛޟҰཎყȄყ 15 ᇄყ 16 ࣏кȃแޟຎᔣแԒጆȄ
ყ14ȈىᆍԒϷඹفಛҰཎყ
Procedure master
Send a start command to each slave.
Receive individuals from every slave, store them into the breeding pool.
When all the salves’ generation no. are more than SNG, begin following steps:
/* SNG is a specific value which represents the no. of generation of the slaves */
Generate the initial populations from the breeding pool.
For I = 1 to NG do /* NG is the number of generations */
Compute the fitness value for each individual.
Select a given number pairs of individuals from the population as parents.
Apply the crossover operator to the pairs of individuals.
Apply the mutation operator to individuals randomly.
Eliminate the non-parent individuals and resupply from the breeding pool.
End For
Output the fittest individual as the answer.
Send a stop command to each slave.
End master
ყ15ȈىᆍԒޟϷඹԒفಛкแޟຎᔣแԒጆ
Procedure slave
Receive start command from master.
Generate initial population.
Repeat the following procedures until the stop command from master Compute the fitness value for each individual.
Select a given number pairs of individuals from the population.
Apply the crossover operator to the pairs of individuals.
Apply the mutation operator to individuals randomly.
Select other individuals at random and replace them.
Send the best Ns individuals to master’s breeding pool.
/* Ns is a specific value which represents the no. of super individuals */
Send the generation no. to master.
End slave
ყ16ȈىᆍԒޟϷඹԒفಛแޟຎᔣแԒጆ
參考文獻
1. A.V. Grigorjev, A.A. Mironov, Mapping DNA by Stochastic Relaxation: A New Approach to Fragment Sizes, Applic. Biosci., pp. 107-111.
2. D. H. Ballard. An Introduction to Natural Computation. The MIT Press, 1997.
3. Davis L., Handbook of genetic algorithms. Van Nostrand Reinhold. 1991.
4. E. Southern. United Kingdom patent application GB8810400. 1988.
5. H.O. Smith and K.W. Wilcox, A restriction enzyme form Hemophilus in fluenzae.
I. P urification and general properties. Journal of Molecular Biology, pp. 51:
379 - 391, 1970.
6. J. Setubal and J. Meidanis. Introduction to Computational Molecular Biology.
PWS Pub lishing Company, 1997.
7. J. Holland. Adaption in Natural and Artificial Systems. The University of Michigan Press, 1975.
8. K.F. Man, K.S. Tang and S. Kwong, Genetic Algorithms Concepts and Designs, Springer-Verlag London Limited, 1999.
9. K.J. Danna, G.H. Sack, and D. Nathans. Studies of simian virus 40 DNA. VII. A cleavage map of the SV40 genome. Journal of Molecular Biology, pp. 78:263 - 276, 1973.
10. L. Goldstein and M. S. Waterman. Mapping DNA by stochastic relaxation.
Advances in Applied Mathematics, pp. 8:194-207, 1987.
11. L.W. Wright, J.B. Lichter, J. Reinitz, M.A. Shifman, K.K. Kidd, and P.L. Miller, Computer-Assisted Restriction Mapping: An Integrated Ap proach to Handling Experimental Uncertainty, CABIOS, Vol. 10, No. 4, pp. 443-450, 1994.
12. M. Krawczak, Algorithms for the Restriction-Site Mapping of DNA Molecules, Proc. Natl. Acad. Sci. USA, 85, pp. 7298-7301, 1988.
13. M. Stefik, Inferring DNA Structures from Segmentation Data, Artif. Intell., 11, pp. 85-114, 1978.
14. M.S. Waterman, Introduction to Computational Biology: Sequences, Maps, and Genomes, Chapman Hall, page 78, 1995.
15. Ming-Yang Kao, Jared Samet, and Wing-Kin Sung. The Enhanced Double Digest Problem for DNA Physical Mapping. Proceedings of the 9th
Scandinavian Workshop on Algorithm Theory (SWAT 2000), 383--392, 2000.
16. P. Tuffery, P.Dessen, C. Mugnier, and S. Hazout, Restriction Map Construction Using Complete Sentences Compatibility Algorithm, Comput. Applic. Biosci., 4, pp. 103-110,1988.
17. Venter JC, Adams MD, Myers EW, et al. The sequence of the human genome, Science, 291, pp. 1304-1351, 2001.
18. W. R. Pearson. Automatic Construction of Restriction Site Maps, Nucleic Acids Res., 10, pp. 217-227, 1984.
19. W. Schmitt and M.S. Waterman, Multiple Solutions of DNA Restriction Mapping Problems, Adv. Appl. Math., 12, pp. 412-427, 1991.
20. W.M. Fitch, T.F. Smith, and W.W. Ralph, Mapping the Order of DNA Restriction Fragments, Gene, 22, pp. 19-29, 1983.
21. Yanhua Peng, Fuhua Yang, Yipeng Qi, Yongxin Huang, Location, Sequence and Promoter Structure of Apoptosis-Inhibiting Gene of Leucania seperate Nuclear Polyhedrosis Virus, VIROLOGICA SINICA, No.1 vol.14 1999.
22. Yin Zhang and Zhijun Wu, Solving the Double Digestion Problem as a Mixed-Integer Linear Program, Technical Report (TR), Technical Report TR01-12, Department of Computational and Applied Mathematics, Rice University, 2001.
23. Z. Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs.
Third Edition. Springer, 1996.
附錄一
甲組:
n = 16乙組:
n = 30丙組:
n = 100有最佳解 繁殖世代數 執行時間 有最佳解 繁殖世代數 執行時間 有最佳解 繁殖世代數 執行時間
Yes 54 281.05 No 301 5590.8 No 301 29865.4 Yes 27 137.65 Yes 89 1662.6 No 301 23089.4
Yes 18 92.393 No 301 5534.9 No 301 23032.3 Yes 31 160.44 No 301 5585.3 No 301 25189.5 No 101 514.02 No 301 5569.1 No 301 24827.1 Yes 55 283.06 No 301 5884.4 No 301 24360.9 Yes 22 111.87 No 301 5387.7 No 301 23036.4 Yes 33 166.68 No 301 5417.2 No 301 23560.1 Yes 66 337.38 No 301 5323.9 No 301 24966.2 Yes 38 196.22 No 301 5365.2 No 301 25823.8 Yes 28 142.38 No 301 5417.3 No 301 23068.3 Yes 19 95.808 No 301 5400.5 No 301 25139.7 Yes 32 163.78 Yes 170 3078.1 No 301 25628.3 Yes 21 106.4 No 301 5436.6 No 301 25633.9 Yes 72 368.81 No 301 5337.2 No 301 24438.9 Yes 59 303.17 Yes 251 2896.5 No 301 24808.4 Yes 92 473.69 Yes 251 2925.9 No 301 25486.9 Yes 70 353.62 Yes 251 2995.7 No 301 25950.7 Yes 17 85.383 Yes 251 2895.7 No 301 23953.2 Yes 19 97.18 Yes 251 2877.4 No 301 24241.1 Yes 40 199.69
Yes 30 151.47 Yes 9 42.862 Yes 33 168.51 Yes 20 102.13 Yes 93 480.58 Yes 38 195.79 Yes 53 270.44 Yes 28 142.86 Yes 68 350.92 Yes 62 316.07 Yes 39 201.66 Yes 35 177.83 Yes 44 250.29 Yes 54 289.03 Yes 34 175.5 Yes 46 239.09 Yes 32 164.61 Yes 22 113.35 Yes 26 134.48 Yes 44 227.53 Yes 7 32.737 Yes 38 198.01 Yes 4 17.576
No 101 505.84 Yes 43 221.02 Yes 42 216.35 Yes 16 79.995 Yes 21 106.93 Yes 61 313.2 Yes 28 142.43 Yes 40 204.52 Yes 55 281.45 Yes 56 286.85 Yes 47 244.72 Yes 4 17.626
附錄二
無育種機制:
n =16有最佳解 繁殖世代數 執行時間
Yes 43 47.328 Yes 97 108.27 No 101 112.24 Yes 61 68.389 Yes 26 29.392 Yes 73 80.976 Yes 64 72.094 Yes 70 78.003 Yes 56 62.63 Yes 77 84.822 No 101 111.76 Yes 48 53.227 Yes 34 37.995 No 101 111.17 No 101 109.81 No 101 124.16 No 101 114.05 No 101 107.8 Yes 15 17.285
No 101 113.32 Yes 58 65.725 Yes 97 107.86 No 101 112.19 No 101 111.1 Yes 9 10.205
No 101 113.83 Yes 55 62.83 Yes 25 27.82 Yes 53 59.285 No 101 110.7 Yes 85 94.256
No 101 111.09 No 101 111.16 Yes 98 108.72
No 101 110.99 Yes 22 24.666 Yes 79 87.366 Yes 87 95.137 Yes 51 57.552 Yes 6 6.78 Yes 26 29.332 Yes 53 60.657 No 101 111.16 Yes 54 60.266 Yes 20 22.893 Yes 79 88.187 Yes 25 28.331 No 101 112.81 Yes 87 98.341 Yes 27 30.894 Yes 70 78.833 Yes 84 93.935 Yes 60 66.125 No 101 110.54 Yes 81 89.81 Yes 17 19.218
附錄三
1. run.m 主程式
2. GA_for_DDP 基因演算法主程式。以巨集方式描述,由 run.m 主程式呼叫。
3. add_terminator_to_population 將最佳基因與優秀人口加入下一代人口中。
4. breeding.m 繁殖程式(基因交換與突變)。
5. check_repeat.m 巨集-檢查是否為重複向量。
6. clean_matrix.m 刪除人口矩陣內重複的向量。
7. clean_polymophism.m 刪除同值異形向量。
8. clean_zero.m 去掉第一個元素值為零的向量。
9. clean_matrix.m 刪除矩陣中重複的列。
10. crossover.m 染色體交換。
11. eliminate_population.m 淘汰非配種代表。
12. fitness.m 計算適應度值向量。
13. func_value_of_fittness.m 計算每一人口的適應度值。
14. GA_AB_FLA.m 依據 A_SEV 與 B_SEV 向量,計算出對應的 double digest 片段長度向量。
15. gene2road.m 由向量排列基因片段。
16. initial_population.m 產生初始人口。
17. limit_population_no.m 世代交替,維持人口數目 (如果人口數超過人口上 限數,就淘汰適應值低者,直到人口數不超過人口上限)。
18. mutation.m 突變。
19. Noah_ark.m 消滅同質異形的人口,只保留一個同質代表。
20. restriction_map.m 排列出限制輿圖。
21. select_best_gene.m 挑選最佳基因並刪除重複者。
22. select_parent.m 挑選配種代表。
23. select_terminator.m 挑選出優秀人口並刪除重複者。
24. SEV2FO.m 將含基礎參數擴充基因向量(SEV)轉換成片段長度序列。
25. SEV2ROAD.m 將擴充基因向量轉換成片段排列。
26. size2FLA.m 指定一個數字以產生成片段長度集合。
27. size2SEV.m 指定一個數字以產生成擴充基因向量。
28. update_terminator.m 篩選優秀人口。
% 1. DDP-GA 主程式 clear all;
format short;
load GA_DATA;
% 宣告全域變數
% =========================================================================
global total_fregment_length; % <--- DNA 總長 global A_FLA; % <--- A 片段長度集合 global B_FLA; % <--- B 片段長度集合 global AB_FLA; % <--- A+B 片段長度集合 global length_of_A_FLA; % <--- --- A 片段數量 global length_of_B_FLA; % <--- B 片段數量 global length_of_AB_FLA; % <--- A+B 片段數量 global S_crossover_probability; % <--- 基本參數 2 ===> 交換基礎序號之機率 global mutation_probability; % <--- 基本參數 3 ===> 基因突變之機率 global max_fittness; % <--- 最大適應度值 global no_of_terminator; % <--- 繁殖池人口數 global mark_of_to_be_parent; % <--- 選為配種代表標記
global ratio_of_terminator_to_population; % < 基本參數 6 ====> 繁殖池人口占總人口數目的比例上 限
%==============================================================================
ratio_to_be_parent=0.8; % <--- 基本參數 1 ===> 繁殖人口中挑出來當親代的比例 S_crossover_probability=0.01; % <--- 基本參數 2 ===> 交換基礎序號之機率
mutation_probability=0.03; % <--- 基本參數 3 ===> 基因突變之機率 initial_population_no=200; % <--- 基本參數 4 ===> 初始人口數目 uplimit_of_population=200; % <--- 基本參數 5 ===> 人口上限
ratio_of_terminator_to_population=0.3;% <--- 基本參數 6 ==> 繁殖池人口占總人口數目的比例上限 total_generation=100; % <--- 基本參數 7 ==> 繁殖世代上限
% 宣告變數 ===============================================================
value_of_fittness=0;
value_of_fittness_of_terminator=0;
terminators=zeros(1,length_of_A_FLA+length_of_B_FLA+2);
best_gene=zeros(1,1);
average_of_population=0;
average_of_terminator=0;
lowest_of_terminator=0;
no_of_population=0;
no_of_generation=1;
no_of_terminator=0;
no_of_best_gene=0;
% (1) 產生初始人口======================================================
populations=initial_population(initial_population_no);
%======================================================================
save run;
analysis_of_GA=zeros(1,length_of_A_FLA+length_of_B_FLA+2);
analysis_generation=zeros(1,2);
total_no_of_best_gene=0;
for run_no=1:57 tic
GA_for_DDP run_time=toc
[no_of_best_gene temp]=size(best_gene);
for i=1:no_of_best_gene
analysis_of_GA(total_no_of_best_gene+i,:)=best_gene(i,:);
end
total_no_of_best_gene=total_no_of_best_gene+no_of_best_gene;
analysis_of_generation(run_no,1)=no_of_generation;
analysis_of_generation(run_no,2)=run_time;
analysis_of_generation(run_no,3)=no_of_best_gene;
save analysis analysis_of_GA analysis_of_generation run_no total_no_of_best_gene clear all
load run load analysis populations=LALA run_no
end
save ANALYSIS_OF_DDP
% 2. GA_for_DDP GA 主程式。以巨集方式描述,由 run.m 主程式呼叫
while (no_of_generation<=total_generation), %(3) 計算每一人口的適應度
value_of_fittness=func_value_of_fittness(populations);
value_of_fittness_of_terminator=func_value_of_fittness(terminators);
%===================================================================
average_of_population=mean(value_of_fittness);
average_of_terminator=mean(value_of_fittness_of_terminator);
% (4) 挑選出優秀人口並刪除重複者。當優秀人口到達上限時,刪除同型異質基因。
terminators=select_terminator(value_of_fittness_of_terminator,terminators,value_of_fi ttness,populations);
% (4.5) 挑選最佳基因並刪除重複者
best_gene=select_best_gene(no_of_generation,value_of_fittness,populations);
% 如果找到答案就跳出來==============
if best_gene(1,1)~=0 break;
end
%===================================
[no_of_population temp]=size(populations);
% (6) 挑選配種代表
parents=select_parent(value_of_fittness,populations,ratio_to_be_parent);
% (7)(8) 基因交換與突變
new_generation=breeding(populations,parents);
% (9) 淘汰未配種的基因
populations=eliminate_population(new_generation,populations,parents);
% (10) 將優秀人口加入下一代人口中,最佳基因要先加進去
populations=add_terminator_to_population(best_gene,terminators,populations);
% (11) 世代交替,維持人口數目 (如果人口數超過人口上限數,就淘汰適應值低者,直到人口數不超過人口上限 '
populations=limit_population_no(populations,uplimit_of_population);
%=========================================================================
% 顯示每一代的數據
[no_of_population temp]=size(populations);
[no_of_best_gene temp]=size(best_gene);
if best_gene(1,1)==0 no_of_best_gene=0;
end
[int32(no_of_generation) int32(no_of_population) int32(no_of_terminator) int32(no_of_best_gene)]
a=average_of_population/max_fittness;
b=average_of_terminator/max_fittness;
[a b]
aaa(no_of_generation,1)=average_of_population/max_fittness;
aaa(no_of_generation,2)=average_of_terminator/max_fittness;
aaa(no_of_generation,3)=max(value_of_fittness)/max_fittness;
%===============================================================================
no_of_generation=no_of_generation+1;
end
% 3. 將最佳基因與優秀人口加入下一代人口中 ========================
function
populations=add_terminator_to_population(best_gene,terminators,old_populations);
global no_of_terminator;
[no_of_population temp]=size(old_populations);
[no_of_terminator temp]=size(terminators);
[no_of_best_gene temp]=size(best_gene);
if (terminators(1,1)~=0) & ~(isempty(terminators)) %有優秀人口才加上去 for i=1:no_of_terminator
old_populations(no_of_population+i,:)=terminators(i,:);
end end
[no_of_population temp]=size(old_populations);
if (best_gene(1,1)~=0) & ~(isempty(best_gene)) %有最佳基因才加上去 for i=1:no_of_best_gene
old_populations(no_of_population+i,:)=best_gene(i,:);
end end
[row_size temp]=size(old_populations);
old_populations=sortrows(old_populations);
% 去掉都為 0 的列 zero_row=1;
while old_populations(zero_row,1)==0, zero_row=zero_row+1;
end
populations=old_populations(zero_row:row_size,:);
if isempty(terminators) terminators=0;
end
% 4. 繁殖程式(基因交換與突變)
function new_generation=breeding(populations,parents);
global length_of_A_FLA;
global length_of_B_FLA;
[parent_couple temp]=size(parents);
for i=1:parent_couple
% 基因交換 ====================================
A_SEV1=populations(parents(i,1),1:length_of_A_FLA+1);
A_SEV2=populations(parents(i,2),1:length_of_A_FLA+1);
[crossover_A_SEV1,crossover_A_SEV2]=crossover(A_SEV1,A_SEV2);
B_SEV1=populations(parents(i,1),length_of_A_FLA+2:length_of_A_FLA+length_of_B_FLA+2);
B_SEV2=populations(parents(i,2),length_of_A_FLA+2:length_of_A_FLA+length_of_B_FLA+2);
[crossover_B_SEV1,crossover_B_SEV2]=crossover(B_SEV1,B_SEV2);
% 基因突變 ==================================
mutation_A_SEV1=mutation(crossover_A_SEV1);
mutation_A_SEV2=mutation(crossover_A_SEV2);
mutation_B_SEV1=mutation(crossover_B_SEV1);
mutation_B_SEV2=mutation(crossover_B_SEV2);
% ++++++++++++++++++++++++++++++++++++++++++
new_generation(i,1:length_of_A_FLA+1)=mutation_A_SEV1;
new_generation(i,length_of_A_FLA+2:length_of_A_FLA+length_of_B_FLA+2)=mutation_B_SEV1
;
new_generation(i+parent_couple,1:length_of_A_FLA+1)=mutation_A_SEV2;
new_generation(i+parent_couple,length_of_A_FLA+2:length_of_A_FLA+length_of_B_FLA+2)=m utation_B_SEV2;
end
% 5. 巨集-檢查是否為重複向量 check_repeat = 0;
for ii=2:7;
for jj=1:ii-1
if balls(ii)==balls(jj) check_repeat =1;
end end end
──────────────────────────────────
% 6. 刪除人口矩陣內重複的向量
function matrix=clean_matrix(old_matrix) [row_size column_size]=size(old_matrix);
if row_size<=1
matrix=old_matrix;
return;
old_matrix=sortrows(old_matrix);
% 去掉都為 0 的列 zero_row=1;
while old_matrix(zero_row,1)==0, zero_row=zero_row+1;
end
old_matrix=old_matrix(zero_row:row_size,:);
[row_size column_size]=size(old_matrix);
new_matrix=zeros(1,column_size);
new_matrix(1,:)=old_matrix(1,:);
is_the_same=zeros(row_size,1);
j=2;i=1;
while i<row_size while j<=row_size
if mean(new_matrix(i,:)==old_matrix(j,:))==1 j=j+1;
if j>row_size break;
end else
new_matrix(i+1,:)=old_matrix(j,:);
i=i+1;
end end
if j>row_size break;
end end
matrix=new_matrix;
──────────────────────────────────
% 7. 去掉同值異形向量
function cleaned_gene=clean_polymophism(gene) global length_of_A_FLA;
global length_of_B_FLA;
global A_FLA;
global B_FLA;
[no_of_gene column_no]=size(gene);
gene_FO_index=zeros(no_of_gene,length_of_A_FLA+length_of_B_FLA+1);
for i=1:no_of_gene
A_SEV=gene(i,1:length_of_A_FLA+1);
B_SEV=gene(i,length_of_A_FLA+2:length_of_A_FLA+length_of_B_FLA+2);
A_FO=SEV2FO(A_SEV,A_FLA);
B_FO=SEV2FO(B_SEV,B_FLA);
gene_FO_index(i,1:length_of_A_FLA)=A_FO;
gene_FO_index(i,length_of_A_FLA+1:length_of_A_FLA+length_of_B_FLA)=B_FO;
gene_FO_index(i,length_of_B_FLA+length_of_B_FLA+1)=i;
end
gene_FO_index=sortrows(gene_FO_index);
flag1=1;
flag2=2;
flag3=1;
index_of_reserve(1)=1;
while flag2<=no_of_gene, if
mean(gene_FO_index(flag1,1:length_of_A_FLA+length_of_B_FLA)==gene_FO_index(flag2,1:le ngth_of_A_FLA+length_of_B_FLA))==1
flag2=flag2+1;
else
flag1=flag2;
flag2=flag2+1;
flag3=flag3+1;
index_of_reserve(flag3)=flag1;
end end
no_of_cleaned_gene=length(index_of_reserve);
for i=1:no_of_cleaned_gene
cleaned_gene(i,:)=gene(index_of_reserve(i),:);
end
% 8. 去掉第一個值為零的向量
function new_matrix=clean_zero(old_matrix) [row_size column_size]=size(old_matrix);
row_of_new_matrix=0;
new_matrix=zeros(1,column_size);
for i=1:row_size
if old_matrix(i,1)~=0
row_of_new_matrix=row_of_new_matrix+1;
new_matrix(row_of_new_matrix,:)=old_matrix(i,:);
end end
% 9. 刪除矩陣中重複的列
function new_matrix=clear_matrix(old_matrix) [row_size column_size]=size(old_matrix);
old_matrix=sortrows(old_matrix);
new_matrix(1,:)=old_matrix(1,:);
is_the_same=zeros(row_size,1);
j=2;i=1;
while i<row_size while j<=row_size
if mean(new_matrix(i,:)==old_matrix(j,:))==1 j=j+1;
if j>row_size break;
end else
new_matrix(i+1,:)=old_matrix(j,:);
i=i+1;
end end
if j>row_size break;
end end end
% 10. 染色體交換
function [crossover_SEV1,crossover_SEV2]=crossover(SEV1,SEV2) global S_crossover_probability
% (0)
crossover_point=0;
n=length(SEV1);
while crossover_point==0,
crossover_point=ceil(n*rand)-1;
end
% (1)
crossover_SEV1=SEV1;
crossover_SEV2=SEV2;
for i=crossover_point+2:n crossover_SEV1(i)=SEV2(i);
crossover_SEV2(i)=SEV1(i);
end
% (2)
if S_crossover_probability==0
else if abs(double((rand-0.5)))/S_crossover_probability<=0.5 crossover_SEV1(1)=SEV2(1);
crossover_SEV2(1)=SEV1(1);
end end
% 11. 淘汰非配種代表==========================
function populations=eliminate_population(new_generation,old_populations,parents);
global length_of_A_FLA;
global length_of_B_FLA;
global mark_of_to_be_parent;
[no_of_population temp]=size(old_populations);
[parent_couple temp]=size(parents);
j=1;
next_generation=zeros(1,length_of_A_FLA+length_of_B_FLA+2);
for i=1:no_of_population
if mark_of_to_be_parent(i)>0
next_generation(j,:)=old_populations(i,:);
j=j+1;
end end
n=2*parent_couple;
[nextone temp]=size(next_generation);
for i=1:n
next_generation(nextone+i,:)=new_generation(i,:);
end
clear populations;
populations=next_generation;
% 12. 計算適應度值向量
function fittness_value=fittness(A_SEV,B_SEV) global A_FLA;
global B_FLA;
global AB_FLA;
global length_of_AB_FLA;
% (1)
if mean(A_SEV)==0 fittness_value=0;
return;
end
comp_AB_FLA=GA_AB_FLA(A_SEV,B_SEV);
% (2)
flag1=1;flag2=1;fittness_value=0;
length_of_comp_AB_FLA=length(comp_AB_FLA);
while (flag1<=length_of_AB_FLA)&(flag2<=length_of_comp_AB_FLA), if AB_FLA(flag1)>comp_AB_FLA(flag2)
flag2=flag2+1;
else if AB_FLA(flag1)<comp_AB_FLA(flag2) flag1=flag1+1;
else
fittness_value=fittness_value+1;
flag1=flag1+1;
flag2=flag2+1;
end end end
% (3)
fittness_value=fittness_value*fittness_value;
% 13. 計算每一人口的適應度值
function value_of_fittness=func_value_of_fittness(populations) global length_of_A_FLA;
global length_of_B_FLA;