୯ҥᆵεᏢႝᐒၗૻᏢଣၗૻπำᏢس ᅺγፕЎ
Department of Computer Science and Information Engineering College of Electrical Engineering and Computer Science
National Taiwan University Master Thesis
ၮҔკೀᏔቚीᆉЃᎿᒧሽޑਏૈ
Using GPU To Accelerate the Pricing of Parisian Options
ඵṓ
Chang Chih-Hsuan
ࡰᏤ௲Ǻֈػၰ റγ Advisor: Lyuu Yuh -Dauh, Ph.D.
ύ҇୯ 99 ԃ 7 Д July, 2010
୯ҥᆵεᏢᅺγᏢՏፕЎ
α၂ہቩۓਜ
ၮҔკೀᏔቚीᆉЃᎿᒧሽޑਏૈ
Using GPU To Accelerate the Pricing of Parisian Options
ҁፕЎ߯ඵṓ։ȐR96922087ȑӧ୯ҥᆵεᏢၗૻ
πำᏢ܌ֹԋϐᅺγᏢՏፕЎǴܭ҇୯ 99 ԃ 7 Д 5 В܍Π ӈԵ၂ہቩ೯ၸϷα၂ϷǴԜܴ
α၂ہǺ
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ȐᛝӜȑ ȐࡰᏤ௲ȑ
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
سЬҺǵ܌ߏ
! ! ! ! ! ! ! ! ! ! ! ȐᛝӜȑ
ᇞ ᇞᖴ
གᖴֈػၰ௲ӧ೭ය໔ϣऐЈӦ๏ךࡰᏤکගٮཀـǴᡣךૈճֹԋࣴ
ز܌ޑᏢکፕЎǶ
གᖴৎΓჹךޑЍکႴᓰǶЀځࢂࣁΑךޑ٣КךᗋाྠඊޑР҆ᒃǴך གྷᇥޑࢂǴᖴᖴգॺǴךΑǶ
ΨགᖴךаޑӕᏢॺǴૈόսܭ๏ךཀـکᐟᓰǴаϷӧፐፕЎǵϐࡕ
ծک҂ٰπբޑᡍϩ٦Ƕ
നࡕᖴᖴ܌ԖමӧౚഉՔךวࢻᓸΚޑဂౚ϶ॺǶᗋԖ൳Տ೭ࢤਔ ໔ϩ٦Ј٣کଆޑӳ϶ॺǶ
ᄔ ᄔा
ᒿႝတၮᆉೀૈΚӦቚуǴӧຑሽߎᑼ़ғ܄ࠔਔΨૈၲډ׳Ԗਏ
Ъ҅ዴޑ่݀ǹԶΨӢࣁߎᑼࠔϐीᆉሽԖਔԖਔ໔ޑ࡚ॐ܄ǴӢԜӵՖ Ҕႝတฯᡏکᄽᆉݤቚीᆉೲࡋک҅ዴ܄൩ԋࣁख़ाޑፐᚒǶ
ӧҁጇፕЎύךॺ٬ҔЃᎿᒧکკೀᏔٰٯηǴЃᎿᒧࣁ
ᅿᏱԖምᛖᒧ܄ޑᅿၡ৩࣬ᜢᒧǶLi ک Zhao ගрճҔғԋڄኧٰ
ीᆉЃᎿᒧޑሽǴךॺӧύѧೀᏔکკೀᏔჴբр၀БݤǶ่݀
߄ҢҗܭკೀᏔԖமεޑѳՉၮᆉૈΚǴӢԜӧՉਔ໔КύѧೀᏔ ϿӭǴЀځයኧॶຫεຫܴᡉǴԶനࡕሽکύѧೀᏔၮᆉࡕޑሽ൳ คৡ౦ǶΨӢԜךॺёа׳זೲӦᕇளЃᎿᒧޑሽԶόѨځᆒዴࡋǶ ᜢᗖӷǺЃᎿᒧǴምᛖᒧǴᒧຑሽǴკၮᆉൂϡǴीᆉӝး
่ᄬǴѳՉၮᆉǴғԋڄኧǶ
Abstract
As computing power increases, we can get faster and more correct results in pricing derivatives. How to improve speed and correctness by computer hardware and algorithms is an important issue because pricing financial products is often a time-critical task.
In this thesis we use Parisian options and GPUs as an example. Parisian options are path-dependent options with barrier-like features. Binb-Qing Li and Hai-Jian Zhao proposed to price Parisian options by generating functions. We implement this method in both CPUs and GPUs. The results show that the execution time used by GPUs is much smaller than those by CPUs because of their powerful parallel-processing capabilities, especially when number of periods grows bigger. As a result, we can price Parisian options faster and with accuracy.
Keywords: Parisian options, barrier options, option pricing, GPU, CUDA, parallel processing, generating function.
Ҟᒵ
α၂ہቩۓਜ……….i
ᇞᖴ………ii
ᄔा... iii
Abstract ... iv
Ҟᒵ... v
კҞᒵ... vii
߄Ҟᒵ... viii
ಃക ᒧᙁϟ... 1
1.1ᒧ ... 1
1.2 ምᛖᒧ ... 2
1.3 ЃᎿᒧ ... 2
1.3.1୷ҁཷۺ ... 2
1.3.2 ᅿᜪԄ ... 2
1.3.3܄ ... 3
ಃΒക ୷ҁᢀۺаϷπڀ... 4
2.1 ΒԄᒧۓሽኳࠠ ... 4
2.2 ғԋڄኧ ... 5
2.3 GPUک CUDA... 6
2.3.1 GPUᙁϟکᐕў ... 6
2.3.2 GPGPUک CUDA ... 7
ಃΟക ճҔғԋڄኧٰीᆉЃᎿᒧޑሽ... 11
3.1 ಔӝኧᏢࢎᄬ ... 11
3.2 ीᆉЃᎿᒧޑሽ ... 13
3.2.1 ೱុࠠЃᎿᒧ ... 13
3.2.2 ಕᑈࠠЃᎿᒧ ... 15
ಃѤക ჴբکኧᏵ่݀ፕ... 18
ಃϖക ่ፕ... 23
ୖԵЎ... 24
კҞᒵ
კ 2-1. යޑΒԄᒧۓሽኳࠠǶ ... 4
კ 2-2. n යޑΒԄᒧۓሽኳࠠǴS0ࣁ߃ۈሽǶ ... 5
კ 2-3. CPU ک GPU ϩձӧੌᗺኧၮᆉޑೲࡋ [8]Ƕ ... 7
კ 2-4. CUDA ࢎᄬ [8]Ƕ ... 8
კ 2-5. ThreadǵBlock ک Grid ޑᜢ߯ [8]Ƕ ... 9
კ 2-6. Ꮻᡏቫԛ [8]Ƕ ... 10
߄Ҟᒵ
߄ 1-1. ምᛖᒧϐϩᜪǶ ... 2 ߄ 4-1. CPU ک GPU ޑКၨǶ ... 18 ߄ 4-2. Кၨ Li ک Zhao аϷךॺӧ CPU ک GPU ჴբीᆉೱុࠠЃᎿᒧ
ޑሽޑ่݀Ƕ...19 ߄ 4-3. Кၨ Li ک Zhao аϷךॺӧ CPU ک GPU ჴբीᆉಕᑈࠠЃᎿᒧ
ޑሽޑ่݀Ƕ...19 ߄ 4-4. ӧ Չᆣࣁ 1 ਔǴCPU ک GPU ीᆉೱុࠠЃᎿᒧޑሽޑਔ໔ КၨǶ...20 ߄ 4-5. ӧ Չᆣࣁ 1 ਔǴCPU ک GPU ीᆉಕᑈࠠЃᎿᒧޑሽޑਔ໔ КၨǶ...20 ߄ 4-6. ӧ Չᆣࣁ 100 ਔǴCPU ک GPU ीᆉೱុࠠЃᎿᒧޑሽޑਔ ໔КၨǶ...21 ߄ 4-7. ӧ Չᆣࣁ 100 ਔǴCPU ک GPU ीᆉಕᑈࠠЃᎿᒧޑሽޑਔ ໔КၨǶ...21
ಃ
ಃക ᒧᙁϟ
1.1ᒧ
ᒧࣁᅿճࠨऊǴວБЍбճߎࡕǴߡԖճӧ҂ٰऊۓޑࢌۓ Вය(ډයВ)Ǵ٩ऊۓϐቬऊሽ(Strike Price)ǴວΕ܈፤рۓኧໆޑऊۓޑ ނǶᒧϩࣁວک፤ǴՉޑբϩࣁວϷ፤рǴаວວࣁٯǴວ ΕວཀښӧډයВਔǴࠨऊᏱԖޣԖճ٩ቬऊሽວࠨऊኧໆޑޑނ
܈ࢂᒧόՉǴԶວວޑΓᒧՉࠨऊਔǴ፤рວޑΓԖက୍ቬ ऊǴວΕວޣࢂჹѱޑ҂ٰو༈࣮ᅍǴ܌а׆ఈૈӧ҂ٰҔၨѱեޑሽ
ວΕޑၗౢаᕇճǶ
ᒧ٩ວБளाቬऊϐයज़ǴΞёϩࣁȨऍԄȩᆶȨኻԄȩᒧǴऍ ԄᒧޑວБૈܭᒧډයҺϺՉճǴኻԄᒧޑວБѝૈӧډ යВωૈՉ٬ճ[4]Ƕ
ᒧޑ܄ӧܭǺ 1. ᄫఎᏹբ
ᒧޑວБѝሡЍбλᚐճߎǴࠅԖคज़ᕇճޑёૈǴ܌аԖаλཛεǴ Ҕၨեҁߎᕇڗၨଯၗൔၿޑ܄Ƕ
2. ᗉᓀ
ၗޣऩԖǴӵ݀όዴۓѱ҂ٰوӛǴࣁΑೕᗉ॥ᓀǴёаҔᖼວ ᒧޑБԄǶӳೀӧܭ܈ሽॶΠຳǴ߾ᒧᕇճёаᔆံཞѨǴԶ ӵ݀ሽॶᅍǴ߾ѝཞѨλҽճߎǶ
3. ሀۯၗ،
җܭᒧவғਏВډՉВԖࢤਔ໔ǴӢԜວБёаԖၨӭޑਔ໔ᢀჸѱ
ޑوӛаϷղᘐǴԶऍԄᒧ׳ӢࣁܭډයВࣣёՉǴӢԜჹܭວБޑ
ၗߎፓࡋගٮ׳ଯޑቸ܄Ƕ ᒧΞᆀࣁයǶ
1.2 ምምᛖᒧ
ምᛖԄᒧᆶᒧޑനεৡ౦ӧܭǺምᛖᒧନΑԖϐቬऊ ሽѦǴۘԖीۓޑምᛖሽ(ሽज़܈Πज़)Ǵޑނሽӧࠨऊډ ය࿘ډԜምᛖሽǴᒧࠨऊջҥڅғਏ܈ಖЗǶځϩᜪӵΠ߄܌ҢǺ
ғਏࠠ ಖЗࠠ
ज़ࠠ ज़ғਏࠠວ
ज़ғਏࠠ፤
ज़ಖЗࠠວ
ज़ಖЗࠠ፤
Πज़ࠠ Πज़ғਏࠠວ
Πज़ғਏࠠ፤
Πज़ಖЗࠠວ
Πज़ಖЗࠠ፤
߄ 1-1. ምᛖᒧϐϩᜪǶ
аज़ಖЗࠠᒧ(up-and-out)ࣁٯǴԜᒧԖঁεܭ߃ۈሽޑምᛖሽ
(barrierǴ܈ᆀࣚज़)ǴޑނሽၲډԜምᛖሽਔǴԜᒧ࠹ѨਏǶ Զज़ғਏࠠᒧ߾ࢂޑނሽ࿘ډምᛖሽਔω໒ۈғਏ[3]Ƕ 1.3 ЃᎿᒧ
ምᛖᒧޑғਏ(܈ಖЗ)చҹவȨ࿘ȩډምᛖሽᡂԋȨ࿘Ъុ
ࢤਔ໔ȩਔǴԜምᛖᒧߡᆀࣁЃᎿᒧ[14]Ƕ 1.3.1୷ҁཷۺ
ЃᎿᒧёᇥࢂᅿምᛖᒧޑᡂǴѬڀഢምᛖᒧޑ܄Ǵሽ
ᆶޑࠔၡ৩ޑሽࢂցමϷምᛖሽԖᜢǴՠࢂచऊೕۓၨࣁᝄǴё ගٮၗΓϐᒧ܄ၨӭǶᙁقϐǴЃᎿᒧࣁᅿሽॶڗ،ܭޑၗౢࢂց ܭࢌۓය໔ϣ࿘ډ܈ࢂຬၸࢌۓࣚज़ǴЪຬၸࣚज़ޑය໔ߏࡋၲډࢌۓ
ྗޑࠔǶ 1.3.2 ᅿᜪԄ
(1)ಕᑈࠠЃᎿᒧǺԜᒧሽॶᆶޑނሽ࿘܈ࢂຬၸምᛖሽ
ޑය໔ᕴߏࡋԖᜢǶၗౢޑނሽ࿘܈ຬၸምᛖሽޑȨಕᑈȩය໔ߏࡋε ܭܭךॺӃႣۓޑߏࡋॶਔǴԜᒧωғਏ(܈ಖЗ)Ƕӵӕምᛖᒧ
ኬǴԜᒧӅԖΖᅿࠠԄǶ
(2)ೱុࠠЃᎿᒧǺԜᅿᒧӧၗౢޑނሽ࿘܈ຬၸምᛖሽ
ޑȨೱុȩය໔ߏࡋεܭܭךॺӃႣۓޑߏࡋॶਔǴᒧωғਏ(܈ಖЗ)Ƕ ӵӕምᛖᒧኬԜᒧӅԖΖᅿࠠԄǶ
(3)షӝࠠЃᎿᒧǺࣁаٿᅿࠠԄᒧϐషӝᡏǴӧҁጇፕЎ္ό ӧፕϐӈǶ
1.3.3܄
ЃᎿᒧޑӳೀӧܭдගٮΑၗΓঁճߎၨեԶЪߥៈၨӭޑ
ঁ़ғ܄ࠔǴЀܭѬࢂӧޑނሽᒲ࿘ډ܈ຬၸࢌࣚज़ࢤਔ໔ࡕջғਏ܈
ύЗǴӢԜճߎሽКද೯ᒧեǴԶΨόӵӕምᛖᒧ
ኬՠޑނሽ࿘܈ຬၸࣚज़ଭ൩ѨਏǴԶૈගٮၨӭޑߥៈǶ
ԶऩаಕᑈࠠکೱុࠠЃᎿᒧКၨǴҗܭೱុࠠЃᎿᒧޑచҹК
ၨόܰၲԋǴࠔКၨόܰѨਏǴ܌аځᒧሽКၨଯǶ
ЃᎿᒧޑຑሽБԄନΑکᒧኬޑ୷ҁୖኧႽࢂǺቬऊሽǴ ය໔ߏอǴค॥ᓀճǴᗋᆶΠӈ൳ঁୖኧԖᜢǺ
1. ډයВߏอ
2. Ⴃۓғਏ(ಖЗ)ය໔ߏอ 3. ምᛖሽ[14]
ಃ
ಃΒക ୷ҁᢀۺаϷπڀ
2.1 ΒԄᒧۓሽኳࠠ
ΒԄᒧۓሽኳࠠനԐҗ CoxǴRoss ک Rubinstein ΟΓӧ 1979 ԃගрǴ ҞޑӧຑډයВՉ٬ޑᒧϐӝሽॶǶΒԄᒧۓሽኳࠠࣁᅿᚆ ණਔ໔ኳࠠǴԜѦ٠ଷިሽѝԖϲکΠຳǴԶЪԛϲکຳޑᐒаϷ൯ࡋ όׯᡂǶԶިሽᡂࣁೱុਔ໔ޑᡂϯǴՠӧԜኳࠠύஒԜೱុਔ໔Ϫԋ
ኬߏࡋޑਔࢤǴঁਔࢤԖঁᢀჸᗺ܈ᆀᗺǴ٠ኳᔕр܌Ԗёૈޑวၡ ৩Ǵӆჹঁၡ৩ޑঁᗺीᆉрޑނӧ၀ᗺਔޑԏаϷᒧ
ӧ၀ᗺޑሽǶ
ଷӧިሽ S ǴϲکΠຳ൯ࡋϩձࣁ u ک d ǴϲکΠຳᐒϩձࣁ p ک q Ǵ๏ۓᢀჸਔ໔ t' ǴԋҬሽࣁ X ǴᒧሽࣁC (ӧԜଷࣁວ)Ǵ߾ё аҔΠკٰ߄ҢයޑΒԄᒧۓሽኳࠠ[13]Ǻ
კ 2-1. යޑΒԄᒧۓሽኳࠠǶ ԶCuǴCdջࣁӚԾ܌ӧᗺޑᒧሽǶ
ࣁΑБߡǴךॺஒ q ԋ1pǴଷၸ n යǴ߾ኳࠠᡂԋკ 2-2Ǵଷӧ
ࢌ҃ᆄᗺύޑ၀ᗺިሽᐕ j යࢂΠຳǴn යࢂϲǴ߾၀ᗺϐިሽࣁj
j j
n d
Su Ǵډၲ၀ᗺᐒࣁ pn j p j j
n (1 )
¸¸¹·
¨¨©§
ǶஒשԵቾѐޑ၉Ǵךॺஒёа ளр၀ᒧޑය߃ሽॶǺ
¦
¸¸¹·
¨¨©§
n
j
j j n j
j n
rT p p Su d X
j e n
c
0
) 0 , max(
) 1 (
ځύrࣁค॥ᓀճǴTࣁᒧޑӸុය໔Ƕ
ऩஒ u ॶۓࣁ n
T
eV Ǵd ॶۓࣁ u
1Ǵ߾ԜΒԄᒧۓሽኳࠠջࣁচۈ CRR
ኳࠠǶ
კ 2-2. n යޑΒԄᒧۓሽኳࠠǴS0ࣁ߃ۈሽǶ
2.2 ғԋڄኧ
ଷԖኧӈAn {a0,a1,}Ǵ߾ڄኧ
¦
f r rr r
rx a a x a x
a x
f 0 1
0
) (
ᆀࣁኧӈAnϐғԋڄኧǶ
ғԋڄኧ೯தҔܭှ،ঁኧಔӝϐୢᚒǶᖐٯٰᇥǴԖٿঁόӕηǴϩձ Ԗٿᗭౚӧ္य़Ǵ߾܌ԖёૈڗౚޑБݤёаҔΠӈԄηٰ߄ҢǺ(ౚຎࣁόӕ)
2 2 2 2 2 2
2
2)(1 ) 1
1
( xx y y xx yxyx yy xy x y
ځύ1ж߄όڗǴԛБ߄ҢڗᗭǴٿԛБ߄ҢڗٿᗭǶऩౚຎࣁ࣬ӕǴ
߾Ԅёׯቪԋ12x3x2 2x3 x4Ǵ x ޑԛБ߄ҢౚޑঁኧǴ߯ኧ߄ҢБݤ ኧǴႽࢂ3x2߄Ңڗ 2 ᗭౚޑБݤԖΟᅿǺࡷځύঁηڗځ္य़ӄޑౚ܈
ࢂٿঁηϩձڗᗭౚ[2]Ƕ
2.3 GPUکک CUDA 2.3.1 GPUᙁϟکᐕў
GPU(Graphic Processing Unit)ǴύЎᙌԋკೀᏔǴࢂᅿߐҔٰೀ
ቹႽၮᆉπբޑ༾ೀᏔǶGPU നεфҔӧܭ٬ᡉҢь෧Ͽჹ CPU ޑ٩ᒘǴ٠ ЪϩᏼΑ CPU ޑπբǴЀځӧೀ 3D კਔਏૈ׳уܴᡉǶᗭ CPU Ԗ
ډѤᗭਡЈǴҔٰೀׇӈၮᆉ(Serial Processing)ǴԶԖ GPU ޑᡉҢьύԖ ӭঁೀᏔǴ࣬ܭ CPU ޑၮᆉਡЈǴՠόӕϐೀӧܭ CPU ޑਡЈԛനӭѝ
ૈೀٿచՉᆣǴԶ GPU ޑࠅᕴӅёаЍජԭచаǴ܌аӧѳՉೀำ ԄѳՉၮᆉΠǴ٬Ҕ GPU ޑीᆉਏК٬Ҕ CPU זӭ७ǶךॺВதғࢲ ύКၨதௗډѳՉၮᆉޑҔǺ1.ቹТᆶྣТǵ2.ຎᆛઠǵ3.ၯᔍǶаྣТ
ٰᖐٯǴྣТࢂҗӭঁႽનಔԋǴCPU ᗨฅёаזೲӦၨፄᚇӦၮᆉǴ ՠࢂҗܭೀᏔޑज़ڋǴCPU ѝૈঁႽનঁ࣬નीᆉǹԶ GPU ߡёаӕਔ
ीᆉӭঁႽનǴӢԜёаӕਔೀ༧ୱޑՅறǴ܌аӧೀྣТ GPU ᡉ ளԖၨεӦᓬ༈ǶԶךॺஒԜמೌၮҔӧᒧޑीሽǴӢࣁӧΒԄኳࠠ
ύǴऩ n ॶ(යኧ)ၨεǴ߾ᕴၡ৩ኧஒၲډঁёᢀޑኧӷǴऩҔޑ CPU
ၮᆉஒၗεໆޑਔ໔ǴӢԜךॺஒӀܫӧ GPU ǴයఈૈᙖҗѬޑѳ ՉၮᆉૈΚٰ෧Ͽޑਔ໔ǶӕኬӦǴεӭኧᔈҔำԄ໒วޣӧ GPU ၮᆉ ਔǴࢂஒำԄޑೱុҽҬҗ CPU ǴԶख़ᙟ܄ଯޑπբҬҗ GPU ೀǶӵԜ ϐѦǴGPU ӧೀੌᗺኧၮᆉБय़ޑԋߏೲࡋΨᇻεܭ CPUǴӵკ 2-3Ƕՠ࣬ჹ ޑԖ٤Т٠คϩ໒ӦኧၮᆉൂϡӢԜӧኧၮᆉБय़ਏัৡ[12]Ƕ
კ 2-3. CPU ک GPU ϩձӧੌᗺኧၮᆉޑೲࡋ [8]Ƕ
Զ GPU ਡЈനԖӜޑϦљኧ NVIDIAǴനԐܭ 1998 ԃวթ NV4ǴҔܭ Riva TNTᡉҢьǹ႖ԃ 4 ДΞр NV5ǴҔܭ Riva TNT5 ǹӧ 1999 ԃޑ 9 Д߾ԖΑ NV10 ޑୢШǴNV10 Ξᆀ TNT3ǴԶ GPU ޑཷۺΨ൩ࢂவ೭ਔং໒ۈ ԖޑǶӧ 2000 ԃ 4 ДΞрΑ NV15Ǵးӧ NVIDIA ཥسӈ GeForce 2 GTS Ǵځ ύ TS ж߄ޑࢂȨGigaTexture ShaderȩǴཀࡘࢂࣾёа༤кޑႽનঁኧࢂ 10 ሹ ભޑǶӧϐࡕΞഌុрΑ NV10ǴҔӧ GeForce 2 MX ǹNV20Ǵࣁ GeForce 3 سӈᡉьޑਡЈ٠ЪஒՅൂϡ२ԛϩԋႽનکഗᗺٿҽǴЍ DirectX8ǹ2002 ԃ 2 ДӕਔวթΑ NV17 ک NV25ǴޣҔܭ GeForce 4 MX440(ࣁ߈ԃٰεӭኧ ၯᔍޑനեाᡉҢьଛഢ)ǴࡕޣҔܭၨଯ໘ޑࠠဦ GeForce 4 Ti4600Ƕ2004 ԃࣴวр NV40 ԋࣁ GeForce 6 سӈǴӕԃ 10 ДځᝡݾჹЋ ATI Ψр Radeon 9700ǴࣁӄౚಃঁЍජ Direct 9.0 ޑᡉҢьǶځࡕ NVIDIA ܭ 2005 ԃஒ NV ׯ Ӝࣁ G سӈǴࣴวр G70ǴG72 Ǵޔ ډ 2006 ԃޑ G80ǴҔܭ GeForce 8800 سӈǴӕਔΨЍජ DirectX 10ǹаϷന߈ӧ 2008 ԃрޑ G200[10]Ƕ
2.3.2 GPGPUکک CUDA
ᇥډ GPU ၮᆉ൩ۓाගډ CUDAǴCUDA ӄӜࣁ Compute Unified Device
ArchitectureǴࣁ NVIDIA Ϧљрޑঁ GPU ӝמೌǴၸԜמೌǴ٬Ҕޣ ёаճҔ NVIDIA Ϧљрޑ GeForece 8 سӈаࡕޑ GPU ٰՉၮᆉǴΨࢂ၀Ϧ љჹܭ GPGPU ޑ҅ԄӜᆀǴ܌ᒏ GPGPU ύЎӄӜࣁ general-purpose computing on graphic processing unitsǴࢂᅿճҔ GPU ٰीᆉচҁҗ CPU ೀޑ೯Ҕीᆉ Һ୍ǴΨ൩ࢂஒ GPU Ҕӧߚޑ 3D კᡉҢ[9]Ƕޑ GPGPU ໒วБݤ
ࢂၸ OpenGL ܈ Direct3DǴаጓቪ shading language ޑБݤڋ shader ٰགྷ ᒤݤၮᆉǶԶ NVIDIA ගрޑ CUDA ࢂёаၸ C ᇟقޑڄԄٰጓቪޑǴу
ό٬ҔკڄԄǴӢԜӧำԄी׳ࣁБߡǶCUDA εठϩࣁ libraryǵruntimeǵdriver ΟঁҽǶځࢎᄬკӵკ 2-4Ǻ
კ 2-4. CUDA ࢎᄬ [8]Ƕ
Զӧቪ CUDA ޑำԄਔǴךॺஒำԄՉޑୱϩԋٿঁҽǴঁࢂ
CPUՉޑ host ᆄǴќѦҽ൩ࢂ GPU Չޑ device ᆄǶԶӧ CUDA ำԄ
ࢎᄬ္ǴЬाޑำԄᗋࢂ CPU ܌ՉޑǴѝԖӧၶډሡाѳՉೀޑਔংǴω
ஒำԄጓԋ device ૈՉޑำԄӆҧ๏ GPU ՉǴԶԜำԄӧ CUDA ύᆀ ϐࣁ kernelǶԶӧ device ύǴೀ kernel ޑനλൂϡћ threadǴύЎࣁՉᆣǴ
ঁ device ύԖӭঁ threadǴঁ thread ࢂӕਔՉ kernel ำԄǴԶךॺ ܌ ճҔޑ൩ࢂঁ thread ޑ index όӕǴԶԖόӕޑၗٰՉၮᆉǶኧঁ thread ёаಔԋঁ blockǴӕঁ block ္ޑ thread ёаӸڗӕ༧ᏫᡏǴӢԜёа
ՉזೲޑӕբǴԶόӕ block ޑ thread ߾คݤޔௗӸڗӕঁᏫᡏǴӭঁ
block ёаಔԋঁ gridǴၸ೭ᅿኳԄǴךॺёа׳уԖਏӦճҔঁ thread ޑфਏǴԶό thread ኧҞ܌ज़ڋǶځᜢ߯ӵკ 2-5Ƕ
კ 2-5. ThreadǵBlock ک Grid ޑᜢ߯ [8]Ƕ
ӧቪ CUDA ำԄਔǴाᡣ thread ܌٬ҔޑၗӃவ host device ޑ
ᏫᡏύǴԶਥᏵόӕޑᔈҔΞԖϩόӕޑᏫᡏᜪࠠǶӵკ 2-6 ܌ҢǴঁ
threadԖԾρޑ local memoryǴԶӧӕঁ block ္ޑόӕ thread ߾ёӸڗӕ
ঁ shared memoryǴԶ܌Ԗޑ thread ջ٬ӧόӕ block ࣗԿόӕ gridǴ߾ૈ٬Ҕ ӕঁ᠐ޑᏫᡏΨ൩ࢂ global memory ္ޑၗǶନԜϐѦǴᗋԖٿᅿ᠐ ޑᏫᡏޜ໔Ǵϩձᆀࣁ constant memory ک texture memoryǴΨࢂૈᡣ܌Ԗ thread
܌Ӹڗޑޜ໔ǶॊޑᏫύǴglobalǵconstant ک texture ᏫᡏޑӸុਔ໔ࢂ
ک kernel ำԄӸӧਔ໔ኬߏޑǶԶךॺதाݙཀޑࢂ thread ӧӸڗӕ༧Ꮻ ᡏਔޑӕϯǶ
კ 2-6. Ꮻᡏቫԛ [8]Ƕ Զ CUDA ำԄՉਔޑࢬำࣁǺ
1. Host(೯தࣁЬᐒ)ᏫᡏଌၗکำԄዸډ device(೯தࣁᡉҢь)ᏫᡏǶ 2. Hostځдޑ٣܈໕Ǵdevice Չ kernel ำԄǶ
3. DeviceᏫᡏஒՉ่݀ӣ host ᏫǴำԄᝩុՉǶ
ಃ
ಃΟക ճҔғԋڄኧٰीᆉЃᎿᒧޑሽ
ӧ೭ঁകύǴךॺ߾ჴբ Li ک Zhao ܌ගрҔғԋڄኧٰຑሽЃᎿᒧ
ޑБݤǴ໒ۈӃࢂа CRR ኳٰࠠ٤ۓကکϟಏǴௗߡҔрғԋڄኧ ޑ߯ኧБݤٰှрЃᎿᒧӧ n යਔޑሽ[1]Ƕ
3.1 ಔӝኧᏢࢎᄬ
ӧ CRR ኳࠠύǴךॺࢂҔ lattice path ୷ᘵǴ٠ଷިሽѝԖᅍکΠຳ ٿᅿёૈǴךॺஒঁᅍ܈Πຳයޑၡ৩ᆀࣁൂϡၡ৩ǶԶӵ݀ӧ০კ
Ǵঁᅍޑൂϡၡ৩൩ࢂவ(x,y)౽ډ(x y1, 1)ǴךॺҔ U ٰ߄ҢǶ
ঁΠຳޑൂϡၡ৩൩ࢂவ(x,y)౽ډ(x y1, 1)ǴךॺҔ D ٰ߄ҢǶ܌аӧ ԜኳࠠύǴచ lattice path ךॺёаׯቪԋa1a2anǴځύai A {U, D}Ǵ ԶЪԜၡ৩ߏࣁ n ǶԜѦךॺᗋۓကޜၡ৩ࣁ1ǶԶٿచၡ৩ޑ४ᑈךॺຎࣁೱ
่ Ǵ ଷ చ ၡ ৩D a1a2an Ǵ ќ చE b1b2bmǴ ߾ д ॺ ޑ ४ ᑈ ջ ࣁ
m
nbb b
a a
a1 2 1 2
DE ǴΨࢂచཥޑၡ৩Ƕ
ௗךॺࣁΑפрၡ৩ࣁ n ޑ܌Ԗ lattice path ঁኧǴߡЇΑғԋڄኧޑཷ
ۺǴଷLࣁ܌Ԗ CRR ኳࠠύ܌Ԗ lattice path ޑӝǴךॺۓကLnࣁӧ܌ԖӧL ύԶЪߏࡋܭ n ޑၡ৩ӝǴځኧᏢԄࣁǺLn
^
DL:"(D) n`
Ƕӆଷ fnࣁLn ޑ ϡ ન ঁ ኧ Ǵ Ҕ |L |n ߄ Ң Ǵ ߾ ך ॺ ё а ቪ р ׇ ӈ { f }n ޑ ғ ԋ ڄ ኧ Ǻ
¦t
0
) (
n
n nt L t
L ǴԶΞӢࣁL
^
DLn:nt0`
ǴӢԜय़{ f }n ޑғԋڄኧΨёаຎࣁLӝޑғԋڄኧǶௗΠٰךॺۓကU {U}ǴΨ൩ࢂᇥӧU ӝ္ѝԖ
ঁϡનћ UǴࡐܴᡉёа࣮рUnύନΑU1 UаѦځдࣣࣁޜӝǴӢԜU ӝޑғԋڄኧࣁU(t) tǴӕٰᇥǴऩךॺۓကD {D}߾D(t) tǴךॺΨё аஒ t ຎࣁӧ lattice path ύޑǴόࢂ۳൩ࢂ۳ΠǶ
೭ਔаঁΠज़ಖЗࠠምᛖᒧວٰբٯηǴ२ӃךॺӃஒ CRR ኳࠠ
ӧ০კǴԶచᆶ x ືѳՉЪک x ືຯᚆࣁ k ޑጕᆀࣁಃ k ቫǴಃ 0 ቫջ
ࣁ x ືǶךॺёаགྷᕵືࣁޑނሽǴᐉືࣁਔ໔ǴଷԜᒧவচᗺ(0,0) ໒ۈǴԶምᛖሽࣁಃ1ቫǴΨ൩ࢂᇥࢌచሽၡ৩࿘ډಃ1ቫਔ߾Ԝᒧ
ѨਏǴӢԜךॺाीᆉԜᒧሽॶਔǴሡाԵቾޑࢂ܌Ԗ҂࿘ډಃ1ቫ ޑၡ৩Ǵచၡ৩ಖᗺࣁ( in, )ǴځύndidnǶࣁΑᙁϯीᆉǴךॺ೭ᜐѝሡ Եቾ܌Ԗ҂࿘ډಃ 1 ቫЪಖᗺӧ( n2 ,0)ޑၡ৩Ƕ
ଷӝ C ࣁ܌Ԗ҂࿘ډಃ1ቫЪಖᗺӧ x ືޑ lattice pathǴCnࣁ C
္Ъಖᗺӧ( n2 ,0)ޑᕴၡ৩ঁኧǴ߾ C ޑғԋڄኧࣁ ¦
t0
) 2
(
n
n nt C t
C ǴԶӧीᆉ
) (t
C ޑϦԄǴךॺӃϟಏ፦ኧӝǺӧ lattice path ܌ӝԶԋޑӝ L ္Ǵ ऩL္य़చၡ৩ёаӦϩှԋኧঁόӕޑλၡ৩Ǵ߾Ԝᅿλၡ৩ޑ
܌ Ԗ ӝ ջ ࣁ ӝL ޑ፦ኧӝP Ƕ Զ ך ॺ ё а ٬ Ҕ ঁ ᇶ շ ۓ ࣁ Ǻ
) ( 1 ) 1
(t P t
L ǶԜѦךॺᗋёаޕၰǴӧ C ӝ္ޑ፦ኧӝ P ࢂҗ܌ӧ C ္Զ ЪѝԖ໒ۈکಖᗺӧ x ືޑၡ৩܌ಔԋǴӢԜךॺёаஒҺՖӧP္ޑၡ৩ϩ
ှԋ 8ȕ'ǴځύECǴԶ P ޑғԋڄኧP(t)൩ܭU(t)C(t)D(t)ǴӆճҔখখ ޑۓǴ߾ԄᡂԋǺ
) ( 1 ) 1
( 2
t C t t
C ǴၸϯᙁکΒϡԛБำԄှਥޑϦԄǴ
ךॺёаᏤр 2
2
2 4 1 ) 1
( t
t t
C
Ǵӆၸੀୌ໒ԄޑᡂඤǴךॺёаளډt2nޑ
߯ኧ ¸¸¹·
¨¨©§
n n
Cn n 2
1
1 Ǵ೭ΨࢂԖӜޑ Catalan ኧǶԶCnࡽࣁ܌Ԗ҂ၸ x ືۭΠ
Ъಖᗺӧ( n2 ,0)ޑၡ৩ঁኧǴѬޑॶஒךॺाीᆉЃᎿᒧሽਔҔ ډǶ
ନԜϐѦǴךॺҔ Lagrange ϸᆉϦԄёарtnӧ(C ))(t rਔޑ߯ኧࣁ
¨¨©§
¸¸¹·
n r r N
r n
r
n 1 ,
2 2
2
ӧךॺा໒ۈीᆉೱុࠠЃᎿᒧሽǴᗋाϟಏঁख़ाޑۓǺ [ۓ]
з m ࣁኧǴᒧঁၡ৩ȝǴۓကȨȝ ӧಃ m ቫനӭޑೱុኧȩࣁmcm(P)Ǵ ϞϺ๏ۓঁ҅ኧ l Ǵ߾܌Ԗmcm(P)l٠Ъ࿘ډ܈ຫၸಃm1ቫаϷಖᗺӧ
ಃ i ቫޑၡ৩ȝ ܌ԋޑӝࣁTǴځύi Ǵ߾m TޑғԋڄኧࣁǺ
2 2
1 0
2 2
2
) 2
(
, 2 2
, ) 1 ( ) ( ) (
) ( ) ) (
(
l r r
t C t
C C t Q
i m r
t C t t
Q t Q t
t Q t t C
T
l
Πय़ޑᇶշۓ߾ࢂӧךॺीᆉಕᑈࠠЃᎿᒧਔҔډǶ
Chung-Feller TheoremǺவ(0,0)ډ( n2 ,0)٠ЪҗൂՏၡ৩ UǵD ܌ಔԋǴ٠ Ъ҅ӳԖ k2 ӧಃ 0 ቫޑ܌Ԗၡ৩Ǵځঁኧک k ॶคᜢǴԶࢂکಃ n ঁ Catalan ኧኬǶ
3.2 ीᆉЃᎿᒧޑሽ
ӧ೭കךॺ܌ाीᆉޑࢂ n යज़ಖЗࠠЃᎿᒧວޑሽǴӧीᆉϐ
Ӄჹ٤ኧॶۓကǴଷךॺޑምᛖሽ H ࣁS0umǴ m ࣁިሽ࿘ډምᛖሽ
ਔाوޑനϿӛԛኧǴw ࣁךॺۓޑ໔ǴΨ൩ࢂӵ݀ިሽӧምᛖሽ
ೱុӸӧ w ਔ໔ࡕԜᒧѨਏǴl ࣁ࣬ჹᔈ w ޑǴӧምᛖሽޑਔ໔ᗺኧǴ T n
l t w Ǵךॺ೯தஒ l ଷԋଽኧǶ
3.2.1 ೱុࠠЃᎿᒧ
ೱុࠠЃᎿᒧѝԵቾӧምᛖሽೱុӸӧޑޑਔ໔ǴӢԜךॺҔখখ ۓကޑȨȝ ӧಃ m ቫനӭޑೱុኧȩΨ൩ࢂmcm(P)Ǵ٠Ъӆۓက f( in, )ࣁ வচᗺ(0,0)໒ۈǴಖᗺӧ( in, )Ъmcm(P)lޑၡ৩ ȝ ϐঁኧǶࡐܴᡉn ࣁi ڻኧਔǴf(n,i) 0ǴӢԜךॺޔௗஒniբࣁଽኧǶԶਥᏵiॶόӕǴϩԋΟঁ
ݩፕǺ
ಃᅿݩǴitlmǺӧ೭ݩύǴf(n,i) 0ǶӢࣁԜၡ৩࿘ډಃ m ቫ ࡕǴѬᗋाԿϿimtlঁӛኧωၲډಃ i ቫǶ
ಃΒᅿݩǴi Ǻӧ೭ঁݩךॺΞाӆ٩Ᏽ೭చၡ৩ࢂցԖ࿘ډಃm
1
m ቫٰϩԋٿҽፕǶଷg( in, )ࣁ܌Ԗmcm(P)lԶЪම࿘ډ܈ຫၸಃ
1
m ቫЪಖᗺӧ( in, )ޑ܌Ԗၡ৩ȝ ኧǶฅࡕҔখখޑۓǴךॺёаᕇளа Πޑғԋڄኧ:
2 2
1 0 1
2 2
2 2
1 1
) (
, 2 2
, ) 1 ( ) ( ) (
) ( ) ) (
, (
¦
l r r n
t C t
C C t Q
i m r
t C t t
Q t Q t
t Q t t C
i n g
l
ӆҔϐගډޑtnӧ(C ))(t rਔޑ߯ኧϦԄаϷၮᆉࡕǴёаளрtnޑ߯ኧ ࣁǺ
¦d
d
¸¸¹·
¨¨©§
¸¸¹·
¨¨©§
2 1 1 1
2 1
1 1 2
1
1 2 1
2 2 2
t j
t n l rj
r l
n C
j r j r j l r
n r
l n
r
ځύ , 2
2
2 1
2 1 1
r l t n
r
t n
Ƕ
ӆh( in, )ࣁவ҂࿘ډಃm1ቫޑၡ৩ኧǴਥᏵϸচǴךॺёаޔௗ
Ꮴр
¸¸¹·
¨¨©§
¸¸¹·
¨¨©§
( 1)
) , (
2
2 m
n i n
n
h n i n i
നࡕg(n,i)کh( in, )ٿޣ࣬уջࣁ f( in, )
ಃΟᅿݩǴmdilmǺঁၡ৩ёаϩԋٿঁҽǴಃঁϩࢂ
வচᗺډ(n mj, 1)Ǵځύim1d jdlǴ೭ҽޑၡ৩ኧໆࣁg(n mj, 1)Ƕ
ಃΒҽޑၡ৩ߏࡋࣁ j Ǵjd ǴёаགྷԋவচᗺрวǴಖᗺӧl i m( 1)ԶЪ
ύؒԖӆӣډচᗺޑၡ৩ǶԶځғԋڄኧёቪࣁ(tC(t))i m( 1)ǶӆҔϐቪၸޑtn ӧ(C ))(t rਔޑ߯ኧϦԄǴ߾߯ኧջࣁ೭ҽၡ৩ኧໆǴीᆉࡕࣁǺ
l j r m i j r
r
r
j d d
¸¸¹·
¨¨©§
2 2
2 1 , ( 1),
2
നࡕךॺӆஒٿҽ४ᑈ൩ёளрmdilmЪmcm(P)lਔޑ܌Ԗ ၡ৩ȝ ϐᕴঁኧǺ
) 1 (
1 , ) 2
1 , ( )
, (
2
1 2 2
2
2
¸¸¹·
¨¨©§
d d
¦
m i r
j r j m r j n g i
n
f j r
l j m i
ԶӧךॺനࡕޑуᕴǴࣁΑ࣪ၮᆉԋҁǴךॺӃᆉрঁനӭ۳Π
ኧ a ٬ளեܭԜኧޑᒧനࡕӧሽϣǺ »
¼
« »
¬
«
) / log(
) / log( 0
d u
K u a S
n
നࡕǴךॺёаᏤрज़ಖЗࠠޑೱុࠠЃᎿᒧޑሽǺ
¦
a
j
j j n j j
n
rT f n n j p p S u d K
e c
0
0 )
( ) 1 ( ) 2 , (
ځύ f(n,n2)ࣁn යᅍ j යΠຳЪj mcm(P)lޑၡ৩ኧǶ
3.2.2 ಕᑈࠠЃᎿᒧ
ಕᑈࠠЃᎿᒧ܌Եቾޑࢂިሽၡ৩ӧምᛖሽಕीޑਔ໔ᕴኧǴӵӕ )
(P
mcm ǴךॺۓကȨȝ ӧಃ m ቫޑಕᑈኧȩࣁcsm(P)ǴԶ fl( in, )ࣁவচᗺ(0,0) ໒ۈǴಖᗺӧ( in, )Ъcsm(P)lޑၡ৩ȝ ϐঁኧǶn ҭբࣁଽኧǶԶΨਥᏵi ॶi ϩԋΟঁݩፕǺ
ಃᅿݩǴitlmǺکೱុࠠЃᎿᒧኬǴ fl(n,i) 0Ƕ
ಃΒᅿݩǴid ǺਥᏵၡ৩ԖؒԖຫၸಃ m ቫϩԋٿҽፕǴଷm Rࣁ
܌Ԗၡ৩ȝ Ъcsm(P)l٠Ъමຫၸಃ m ቫԶനࡕಖᗺӧ i ޑӝǴ߾ঁR္ޑ ȝ ёϩှԋΟঁҽ ĮȕȖǴಃҽ Į ࣁ܌Ԗவচᗺ໒ۈѝ࿘ډಃ m ቫԛΨ ൩ࢂಖᗺӧಃ m ቫޑޑၡ৩ǴځӝᆀࣁR1ǹಃΒҽ ȕ ࣁ܌Ԗcs0(E)lԶЪ ӧಃ 0 ቫനϿٿЪಖᗺӧय़ޑၡ৩ǴځӝᆀࣁR2ǹಃΟҽȖ ࣁவಃ 0 ቫрวǴಖᗺӧಃi ቫЪύؒԖӆӣډಃ 0 ቫޑၡ৩Ǵځӝᆀࣁm R3ǶԶ ךॺёар R ޑғԋڄኧǺR(t) R1(t)R2(t)R3(t)ǴځύR1(t) (tC(t))mǴԶ
i
t m
tC t
R3( ) ( ( )) ǶௗךॺाᆉR2(t)ǴճҔখখϟಏޑ Chung-Feller ۓǴך
ॺଷAnkࣁಖᗺӧ(n,0)ԶЪ҅ӳԖ k ӧಃ 0 ቫޑၡ৩ኧໆǴ߾
2
Cn
Ank Ǵך ॺ߾ૈቪрΠԄǺ
¦
¦
¦
!
!
4
0
2 2
2 2 2
1
2
2 2
2 2 2 2
2 2
2 2 2
2
2
2 ) 2
2 ( 2
2 2 2
2
) (
) (
) (
l
j
j l n
n l
l
n l
n
l n n
l l l l
t jC t l
l C
t l C
t l C
t C
t A A
t A A
t A t R
j
n
gl( in, )ࣁӧRύಖᗺӧ( in, )ޑ܌Ԗၡ৩ኧǴ߾ךॺਥᏵय़ૈளډRޑ ғԋڄኧࣁǺ
j j l
j i m i
m i m n
l l jC t
t tC t
C l t
t i n g
2 4
0 2 1
2 2
2 )) 2
( ( )
2 ( ) 2 ,
( ¦
¦
ӆଷr3 2miǴฅࡕжΕtnӧ(C ))(t rਔޑ߯ኧϦԄǴёаᏤрǺ
¦
¸¸¹·
¨¨©§
¸¸¹·
¨¨©§
4
0 3 2 2
3 2
3 3
3 3
) 1 2 ( 2
) 1 )(
2 ) (
, (
l
j
j r j n r
n
l n j C
r j n
j l n r
r n
r i l
n g
hl( in, )ࣁவ҂࿘ډಃ m ቫޑၡ৩ǴӕኬӦΨό࿘ډಃm1ቫǴӢԜё арǺ
¸¸¹·
¨¨©§
¸¸¹·
¨¨©§
( 1)
) , (
2
2 m
n i n
n
hl n i n i
നࡕךॺёаஒٿҽуଆٰр fl(n,i) gl(n,i)hl(n,i)
ಃΟᅿݩǴmilmǺ೭္ޑ fl( in, )کখখӧᆉೱុࠠЃᎿᒧޑ
ಃΟύݩᜪ՟ǴёҔΠԄ߄ҢǺ
m i r
j r j m r j n f i
n f
l j m i
r j j
l l
¸¸¹·
¨¨©§
¦d
4
1 4 2
4 1 ,
) 2 , ( )
,
( 4
ځύ flj(n j,m)ࣁၡ৩ Ș ځύcsm(K)l jЪಖᗺӧ(n j,m)ޑၡ৩ኧ
ໆǶԶഭΠவ(n j,m)ډ( in, )Ъவ҂࿘ډಃ m ቫޑၡ৩ኧࣁ
¸¸¹·
¨¨©§
4 2 4
4
2 1
r j
j r j
r Ƕ
നࡕǴךॺёаᏤрज़ಖЗࠠޑಕᑈࠠЃᎿᒧޑሽǺ
¦
a
j
j j n j j
n l
rT f n n j p p S u d K
e c
0
0 )
( ) 1 ( ) 2 , (
aޑۓကکय़ךॺӧीᆉೱុࠠЃᎿᒧਔޑۓကޑኬǴࣁঁനӭ۳Π
ኧ٬ளեܭԜኧޑᒧനࡕӧሽϣǺ »
¼
« »
¬
«
) / log(
) / log( 0
d u
K u a S
n
Ƕ
ಃ
ಃѤക ჴբکኧᏵ่݀ፕ
ךॺӧჴբಃΟകගډޑБݤਔǴࢂӧ Windows ᕉნΠǴ٬Ҕ༾೬ޑ Visual C++ǶԶ໒ۈךॺӃКၨךॺ܌ा٬Ҕޑ CPU ک GPU ฯᡏᕉნόӕǴௗך ॺεཷඔॊӵՖҔ C ᇟقٰቪЬำԄаϷ CUDA ำԄǴനࡕӧόӕޑᡂኧΠǴ
ٰᡍךॺޑዴૈճҔ GPU ᕇள׳ӳޑਏૈǶ ךॺ܌٬Ҕޑ CPU ک GPUǺ
Intel Core2 Duo T7100 Nvidia GeForce 8400M GS
ਡЈኧҞ 2 16
ՉᆣኧҞ 2 Ծु
ਡЈᓎ(ೲࡋ) 1.8GHz 400MHz
߄ 4-1. CPU ک GPU ޑКၨǶ
ॶளݙཀޑࢂǴӢࣁךॺቪӧ CPU ޑำԄ٠҂ԵቾډᚈਡЈǴӢԜ CPU ӧ
ೀำԄਔࢂҔᗭਡЈаϷచՉᆣѐՉޑǶ
ӧ໒ۈჴᡍǴךॺޕၰයኧᡂӭਔǴाीᆉޑނሽޑၡ৩ᕴኧޑኧ ໆቚуࢂᚳεޑǴࣗԿຬၸ C ᇟق္ double ᡂኧ܌ૈયޑനεॶǴӢԜךॺ ӧ೭٬ҔΑ Lyuu ک Wu ܌ቪޑፕЎ[6]္ޑБݤǴஒኧӷڗΑჹኧϐࡕӆीᆉǶ
ךॺޑୖኧॶࣁǺ
15 , 1 , 2 . 0 , 08 . 0 , 110 ,
95 ,
100 K H r T w
S V (days)
ځύ S ࣁӦނޑ߃ۈሽǴKࣁቬऊሽǴHࣁምᛖሽǴrࣁԃࡋค॥ᓀճ
ǴV ࣁݢࡋǴTࣁᒧӸុය໔(аԃٰᆉ)Ǵ w ࣁૈӧምᛖሽനӭό
ᡣԜᒧѨਏޑϺኧǶ
ךॺӃӧ CPU ՉำԄǴځύයኧ n ॶךॺ 100Ǵ200,…,1000Ǵ٠ीᆉ рਔ໔аϷ่݀ǶௗΠٰךॺඤԋҔ GPU ѐೀǴԶନΑයኧаѦǴךॺᗋ
ۓ GPU Չᆣ(thread)ޑኧҞǴ၂კନΑܴ GPU ޑਏૈѦᗋ࣮ૈցפрന
٫ϯޑБݤǶ
२ӃךॺӃᆉрٿᜐीᆉሽޑ่݀٠Ъک Li аϷ Zhao ӧдॺ܌ीᆉрޑ
่݀КၨǶ
යኧ Liک Zhao ޑ่݀ CPU ޑ่݀ GPUޑ่݀
100 1.1684 1.167366 1.167224
߄ 4-2. Кၨ Li ک Zhao аϷךॺӧ CPU ک GPU ჴբीᆉೱុࠠЃᎿᒧ
ޑሽޑ่݀Ƕ
යኧ Liک Zhao ޑ่݀ CPU ޑ่݀ GPUޑ่݀
100 1.0157 1.016322 1.015997
߄ 4-3. Кၨ Li ک Zhao аϷךॺӧ CPU ک GPU ჴբीᆉಕᑈࠠЃᎿᒧ
ޑሽޑ่݀Ƕ
ӧ೭ᜐךॺ࣮ډٿᜐ่݀ࢂৡόӭޑǴёૈޑᇤৡচӢϐࣁךॺڗჹኧࡕ
ၮᆉӆᙯӣѐਔӚԖ٤ᇤৡǶќচӢӧܭᗨฅ CUDA Ϟςගٮᚈᆒ ዴੌᗺኧ(double)ၮᆉǴՠᗋࢂѸ࣮ฯᡏૈЍජޑीᆉૈΚހҁ(Compute Capability)ǴႽךॺӧ೭ጇፕЎ܌٬Ҕޑ Nvidia GeForce 8400M GS ೭ᡉҢь൩ ѝૈЍජډ 1.1ǴΨӢԜคݤҔᚈᆒዴੌᗺኧٰᆉǴԶѸҔൂᆒዴੌᗺኧǴӢ ԜԖ٤ᇤৡǶ
ௗΠٰךॺӧ GPU ۓόӕኧҞޑՉᆣаϷόӕයኧٰीᆉՉਔ ໔Ƕӧ೭ᜐךॺाݙཀޑࢂǴҗܭӧ Li ک Zhao ޑ่݀ύǴයኧࣣࢂ 100 ޑ७ኧǴ ӢԜךॺஒ thread Ψ൩ࢂՉᆣኧҞࣁ 100ǴԶӢࣁฯᡏज़ڋୢᚒǴӧ೭ᡉ ҢьךॺനӭૈۓޑՉᆣኧҞࣁ 128Ƕ
ՉᆣኧҞࣁ 1 ਔǺ
යኧ CPUೀਔ໔(ms) GPUೀਔ໔(ms)
10 0.030171 0.046863
20 0.108813 0.400006
50 0.850527 12.498731
100 6.545664 73.193116
200 40.502705 396.019381
߄ 4-4. ӧ Չᆣࣁ 1 ਔǴCPU ک GPU ीᆉೱុࠠЃᎿᒧޑሽޑਔ໔ КၨǶ
යኧ CPUೀਔ໔(ms) GPUೀਔ໔(ms)
10 0.019905 0.024478
20 0.025283 0.352692
50 0.090444 1.103192
100 0.433924 6.312448
200 2.361334 40.017239
߄ 4-5. ӧ Չᆣࣁ 1 ਔǴCPU ک GPU ीᆉಕᑈࠠЃᎿᒧޑሽޑਔ໔ КၨǶ
ӧ೭ᜐךॺёа࣮рǴՉᆣኧҞࣁ 1 ਔǴGPU ӧೀԜᄽᆉݤБय़ޑ ਏόӵ CPUǴЀځࢂයኧຫεਔຫܴᡉǴаಕᑈࠠЃᎿᒧࣁٯǴӵ߄ 4-5Ǵයኧࣁ 10 ਔǴGPU ೀ܌ޑਔ໔ࣁ CPU ϐ 1.55 ७ǴډΑයኧࣁ 20 ਔ߾ࣁ 3.67 ७Ǵᒿයኧቚу܌ޑਔ໔Ψа׳ଯޑ७ቚуǴයኧࣁ 200 ਔǴGPU ܌ޑਔ໔ςৡόӭࢂ CPU ޑ 9.77 ७Ǵ࣬ܭਏࣁ 0.102 ७Ǵ೭
ࢂҗܭךॺؒԖҔډ GPU നεޑфૈΨ൩ࢂѳՉϯ܌ठǶ
ՉᆣኧҞࣁ 100 ਔǺ
යኧ CPUೀਔ໔(ms) GPUೀਔ໔(ms)
10 0.044140 0.035817
100 5.995385 2.331474
200 47.229465 8.078661
500 648.958618 44.038397
1000 7926.140625 113.346356
߄ 4-6. ӧ Չᆣࣁ 100 ਔǴCPU ک GPU ीᆉೱុࠠЃᎿᒧޑሽޑਔ ໔КၨǶ
යኧ CPUೀਔ໔(ms) GPUೀਔ໔(ms)
10 0.013130 0.011041
100 0.423657 0.174266
200 2.388921 0.493527
500 33.134659 2.848529
1000 424.776459 7.332075
߄ 4-7. ӧ Չᆣࣁ 100 ਔǴCPU ک GPU ीᆉಕᑈࠠЃᎿᒧޑሽޑਔ ໔КၨǶ
ௗךॺஒՉᆣኧҞࣁ 100Ǵҗ߄ 4-6Ǵ4-7 ૈ࣮рࡐܴᡉӦӧӭΑѳՉ ϯϐࡕޑׯ๓ǶаೱុࠠЃᎿᒧٰᇥǴӧයኧࣁ 10 ਔǴGPU ᗨฅೲࡋၨז ՠόܴᡉǴ೭ࢂӢࣁයኧϼλਔǴ୮ाख़ᙟՉޑԛኧၨϿǴԶ CPU ਡЈ ޑଯೀೲࡋ෧ϿΑӢࣁቚуೀԛኧ܌аӭޑਔ໔Ǵу GPU ᗋाೀ
ၗሀୢᚒǴӢԜׯ๓ਏόܴᡉǴԶӧයኧቚуډ 100 ࡕǴ߾໒ۈԖၨӳޑ ׯ๓ਏǴςԖ 2.571 ७ޑуೲǶӧයኧࣁ 500 ਔǴGPU ޑՉೲࡋςࢂ
CPUՉೲࡋޑ 14.74 ७ǶԶයኧډ 1000 ࡕǴٿᜐՉਔ໔ޑৡ౦ςԖ 69 ७ ӭǴӧಕᑈࠠᒧБय़ǴCPU ܌ޑਔ໔߾ࣁ GPU ܌ਔ໔ޑ 57 ७ϐӭǶ
ಃ
ಃϖക ่ፕ
ЃᎿᒧຑሽࢂόϿᔮᏢৎ܈ኧᏢᏢৎࣴزޑᒧǴԶ Li ک Zhao ܌ ගрҔғԋڄኧٰीᆉЃᎿᒧΞගٮΑঁှ،БݤǴฅԶǴӧԜᄽᆉݤ ύǴाीᆉрᒧሽ܌ሡाޑਔ໔ໆӧයኧቚуਔΨε൯ӦቚуǴӢԜǴ ךॺߡགྷճҔ GPU ޑѳՉϯૈΚٰቚਏૈǶ
җჴᡍ่࣮݀рǴ GPU ՉᆣޑኧҞࣁ 1 ਔǴGPU Չޑᕉნک CPU
ኬǶՠ GPU Չೲࡋၨ CPU ᄌǴջ٬ ԜЃᎿᒧीᆉၸำ٠όፄᚇԶ GPU
ೀੌᗺኧၮᆉޑਏૈΞК CPU ӳǴයኧቚуਔǴGPU ӧೀޑਔ໔ϝฅ
ࢂӭܭ CPUǶ
Զךॺۓ GPU ՉᆣޑኧҞࣁ 100 ਔǴࡐܴᡉૈ࣮рᏱԖѳՉϯೀ
ࡕǴGPU ޑਏૈᓬܭѝԖൂՉᆣޑ CPUǴՠӢࣁךॺӧҬ๏ GPU ՉਔǴ ᗋाೀၗሀǴᏫᡏୢᚒԶ٤ਔ໔Ƕ܌аයኧࡐλਔǴGPU ѳՉϯޑᓬ༈٠όܴᡉǶՠډයኧε൯ቚуਔǴךॺёа࣮ډ GPU Չ܌
ޑਔ໔ቚуޑόӭǴϸԶࢂ CPU Չਔ໔ቚу൯ࡋᡂεǶ೭ࢂҗܭךॺஒය ኧॶቚεਔǴՉᆣኧҞࣁ 100 ޑ၉ঁՉᆣѝाӭາ൳ԛǴԶൂՉᆣ߾
ाೀ܌Ԗޑीᆉၸำ܌ԿǶ
ӢԜǴջ٬Ϟӧ٬Ҕ CUDA ϝԖ٤લᗺǴКӵᇥฯᡏ(ᡉҢь)܌ज़ ڋՐǴаϷคݤೀሀǴךॺϝฅૈ࣮р GPU ӧೀᜪ՟ୢᚒޑᓬ༈ǴԶ ନΑӧҁጇᄽᆉݤаѦǴGPU ӧೀᆾӦьᛥኳᔕǴ܈ࢂ٤ёаஒᏱԖᚳε ኧໆޑीᆉҽܨှԋӭ۶Ԝ࣬ᜢ܄όεޑीᆉҽୢᚒБय़ΨࢂԖ࣬
ޑਏǶϐࡕᒿฯᡏמೌޑǴךॺӧೀεໆၗޑਏׯ๓׳уܴ
ᡉǶ
ୖ
ୖԵЎ
[1] Bing-Qing Li , Hai-Jian Zhao, Pricing Parisian Options by Generating Functions, The Journal of Derivatives, 2009, 72–81.
[2] H. S. Wilf, Generatingfunctionology, Academic Press, 2nd edition, 1994.
[3] Costabile, M., A combinatorial approach for pricing Parisian options, Decisions in Economics and Finance, 2002, 25(2), 111–125.
[4] Hull, J.C., Options, Futures, and Other Derivatives, 6th edition, Upper Saddle River, NJ: Prentice-Hall, 2006.
[5] Yuh -Dauh Lyuu and Yi-Chun Wu, Performance of GPU for a Tree Model for Convertible Bonds Pricing with Stock Price, Interest Rate, and Default Risks, 2008.
[6] Yuh -Dauh Lyuu and Cheng-Wei Wu, Pricing Parisian Options: Combinatorics, Simulation, and Parallel Processing, 2008.
[7] Yuh -Dauh Lyuu and Cheng-Wei Wu, An Improved Combinatorial Approach for Pricing Parisian Options, Decisions in Economics and Finance, 33(2010) , 49–61.
[8] NVIDIA Corporation, NVIDIA_CUDA_Programming_Guide_2.2.1, 2009.
[9] Heresy’ Space, http://heresy.spaces.live.com/blog/.
[10] NVIDIA GPUਡЈᐕўӣ៝, http://www.fevernet.com/thread-6501-1-1.html.
[11] CUDA ZONE, http://www.nvidia.com.tw/object/cuda_home_new_tw.html.
[12] CUDA Wiki, http://zh.wikipedia.org/zh-tw/CUDA.
[13] ഋЎሱ, ᒧΒԄुሽ.
[14] ูႠ, ЃᎿᒧϟಏ, ᝊٰߎᑼബཥۑтಃΒΜϖය, 2003.