• 沒有找到結果。

透過位址預先計算以提升圖形處理器記憶體效能

N/A
N/A
Protected

Academic year: 2022

Share "透過位址預先計算以提升圖形處理器記憶體效能"

Copied!
46
0
0

加載中.... (立即查看全文)

全文

(1)

୯ҥᆵ᡼εᏢႝᐒၗૻᏢଣၗૻπำᏢس ᅺγፕЎ

'HSDUWPHQW RI &RPSXWHU 6FLHQFH DQG ,QIRUPDWLRQ (QJLQHHULQJ

&ROOHJH RI (OHFWULFDO (QJLQHHULQJ DQG &RPSXWHU 6FLHQFH

1DWLRQDO 7DLZDQ 8QLYHUVLW\

0DVWHU 7KHVLV

೸ၸՏ֟ႣӃीᆉаගϲკ׎ೀ౛Ꮤ૶Ꮻᡏਏૈ

,PSURYLQJ *38 0HPRU\ 3HUIRUPDQFH YLD $GGUHVV 3UHFRPSXWDWLRQ

ഋҥ৖

/L-KDQ &KHQ

ࡰᏤ௲௤Ǻླྀ٫࣓ റγ

$GYLVRU &KLD/LQ <DQJ 3K'

ύ๮҇୯  ԃ  Д

$XJXVW 

doi:10.6342/NTU201602569

(2)

ठᖴ

ᅺγፕЎૈ୼໩ճԋфֹԋǴЬाགᖴޑࢂךޑࡰᏤ௲௤ླྀ٫࣓Դ

ৣǴӧ΋໒ۈ଺ࣴزǴᚸᔉคޕޑਔংǴԴৣᕴࢂӧࣴز΢๏Αךࡐ ӭ҅ዴޑཀـᆶБӛǴᡣךૈ୼ӧࣴزޑၰၡ΢ᝩុ߻຾ǴନΑࣴز

΢ޑಒЈࡰᏤϐѦǴԴৣᗋ࿶த௲Ꮴךॺ଺Γೀ٣ޑၰ౛ᆶ଺٣௃ޑ БݤǴ೭ٿԃΠٰǴᡣךڙ੻ؼӭǴӧԜჹԴৣ᝘΢ന၈ኑޑགᖴǶ ԜѦाགᖴα၂ہ঩Ǵഋ٩ᆺഋ௲௤ǵֈϘᅺֈ௲௤аϷഋڷדഋ௲

௤Ǵܭα၂ၟᅺፕБय़ගрΑࡐӭёаׯ຾ޑ೽ϩǴᡣԜጇፕЎ׳у

ֹ๓Ƕ

ӧࣴز܌ғఱύǴന੝ձགᖴޑࢂЦ࢙ᑣᏢߏǴᏢߏᕴࢂ྽ךၶډ

ࣴز΢ޑ౟ᓍਔǴഉך΋ଆ૸ፕǴ٠๏рࡐӭᝊ຦ޑཀـǴЇሦךډ

҅ዴޑБӛǶԜѦགᖴჴᡍ࠻Ꮿ϶ԴᒘǵҥҥၟԴᖙǴεৎᕴࢂ΋ଆ

૸ፕǵ΋ଆ଺٣ǵ΋ଆѺ቏ઍǵ΋ଆ΢൵܌Ǵᡣךӧࣴز܌ғఱύк ᅈ៿኷ޑӣᏫǶᗋԖགᖴჴᡍ࠻Ꮲ׌ॺǴЀځࢂ੻ܱǴഉך΋ଆ૸ፕ ᜢܭᐒڋޑ೽ϩǶനࡕǴགᖴৎΓॺޑЍ࡭ᆶഉՔǴӧךବဉㄲㄲޑ ਔংǴฆ໭๏ךӞǴӧךఁӣৎޑਔংǴ฻ךӣৎǴӧך௃ᆣեပޑ ਔংǴ๏ΑѺ਻уݨǴᡣךԖᝩុᏟରޑ߿਻ǴᜫԜਔԜڅޑ഻৹ૈ

ᆶգॺ΋ӕӅ٦

ഋҥ৖ᙣ᛽

ΪѠεၗૻπำ܌

ύ๮҇୯΋ԭ႟ϖԃΖД

L doi:10.6342/NTU201602569

(3)

ᄔा

ଯਏૈޑ೯Ҕკ׎ೀ౛ᏔЬाྍԾܭεໆ୺ՉᆣޑѳՉϯаϷၗྍ

ޑкϩ٬ҔǶӧ೯Ҕკ׎ೀ౛Ꮤޑ୺ՉኳࠠΠǴኧаίीޑ୺ՉᆣҔ

ٰၲډଯำࡋ٠ՉǶฅԶǴ؂ঁਡЈ΢ࠅѝԖ΋ঁࡐλޑಃ΋ભזڗ

ٰ෧ϿӸڗ૶Ꮻᡏޑۯᒨਔ໔Ƕӧεໆޑ୺Չᆣ۶Ԝᝡݾ΋ঁࡐλޑ

ಃ΋ભזڗၗྍޑ௃ݩΠᏤठࡐৡޑזڗਏૈ٠ӢԜज़ڋΑس಍ޑਏ

ૈǶ

ҁᅺፕύǴךॺ௖઩౜Ԗკ׎ೀ౛Ꮤ΢ޑ୔༧௨ำᏔаϷZDUS ௨ำᏔ

ޑ೛ीǴךॺว౜౜Ԗޑ୔༧௨ำᏔ໼ӛܭஒ٬Ҕډ࣬ӕזڗӈϩණ ԿόӕޑਡЈ΢ǴԶ෧ϿΑזڗӈख़ፄ٬Ҕޑᐒ཮ǶӢԜǴךॺ೛ी

୔ୱགޕ୔༧ϷZDUS ௨ำᏔٰගϲזڗਏૈǶ೸ၸ೬ᡏکฯᡏޑӝբ

БԄǴᡣ୔༧௨ำᏔૈ୼٣Ӄளޕ؂ঁ୔༧཮Ҕډব٤זڗӈ٠ஒ٬

Ҕډ࣬ӕזڗӈޑ୔༧ܫӧӕ΋ঁਡЈ΢аቚуזڗӈޑख़ፄ٬Ҕޑ

ᐒ཮ǶԜѦǴ٠ᙖҗ୔ୱགޕZDUS ௨ำᏔಒಈࡋޑ௓ڋ ZDUS ޑ୺Չ

໩ׇٰਂਆזڗӈख़ፄ٬Ҕޑᐒ཮Ƕჴᡍ่݀ᡉҢǴךॺගрޑ୔ୱ གޕ௨ำᏔёаԖਏӦගϲזڗਏૈǴ٠КനӃ຾ޑ௨ำ೛ीӭගϲ

ऊ ޑ᏾ᡏਏૈǶ

ᜢᗖӷ² ೯Ҕკ׎ೀ౛Ꮤǵ୔༧௨ำᏔǵ:DUS ௨ำᏔǵಃ΋ભז

ڗǵଯਏૈၮᆉ

LL doi:10.6342/NTU201602569

(4)

$EVWUDFW

+LJK SHUIRUPDQFH FRPSXWLQJ RQ *3*38 LV UHOLHG RQ PD[LPL]LQJ WKUHDG

OHYHO SDUDOOHOLVP DQG IXOO\ UHVRXUFH XWLOL]DWLRQ ,Q *38V H[HFXWLRQ PRGHO

WKRXVDQGV RI WKUHDGV DUH HPSOR\HG WR DFKLHYH KLJK OHYHO SDUDOOHOLVP +RZ

HYHU RQO\ D VPDOO / FDFKH UHVRXUFH LV SURYLGHG LQ HDFK 60 VWUHDPLQJ PXOWL

SURFHVVRU WR UHGXFH PHPRU\ DFFHVV ODWHQF\ 0DVVLYH WKUHDGV FRPSHWLQJ D VPDOO / FDFKH UHVRXUFH FDXVHV SRRU FDFKH SHUIRUPDQFH DQG OLPLWV V\VWHP SHUIRUPDQFH

,Q WKH WKHVLV ZH H[SORUH WKH FXUUHQW GHVLJQ RI WKUHDG EORFN DQG ZDUS VFKHGXOHU :H ILQG FXUUHQW WKUHDG EORFN VFKHGXOHU WHQGV WR DOORFDWH WKUHDG EORFNV WKDW XVH WKH VDPH FDFKH OLQH WR GLIIHUHQW 60V DQG UHGXFH FDFKH UHXVH RSSRUWXQLWLHV 7KHUHIRUH ZH GHVLJQ /RFDOLW\$ZDUH VFKHGXOHU IRU LPSURY

LQJ *38 FDFKH SHUIRUPDQFH %DVHG RQ RXU SURSRVHG VRIWZDUH DQG KDUGZDUH FRRSHUDWLYH PHWKRG FDFKH OLQH WRXFKHG LQ D EORFN FDQ EH NQRZQ D SULRUL VR WKDW WKUHDG EORFN VFKHGXOHU FRXOG SXW WKUHDG EORFN ZLWK VKDULQJ FDFKH OLQH WR WKH VDPH 60 IRU LQFUHDVLQJ FDFKH OLQH UHXVH RSSRUWXQLWLHV %HVLGHV WKDW

ORFDOLW\DZDUH ZDUS VFKHGXOHU ILQHJUDLQHG FRQWUROV H[HFXWLRQ RUGHU RI ZDUS WR FDSWXUH WKH FDFKH ORFDOLW\ 7KH UHVXOW VKRZV RXU /RFDOLW\$ZDUH 6FKHGXOHU FRXOG HIIHFWLYHO\ LPSURYH FDFKH SHUIRUPDQFH DQG DFKLHYH  SHUIRUPDQFH RQ DYHUDJH RYHU WKH VWDWHRIWKHDUW VFKHGXOLQJ SROLFLHV

.H\ZRUGV ² *3*38 %ORFN &7$ 6FKHGXOHU :DUS :DYHIURQW 6FKHG

XOHU / 'DWD &DFKH +LJK 3HUIRUPDQFH &RPSXWLQJ

LLL doi:10.6342/NTU201602569

(5)

&RQWHQWV

ठᖴ L

ᄔा LL

$EVWUDFW LLL

 ,QWURGXFWLRQ 

 %DFNJURXQG 0RWLYDWLRQ 

 %DFNJURXQG                                  

 *3*38 $UFKLWHFWXUH                         

 7KUHDG %ORFN 6FKHGXOLQJ                       

 :DUS 6FKHGXOLQJ                           

 0RWLYDWLRQ                                   

 /RFDOLW\$ZDUH 6FKHGXOHU 

 2YHUYLHZ RI ORFDOLW\DZDUH VFKHGXOHU                    

 &RPSLOHU 6XSSRUW                               

 /RFDOLW\$ZDUH 7KUHDG %ORFN 6FKHGXOHU                   

 7KUHDG%ORFN/HYHO $FFHVV 5DQJH &DOFXODWLRQ           

 7KUHDG%ORFN'LVSDWFKLQJ 'HFLVLRQ                 

 /RFDOLW\$ZDUH :DUS 6FKHGXOHU                       

 :DUS/HYHO $FFHVV 5DQJH &DOFXODWLRQ                

 7ZROHYHO :DUS 6FKHGXOHU                      

LY doi:10.6342/NTU201602569

(6)

 ([SHULPHQWDO 0HWKRGRORJ\ 

 (YDOXDWLRQ 

 (IIHFW RI WKUHDG EORFN VFKHGXOHU                       

 (IIHFW RI ZDUS VFKHGXOHU                           

 3LSHOLQH VWDOO UHGXFWLRQ                            

 +DUGZDUH 2YHUKHDG                              

 5HODWHG :RUNV 

 %ORFN 6FKHGXOLQJ IRU LPSURYLQJ FDFKH ORFDOLW\               

 :DUS 6FKHGXOLQJ IRU LPSURYLQJ FDFKH ORFDOLW\                

 ,PSURYLQJ UHVRXUFH XWLOL]DWLRQ RQ *38V                   

 ,PSURYLQJ WKUHDGOHYHO SDUDOOHO RQ *38V                   

 &RQFOXVLRQ 

%LEOLRJUDSK\ 

Y doi:10.6342/NTU201602569

(7)

/LVW RI )LJXUHV

 *3*38 $UFKLWHFWXUH                             

 'DWD ORFDOLW\ LQ ZRUNORDGV ZLWK GLIIHUHQW DFFHVV EHKDYLRU          

 &RPSDULVRQ RI GLIIHUHQW WKUHDG EORFN VFKHGXOLQJ SROLFLHV IRU URZPDMRU DQG FROXPQPDMRU DSSOLFDWLRQV                        

 2YHUYLHZ RI RXU ORFDOLW\DZDUH VFKHGXOHU                  

 $GGUHVV FDOFXODWLRQ FRGH H[WUDFWLRQ                     

 6LPSOH DGGUHVV FDOFXODWLRQ FRGH $UUD\EDVHG GDWD VWUXFWXUHV       

 ([DPSOH RI PHPRU\ DFFHVV UHJLRQ LQ D WKUHDG EORFN             

 ([WHQGHG EORFN TXHXH HQWULHV                         

 %\WH DGGUHVV WR OLQH DGGUHVV WUDQVIRUPDWLRQ                 

 0DSSLQJ DGGUHVV UDQJH RI WKH EORFN WR WKH FRRUGLQDWH RI FDFKH OLQH DGGUHVV 

 )ORZ RI WKH WKUHDGEORFN GLVSDWFKLQJ GHFLVLRQ DOJRULWKP          

 /RFDOLW\ HVWLPDWLRQ EHWZHHQ WZR EORFNV                   

 KLHUDUFKLFDO ZDUS HQFRGLQJ                          

 /RFDOLW\ DZDUH ZDUS VFKHGXOHU                        

 )ORZFKDUW RI WKH WZROHYHO ZDUS GLVSDWFKLQJ GHFLVLRQ            

 ([DPSOH RI WKH /RFDOLW\ GHJUHH WDEOH                     

 ([DPSOH RI LQWHUZDUS ORFDOLW\ FRPSXWDWLRQ                 

 &RPSDULVRQ RI SHUIRUPDQFH IRU GLIIHUHQW SROLFLHV QRUPDOL]H WR EDVHOLQH DUFKLWHFWXUH XVLQJ *72 ZDUS VFKHGXOHU                   

 /' FDFKH PLVV FRPSDULVRQV                         

YL doi:10.6342/NTU201602569

(8)

 %UHDNGRZQ SLSHOLQH VWDOO F\FOHV QRUPDOL]H WR EDVHOLQH H[HFXWLRQ F\FOHV RI 7\SH , DSSOLFDWLRQV                              

 %UHDNGRZQ SLSHOLQH VWDOO F\FOHV QRUPDOL]H WR EDVHOLQH H[HFXWLRQ F\FOHV LQ W\SH,, DQG W\SH,,,                              

YLL doi:10.6342/NTU201602569

(9)

/LVW RI 7DEOHV

 *38 DQG &38 SHU WKUHDG FDFKH UHVRXUFH                   

 *3*386LP 6LPXODWLRQ &RQILJXUDWLRQ                   

 :RUNORDGV                                   

 :RUNORDG FDWHJRULHV                             

 +DUGZDUH RYHUKHDG                              

YLLL doi:10.6342/NTU201602569

(10)

&KDSWHU  ,QWURGXFWLRQ

*HQHUDO 3XUSRVH *UDSKLF 3URFHVVLQJ 8QLWV *3*38V DUH EHFRPLQJ PRUH DQG PRUH SRSX

ODU DQG ZLGHO\ XVHG LQ GLIIHUHQW DUHDV VXFK DV LPDJH SURFHVVLQJ SK\VLFDO EDVHG VLPXODWLRQ

DQG FORXG FRPSXWLQJ GXH WR WKHLU VLJQLILFDQW FRPSXWLQJ FDSDELOLW\ >@ 0RGHUQ *3*

38V DOORZ WKRXVDQGV RI WKUHDGV WR EH H[HFXWHG LQ SDUDOOHO ,Q RUGHU WR PDQDJH WKH PDVVLYH WKUHDGV RQ D *3*38 FRUH DQG VLPSOLILHG KDUGZDUH GHVLJQ WKUHDGV DUH W\SLFDOO\ JURXSHG LQWR WKUHDG EORFNV DQG HDFK WKUHDG EORFN DUH GLYLGHG LQWR D VPDOO JURXS RI WKUHDGV FDOOHG ZDUS 7KUHDGV ZLWKLQ D ZDUS DUH H[HFXWHG LQ ORFNVWHS ZKLFK PHDQV WKDW WKH\ VKDUH WKH IHWFK DQG GHFRGH XQLW LQ WKH SLSHOLQH DQG DUH DOO H[HFXWHG RQ WKH VDPH LQVWUXFWLRQ DW WKH VDPH WLPH :LWK PXOWLSOH ZDUSV UHVLGLQJ LQ D *38 FRUH WKH PHPRU\ DFFHVV ODWHQF\ FDQ EH KLGGHQ E\ WKH IDVW FRQWH[WVZLWFK EHWZHHQ GLIIHUHQW ZDUSV 7KHUHIRUH *3*38V FDQ RIIHU EHWWHU SHUIRUPDQFH DQG SRZHU HIILFLHQF\ WKDQ &38V

,Q W\SLFDO *3*38 H[HFXWLRQ PRGHO D NHUQHO ZKLFK LV WKH PLQLPXP WDVN XQLW RQ *38V

LV FRPSRVHG RI PXOWLSOH WKUHDG EORFNV 7KH WKUHDG EORFN VFKHGXOHU GLVWULEXWHV WKRVH WKUHDG EORFNV WR WKH 6WUHDPLQJ 0XOWLSURFHVVRUV 60V  ,Q HDFK 60 WKH WKUHDG EORFNV DUH IXU

WKHU GLYLGHG LQWR D VPDOO VFKHGXOLQJ XQLW±ZDUS WKH ZDUS VFKHGXOHU LVVXHV ZDUS WR WKH 6,0' VLQJOH LQVWUXFWLRQV PXOWLSOH GDWD ODQH IRU H[HFXWLRQ %HVLGHV HDFK 60 FRQWDLQV ODUJH DPRXQWV RI PHPRU\ UHVRXUFHV VXFK DV UHJLVWHU ILOHV O GDWD FDFKH VKDUHG PHPRU\

UHVRXUFH HWF WR VXSSRUW WKH ODUJH QXPEHU RI WKUHDGV IRU H[HFXWLRQ

'HVSLWH WKH KLJKO\ DFKLHYDEOH WKUHDGOHYHO SDUDOOHOLVP WKH *38 FRUHV XVXDOO\ VXIIHU IURP WKH VHULRXV FDFKH FRQWHQWLRQ ,Q HDFK FRUH RI PRGHUQ *3*38 DUFKLWHFWXUH WKRX

 doi:10.6342/NTU201602569

(11)

VDQGV RI WKUHDGV VKDUH D VPDOO O FDFKH UHVRXUFH ZKLFK PHDQV WKDW HDFK WKUHDG RQO\ KDV DQ H[WUHPHO\ VPDOO O FDFKH UHVRXUFH TXRWD )RU H[DPSOH LQ 19,',$¶V .HSOHU WKDW FDQ VXSSRUW XS WR  WKUHDGV LQ HDFK FRUH SHUWKUHDG FDFKH UHVRXUFH LV RQO\  E\WHV >@

ZKLFK LV IDU OHVV WKDQ WKH &38 WKUHDG KDV :LWK WKH OLPLWHG FDFKH FDSDFLW\ WKH FDFKH OLQHV EURXJKW E\ RQH WKUHDG FDQ HDVLO\ EH HYLFWHG E\ RWKHU WKUHDGV OHDGLQJ WR VHULRXV SHUIRU

PDQFH GHJUDGDWLRQ

3ULRU VWXGLHV KDYH EHHQ SURSRVHG GLIIHUHQW ZDUS VFKHGXOHUV WR PLWLJDWH FDFKH FRQ

WHQWLRQ * 5RJHUV HW DO >@ SURSRVH FDFKHFRQVFLRXV ZDUS VFKHGXOLQJ ZKLFK XVHV DGGL

WLRQDO KDUGZDUH WR PRQLWRU FDFKH WKUDVKLQJ EHKDYLRU DQG WKHQ XVH ZDUS WKURWWOLQJ WHFKQLTXH WR UHGXFH LQWHUZDUS LQWHUIHUHQFH IRU SUHVHUYLQJ LQWUDZDUS FDFKH ORFDOLW\ '$:6 >@ IXU

WKHU SUHVHUYHV PRUH LQWUDZDUS FDFKH ORFDOLW\ E\ XVLQJ D SUHGLFWRU FRPELQHG ZLWK SURILOH

EDVHG DQG RQOLQH GHWHFWLRQ LQIRUPDWLRQ 0RVW RI WKHP WU\ WR WKURWWOH WKH QXPEHU RI ZDUSV WKDW FDQ DFFHVV WKH FDFKH UHVRXUFH WR SUHVHUYH WKH LQWUDZDUS ORFDOLW\ +RZHYHU YHU\ IHZ VWXGLHV WU\ WR PLWLJDWH FDFKH FRQWHQWLRQ E\ SXWWLQJ WKH ZDUSV ZLWK FDFKH ORFDOLW\ RQ WKH VDPH FRUH ,Q DGGLWLRQ WR ZDUS VFKHGXOLQJ UHFHQW ZRUN >@ DOVR VKRZV WKDW WKUHDG EORFN VFKHGXOLQJ LV DQ DQRWKHU LPSRUWDQW IDFWRU WR LPSURYH FDFKH SHUIRUPDQFH ,W VFKHGXOHV WZR FRQVHFXWLYH WKUHDG EORFNV WR WKH VDPH FRUH EHFDXVH WKH\ REVHUYH WKDW WKH WZR FRQ

VHFXWLYH WKUHDG EORFNV XVXDOO\ DFFHVV WKH GDWD LQ VKDUHG FDFKH OLQHV +RZHYHU WKHLU ZRUN RQO\ IRFXV RQ D VSHFLILF GDWD DFFHVV EHKDYLRU LQ YDULRXV *3*38 ZRUNORDGV )RU *3*38 DSSOLFDWLRQV ZLWK GLIIHUHQW GDWD DFFHVV EHKDYLRU D FRPSUHKHQVLYH DSSURDFK LV QHHGHG WR H[SORLW QR RQO\ WKH FDFKH ORFDOLW\ EHWZHHQ WKH WZR FRQVHFXWLYH WKUHDG EORFNV EXW DOVR WKH FDFKH ORFDOLW\ H[LVWLQJ LQ GLIIHUHQW WKUHDG EORFN FRPELQDWLRQV

,Q WKLV WKHVLV ZH SURSRVH WKH VRIWZDUH DQG KDUGZDUH FRRSHUDWLYH PHWKRG WR LPSURYH FDFKH SHUIRUPDQFH 7KH PDLQ LGHD LV WR HVWLPDWH WKH WKUHDG ORFDOLW\ WKURXJK DGGUHVV SUH

FRPSXWDWLRQ :LWK WKH ORFDOLW\ LQIRUPDWLRQ WKH WKUHDG EORFN VFKHGXOHU DQG ZDUS VFKHGXOHU ZLWKLQ D 60 6WUHDPLQJ 0XOWLSURFHVVRU FDQ WKHQ PDNH VPDUW GHFLVLRQ IRU EORFNZDUS GLVSDWFKHV ZLWK WKH JRDO WR RSWLPL]H O FDFKH SHUIRUPDQFH

7R DFKLHYH WKH JRDO FRPSLOHU KHOSV WR H[WUDFW WKH DGGUHVV LQIRUPDWLRQ IURP WKH *38 SURJUDP GXULQJ FRPSLODWLRQ %\ XWLOL]LQJ WKH DGGUHVV LQIRUPDWLRQ WKH WKUHDG EORFN VFKHG

 doi:10.6342/NTU201602569

(12)

XOHU FDQ HVWLPDWH WKH FDFKH ORFDOLW\ EHWZHHQ GLIIHUHQW WKUHDG EORFNV DQG DVVLJQ WKH SURSHU WKUHDG EORFN WR WKH 60 0RUHRYHU ZDUS VFKHGXOHU FDQ ILQHJUDLQHG FRQWURO WKH H[HFXWLRQ RUGHU RI ZDUSV WR HIILFLHQWO\ H[SORLW WKH FDFKH ORFDOLW\ SURYLGHG IURP WKH EORFN VFKHGXOHU EDVHG RQ WKH HVWLPDWHG ZDUS ORFDOLW\ 7KH NH\ FKDOOHQJH LV KRZ WR DFFXUDWHO\ HVWLPDWH WKH ORFDOLW\ EHWZHHQ WKUHDG EORFNV DQG ZDUSV DW UXQWLPH )LUVW ZH ZRXOG QHHG D PHWULF WR TXDQWLI\ FDFKH DFFHVV ORFDOLW\ EHWZHHQ WKUHDG EORFNV DQG ZDUSV 6HFRQG GHWHUPLQH ZKDW LQIRUPDWLRQ LV QHFHVVDU\ IRU RXU PHWULFV DQG KRZ WR JHW WKH UHTXLUHG LQIRUPDWLRQ

7KLUG KRZ WR GHVLJQ WKH EORFN VFKHGXOHU SROLF\ DQG ZDUS VFKHGXOHU SROLF\ WR FDSWXUH SHUIRUPDQFH LPSURYH RSSRUWXQLWLHV E\ H[SORLWLQJ WKH FDFKH DFFHVV ORFDOLW\

7KH WKHVLV RIIHUV WKH IROORZLQJ FRQWULEXWLRQV

‡ :H SURSRVH WKH ILUVW VRIWZDUHKDUGZDUH FRRSHUDWLYH PHFKDQLVP WR HVWLPDWH WKH FDFKH ORFDOLW\ DPRQJ GLIIHUHQW WKUHDG EORFNVZDUSV DW UXQWLPH 2XU PHFKDQLVP FDQ FDS

WXUH WKH FDFKH DFFHVV ORFDOLW\ LQ D EURDG UDQJH RI UHJXODU *3*38 DSSOLFDWLRQV ZLWK GLYHUVH PHPRU\ DFFHVV EHKDYLRU

‡ :H GHYHORS D ORFDOLW\DZDUH WKUHDG EORFN VFKHGXOHU DQG ZDUS VFKHGXOHU 7KH WKUHDG EORFN VFKHGXOHU LV DEOH WR LQFUHDVH WKH DPRXQW RI VKDUHG FDFKH OLQHV DQG GDWD UHXVH LQ HDFK 60 DQG WKH ZDUS VFKHGXOHU LV DEOH WR PD[LPL]H FDFKH UHXVH RSSRUWXQLWLHV LQ D 60 EDVHG RQ WKH HVWLPDWHG FDFKH ORFDOLW\

‡ 7KH (YDOXDWLRQV VKRZ WKDW RXU PHFKDQLVP FDQ GUDPDWLFDOO\ UHGXFH O FDFKH PLVV UDWH DQG SURYLGH XS WR  DYHUDJH  DQG  DYHUDJH  SHUIRUPDQFH LPSURYHPHQW RYHU WKH FRQYHQWLRQDO URXQGURELQ VFKHGXOHU DQG WKH VWDWHRIWKHDUW DSSURDFK

7KH UHVW RI WKH WKHVLV LV RUJDQL]HG DV IROORZV ,Q &KDSWHU  ZH ZLOO LQWURGXFH WKH FXUUHQW *3*38 DUFKLWHFWXUH DQG WKHQ GHVFULEH RXU PRWLYDWLRQ IRU WKLV WKHVLV ,Q &KDSWHU 

ZH ZLOO LQWURGXFH RXU SURSRVHG VRIWZDUH DQG KDUGZDUH FRRSHUDWLYH PHWKRG WR HVWLPDWH FDFKH DFFHVV ORFDOLW\ DQG WKH SURSRVHG ORFDOLW\DZDUH VFKHGXOHU SROLFLHV DQG WKH GHWDLOV RI DUFKLWHFWXUH ,Q &KDSWHU  ZH ZLOO GHVFULEH RXU PHWKRGRORJ\ DQG ZRUNORDGV ,Q &KDSWHU  ZH ZLOO HYDOXDWH RXU SURSRVHG VFKHGXOHU ZLWK WKH VWDWHRIWKHDUW VFKHGXOLQJ SROLFLHV DQG

 doi:10.6342/NTU201602569

(13)

VKRZ RXU H[SHULPHQWDO UHVXOWV ,Q &KDSWHU  ZH ZLOO VXPPDUL]H UHODWHG ZRUNV )LQDOO\

ZH ZLOO FRQFOXGH WKH WKHVLV LQ &KDSWHU 

 doi:10.6342/NTU201602569

(14)

&KDSWHU 

%DFNJURXQG 0RWLYDWLRQ

,Q WKLV FKDSWHU ZH ILUVW LQWURGXFH *3*38 DUFKLWHFWXUH DQG WKHQ SURYLGH EDFNJURXQG LQIRU

PDWLRQ RI WKUHDG EORFNZDUS VFKHGXOHU 1H[W ZH ZRXOG EULHIO\ GHVFULEH RXU PRWLYDWLRQ

 %DFNJURXQG

:LWK HPHUJLQJ RI KLJK SHUIRUPDQFH FRPSXWLQJ PDQ\ DSSOLFDWLRQV DUH XVLQJ *38 WR DF

FHOHUDWH WKHLU SHUIRUPDQFH )RU *3*38 SURJUDPLQJ &8'$ &RPSXWLQJ 8QLILHG 'HYLFH

$UFKLWHFWXUH >@ DQG 2SHQ&/ 2SHQ &RPSXWLQJ /DQJXDJH >@ DUH WKH PRVW SRSXODU ODQJXDJH ,Q &8'$ DQG 2SHQ&/ SURJUDPPLQJ PRGHO D NHUQHO LV GHILQHG DV D PLQLPXP XQLW ODXQFKHG RQ *38V $ NHUQHO LV FRQVLVW RI WKRXVDQGV RI WKUHDGV DQG WKHVH WKUHDGV DUH RUJDQL]HG DV WZROHYHO WKUHDG KLHUDUFK\ ± WKUHDG EORFN RU FDOOHG &7$ DQG ZDUS RU FDOOHG ZDYHIURQW  $ FROOHFWLRQ RI WKUHDGV DUH JURXSHG WR ZDUS DQG RQH RU PRUH ZDUSV IRUP D WKUHDG EORFN

 *3*38 $UFKLWHFWXUH

)LJ  VKRZV D W\SLFDO *3*38 DUFKLWHFWXUH ,W FRQVLVWV RI WKUHDG EORFN VFKHGXOHU PXO

WLSOH VWUHDPLQJ PXOWLSURFHVVRUV 60V  D / FDFKH VKDUHG DPRQJ DOO 60V DQG D RIIFKLS '5$0 7KUHDG EORFN VFKHGXOHU LV UHVSRQVLEOH WR GLVWULEXWH WKUHDG EORFNV WR 60V (DFK 60 KDV GXDO ZDUS VFKHGXOHU UHJLVWHU ILOHV / FDFKH WH[WXUH FDFKH DQG VFUDWFKSDG PHP

RU\ 'XULQJ NHUQHO H[HFXWLRQ WKUHDG EORFN VFKHGXOHU GLVSDWFKHV WKH ZKROH WKUHDGV ZLWKLQ

 doi:10.6342/NTU201602569

(15)

GPU

SM SM SM

SM

Warp scheduler

Warp scheduler

SIMD Lanes Register Files Block scheduler

L1 Cache Shared Mem.

Texture Cache L2 cache

DRAM

)LJXUH  *3*38 $UFKLWHFWXUH

WKH WKUHDG EORFN WR 60 DQG IL[HG QXPEHU RI WKUHDGV XVXDOO\   LQ WKH WKUHDG EORFN DUH DXWRPDWLFDOO\ JURXSHG DV ZDUS E\ ZDUS VFKHGXOHU 7KHQ LQ HDFK F\FOH ZDUS VFKHGXOHU LVVXHV ZDUS WR 6,0' VLQJOH LQVWUXFWLRQ PXOWLSOH GDWD SLSHOLQH 7KH SULYDWH / FDFKH LV XVHG WR FDFKH GDWD LQ WKH 60 +RZHYHU VFUDWFKSDG PHPRU\ ZKLFK LV XVXDOO\ FDOOHG VKDUHG PHPRU\ LQ &8'$ SURJUDPPLQJ LV XVHG DV XVHUPDQDJHG FDFKH 7KH GDWD VWRUHG LQ WKH VKDUHG PHPRU\ LV RQO\ YLVLEOH DPRQJ ZDUSV ZLWKLQ WKH VDPH WKUHDG EORFN

 7KUHDG %ORFN 6FKHGXOLQJ

7KUHDG EORFN VFKHGXOHU ZKLFK LV FDOOHG JLJDWKUHDG HQJLQH LQ 19,',$ *38V LV UHVSRQVL

EOH WR DVVLJQ WKUHDG EORFNV WR WKH 60V ZKHQ 60V KDYH HQRXJK KDUGZDUH UHVRXUFH 7KHUH DUH IRXU KDUGZDUH VRXUFH OLPLWLQJ WKH QXPEHU RI WKUHDG EORFNV UXQQLQJ RQ WKH 60 UHJ

LVWHU ILOHV XVDJHSHU WKUHDG EORFN VFUDWFKSDG PHPRU\ XVDJHSHU WKUHDG EORFN QXPEHU RI WKUHDGV DQG WKH QXPEHU RI EORFNV UHVSHFWLYHO\ 2QFH 60 KDV HQRXJK KDUGZDUH UHVRXUFH WR VXSSRUW D QHZ WKUHDG EORFN WKUHDG EORFN VFKHGXOHU ZRXOG DVVLJQ D QHZ WKUHDG EORFN WR WKH 60 +RZHYHU WKHUH LV YHU\ IHZ LQIRUPDWLRQ RQ WKH VFKHGXOLQJ SROLF\ XVHG WKUHDG EORFN VFKHGXOHU ,Q WKLV ZRUN ZH XVH 5RXQG5RELQ 55 WKUHDG EORFN VFKHGXOHU DV RXU EDVHOLQH >@

 doi:10.6342/NTU201602569

(16)

 :DUS 6FKHGXOLQJ

,Q 19,',$ *38V  WKUHDGV ZLWKLQ D WKUHDG EORFN DUH JURXSHG DV D VLQJOH ZDUS 0XOWLSOH ZDUSV LQ WKH 60 FDQ EH XVHG WR KLGH PHPRU\ DFFHVV ODWHQF\ E\ ILQHJUDLQHG PXOWLWKUHDG

LQJ )RU LQVWDQFH ZKHQ RQH ZDUS VXIIHUV D VWDOO ZDUS VFKHGXOHU ZRXOG FRQWLQXDOO\ LVVXH RWKHU DYDLODEOH ZDUSV WR KLGH VWDOO ODWHQF\ IRU DFKLHYLQJ KLJK SHUIRUPDQFH 7RGD\ YDULRXV ZDUS VFKHGXOLQJ SROLF\ KDV EHHQ SURSRVHG WR DFKLHYH KLJK SHUIRUPDQFH +HUH ZH ZLOO LQWURGXFH WKUHH EDVLF ZDUS VFKHGXOHU DV IROORZLQJ

5RXQG5RELQ ZDUS VFKHGXOHU

,Q 5RXQG5RELQ ZDUS VFKHGXOHU HDFK ZDUS KDV DQ HTXDO SULRULW\ DQG ZDUS VFKHGXOHU LVVXHV ZDUS EDVHG RQ WKH ZDUS SULRULW\ 7KLV QDLYH ZDUS VFKHGXOHU SROLF\ WHQGV WR PDNH DOO ZDUSV KDYH WKH VDPH SURJUHVV ZKLFK SRWHQWLDOO\ PDNHV DOO ZDUSV VWDOOHG RQ WKH VDPH ORQJ ODWHQF\ RSHUDWLRQ LH RIIFKLS PHPRU\ DFFHVV LQ WKH VDPH WLPH DQG GHJUDGHV WKH V\VWHP SHUIRUPDQFH

*72 ZDUS VFKHGXOHU

*72 JUHHG\WKHQROGHVW >@ ZDUS VFKHGXOHU KDV EHHQ SURSRVHG WR LPSURYH WKH GUDZ

EDFNV RI 55 ZDUS VFKHGXOHU 7KH PDLQ LGHD LV WR PDNH GLIIHUHQW ZDUSV KDYH GLIIHUHQW SURJUHVV VXFK WKDW WKH\ PLJKW QRW EH VWDOOHG RQ WKH VDPH WLPH 7KH GHWDLOHG SROLF\ LV H[SODLQHG DV IROORZLQJ ILUVW *72 VFKHGXOHU ZRXOG JUHHGLO\ VHOHFW WKH VDPH ZDUS IRU H[

HFXWLRQ XQWLO LW VXIIHUV D VWDOO DQG WKHQ FKRRVH WKH ROGHVW ZDUS IRU H[HFXWLRQ (DFK ZDUS¶V DJH LV GHWHUPLQHG E\ WKH WLPH LW HQWHUV WKH 60 1RWH WKDW DOWKRXJK DOO ZDUSV ZLWKLQ D WKUHDG EORFN HQWHU WKH 60 DW WKH VDPH WLPH EXW GLIIHUHQW ZDUSV VWLOO KDYH GLIIHUHQW DJH 7KH ZDUS ZLWK VPDOOHU WKUHDG LG KDV PRUH ROGHU DJH WKDQ WKH ZDUS ZLWK ELJJHU WKUHDG LG 6LQFH WKH ZDUS ZLWK ROGHVW DJH XVXDOO\ KDV WKH KLJKHVW SULRULW\ WKH ROGHVW ZDUS XVXDOO\ KDV IDVWHVW SURJUHVV ZKLOH WKH \RXQJHVW ZDUS KDV WKH VORZHVW SURJUHVV 7KH GLIIHUHQW SURJUHVV DPRQJ ZDUSV PHDQV WKDW WKH\ PLJKW QRW EH VWDOOHG RQ WKH VDPH WLPH 7KDW LV RQFH D ZDUS LV VWDOOHG RWKHU DYDLODEOH ZDUSV FDQ EH VHOHFWHG IRU H[HFXWLRQ WR PDNH WKH FRUH EXV\ DQG LPSURYH WKH V\VWHP SHUIRUPDQFH

 doi:10.6342/NTU201602569

(17)

7DEOH  *38 DQG &38 SHU WKUHDG FDFKH UHVRXUFH 'HYLFH 0D[LPXP WKUHDGV SHU FRUH60 / FDFKH VL]H

)HUPL  .% FRQILJXUDEOH

.HSOHU  .% FRQILJXUDEOH

+DVZHOO  .%

6N\ODNH  .%

7ZROHYHO ZDUS VFKHGXOHU

7ZROHYHO ZDUS VFKHGXOHU LV DOVR SURSRVHG > @ WR LPSURYH WKH VKRUWFRPLQJV RI 55

VFKHGXOHU ,W PDLQWDLQV WZR ZDUS JURXSV ± DFWLYH JURXS DQG SHQGLQJ JURXS 2QO\ ZDUSV LQ WKH DFWLYH JURXS FDQ EH LVVXHG E\ ZDUS VFKHGXOHU :KHQ DQ\ ZDUS LQ WKH DFWLYH JURXS LV VWDOOHG RQ WKH ORQJ ODWHQF\ RSHUDWLRQV WKH ZDUS LV GHPRWHG WR WKH SHQGLQJ JURXS $W WKH VDPH WLPH D ZDUS LV SURPRWHG IURP SHQGLQJ JURXS WR DFWLYH JURXS $V D UHVXOW GLIIHUHQW ZDUS SURJUHVV EHWZHHQ ZDUSV LQ WKH DFWLYH JURXS DQG WKH SHQGLQJ JURXS FDQ SUHYHQW DOO ZDUSV IURP VXIIHULQJ VWDOOV LQ WKH VDPH WLPH

 0RWLYDWLRQ

,Q SUHYLRXV VHFWLRQ ZH KDYH DOUHDG\ LQWURGXFHG WKH *3*38 DUFKLWHFWXUH 7KRXVDQGV RI WKUHDGV DUH HPSOR\HG LQ D 60 WR DFKLHYH KLJK WKURXJKSXW EXW RQO\ D VPDOO / FDFKH UHVRXUFH LV SURYLGHG WR UHGXFH PHPRU\ DFFHVV ODWHQF\ $V VKRZQ LQ WDEOH  .HSOHU

*38 >@ KDV D .% / FDFKH DQG VXSSRUW  WKUHDGV LQ D 60 ZKLOH 6N\ODNH&38 KDV .% / FDFKH DQG VXSSRUW  WKUHDGV FRQFXUUHQWO\ UXQQLQJ LQ D FRUH 2Q DYHUDJH D JSX WKUHDG KDV RQO\ DURXQG  E\WHV FDFKH UHVRXUFH ZKLFK LV IDU OHVV WKDQ &38 WKUHDGV KDV DURXQG .%  :LWK OLPLWHG SHUWKUHDG FDFKH FDSDFLW\ FDFKH EORFN EURXJKW E\ RQH WKUHDG ZRXOG HDVLO\ EH NLFNHG RXW E\ RWKHU WKUHDGV OHDGLQJ WR VHULRXV SHUIRUPDQFH ORVV

%HVLGHV FXUUHQW WKUHDG EORFN VFKHGXOHU LV XQDZDUH RI FDFKH ORFDOLW\ DPRQJ WKUHDG EORFNV DQG DGRSWV D 5RXQG5RELQ IDVKLRQ SROLF\ WR GLVSDWFK WKUHDG EORFNV 5HFHQW ZRUN >@

REVHUYHV WKDW FRQVHFXWLYH WKUHDG EORFNV RIWHQ DFFHVV GDWD LQ VKDUHG FDFKH OLQHV DQG WKH\

SURSRVHG D WKUHDG EORFN VFKHGXOLQJ SROLF\ FDOOHG %&6 WR LPSURYH SHUIRUPDQFH E\ DOORFDW

 doi:10.6342/NTU201602569

(18)

Rows width

Cols width

Row-major access

cache line

B

0

B

1

B

2

B

3

B

n

Rows width

Cols width

Col-major access

cache line

B

0

B

1

B

2

B

3

B

n

B

n+1

B

n+2

B

n+3

)LJXUH  'DWD ORFDOLW\ LQ ZRUNORDGV ZLWK GLIIHUHQW DFFHVV EHKDYLRU

Row-major access example

Col-major access example

Block scheduler (RR)

SM 0

B

0

B

2

SM 1

B

1

B

3

Block scheduler (BCS) SM 0

B

0

B

1

SM 1

B

2

B

3

Block scheduler (Ideal) SM 0

B

0

B

1

SM 1

B

2

B

3

Block scheduler (RR) SM 0

B

0

B

2

SM 1

B

1

B

3

Block scheduler (BCS) SM 0

B

0

B

1

SM 1

B

2

B

3

Block scheduler (Ideal) SM 0

B

0

B

n

SM 1

B

1

B

n+1

)LJXUH  &RPSDULVRQ RI GLIIHUHQW WKUHDG EORFN VFKHGXOLQJ SROLFLHV IRU URZPDMRU DQG FROXPQPDMRU DSSOLFDWLRQV

LQJ FRQVHFXWLYH WKUHDG EORFNV WR WKH VDPH 60 +RZHYHU QRW DOO ZRUNORDGV EHQHILW IURP

%&6 DV WKH WKUHDG EORFN FRPELQDWLRQV WKDW VKDUH FDFKH OLQHV YDU\ LQ ZRUNORDGV ZLWK GLI

IHUHQW PHPRU\ DFFHVV EHKDYLRU )LJXUH  VKRZV WZR FRPPRQ GDWD DFFHVV EHKDYLRU LQ FXUUHQW *3*38 ZRUNORDGV )RU URZPDMRU DFFHVV DSSOLFDWLRQ FRQVHFXWLYH WKUHDG EORFNV XVXDOO\ VKDUH WKH VDPH FDFKH OLQH HQWULHV +RZHYHU IRU FROXPQPDMRU DFFHVV DSSOLFDWLRQ

FDFKH OLQHV DUH VKDUHG EHWZHHQ VWULGH WKUHDG EORFNV VXFK DV EORFN  DQG EORFN 1 )LJ

XUH  LOOXVWUDWHV WKH LPSDFW RI GLIIHUHQW WKUHDG EORFN VFKHGXOLQJ SROLFLHV RQ WKHVH WZR W\SH RI DSSOLFDWLRQV $V ZH FDQ VHH FXUUHQW 55 WKUHDG EORFN VFKHGXOHU GHVLJQ WHQGV WR DOORFDWH WKUHDG EORFN XVLQJ VDPH FDFKH OLQHV WR GLIIHUHQW FRUHV 3ULRU ZRUN %&6 FRXOG

 doi:10.6342/NTU201602569

(19)

RQO\ FDSWXUH SHUIRUPDQFH LPSURYHPHQW RSSRUWXQLWLHV LQ URZPDMRU DFFHVV DSSOLFDWLRQV

2XU JRDO LV WR GHVLJQ D ORFDOLW\DZDUH VFKHGXOHU WKDW FDQ H[SORLW WKH GDWD UHXVH EHWZHHQ WKUHDG EORFNV LQ GLIIHUHQW W\SHV RI ZRUNORDGV LQFOXGLQJ ERWK URZPDMRU DQG FROXPQPDMRU DSSOLFDWLRQV DV SUHVHQWHG LQ WKH LGHDO FDVH LQ ILJXUH 

 doi:10.6342/NTU201602569

(20)

&KDSWHU 

/RFDOLW\$ZDUH 6FKHGXOHU

,Q WKLV FKDSWHU ZH ILUVW GHVFULEH WKH PDLQ LGHD RI RXU ORFDOLW\DZDUH VFKHGXOHU DQG WKHQ LQWURGXFH RXU VFKHGXOHU SROLFLHV LQ EORFN VFKHGXOHU DQG ZDUS VFKHGXOHU UHVSHFWLYHO\

 2YHUYLHZ RI ORFDOLW\DZDUH VFKHGXOHU

7R LPSURYH FDFKH SHUIRUPDQFH ZH GHVLJQ ORFDOLW\DZDUH VFKHGXOHU ZKLFK FDQ HQKDQFH WKH FDFKH UHXVH RSSRUWXQLWLHV 7KH JRDO RI RXU GHVLJQ LV WR OHW WKUHDG EORFNV XVLQJ WKH VDPH FDFKH OLQH EH DVVLJQHG WR WKH VDPH FRUH DQG ZDUSV XVLQJ WKH VDPH FDFKH OLQH HQWULHV FRXOG EH LVVXHG QHDUO\ DW WKH VDPH WLPH VXFK WKDW FDFKH OLQHV FRXOG EH UHXVHG DV PDQ\ WLPHV DV SRVVLEOH EHIRUH WKH\ JHW HYLFWHG DQG WKHUHIRUH UDLVH / KLW UDWH 7R DFKLHYH RXU JRDO

ZH SURSRVH VRIWZDUH DQG KDUGZDUH FRRSHUDWLYH PHFKDQLVP WR HVWLPDWH ORFDOLW\ DPRQJ GLIIHUHQW WKUHDG EORFNV DQG ZDUSV DW UXQWLPH

)LJXUH  VKRZV WKH RYHUYLHZ RI SURSRVHG ORFDOLW\DZDUH VFKHGXOLQJ PHWKRG 7KH ER[HV ZLWK JUHHQ FRORU DUH VRIWZDUH VLGH DQG RWKHUV DUH KDUGZDUH VLGH 7KH ORFDOLW\DZDUH VFKHGXOLQJ PHWKRG LQFOXGHV DGGUHVV FDOFXODWLRQ FRGH H[WUDFWLRQ DQG ORFDOLW\DZDUH EORFN GLVSDWFKHU 7KH DGGUHVV FDOFXODWLRQ FRGH H[WUDFWLRQ LV GRQH E\ FRPSLOHU ZKLFK H[WUDFWV WKH FRGH IURP D NHUQHO SURJUDP GXULQJ FRPSLODWLRQ DQG JHQHUDWHV D VHSDUDWH DGGUHVV FDO

FXODWLRQ ELQDU\ 7KH *38 GULYHU SDVVHV WKH ELQDU\ WR EORFN VFKHGXOHU DW WKH VWDUW RI *38 DSSOLFDWLRQ 7KH ORFDOLW\DZDUH EORFN GLVSDWFKHU LV D VFKHGXOLQJ DOJRULWKP UXQQLQJ RQ D VPDOO LQRUGHU SURFHVVRU >@ LQ EORFN VFKHGXOHU ZKLFK LQFRUSRUDWHV WKH DGGUHVV FDOFXOD

 doi:10.6342/NTU201602569

(21)

GPU program

compiler

Address Calculation Binary executable

Extract address calculation code

GPU

Block queue

Thread-Block-Level Access Range Calculation

Thread-Block Dispatching Decision

Warp-Level Access Range Calculation Block Scheduler

Warp queue

Two-level warp scheduler

SMs

a small, in-order processor Locality-aware scheduling

algorithm

)LJXUH  2YHUYLHZ RI RXU ORFDOLW\DZDUH VFKHGXOHU

WLRQ ELQDU\ DQG GLVSDWFKHV WKUHDG EORFNV WR 60V PD[LPL]LQJ WKH LQWHUEORFN ORFDOLW\ $IWHU NHUQHO LV ODXQFKHG RQ *38V LWV WKUHDG EORFNV DUH HQTXHXHG LQ WKH EORFN TXHXH RI WKH EORFN VFKHGXOHU 7KH PHPRU\ DFFHVV UDQJH RI HDFK WKUHDG EORFN LQ WKH EORFN TXHXH DUH FDOFX

ODWHG EDVHG RQ WKH DGGUHVV FDOFXODWLRQ ELQDU\ DQG VWRUHG LQ WKH VDPH EORFN TXHXH HQWU\

:KHQ DQ\ 60 KDV DYDLODEOH UHVRXUFHV LH D WKUHDG EORFN KDV ILQLVKHG  WKH WKUHDGEORFN GLVSDWFKLQJ GHFLVLRQ LV WULJJHUHG 7KH WKUHDGEORFN GLVSDWFKLQJ GHFLVLRQ DOJRULWKP ZRXOG HVWLPDWH ORFDOLW\ DPRQJ WKUHDG EORFNV DQG VHOHFW D WKUHDG EORFN ZKLFK KDV WKH KLJKHVW ORFDOLW\ ZLWK DOO WKH UXQQLQJ EORFNV LQ WKH 60 7KH DFFHVV UDQJH RI ZDUSV LQ WKH EORFN LV WKHQ FDOFXODWHG DQG DWWDFKHG WR WKH FRUUHVSRQGLQJ ZDUSV GXULQJ GLVSDWFKLQJ )LQDOO\

WKH ORFDOLW\DZDUH ZDUS VFKHGXOHU VFKHGXOHV ZDUSV EDVHG RQ WKH DFFHVV UDQJH LQIRUPDWLRQ JHQHUDWHG E\ WKH EORFN VFKHGXOHU WR SUHVHUYH WKH LQWHUEORFN ORFDOLW\ DW ZDUSOHYHO

 &RPSLOHU 6XSSRUW

$ *3*38 DSSOLFDWLRQ LV FRPSRVHG RI RQH RU PRUH NHUQHOV (DFK NHUQHO LV DQ DUUD\ RI WKUHDG EORFNV ZLWK XQLTXH ,'V DQG HDFK WKUHDG LQ WKH VDPH WKUHDG EORFN LV JLYHQ D XQLTXH WKUHDG LG 7KH SURJUDPPHU XVXDOO\ XVHV WKHVH XQLTXH EORFN,'V DQG WKUHDG,'V WR FDOFXODWH WKH LQGLYLGXDO GDWD SRVLWLRQ RI HDFK WKUHDG ,Q UHJXODU *3*38 DSSOLFDWLRQV WKUHDGV RIWHQ RSHUDWH RQ VWUXFWXUDO GDWD VXFK DV RQH RU WZR GLPHQVLRQ GDWD DUUD\ 6R WKH PDSSLQJ EH

WZHHQ WKUHDG WR GDWD FDQ EH FRPSXWHG WKURXJK VLPSOH DULWKPHWLFV )RU H[DPSOH ILJXUH 

 doi:10.6342/NTU201602569

(22)

1: __global__void kernel(float *J_cuda, int BLOCK_SIZE, ...) 2:{

3:

4: int blockid = blockIdx.x;

5: int threadid = threadIdx.x;

6:

7: int index = blockid * BLOCK_SIZE + threadid; 8:

9: c_cuda_temp[ty][tx] = J_cuda[index];

10: // ...

11: // computation 12: // ...

13:}

Kernel Function

Constant Value & Data Array Pointer

Address Calculation Code Index Data Array

)LJXUH  $GGUHVV FDOFXODWLRQ FRGH H[WUDFWLRQ

VKRZV D VLPSOLILHG NHUQHO FRGH 7KH FRGH LQ WKH WRS ER[ OLQH  VKRZV D W\SLFDO LQSXW NHU

QHO SDUDPHWHUV LQFOXGLQJ GDWD DUUD\ LH -BFXGD DQG FRQVWDQW YDOXH LH %/2&.B6,=(  7KH FRGH LQ WKH PLGGOH ER[ OLQH   ZKLFK LV FDOOHG WKH DGGUHVV FDOFXODWLRQ FRGH FRP

SXWHV WKH LQGH[ RI WKH LQSXW GDWD DUUD\ )LQDOO\ WKH ´LQGH[´ YDULDEOH LV XVHG WR DFFHVV WKH GDWD DUUD\ RI -BFXGD  DV VKRZQ LQ WKH ERWWRP ER[ OLQH   7KH FRPSLOHU FDQ HDVLO\ H[WUDFW WKHVH DGGUHVV FDOFXODWLRQ FRGH VHJPHQWV IURP D NHUQHO IXQFWLRQ DQG JHQHUDWH WKH DGGUHVV FDOFXODWLRQ ELQDU\ ZLWK WKRVH FRGH VHJPHQWV DQG EDVH DGGUHVV RI WKH GDWD DUUD\ SRLQWHUV

7KH EORFN VFKHGXOHU XWLOL]HV WKH ELQDU\ WKH WKUHDG,' DQG WKH EORFN,' WR FRPSXWH WKH PHPRU\ DGGUHVV RQ DQ DUELWUDU\ WKUHDG LQ WKH NHUQHO DV VKRZQ LQ ILJXUH 

…….

B0 …….

gpuSrc_Base_Pointer B1

xidx Data Array

Bn

int xidx = blockID * BLOCK_SIZE + threadID Int data = *( gpuSrc_Base_Pointer + xidx )

Code Segment

)LJXUH  6LPSOH DGGUHVV FDOFXODWLRQ FRGH $UUD\EDVHG GDWD VWUXFWXUHV

 doi:10.6342/NTU201602569

(23)

Data array

B

0

B

1

B

2

B

3

B

4

B

5

)LJXUH  ([DPSOH RI PHPRU\ DFFHVV UHJLRQ LQ D WKUHDG EORFN

 /RFDOLW\$ZDUH 7KUHDG %ORFN 6FKHGXOHU

,Q WKLV VHFWLRQ ZH ILUVW LQWURGXFH KRZ WR FDOFXODWH DFFHVV UDQJH RI HDFK WKUHDG EORFN DQG WKHQ GHVFULEH RXU WKUHDG EORFN GLVSDWFKLQJ SURFHVV LQFOXGLQJ FRPSXWLQJ LQWHUEORFN ORFDO

LW\ E\ XWLOL]LQJ WKH DFFHVV UDQJH LQIRUPDWLRQ DQG EORFN GLVSDWFKLQJ FULWHULD

 7KUHDG%ORFN/HYHO $FFHVV 5DQJH &DOFXODWLRQ

)RU UHJXODU *3*38 DSSOLFDWLRQV WKUHDGV RIWHQ DFFHVV FRQWLJXRXV PHPRU\ UHJLRQV 7KH PHPRU\ UHJLRQ XVXDOO\ EH OLQHDU RU ' GDWD EORFN ZKHQ WKUHDGV DFFHVV GDWD RQ RQH GL

PHQVLRQ RU WZR GLPHQVLRQ GDWD DUUD\ 6R WKH PHPRU\ DFFHVV UDQJH RI HDFK WKUHDG EORFN FDQ EH YLHZHG DV UHFWDQJOH DV VKRZQ LQ WKH ILJXUH  7KHUHIRUH ZH FRXOG XVH WKH VWDUW SRLQW LH XSSHUOHIW SRLQW  ZLGWK DQG KHLJKW WR UHSUHVHQW WKH UHFWDQJOH 7KH VWDUW SRLQW FRXOG EH FRPSXWHG E\ WKH ¶VW WKUHDG LQ WKH EORFN DQG WKH ZLGWKKHLJKW FRXOG EH FRPSXWHG E\ WKH GLIIHUHQFHV RI WKH PHPRU\ DGGUHVV RI WKH ¶VW WKUHDG DQG ODVW WKUHDG LQ WKH EORFN

:H H[WHQG WKH RULJLQDO EORFN TXHXH HQWULHV WR VWRUH WKH DFFHVV UDQJH UHFWDQJOHV RI WKH EORFN RQ WKH GDWD DUUD\V DV VKRZQ LQ ILJXUH 

,Q RUGHU WR FDOFXODWH KRZ PDQ\ FDFKH OLQHV VKDUHG EHWZHHQ WKUHDG EORFNV WKH PHPRU\

DFFHVV UDQJH RI WKH WKUHDG EORFN FRXOG EH IXUWKHU UHSUHVHQWHG LQ WKH FRRUGLQDWH RI FDFKH OLQHV $V VKRZQ LQ ILJXUH  WKH GDWD DUUD\ LV WUDQVIRUPHG IURP E\WH DGGUHVVHV WR WKH FDFKH OLQH DGGUHVVHV )RU H[DPSOH WKH PHPRU\ DGGUHVVHV   WR   DUH PDSSHG WR

 doi:10.6342/NTU201602569

(24)

Block queue

width height

B0 B1 B2 B3 Bn

start point

)LJXUH  ([WHQGHG EORFN TXHXH HQWULHV

WKH FDFKH OLQH DGGUHVV    DQG WKH PHPRU\ DGGUHVVHV   WR   DUH PDSSHG WR WKH FDFKH OLQH DGGUHVV    GXH WR  E\WHV O FDFKH VL]H LQ PRGHUQ *38V

7KH PHPRU\ DFFHVV UDQJH RI WKH EORFN FRXOG EH WUDQVIRUPHG LQWR FDFKH OLQH DFFHVV UDQJH 7KHUHIRUH FDFKH OLQHV DFFHVVHG E\ WKH EORFN FRXOG EH UHSUHVHQWHG LQ D UHFWDQJOH ZLWK D VWDUW SRLQW ZLGWK DQG KHLJKW DV VKRZQ LQ ILJXUH 

…..

2D line address (x’,y’)

cache line size

2D data array (MxN bytes)

0,0 0,1 0,2

1,0 2,0

M,N

0,0 0,1

1,0

)LJXUH  %\WH DGGUHVV WR OLQH DGGUHVV WUDQVIRUPDWLRQ

Block accessing range

1’st thread Last thread

࢝࢏ࢊ࢚ࢎ

࢝࢏ࢊ࢚ࢎ

Cols width 2D line address (x’,y’)

0,0 0,1

1,0

Rows width

)LJXUH  0DSSLQJ DGGUHVV UDQJH RI WKH EORFN WR WKH FRRUGLQDWH RI FDFKH OLQH DGGUHVV

 doi:10.6342/NTU201602569

(25)

 7KUHDG%ORFN'LVSDWFKLQJ 'HFLVLRQ

)LJXUH  )ORZ RI WKH WKUHDGEORFN GLVSDWFKLQJ GHFLVLRQ DOJRULWKP

7KH IORZFKDUW RI WKH WKUHDGEORFN GLVSDWFKLQJ GHFLVLRQ LV SORWWHG LQ ILJXUH  2QFH D WKUHDG EORFN KDV ILQLVKHG RQ D SDUWLFXODU 60 [ WKH EORFN VFKHGXOHU ZRXOG DOORFDWH D WKUHDG EORFN WR WKH 60 EDVHG RQ LQWHUEORFN ORFDOLW\ RI WKH FDQGLGDWH EORFN DQG DOO WKH UXQQLQJ EORFNV RQ WKH 60 [ 7KH LQWHUEORFN ORFDOLW\ RI WZR EORFNV LV GHILQHG DV WKH WKH DJJUHJDWHG RYHUODSSHG FDFKH OLQHV DFFHVV UDQJH RI WKH WZR ZKLFK LV WKH VXPPDWLRQ RI RYHUODSSHG DUHD RI DOO GDWD DUUD\V )RU HDFK GDWD DUUD\ WKH RYHUODSSHG DUHD LV FRPSXWHG DV IROORZLQJ

 $V VKRZQ LQ ILJXUH  WKH distancexDQG distanceyDUH WKH GLIIHUHQFH EHWZHHQ WZR EORFN¶V XSSHU OHIW SRVLWLRQ

 ,I distancex> widthxRU distancey> widthy WKHUH LV QR RYHUODSSHG DUHD EHWZHHQ WKH WZR EORFN ZKLFK PHDQV WKDW WKHUH LV QR ORFDOLW\ EHWZHHQ WKHVH WZR EORFNV RQ WKLV GDWD DUUD\

 2WKHUZLVH WKH RYHUODSSHG DUHD LV widthx − distancex ∗ widthy − distancey  ZKLFK LV WKH QXPEHU RI FDFKH OLQHV WKDW VKDUHG E\ WKH WZR EORFNV RQ WKLV GDWD DUUD\

)LQDOO\ WKH EORFN VFKHGXOHU ZRXOG VHOHFW D WKUHDG EORFN WKDW KDV WKH KLJKHVW LQWHUEORFN ORFDOLW\ ZLWK WKH EORFNV UXQQLQJ RQ WKH 60 [ +RZHYHU LI DOO WKH FDQGLGDWH WKUHDG EORFNV KDYH QR LQWHUEORFN ORFDOLW\ LH ]HUR RYHUODSSHG DUHD ZLWK 60 [ WKH WKUHDG EORFN ZKLFK

 doi:10.6342/NTU201602569

(26)

†‹•–ƒ…‡ൌ െ 

†‹•–ƒ…‡

ൌ െ 

overlapped

ݓ݅݀ݐ݄ǣ ܺെ ܺ

™‹†–Šǣ െ  (X2, Y2)

(X1, Y1)

(X3, Y3)

)LJXUH  /RFDOLW\ HVWLPDWLRQ EHWZHHQ WZR EORFNV

KDV PLQLPDO LQWHUEORFN ORFDOLW\ ZLWK WKH EORFNV UXQQLQJ RQ WKH RWKHU 60V LV VHOHFWHG

6LQFH LI D WKUHDG EORFN ZKLFK KDV FDFKH ORFDOLW\ ZLWK WKH UXQQLQJ EORFNV LQ RWKHU 60V LV VHOHFWHG WR LVVXH WKH FDFKH UHXVH RSSRUWXQLWLHV LQ RWKHU 60V DUH UHGXFHG

 /RFDOLW\$ZDUH :DUS 6FKHGXOHU

,Q WKLV VHFWLRQ ZH ILUVW LQWURGXFH KRZ WR FDOFXODWH DFFHVV UDQJH RI HDFK ZDUS DQG WKHQ GHVFULEH RXU SURSRVHG ORFDOLW\DZDUH ZDUS VFKHGXOHU WR FDSWXUH WKH EHQHILWV SURYLGHG IURP EORFN VFKHGXOHU

 :DUS/HYHO $FFHVV 5DQJH &DOFXODWLRQ

8QOLNH WKH DFFHVV UDQJH RI D EORFN WKH DFFHVV UDQJH RI D ZDUS XVXDOO\ GRHV QRW KDYH D IL[HG VKDSHG VR LW FDQ QRW EH UHSUHVHQWHG DV WKH VWDUW SRLQW ZLGWK DQG KHLJKW ,QVWHDG

WKH DFFHVV UDQJH RI D ZDUS FDQ EH UHSUHVHQWHG DV D ELWYHFWRU ,Q WKH ELWYHFWRU HDFK ELW LV XVHG WR UHSUHVHQW WKH DFFHVV VWDWXV RI D XQLTXH FDFKH OLQH %LW  PHDQV WKDW WKH FDFKH OLQH LV QRW DFFHVVHG E\ WKH ZDUS DQG ELW  PHDQV WKDW WKH FDFKH OLQH LV DFFHVVHG E\ WKH ZDUS

+RZHYHU WKH RQHELWSHUOLQH UHSUHVHQWDWLRQ LV LPSUDFWLFDO GXH WR WKH KXJH ZRUNLQJ VHW LQ WKH NHUQHO +HQFH ZH SURSRVH WKH KLHUDUFKLFDO HQFRGLQJ PHWKRG WR FXW GRZQ ELW XVDJH DV VKRZQ LQ ILJXUH  7KH KLHUDUFKLFDO HQFRGLQJ PHWKRG FRQWDLQV WKH IROORZLQJ WZR VWHSV

6WHS   7KH GDWD DUUD\ LV SDUWLWLRQHG LQWR 2M VPDOO UHJLRQV ZKHUH HDFK UHJLRQ LV UHSUH

 doi:10.6342/NTU201602569

(27)

Cols Width

2D cache line coordinate

Row Width

Each region is further divided into N sub-region, and each sub-region use 1 bit to indicate whether it is accessed by a warp or not

E.g. N = 4 in this example

Region

step 1 step 2

)LJXUH  KLHUDUFKLFDO ZDUS HQFRGLQJ

VHQWHG E\ D UHJLRQ ELWYHFWRU ZLWK 0 ELWV 7KHQ HDFK WKUHDG EORFN FRXOG JHW D 0ELW UHJLRQ YHFWRU E\ PDSSLQJ LWV PHPRU\ DFFHVV UDQJH WR WKH GDWD DUUD\ $V VKRZQ LQ ILJXUH  WKH GDWD DUUD\ LV SDUWLWLRQHG LQWR 24UHJLRQ ,I WKH DFFHVV UDQJH RI D EORFN LV IDOOHQ LQWR WKH JUHHQ ER[ UHJLRQ   WKH ELW UHJLRQ YHFWRU EHFRPHV 

6WHS   (DFK UHJLRQ LV IXUWKHU SDUWLWLRQHG LQWR 1 VXEUHJLRQ ZKHUH HDFK VXEUHJLRQ LV UHS

UHVHQWHG E\ D VXEUHJLRQ ELWYHFWRU ZLWK 1 ELWV 7KHQ WKH ZDUS FRXOG JHW D 1ELW VXEUHJLRQ YHFWRU E\ PDSSLQJ LWV PHPRU\ DFFHVV UDQJH WR WKH VXEUHJLRQ $V VKRZQ LQ ILJXUH  HDFK UHJLRQ LV SDUWLWLRQHG LQWR  VXEUHJLRQ (DFK VXEUHJLRQ XVHV

 ELW WR LQGLFDWH ZKHWKHU LW LV DFFHVVHG E\ WKH ZDUS RU QRW ,I D ZDUS DFFHVVHV WKH VXEUHJLRQV ZLWK WKH EOXH ER[ WKH XSSHUOHIW DQG ORZHUULJKW VXEUHJLRQV  WKH ELW VXEUHJLRQ YHFWRU EHFRPHV 

&RPELQH WKH UHJLRQ YHFWRU RI WKH EORFN DQG VXEUHJLRQ YHFWRU RI WKH ZDUS WKH DFFHVV UDQJH RI D ZDUS FDQ EH UHSUHVHQWHG DV D ELWYHFWRU ZLWK 0 OHQJWK RI UHJLRQ ELWYHFWRU  1 OHQJWK RI VXEUHJLRQ ELWYHFWRU ELWV

 7ZROHYHO :DUS 6FKHGXOHU

,Q RUGHU WR FDSWXUH WKH LQWHUEORFN ORFDOLW\ DW ZDUSOHYHO ZH VKRXOG SXW WKH ZDUSV ZLWK LQWHUZDUS ORFDOLW\ WRJHWKHU DQG WKHQ H[HFXWH WKRVH ZDUSV URXJKO\ DW WKH VDPH WLPH VXFK WKDW VKDUHG FDFKH OLQHV FRXOG EH XVH DV PDQ\ WLPHV DV SRVVLEOH EHIRUH WKH\ JHW HYLFWHG

%DVHG RQ WKH DERYH WKRXJKW ZH HPSOR\ ZLGHO\ XVHG WZROHYHO ZDUS VFKHGXOHU WR GH

YHORS RXU ORFDOLW\DZDUH ZDUS VFKHGXOHU )LJXUH  VKRZV RXU SURSRVHG ORFDOLW\DZDUH

 doi:10.6342/NTU201602569

(28)

W W W W W Pending group

W W W W

Active group

Promote a warp having the most inter-warp localitywith other warps in active group

ƒ Long latency operation

(off-chip memory access)

Greedyto execute a warp until short stall then locality select the next issued warp

Warp queue

SM

Selector

LANELANE LANE

LANELANE LANE

LANELANE LANE

Two-level warp scheduler

)LJXUH  /RFDOLW\ DZDUH ZDUS VFKHGXOHU

ZDUS VFKHGXOHU ,Q WKH 60 DQ DGGLWLRQDO ZDUS TXHXH LV LQWURGXFHG WR VWRUH WKH DFFHVV UDQJH RI WKH UXQQLQJ ZDUSV RQ WKH 60 DV ZHOO DV WKH LQWHUZDUS ORFDOLW\ EHWZHHQ ZDUSV

7KH DFFHVV UDQJH LQIRUPDWLRQ RI ZDUSV LV XSGDWHG E\ WKH EORFN VFKHGXOHU DQG WKH LQWHU

ZDUS ORFDOLW\ EHWZHHQ ZDUSV LV FRPSXWHG E\ WKH ZDUS VFKHGXOHU GXULQJ H[HFXWLRQ 7KH LGHD RI WKH SURSRVHG WZROHYHO ZDUS VFKHGXOHU LV WR GLYLGH DOO WKH UXQQLQJ ZDUSV LQ WKH 60 LQWR WZR ZDUS JURXSV DFWLYH JURXS DQG SHQGLQJ JURXS ,Q HDFK F\FOH D ZDUS IURP WKH DFWLYH JURXS LV LVVXHG WR WKH 6,0' ODQH IRU H[HFXWLRQ LQ WKH SULRULW\ RUGHU RI JUHHG\

WKHQ ORFDOLW\ 2QFH DQ\ ZDUS VXIIHUV D ORQJ ODWHQF\ VWDOO  LH RIIFKLS PHPRU\ DFFHVV

WKH ZDUS LV GHPRWHG WR SHQGLQJ JURXS $W WKH VDPH WLPH RQH RI UHDG\ ZDUS ZKLFK KDV WKH KLJKHVW VKDULQJ GHJUHH ZLWK WKH ZDUSV LQ WKH DFWLYH JURXS LV SURPRWHG IURP SHQGLQJ JURXS WR DFWLYH JURXS

6LQFH ZH DOZD\V SURPRWH ZDUSV ZLWK ORFDOLW\ IURP SHQGLQJ JURXS WR DFWLYH JURXS

VWDUYDWLRQ PD\ RFFXU ZKHQ VRPH ZDUS QDWXUDOO\ KDV QR ORFDOLW\ ZLWK RWKHU ZDUSV 2QFH D ZDUS VWDUYHV WKH RWKHU ZDUSV ZLWKLQ WKH VDPH WKUHDG EORFN FDQ QRW OHDYH 60 XQWLO WKH VWDUYHG ZDUS LV ILQLVKHG OHDGLQJ WR SHUIRUPDQFH GHJUDGDWLRQ 7R WDFNOH VWDUYDWLRQ LVVXH

D VLPSOH WLPHRXW VROXWLRQ LV DGRSWHG (DFK WKUHDG EORFN LV JLYHQ DQ DJH ZKHQ LW LV DV

VLJQHG WR WKH 60 :H GHWHFW WKH VWDUYDWLRQ KDSSHQHG ZKHQ Agenew−dispatched−threadblock  Agecurrent−threadblock! 1 ZKHUH 1 LV WKH PD[ QXPEHU RI WKUHDG EORFN LQ WKH 60 2QFH

 doi:10.6342/NTU201602569

(29)

)LJXUH  )ORZFKDUW RI WKH WZROHYHO ZDUS GLVSDWFKLQJ GHFLVLRQ

VWDUYDWLRQ LV GHWHFWHG WKRVH VWDUYHG ZDUSV DUH VHYHUHG LQ WKH ILUVW SULRULW\

7KH IORZFKDUW RI WKH WZROHYHO ZDUS GLVSDWFKLQJ GHFLVLRQ LV VKRZQ LQ ILJXUH  )LUVW

ZDUS VFKHGXOHU ZRXOG JUHHGLO\ VHOHFW WKH VDPH ZDUS LQ WKH DFWLYH JURXS IRU H[HFXWLRQ XQWLO LW VXIIHUV D VWDOO ,I WKH VWDOO LV D VKRUW VWDOO VXFK DV SLSHOLQH VWDOOV WKH ZDUS VFKHGXOHU ZRXOG VHOHFW D ZDUS ZKLFK KDV WKH KLJKHVW LQWHUZDUS ORFDOLW\ ZLWK ODVW LVVXHG ZDUS IURP WKH DFWLYH JURXS IRU H[HFXWLRQ 2WKHUZLVH WKH ZDUS LV VWDOOHG E\ D ORQJ ODWHQF\ RSHUDWLRQ DQG LV GHPRWHG WR WKH SHQGLQJ JURXS $W WKH VDPH WLPH D ZDUS KDV WKH KLJKHVW LQWHUZDUS ORFDOLW\ ZLWK DOO ZDUSV LQ WKH DFWLYH JURXS LV SURPRWHG IURP WKH SHQGLQJ JURXS WR WKH DFWLYH JURXS

7KH LQWHUZDUS ORFDOLW\ GHVFULEHG LQ WKH DERYH LV NHSW LQ D ORFDOLW\ GHJUHH WDEOH DV VKRZQ LQ ILJXUH  (DFK HQWU\ LQ WKH ORFDOLW\ GHJUHH WDEOH UHSUHVHQWV WKH LQWHUZDUS ORFDOLW\ RI WKH FRUUHVSRQGLQJ WZR ZDUSV )RU H[DPSOH LQWHUZDUS ORFDOLW\ EHWZHHQ ZDUS

 DQG ZDUS  LV VWRUHG LQ WKH HQWU\    7KH LQWHUZDUS ORFDOLW\ EHWZHHQ WKH WZR ZDUSV FDQ EH FRPSXWHG E\ FRPSDULQJ WKHLU ZDUS DFFHVV UDQJH LQIRUPDWLRQ DV IROORZLQJ

 &KHFN ZKHWKHU WKH UHJLRQYHFWRU EHWZHHQ WKH WZR ZDUSV DUH WKH VDPH LI WKH\ KDYH GLIIHUHQW UHJLRQYHFWRU WKHUH LV QR LQWHUZDUS ORFDOLW\ DPRQJ WKHP $V VKRZQ LQ

 doi:10.6342/NTU201602569

(30)

Warp 0 (001-1011) Warp 1 (001-0011) Warp2 (010-1111) …

Warp 0 2 0

Warp 1 0

Warp 2

Warp access range information

inter-warp locality between warp 0 and warp 1 Region bit-vector Subregion bit-vector

)LJXUH  ([DPSOH RI WKH /RFDOLW\ GHJUHH WDEOH

Warp 0 access range:

Warp 1 access range:

1001-001001

1001-001001

Warp 2 access range: 1011-001001

Region vector Subregion vector

Same region vector: inter-warp locality = 2

Different region vector : inter-warp locality = 0 )LJXUH  ([DPSOH RI LQWHUZDUS ORFDOLW\ FRPSXWDWLRQ

ILJXUH  ZDUS  DQG ZDUS  KDYH GLIIHUHQW UHJLRQYHFWRU VR WKHUH LV QR LQWHU

ZDUS ORFDOLW\ EHWZHHQ WKHP

 2WKHUZLVH WKH LQWHUZDUS ORFDOLW\ LV WKH QXPEHU RI VDPH ELW  LQ WKH VXEUHJLRQ

YHFWRU $V VKRZQ LQ ILJXUH  ZDUS  DQG ZDUS  KDYH WKH VDPH UHJLRQYHFWRU

VR WKH LQWHUZDUS ORFDOLW\ EHFRPHV  EHFDXVH WKHUH DUH  RI WKH VDPH ELW  LQ WKH VXEUHJLRQYHFWRU  

 doi:10.6342/NTU201602569

(31)

&KDSWHU 

([SHULPHQWDO 0HWKRGRORJ\

7DEOH  *3*386LP 6LPXODWLRQ &RQILJXUDWLRQ

3DUDPHWHUV 9DOXH

&RUHV 

:DUS 6L]H 

0D[ QXPEHU RI WKUHDGV  &RUH 

0D[ QXPEHU RI WKUHDG EORFN  &RUH  1XPEHU RI UHJLVWHUV  &RUH 

6KDUHG 0HPRU\ .%

/ 'DWD &DFKH .% % OLQH ZD\

/ &DFKH % OLQH ZD\ DVVRFLDWHG WRWDO .%

'5$0 0RGHO )5)&)6 0&V HQWU\ UHTXHVW0&

*''5 7LPLQJ tRRD = 6 tRCD = 12 tRAS = 28 tRP = 12 tRC = 40 tCL = 12

:H PRGHO ORFDOLW\DZDUH VFKHGXOLQJ PHFKDQLVP DV GHVFULEHG LQ &KDSWHU  LQ *3*38

6LP >@ 7KH EDVHOLQH FRQILJXUDWLRQ LV D )HUPLOLNH DUFKLWHFWXUH (DFK 60 KDV D SULYDWH

.% / GDWD FDFKH DQG .% VKDUHG PHPRU\ 7KHUH DUH WRWDOO\  60V VKDULQJ D .%

/ FDFKH 7KH JOREDO PHPRU\ LV SDUWLWLRQHG LQWR  '5$0 FKDQQHOV DQG HDFK '5$0 FKDQQHO KDV D )LUVW5HDG\ )LUVW&RPH)LUVW6HUYHG )5)&)6 PHPRU\ FRQWUROOHU 7KH RWKHU GHWDLOHG FRQILJXUDWLRQ RI WKH VLPXODWRU LV VXPPDUL]HG LQ WDEOH 

7R HYDOXDWH RXU PHFKDQLVP ZH XVH  EHQFKPDUNV IURP GLIIHUHQW EHQFKPDUN VXLWV DV OLVWHG LQ WDEOH  ,Q RUGHU WR HYDOXDWH WKH GLYHUVH LPDJH SURFHVVLQJ ZRUNORDGV ZH DOVR DGG WZR EHQFKPDUNV ± 6,)7 DQG *DERU ILOWHU ZKLFK DUH WKH ZHOONQRZQ ZRUNORDGV LQ WKH

 doi:10.6342/NTU201602569

(32)

DUHD RI FRPSXWHU YLVLRQ LQ RXU EHQFKPDUN OLVWV $OO EHQFKPDUNV DUH IXOO\ VLPXODWHG RQ WKH *3*38VLP

7DEOH  :RUNORDGV

1DPH $EEU 1R RI 1R RI

NHUQHOV &7$V

+RWVSRW >@ +6  

%DFN 3URSDJDWLRQ >@ %3  

/8 'HFRPSRVLWLRQ >@ /8'  

1HDUHVW 1HLJKERU >@ 11  

1HHGOHPDQ:XQVFK >@ 1:  

6SHFNOH 5HGXFLQJ $QLVRWURSLF 'LIIXVLRQ >@ 65$'  

*DXVVLDQ (OLPLQDWLRQ >@ *$866,$1  

6HSDUDEOH &RQYROXWLRQ ILOWHU >@ &219  

7UDQVSRVH >@ 75$1  

+LVWRJUDP >@ +,6  

*DERU ILOWHUV *DERU  

6FDOHLQYDULDQW IHDWXUH WUDQVIRUP 6,)7  

 doi:10.6342/NTU201602569

(33)

&KDSWHU  (YDOXDWLRQ

,Q WKLV FKDSWHU ZH XVH WKH VLPXODWLRQ PHWKRGRORJ\ GHVFULEHG LQ FKDSWHU  DQG HYDOXDWH RXU ORFDOLW\DZDUH VFKHGXOHU GHVFULEHG LQ FKDSWHU  ZLWK WKH VWDWHRIWKHDUW ORFDOLW\DZDUH WKUHDG EORFN VFKHGXOHU±%&6 >@ 7KH SULRU ZRUN REVHUYHV WKH ZRUNORDG EHKDYLRU DQG ILQGV WKDW FRQVHFXWLYH WKUHDG EORFNV XVXDOO\ KDYH EHWWHU FDFKH ORFDOLW\ +HQFH %&6 GLV

SDWFKHV WZR FRQVHFXWLYH WKUHDG EORFNV WR WKH 60 DW WKH VDPH WLPH ,Q RUGHU WR XQGHUVWDQG WKH UHVXOW RI %&6 DQG RXU SURSRVHG VFKHGXOHU ZH FODVVLI\ ZRUNORDGV LQWR WKUHH FDWHJRULHV DFFRUGLQJ WR WKHLU GDWD DFFHVV EHKDYLRUDV VKRZQ LQ WDEOH 

:RUNORDGV DUH FODVVLILHG LQWR 7\SH, EHFDXVH WKHLU GDWD DFFHVV EHKDYLRU LV GRPLQDWHG E\ URZPDMRU DFFHVV ZKLOH DSSOLFDWLRQV ZLWK D PL[WXUH RI FROXPQPDMRU DFFHVVHV UDQGRP DFFHVVHV HWF DUH FODVVLILHG LQWR 7\SH,, 2WKHU ZRUNORDGV WKDW GR QRW KDYH LQWHUEORFN ORFDOLW\ DUH FODVVLILHG LQWR 7\SH,,,

7KH SHUIRUPDQFH UHVXOW RI RXU ORFDOLW\DZDUH VFKHGXOHU LV VKRZQ LQ ILJXUH  QRUPDO

L]H WR WKH EDVHOLQH FRQILJXUDWLRQ XVLQJ *72 ZDUS VFKHGXOHU DQG / PLVV UDWH LV VKRZQ LQ ILJXUH  )RU W\SH, ZRUNORDGV RXU ORFDOLW\DZDUH VFKHGXOHU LPSURYH DSSUR[LPDWHO\ 

SHUIRUPDQFH FRPSDUHG WR %&6 EXW IRU W\SH,, ZRUNORDGV DSSUR[LPDWHO\  VSHHGXS LV

7DEOH  :RUNORDG FDWHJRULHV

7\SH EHQFKPDUN

7\SH, +6 6UDG &RQY +LV DQG %3 7\SH,, /XG 7UDQ *DXVVLDQ DQG 6,)7

7\SH,,, 11 DQG 1:

 doi:10.6342/NTU201602569

(34)

IXUWKHU DFKLHYHG )RU W\SH,,, ZRUNORDGV RXU SURSRVHG VFKHGXOHU KDV QR HIIHFW

)LJXUH  &RPSDULVRQ RI SHUIRUPDQFH IRU GLIIHUHQW SROLFLHV QRUPDOL]H WR EDVHOLQH DU

FKLWHFWXUH XVLQJ *72 ZDUS VFKHGXOHU

 (IIHFW RI WKUHDG EORFN VFKHGXOHU

)RU W\SH, DSSOLFDWLRQV WZR FRQVHFXWLYH WKUHDG EORFNV XVXDOO\ KDYH EHWWHU FDFKH ORFDOLW\

ZKLFK LV FRQVLVWHQW ZLWK WKH REVHUYDWLRQV LQ >@ 7KHUHIRUH LQ WKLV FDVH RXU WKUHDG EORFN VFKHGXOHU DFKLHYHV URXJKO\ WKH VDPH SHUIRUPDQFH ZLWK %&6 EHFDXVH RXU WKUHDG EORFN GLV

SDWFKHG GHFLVLRQ LV WKH VDPH DV %&6 +RZHYHU %&6 KDV WZR OLPLWDWLRQV  'XH WR ODFN RI ORFDOLW\ GHWHFWLRQ LQ WKH UXQWLPH LW FRXOG QRW GLVSDWFK WKUHDG EORFN XQWLO 60 KDV HQRXJK KDUGZDUH UHVRXUFH IRU VXSSRUWLQJ WKH UHTXLUHPHQW RI WZR WKUHDG EORFN +HQFH WKLV NLQG RI GHOD\ VFKHGXOLQJ FDXVHV  EORFNWKURWWOHHIIHFW ZKLFK PD\ UHOLHYHV UHVRXUFH FRQWHQWLRQ DQG SURYLGHV VRPH SHUIRUPDQFH LPSURYHPHQW  ODWHQF\KLGLQJDELOLW\ UHGXFWLRQ  7KH RWKHU OLPLWDWLRQ LV WKDW %&6 FRXOG RQO\ H[SORLW WKH FDFKH ORFDOLW\ DPRQJ WZR WKUHDG EORFNV

)RU 65$' DSSOLFDWLRQ DOWKRXJK RXU GLVSDWFKHG GHFLVLRQ LV WKH VDPH DV %&6 %&6 FDQ SHU

IRUP EHWWHU WKDQ RXU VFKHGXOHU GXH WR WKH HIIHFW RI EORFN WKURWWOLQJ )RU +6 DQG +LV FDFKH ORFDOLW\ LV QRW RQO\ H[LVWLQJ LQ WKH WZR FRQVHFXWLYH WKUHDG EORFNV EXW LQ WKH 1 FRQVHFXWLYH WKUHDG EORFNV PRUH WKDQ  DQG RXU VFKHGXOHU FDQ FRUUHFWO\ GHWHFW ORFDOLW\ DPRQJ WKUHDG

 doi:10.6342/NTU201602569

(35)

)LJXUH  /' FDFKH PLVV FRPSDULVRQV

EORFNV DQG GLVSDWFK WKRVH 1 FRQVHFXWLYH WKUHDG EORFNV WR WKH VDPH 60 DV PDQ\ DV SRVVLEOH IRU H[SORLWLQJ PRUH FDFKH ORFDOLW\

)RU W\SH,, DSSOLFDWLRQV VLQFH %&6 XVXDOO\ FDQ QRW FRUUHFWO\ GHWHFW WKH ORFDOLW\ EH

WZHHQ WKUHDG EORFNV LQ WKLV W\SH RXU SURSRVHG WKUHDG EORFN VFKHGXOHU FRXOG IXUWKHU DFKLHYH

 VSHHGXS DQG UHGXFH  PLVV UDWH )RU W\SH,,, DSSOLFDWLRQV OLNH 11 DQG 1: RXU VFKHGXOHU KDV QR HIIHFW EXW %&6 GHJUDGHV  LQ 11 DSSOLFDWLRQ GXH WR WKH HIIHFW RI GH

OD\ VFKHGXOLQJ )RU 11 DSSOLFDWLRQ WKHUH DUH RQO\  ZDUSV LQ WKH 60 WKHUHIRUH LW LV KDUG WR KLGH PHPRU\ DFFHVV ODWHQF\ HVSHFLDOO\ DSSO\LQJ GHOD\ VFKHGXOLQJ SROLF\ RQ EORFN VFKHGXOHU 2QO\  ZDUSV FDQ EH XVHG WR KLGH WKH PHPRU\ DFFHVV ODWHQF\

 (IIHFW RI ZDUS VFKHGXOHU

,Q RUGHU WR H[SORLW PRUH FDFKH ORFDOLW\ ZH GHVLJQ RXU ORFDOLW\DZDUH ZDUS VFKHGXOHU $V ZH PHQWLRQHG LQ FKDSWHU  ZH XVH D ELWYHFWRU WR UHSUHVHQW LWV DFFHVV UDQJH E\ KLHUDUFKLFDO HQFRGLQJ PHWKRG +HUH ZH FKRRVH FRQILJXUDWLRQ XVLQJ DW PRVW  ELWV IRU UHJLRQYHFWRU DQG  ELWV IRU VXEUHJLRQYHFWRU RQ D GDWD DUUD\ EHFDXVH LWV SHUIRUPDQFH DQG KDUGZDUH UHVRXUFH LV PRUH IHDVLEOH

)RU W\SH, DQG W\SH,, DSSOLFDWLRQV DIWHU DGGLQJ RXU SURSRVHG ZDUS VFKHGXOHU SHUIRU

 doi:10.6342/NTU201602569

(36)

PDQFH FRXOG EH IXUWKHU LPSURYHG  RQ DYHUDJH )RU +6 6UDG +LV DQG 6,)7 RXU ZDUS VFKHGXOHU FDQ IXUWKHU FDSWXUH PRUH FDFKH ORFDOLW\ SURYLGHG E\ WKUHDG EORFN VFKHGXOHU DQG LPSURYH SHUIRUPDQFH IURP  +RZHYHU IRU WKH RWKHU DSSOLFDWLRQV RXU SURSRVHG ZDUS VFKHGXOHU FRXOG QRW DFTXLUH WKH PRUH EHQHILWV EHFDXVH WKHLU FDFKH IRRWSULQWV DUH ILWWHG LQWR / FDFKH DIWHU DOORFDWLQJ WKUHDG EORFNV ZLWK LQWHUEORFN ORFDOLW\ WR WKH VDPH 60

 3LSHOLQH VWDOO UHGXFWLRQ

:H IXUWKHU EUHDNGRZQ SLSHOLQH VWDOO F\FOHV WR DQDO\VLV ZKLFK SDUWV RI VWDOO F\FOHV FRXOG EH UHGXFHG E\ RXU ORFDOLW\DZDUH VFKHGXOHU $ SLSHOLQH VWDOO LV FDXVHG E\ WKUHH W\SH RI KD]DUGV VWUXFWXUDO KD]DUGV LGOH RU FRQWURO KD]DUGV DQG 5$: KD]DUGV 6WUXFWXUDO KD]DUGV RFFXU ZKHQ WKHUH LV QR DYDLODEOH KDUGZDUH UHVRXUFH IRU ZDUS H[HFXWLRQ LH ODFN RI PLVV VWDWXV KROGLQJ UHJLVWHU 06+5 UHVRXUFH  :KHQ ZDUSV GR QRW KDYH D YDOLG LQVWUXFWLRQ IRU H[HFXWLRQ DQ LGOH RU FRQWURO KD]DUGV RFFXUV ,W RIWHQ KDSSHQV ZKHQ D EDUULHU RU FRQWURO EUDQFK LQVWUXFWLRQ LV H[HFXWHG 7KH 5$: KD]DUGV DUH FDXVHG E\ WKH WZR UHDVRQV PHP

RU\ DFFHVV RSHUDWLRQV UHDG DIWHU ZULWH WR UHJLVWHU RU GDWD GHSHQGHQFHV $V ZH FDQ VHH LQ ILJXUH  DQG  RXU ORFDOLW\DZDUH WKUHDG EORFN VFKHGXOHU FDQ HIIHFWLYHO\ UHGXFH VWDOO F\FOHV ZKLFK FRPH IURP 5$: DQG VWUXFWXUDO KD]DUGV 6LQFH RXU ORFDOLW\DZDUH VFKHGXOHU LV WR SXW WKUHDGV ZLWK FDFKH ORFDOLW\ WRJHWKHU WKHUH DUH WZR EHQHILWV )LUVW PHPRU\ UHTXHVWV IRU DFTXLULQJ WKH VDPH FDFKH OLQH DUH LQFUHDVHG DQG WKHVH UHTXHVWV FDQ EH FRDOHVFHG LQ WKH 06+5 HQWULHV ZKLFK UHGXFH WKH RULJLQDO VWUXFWXUDO KD]DUGV %HVLGHV LQFUHDVLQJ FDFKH SHUIRUPDQFH PHDQV WKDW WKH PHPRU\ DFFHVV ODWHQF\ LV VKRUWHQ DQG WKH 5$: KD]DUGV DUH DOVR UHGXFHG +RZHYHU DIWHU RXU SURSRVHG ZDUS VFKHGXOHU LV DGRSWHG WKHUH LV D SRUWLRQ RI LGOH RU FRQWURO VWDOO F\FOHV LQFUHDVHG LQ VRPH DSSOLFDWLRQV ,Q WKHVH DSSOLFDWLRQV ZH ILQG SURJUDPPHUV WHQG WR XVH VKDUHG PHPRU\ WR RSWLPL]H WKH SHUIRUPDQFH :KHQ VKDUHG PHPRU\ LV XVHG LW ZRXOG QHHG EDUULHU LQVWUXFWLRQV WR V\QFKURQL]H WKUHDGV ZLWKLQ D WKUHDG EORFN WR JXDUDQWHH FRUUHFWQHVV +RZHYHU RXU ORFDOLW\DZDUH VFKHGXOHU PD\ FDXVH PRUH ODUJH ZDUS SURJUHVV YDULDWLRQ ZLWKLQ D WKUHDG EORFN FRPSDUHG WR *72 ZDUS VFKHGXOHU GXH WR FDSWXULQJ LQWHUZDUS ORFDOLW\ +HQFH RXU ZDUS VFKHGXOHU SURORQJV WKH DYHUDJH ZDLWLQJ WLPH RI WKH ZDUS DW EDUULHU DQG LQFUHDVHV WKH LGOH RU FRQWURO KD]DUGV

 doi:10.6342/NTU201602569

(37)

)LJXUH  %UHDNGRZQ SLSHOLQH VWDOO F\FOHV QRUPDOL]H WR EDVHOLQH H[HFXWLRQ F\FOHV RI 7\SH , DSSOLFDWLRQV

)LJXUH  %UHDNGRZQ SLSHOLQH VWDOO F\FOHV QRUPDOL]H WR EDVHOLQH H[HFXWLRQ F\FOHV LQ W\SH,, DQG W\SH,,,

 doi:10.6342/NTU201602569

(38)

 +DUGZDUH 2YHUKHDG

7KH KDUGZDUH FRVW LQFOXGHV WKH VWRUDJH FRVW IRU EORFN TXHXH LQIRUPDWLRQ DQG ZDUS TXHXH LQIRUPDWLRQ ,Q HDFK EORFN TXHXH HQWU\ DGGLWLRQDO LQIRUPDWLRQ RI DFFHVV UDQJH UHFWDQJOHV RI WKH EORFN LQFOXGLQJ VWDUW SRLQWV ZLGWK DQG KHLJKW ZKHUH HDFK HOHPHQW UHTXLUHV  E\WH

,Q RXU H[SHULPHQWV WRWDO  E\WHV DUH UHTXLUHG IRU HDFK EORFN TXHXH HQWU\ )RU ZDUS TXHXH LQIRUPDWLRQ WKHUH DUH WZR DGGLWLRQDO LQIRUPDWLRQ 2QH LV ZDUS DFFHVV UDQJH ELW

YHFWRU DQG WKH RWKHU LV ORFDOLW\ GHJUHH WDEOH )RU HDFK ZDUS DFFHVV UDQJH ELWYHFWRU 

E\WHV DUH UHTXLUHG LQFOXGLQJ  E\WHV IRU UHJLRQ ELWYHFWRU DQG  E\WHV IRU VXEUHJLRQ ELW

YHFWRU )RU ORFDOLW\ GHJUHH WDEOH  E\WHSHUHQWU\ LV UHTXLUHG IRU UHFRUGLQJ LQWHUZDUS ORFDOLW\ 7KHUHIRUH WKH WRWDO VWRUDJH FRVW IRU ZDUS TXHXH LQIRUPDWLRQ LV DURXQG .% GXH WR FXUUHQW JHQHUDWLRQ *38 VXSSRUWHG XS WR  ZDUSV 7KH GHWDLOV RI WKH KDUGZDUH RYHUKHDG FRVW DUH OLVWHG LQ WDEOH 

7KH SHUIRUPDQFH RYHUKHDG LQ RXU GHVLJQ LV QHJOLJLEOH )RU EORFN VFKHGXOHU WKH EORFN GLVSDWFKLQJ SURFHVV LV IDVW GXH WR WKH UHTXLUHG LQIRUPDWLRQ VWRUHG LQWR WKH EORFN HQWULHV

%HVLGHV WKH EORFN GLVSDWFKLQJ SURFHVV LV QRW RQ WKH FULWLFDO SDWK EHFDXVH WKHUH DUH RWKHU UXQQLQJ EORFNV LQ WKH 60 )RU ZDUS VFKHGXOHU WKH LQWHUZDUS VKDULQJ GHJUHH LV VWRUHG LQ WKH ORFDOLW\GHJUHH WDEOH +HQFH ILQGLQJ WKH PD[LPXP VKDULQJ GHJUHH ZDUS UHTXLUHV RQO\

OLQHDU VHDUFK WLPH 7KH FRQVXPLQJ WLPH FRXOG EH KLGGHQ E\ WKH ZDUS VFKHGXOHU ZKLFK ZRXOG JUHHGLO\ LVVXHV WKH VDPH ZDUS XQWLO LW VXIIHUV D VWDOO

7DEOH  +DUGZDUH RYHUKHDG

/RFDWLRQ &RPSRQHQWV 6WRUDJH RYHUKHDG

%ORFN VFKHGXOHU %ORFN TXHXH  E\WHV  SHU HQWU\

60 :DUS TXHXH /RFDOLW\ GHJUHH WDEOH     .%  SHU 60 :DUS DFFHVV UDQJH LQIRUPDWLRQ    .%  SHU 60

 doi:10.6342/NTU201602569

(39)

&KDSWHU 

5HODWHG :RUNV

,Q WKLV FKDSWHU ZH ZRXOG VXPPDUL]H WKH UHODWHG ZRUNV )LUVW ZH ZLOO VXPPDUL]H WKH ZRUNV IRU LPSURYH FDFKH ORFDOLW\ RQ EORFN VFKHGXOHU DQG ZDUS VFKHGXOHU 6HFRQG ZH ZLOO LQWURGXFH RWKHU ZRUNV IRU LPSURYLQJ *38 UHVRXUFH XWLOL]DWLRQ DQG WKUHDGOHYHO SDUDOOHO

 %ORFN 6FKHGXOLQJ IRU LPSURYLQJ FDFKH ORFDOLW\

([LVWLQJ UHVHDUFKHV RQ WKUHDG EORFN VFKHGXOHU DUH TXLWH OHVV 0 /HH HW DO >@ REVHUYH WKDW WZR FRQVHFXWLYH WKUHDG EORFNV XVXDOO\ KDYH VSDWLDO FDFKH ORFDOLW\ ZKLFK PHDQV WKDW WZR FRQVHFXWLYH WKUHDG EORFNV RIWHQ DFFHVV WKH GDWD LQ WKH VKDUHG FDFKH OLQHV 7KHUHIRUH

WKH\ SURSRVH %&6 VFKHGXOLQJ ZKLFK DVVLJQV WZR FRQVHFXWLYH WKUHDG EORFNV WR WKH VDPH 60 IRU H[SORLWLQJ PRUH FDFKH ORFDOLW\ $V ZH GLVFXVVHG LQ WKH SUHYLRXV VHFWLRQ WKHLU ZRUN LV VXLWDEOH IRU URZPDMRU DSSOLFDWLRQV ZKLOH RXU ZRUN LV VXLWDEOH IRU YDULRXV DSSOLFDWLRQV ZLWK GLIIHUHQW GDWD DFFHVV EHKDYLRUV E\ HVWLPDWLQJ FDFKH ORFDOLW\ DW UXQWLPH 0RUHRYHU

RXU EORFN VFKHGXOHU FROODERUDWHV ZLWK ZDUS VFKHGXOHU WR FDSWXUH DV PXFK FDFKH ORFDOLW\ DV SRVVLEOH

 :DUS 6FKHGXOLQJ IRU LPSURYLQJ FDFKH ORFDOLW\

* 5RJHUV HW DO >@ SURSRVH FDFKHFRQVFLRXV ZDUS VFKHGXOLQJ ZKLFK XVHV DGGLWLRQDO KDUG

ZDUH WR PRQLWRU FDFKH WKUDVKLQJ EHKDYLRU DQG WKHQ XVH ZDUS WKURWWOLQJ WHFKQLTXH WR UHGXFH

 doi:10.6342/NTU201602569

(40)

LQWHUZDUS LQWHUIHUHQFH DQG SUHVHUYH LQWUDZDUS FDFKH ORFDOLW\ '$:6 >@ IXUWKHU SUH

VHUYHV PRUH LQWUDZDUS FDFKH ORFDOLW\ E\ XVLQJ D SUHGLFWRU FRPELQHG ZLWK SURILOHEDVHG DQG RQOLQH GHWHFWLRQ LQIRUPDWLRQ $OWKRXJK WKURWWOH WHFKQLTXH PLWLJDWHV FDFKH WKUDVKLQJ

LW UHGXFHV 7/3 DQG PDNH RWKHU VKDUHG UHVRXUFH XQGHUXWLOL]H 7KHUHIRUH ' /L HO DO >@

SURSRVH SULRULW\EDVHG FDFKH DOORFDWLRQ ZKLFK OLPLWV WKH QXPEHU RI ZDUSV WKDW FDQ DOORFDWH FDFKH OLQH HQWULHV ZKLOH RWKHU ZDUSV E\SDVV FDFKH ; ;LH >@ SURSRVHV FRRUGLQDWHG VWDWLF DQG G\QDPLF FDFKH E\SDVVLQJ PHFKDQLVP WR GHWHUPLQH ZKLFK ZDUS VKRXOG EH DOORFDWHG WKH FDFKH UHVRXUFH WR LPSURYH RYHUDOO FDFKH ORFDOLW\

2XU ZRUN WDUJHWV RQ SUHVHUYLQJ LQWHUZDUS ORFDOLW\ E\ SXWWLQJ WKUHDGV ZLWK EHWWHU FDFKH ORFDOLW\ WRJHWKHU ZKLFK LV RUWKRJRQDO WR SUHVHUYLQJ LQWUDZDUS ORFDOLW\ ZRUN OLNHV &&:6

'$:6 HWF $GGLWLRQDOO\ RXU ZRUN LV XQOLNH RWKHU UHODWHG ZRUNV ZKLFK RQO\ GHVLJQ IURP WKH DVSHFW RI ZDUS VFKHGXOHU 2XU ORFDOLW\DZDUH VFKHGXOHU GHVLJQ WDNHV RQH IXUWKHU VWHS IURP WRS WR ERWWRP IURP EORFN VFKHGXOHU WR ZDUS VFKHGXOHU

 ,PSURYLQJ UHVRXUFH XWLOL]DWLRQ RQ *38V

7KHUH KDYH EHHQ PDQ\ ZRUNV RQ LPSURYLQJ *38 UHVRXUFH XWLOL]DWLRQ *HEKDUW HW DO >@

SURSRVH D XQLILHG RQFKLS UHVRXUFH PDQDJHPHQW WR PDQDJH WKH RQFKLS UHVRXUFH LQFOXG

LQJ WKH UHJLVWHU ILOHV VFUDWFKSDG PHPRU\ DQG GDWD FDFKH IRU GLIIHUHQW DSSOLFDWLRQV 2

.D\ÕUDQ HW DO >@ UHGXFH FDFKH QHWZRUN DQG PHPRU\ FRQWHQWLRQ E\ OLPLWLQJ WKH QXPEHU RI EORFNV UXQQLQJ LQ D 60 $ -RJ HW DO >@ SURSRVH D VFKHGXOLQJ PHFKDQLVP WKDW HQDEOHV

*38 GDWD SUHIHWFK E\ XWLOL]LQJ D VLPSOH SUHGLFWRU DQG LPSURYH EDQNOHYHO SDUDOOHOLVP

5 $XVDYDUXQJQLUXQ HW DO >@ SURSRVH D PHPRU\ FRQWUROOHU GHVLJQ WKDW LV DEOH WR EDWFK WKH PHPRU\ UHTXHVWV ZLWK DFFHVVLQJ WKH VDPH URZ WR LPSURYH WKH URZ EXIIHU ORFDOLW\ DQG WKHUHIRUH LPSURYH WKH '5$0 SHUIRUPDQFH - 7 $GULDHQV HW DO >@ SURSRVH D VFKHPH WKDW HQDEOHV VSDWLDO PXOWLWDVNLQJ RQ *38V ZKLFK SDUWLWLRQV WKH 60V IRU GLIIHUHQW DSSOL

FDWLRQV WR LPSURYH WKH 60 UHVRXUFH XWLOL]DWLRQ DQG IDLUQHVV = :DQJ HW DO >@ HQDEOH VSDWLDO PXOWLWDVNLQJ ZLWKLQ D 60 RQ *38V ZKLFK VXSSRUWV IRU PXOWLSOH NHUQHOV LQ D 60 E\ VROYLQJ WKH IUDJPHQWDWLRQ LVVXHV DQG LPSURYH WKH RQFKLS UHVRXUFH XWLOL]DWLRQ E\ H[

SORLWLQJ KHWHURJHQHLW\ RI GLIIHUHQW NHUQHOV

 doi:10.6342/NTU201602569

(41)

 ,PSURYLQJ WKUHDGOHYHO SDUDOOHO RQ *38V

1DUDVLPDQ HW DO >@ SURSRVH D WZROHYHO ZDUS VFKHGXOHU ZKLFK GLYLGHV WKH ZDUSV LQWR IHWFK JURXSV WR RYHUODS WKH FRPSXWDWLRQ ZLWK PHPRU\ DFFHVV E\ GLVWRUWLQJ WKH SURJUHVV EHWZHHQ GLIIHUHQW IHWFK JURXSV VR WKDW VRPH ZDUSV ZLWKLQ D IHWFK JURXS SHUIRUP FRPSX

WDWLRQ ZKLOH RWKHU ZDUSV ZLWKLQ RWKHU IHWFK JURXSV H[HFXWH WKH PHPRU\ RSHUDWLRQV -RJ HW DO >@ SURSRVH D VFKHGXOHU FDOOHG 2:/ WR LQFUHDVH ODWHQF\ KLGLQJ DELOLW\ E\ UHGXFLQJ FDFKH FRQWHQWLRQ DQG H[SORLW '5$0 EDQNOHYHO SDUDOOHOLVP WR LPSURYH SHUIRUPDQFH 7KH 2:/ DLPV WR PLWLJDWH ORQJ PHPRU\ DFFHVV ODWHQF\ E\ SULRULWL]LQJ RQ D VHOHFWHG VXEVHW RI EORFNV $ 6HWKLD HW DO >@ SURSRVH FDFKH DFFHVV UHH[HFXWLRQ V\VWHP DQG PHPRU\DZDUH VFKHGXOLQJ WR LPSURYH WKH V\VWHP SHUIRUPDQFH ,W GHWHFWV WKH PHPRU\ VDWXUDWLRQ DQG SUL

RULWL]HV PHPRU\ UHTXHVWV RI RQH ZDUS WR HQDEOH WKH RSSRUWXQLWLHV WR RYHUODS FRPSXWH DQG PHPRU\ DFFHVVHV %HVLGHV LW DOVR HQDEOHV ZDUS H[HFXWLRQ HYHQ WKRXJK WKH PHPRU\ VDW

XUDWLRQ 2QFH WKH ZDUS LV IDLOHG WR H[HFXWLRQ WKH DGGLWLRQDO KDUGZDUH TXHXH LV XVHG WR EXIIHU WKH ZDUS IRU IXWXUH UHH[HFXWLRQ

 doi:10.6342/NTU201602569

(42)

&KDSWHU 

&RQFOXVLRQ

,Q WKLV ZRUN ZH GHYHORS D ORFDOLW\ DZDUH VFKHGXOHU WR LPSURYH WKH SRRU FDFKH SHUIRU

PDQFH LQ PRGHUQ *3*38 DSSOLFDWLRQV :H SRLQW RXW WKDW WKH FDFKH DFFHVV ORFDOLW\ H[

LVWLQJ LQ GLIIHUHQW WKUHDG EORFN FRPELQDWLRQV GXH WR GLIIHUHQW PHPRU\ DFFHVV EHKDYLRU LQ YDULRXV *3*38 ZRUNORDGV ZKLFK LV QRW DGGUHVVHG LQ FXUUHQW VWXGLHV +HQFH D FRPSUH

KHQVLYH DSSURDFK LV QHHGHG WR LPSURYH SHUIRUPDQFH E\ H[SORLWLQJ FDFKH DFFHVV ORFDOLW\

LQ GLIIHUHQW *3*38 ZRUNORDGV 7KHUHIRUH :H SURSRVH WKH VRIWZDUH DQG KDUGZDUH FRRS

HUDWLYH PHFKDQLVP WR HVWLPDWH WKH FDFKH ORFDOLW\ LQ WKUHDGEORFNOHYHO DQG ZDUSOHYHO DW UXQWLPH %DVHG RQ WKH HVWLPDWHG ORFDOLW\ LQIRUPDWLRQ ZH GHVLJQ WKH ORFDOLW\DZDUH EORFN VFKHGXOHU ZKLFK LV DEOH WR PD[LPL]H WKH FDFKH UHXVH RSSRUWXQLWLHV DQG WKH ORFDOLW\DZDUH ZDUS VFKHGXOHU ZKLFK LV DEOH WR FDSWXUH SHUIRUPDQFH LPSURYH RSSRUWXQLWLHV SURYLGHG IURP EORFN VFKHGXOHU 7KH H[SHULPHQWDO UHVXOWV VKRZ WKDW RXU ORFDOLW\DZDUH VFKHGXOHU FDQ HIIHFWLYHO\ UHGXFH / FDFKH PLVV UDWH DW PRVW  DQG WRWDOO\ DFKLHYH  SHUIRU

PDQFH RQ DYHUDJH FRPSDUHG WR WKH VWDWHRIDUW VFKHGXOLQJ SROLFLHV

 doi:10.6342/NTU201602569

(43)

%LEOLRJUDSK\

>@ . 0 DEGDOOD HW DO 6FKHGXOLQJ DQG H[HFXWLRQ RI FRPSXWH WDVNV 86 3DWHQW 86 

>@ - 7 $GULDHQV . &RPSWRQ 1 6 .LP DQG 0 - 6FKXOWH 7KH FDVH IRU JSJSX VSDWLDO PXOWLWDVNLQJ ,Q ,((( ,QWHUQDWLRQDO 6\PSRVLXP RQ +LJK3HUIRUPDQFH &RPS

$UFKLWHFWXUH SDJHV ± )HE 

>@ 5 $XVDYDUXQJQLUXQ . . : &KDQJ / 6XEUDPDQLDQ * + /RK DQG 2 0XWOX

6WDJHG PHPRU\ VFKHGXOLQJ $FKLHYLQJ KLJK SHUIRUPDQFH DQG VFDODELOLW\ LQ KHWHUR

JHQHRXV V\VWHPV ,Q &RPSXWHU $UFKLWHFWXUH ,6&$   WK $QQXDO ,QWHUQDWLRQDO 6\PSRVLXP RQ SDJHV ± -XQH 

>@ $ %DNKRGD * / <XDQ : : / )XQJ + :RQJ DQG 7 0 $DPRGW $QDO\]LQJ FXGD ZRUNORDGV XVLQJ D GHWDLOHG JSX VLPXODWRU ,Q 3HUIRUPDQFH $QDO\VLV RI 6\VWHPV DQG 6RIWZDUH  ,63$66  ,((( ,QWHUQDWLRQDO 6\PSRVLXP RQ SDJHV ±

 $SULO 

>@ 6 &KH 0 %R\HU - 0HQJ ' 7DUMDQ - : 6KHDIIHU 6 + /HH DQG . 6NDGURQ

5RGLQLD $ EHQFKPDUN VXLWH IRU KHWHURJHQHRXV FRPSXWLQJ ,Q :RUNORDG &KDUDF

WHUL]DWLRQ  ,,6:&  ,((( ,QWHUQDWLRQDO 6\PSRVLXP RQ SDJHV ± 2FW



>@ 0 *HEKDUW ' 5 -RKQVRQ ' 7DUMDQ 6 : .HFNOHU : - 'DOO\ ( /LQGKROP DQG . 6NDGURQ (QHUJ\HIILFLHQW PHFKDQLVPV IRU PDQDJLQJ WKUHDG FRQWH[W LQ WKURXJK

SXW SURFHVVRUV ,Q 3URFHHGLQJV RI WKH WK $QQXDO ,QWHUQDWLRQDO 6\PSRVLXP RQ &RP

SXWHU $UFKLWHFWXUH ,6&$ ¶ SDJHV ± 1HZ <RUN 1< 86$  $&0

 doi:10.6342/NTU201602569

(44)

>@ 0 *HEKDUW 6 : .HFNOHU % .KDLODQ\ 5 .UDVKLQVN\ DQG : - 'DOO\ 8QLI\LQJ SULPDU\ FDFKH VFUDWFK DQG UHJLVWHU ILOH PHPRULHV LQ D WKURXJKSXW SURFHVVRU ,Q 

WK $QQXDO ,((($&0 ,QWHUQDWLRQDO 6\PSRVLXP RQ 0LFURDUFKLWHFWXUH SDJHV ±

 'HF 

>@ $ -RJ 2 .D\LUDQ 1 &KLGDPEDUDP 1DFKLDSSDQ $ . 0LVKUD 0 7 .DQGHPLU

2 0XWOX 5 ,\HU DQG & 5 'DV 2ZO &RRSHUDWLYH WKUHDG DUUD\ DZDUH VFKHGXOLQJ WHFKQLTXHV IRU LPSURYLQJ JSJSX SHUIRUPDQFH 6,*3/$1 1RW   ± 0DU



>@ $ -RJ 2 .D\LUDQ $ . 0LVKUD 0 7 .DQGHPLU 2 0XWOX 5 ,\HU DQG & 5

'DV 2UFKHVWUDWHG VFKHGXOLQJ DQG SUHIHWFKLQJ IRU JSJSXV ,Q 3URFHHGLQJV RI WKH

WK $QQXDO ,QWHUQDWLRQDO 6\PSRVLXP RQ &RPSXWHU $UFKLWHFWXUH ,6&$ ¶ SDJHV

± 1HZ <RUN 1< 86$  $&0

>@ 2 .D\ÕUDQ $ -RJ 0 7 .DQGHPLU DQG & 5 'DV 1HLWKHU PRUH QRU OHVV 2SWL

PL]LQJ WKUHDGOHYHO SDUDOOHOLVP IRU JSJSXV ,Q 3URFHHGLQJV RI WKH QG ,QWHUQDWLRQDO

&RQIHUHQFH RQ 3DUDOOHO $UFKLWHFWXUHV DQG &RPSLODWLRQ 7HFKQLTXHV SDJHV ±

6HSW 

>@ 0 /HH 6 6RQJ - 0RRQ - .LP : 6HR < &KR DQG 6 5\X ,PSURYLQJ JSJSX UH

VRXUFH XWLOL]DWLRQ WKURXJK DOWHUQDWLYH WKUHDG EORFN VFKHGXOLQJ ,Q  ,((( WK ,Q

WHUQDWLRQDO 6\PSRVLXP RQ +LJK 3HUIRUPDQFH &RPSXWHU $UFKLWHFWXUH +3&$  SDJHV

± )HE 

>@ ' /L 0 5KX ' 5 -RKQVRQ 0 2¶&RQQRU 0 (UH] ' %XUJHU ' 6 )XVVHOO DQG 6 : 5HGGHU 3ULRULW\EDVHG FDFKH DOORFDWLRQ LQ WKURXJKSXW SURFHVVRUV ,Q 

,((( VW ,QWHUQDWLRQDO 6\PSRVLXP RQ +LJK 3HUIRUPDQFH &RPSXWHU $UFKLWHFWXUH +3&$  SDJHV ± )HE 

>@ 9 1DUDVLPDQ 0 6KHEDQRZ & - /HH 5 0LIWDNKXWGLQRY 2 0XWOX DQG < 1

3DWW ,PSURYLQJ JSX SHUIRUPDQFH YLD ODUJH ZDUSV DQG WZROHYHO ZDUS VFKHGXOLQJ ,Q

 doi:10.6342/NTU201602569

(45)

3URFHHGLQJV RI WKH WK $QQXDO ,((($&0 ,QWHUQDWLRQDO 6\PSRVLXP RQ 0LFURDU

FKLWHFWXUH 0,&52 SDJHV ± 1HZ <RUN 1< 86$  $&0

>@ - 1LFNROOV DQG : - 'DOO\ 7KH JSX FRPSXWLQJ HUD ,((( 0LFUR   ±

0DUFK 

>@ 19,',$ &XGD FF VGN FRGH VDPSOHV 

>@ 19,',$ .HSOHU *. ZKLWHSDSHU ?iiT,ffrrrXMpB/BX+QKf+QMi2MifS.6f F2TH2`fLoA.A@E2TH2`@:ERRy@`+?Bi2+im`2@q?Bi2TT2`XT/7 

>@ 19,',$ &RUSRUDWLRQ 19,',$ &8'$ &RPSXWH 8QLILHG 'HYLFH $UFKLWHFWXUH 3UR

JUDPPLQJ *XLGH 19,',$ &RUSRUDWLRQ 

>@ 0 6 2UU % 0 %HFNPDQQ 6 . 5HLQKDUGW DQG ' $ :RRG )LQHJUDLQ WDVN DJ

JUHJDWLRQ DQG FRRUGLQDWLRQ RQ JSXV ,Q 3URFHHGLQJ RI WKH VW $QQXDO ,QWHUQDWLRQDO 6\PSRVLXP RQ &RPSXWHU $UFKLWHFXWXUH ,6&$ ¶ SDJHV ± 3LVFDWDZD\ 1-

86$  ,((( 3UHVV

>@ 7 * 5RJHUV 0 2¶&RQQRU DQG 7 0 $DPRGW &DFKHFRQVFLRXV ZDYHIURQW VFKHGXO

LQJ ,Q 3URFHHGLQJV RI WKH  WK $QQXDO ,((($&0 ,QWHUQDWLRQDO 6\PSRVLXP RQ 0LFURDUFKLWHFWXUH 0,&52 SDJHV ± :DVKLQJWRQ '& 86$  ,(((

&RPSXWHU 6RFLHW\

>@ 7 * 5RJHUV 0 2¶&RQQRU DQG 7 0 $DPRGW 'LYHUJHQFHDZDUH ZDUS VFKHGXO

LQJ ,Q 3URFHHGLQJV RI WKH WK $QQXDO ,((($&0 ,QWHUQDWLRQDO 6\PSRVLXP RQ 0LFURDUFKLWHFWXUH 0,&52 SDJHV ± 1HZ <RUN 1< 86$  $&0

>@ $ 6HWKLD ' $ -DPVKLGL DQG 6 0DKONH 0DVFDU 6SHHGLQJ XS JSX ZDUSV E\

UHGXFLQJ PHPRU\ SLWVWRSV ,Q  ,((( VW ,QWHUQDWLRQDO 6\PSRVLXP RQ +LJK 3HUIRUPDQFH &RPSXWHU $UFKLWHFWXUH +3&$  SDJHV ± )HE 

>@ - ( 6WRQH ' *RKDUD DQG * 6KL 2SHQFO $ SDUDOOHO SURJUDPPLQJ VWDQGDUG IRU KHWHURJHQHRXV FRPSXWLQJ V\VWHPV ,((( 'HV 7HVW   ± 0D\ 

 doi:10.6342/NTU201602569

(46)

>@ = :DQJ - <DQJ 5 0HOKHP % &KLOGHUV < =KDQJ DQG 0 *XR 6LPXOWDQHRXV PXOWLNHUQHO JSX 0XOWLWDVNLQJ WKURXJKSXW SURFHVVRUV YLD ILQHJUDLQHG VKDULQJ ,Q

 ,((( ,QWHUQDWLRQDO 6\PSRVLXP RQ +LJK 3HUIRUPDQFH &RPSXWHU $UFKLWHFWXUH +3&$  SDJHV ± 0DUFK 

>@ ; ;LH < /LDQJ < :DQJ * 6XQ DQG 7 :DQJ &RRUGLQDWHG VWDWLF DQG G\QDPLF FDFKH E\SDVVLQJ IRU JSXV ,Q  ,((( VW ,QWHUQDWLRQDO 6\PSRVLXP RQ +LJK 3HUIRUPDQFH &RPSXWHU $UFKLWHFWXUH +3&$  SDJHV ± )HE 

 doi:10.6342/NTU201602569

參考文獻

相關文件

下列哪一種記憶體屬於非揮發性記憶體, 不會因電源關閉而使其中的資料消 失, 但是可以透過電壓的方式重複抹除資料, 可用於基本輸入/ 輸出系統 (Basic Input / Output System,BIOS)

(一)機關主要職掌:本局組織規程奉考試院 101.8.1 考授銓法五字第 1013628775 號函發布,掌理全市教育行政業務,組織規程修正經考 試院 107.4.11

[r]

MOV reg,data reg ← data 轉移立即資料(data)到暫存器 reg 內 MOV dreg,sreg dreg ← sreg 轉移暫存器 sreg 的內容到暫存器 dreg MOV segreg,reg segreg ← reg

進行 18 以內的加法和減法口算 學生須透過口算解主要以圖像闡述的應用 題,並以橫式作記錄。.. 加法和減法的直式在學習單位 1N4

下列關於 CPU 的敘述,何者正確?(A)暫存器是 CPU 內部的記憶體(B)CPU 內部快取記憶體使 用 Flash Memory(C)具有 32 條控制匯流排排線的 CPU,最大定址空間為

D5.1 應用1個具體圖像代表 1個單位,製作象形圖 D5.2

透過讚賞表達支持和鼓勵,都能讓人有 被關 注、 被認同及被需要 的感覺,更是有效提升學 生抗逆力的妙藥。惟須謹記,運用時「必須到 位」, 且