• 沒有找到結果。

產生的 VHDL 原始碼

四、 OR1K 以 ASIP-Meister 實作

4.9 產生的 VHDL 原始碼

表-4 我們可以看到產生的 CPU 原始碼,其中最重要的是 ua_ctrl 是產生 所有控制信號的模組,而 uf_開頭的就是 ASIPmeister 提供的系統單元。其 他的 ua_mux, ua_preg 等就是依據為指令中的描述來產生各個系統單元間資 料傳送的流向而由 ASIPmeister 產生。在圖-16 我們可以看到整個的方塊圖,

也可以追蹤各系統單元資料傳送的過程。如果同一個系統單元的輸入有可能 來自一個以上的系統單元,那麼 ASIPmeister 會自動的加上 MUX, 而這些資 料的傳送是跨流水線的階段,那麼 ASIPmeister 會自動的加上需要的流水線 暫存器。

表-4 CPU Top VHDL 模組

圖-16 CPU Top 方塊圖 4.10 驗證方式

我以附件一的測試軟體於軟體 OR1K simulator 中執行得到附件二的執行 過程檔。再將測試程式的機器碼放入 VHDL test bench 中跑模擬(圖-17), 並已觀察 PC 及 Register 變化(表-5),來分析 ASIPmeister 的 VHDL 是否符 合 OR1K 軟體 simulator 的結果,若有不符則再回去 ASIPmeister 修改設計。

一般而言都是修改微運算的部分,直到 VHDL 模擬結果與軟體 simulator 的結 果一致以後,才依此 VHDL 原始碼進行晶片設計。

表-5 Test Program Simulator 結果

五.晶片實作 5.1 設計流程

ASIP design =>

VHDL code generated =>

Write test program and run OR1K simulater to get log file =>

RTL simulation using ModelSim =>

Sythesis usign Synopsys Design Compiler =>

Add scan chain using DFT compiler =>

RTL with scan cell simulation using ModelSim =>

Run TetraMax to get coverage log =>

Placement using SOC Encounter =>

Pre-layout Gate level Simulation using ModelSim =>

APR using SOC Encounter =>

Post-layout Gate level simulation using ModelSim=>

Replace GDS with Vertuso =>

DRC/LVS with Calibre 5.2 模擬結果

除了在邏輯合成前跑過模擬外,我們分別在合成後加了 Scan chain 及 pre-layout, post-layout 都跑了模擬(圖-18,19,20),除了存取外部 data memorye 有部分 unknown 外,都沒有 setup/hold time 等違反 Timing 的錯誤 產生。

圖-19 Gate level simulation

圖-20 Post layout simulation

5.3 晶片規格

Core Voltage : 1.8V I/O Voltage : 3.3V

System Clock :

35Mhz (Asipmeister estimate: 32.7ns, Synthesis before

scanchain added: 14.39ns, final: 28.45ns.Using 10ns SRAM as external instruction and data memory).

Chip Area : 1.4x1.4 mm Core Area : 1x1 mm

Power Consumption Simulation(Toggle rate probability 0.5) : 31mw Utilization of core : 50.7%

Core Gate count :

32653 Gate(Asipmeister estimate:59.5K(typ),

Synthesis before scanchain added:29.4K, final:32.5K) Area: 0.325 mm^2 為了節省經費及本設計只有 32K gate, 所以我們並沒有將所有 Bus 的 pad 拉出,instruction address 只拉了 11 條,而 data bus 指用 low byte, 地址只拉了兩條。但也足夠驗證所有除了 load/store word 及 half word 指 令。因此將 die size 固定在 1.4x1.4mm, 64 I/O pad.雖然 power for I/O 及 core 只有各一對,但於 SOC Encounter Power 分析時將 Toggle probability 設成 0.5 時都只有 30mw 左右,且 power strip 有三組,不會有電流密度問 題。

5.4 佈局結果錯誤說明

5.5 測試規劃

驗證時需設計一小 CHIP COB 版並將晶片 Bond 在上面,並設計上方的小測 版用來測試驗證程式。Data Memory 經由 Buf 線路形成 dual port,可以用來 下載程式。SW 及 LED 用來當簡單的輸入及輸出。Glue logic 用來將同步的 clk, request, acknowledge 轉為外部 memory read, memory write 信號。

因本驗證系統須待 CIC 晶片完成後才能驗證,因此目前只是規劃,尚未完 成。

圖-22 驗證系統方塊

Glue Logic

uP Port

B U F

Instruction Memory

B U F

SW &

LED Chip COB

With

Headers

For

LA

六.結論

關於 ASIP 設計工具的最大功能,在於加快設計及修改的週期,以便在 各種設計的可能中挑到一個最合乎要求的設計。以[10]中的論文提到,在 一個學習了 PEAS-III(後來更名為 ASIPmeister)工具的學生,設計一個 32 bit R3000 的時間(表 6-1)需約 58 小時,修改成 24-bit 的 RISC processor 只需 13 小時。在本實作的設計及學習時間上,學習的時間並未記錄,在設 計的時間上實際花的時間第一版的工作時間約 12 天,但後續與 OR1K software simulator 實際比較及修改微運算的過程約花了 12 天。因此實 際設計花了約 24x8=196 小時,是[10]中 58 小時的三倍多。分析一下,[10]

中的 modififying errors 只花了 2 小時,而我花了與寫第一版一樣長的時 間,原因可能是我本身對 OR1K 不熟的關係及必須處理 Data hazard。

表-7 R3000 設計時間

至於合成出來的 gate count, 表-8 中可以知道[10]的 R3000 設計合成 30.8K。而 OR1K 也與 R3000 類似是 32-bit RISC, 我的合成結果是 32.6K。

而 OR1K 的功能與 R3000 相仿,因此兩者的 gate count 結果也差異不大。

實作的結果證實[10]所提的設計時間及 gate count 與本設計比較,除設計 時間[10]中較短外,本設計與[10]的數據相符。也因此可以證實 ASIP 工具 可以加速設計週期,以本設計而言共花了 24 天工作天即可以產生完整的 RTL-code, 相較於完全手動撰寫,在時間上的確有很大的幫助。

至於 C compiler 的產生,因為我們沒有 COSY compiler generator 可 以測試,雖然 ASIMmeister 有產生相關的檔案,但無法驗證 C compiler 的 正確與否,也無法進行針對特定 application 來進行架構調整的工作。因 此對於要進一步使用此 ASIP 工具設計 ASIP 時,必須向 Target Compiler Technology 申請學術版的 COSY 或購買。沒有 COSY,設計出來的 ASIP 就沒 有 C compiler 可用,就不能去調整 Processor 架構。我曾評估自行修改 GNU C compiler 中的 Mechine Description file, 但是必須花不少時間。

最後決定只作 general purpose RISC, 以免需自行設計 C compiler。

ASIPmeister 是用來設計 ASIP, 而我的實作受限於 C compiler 無法產生,

因此只作 general purpose RISC,是日後可以改進的地方。

表-8 R3000 合成结果

在完整的 OR1K BAS32 I 指令能以 ASIPmesiter 完成後,要進一步的刪 掉不必要的指令,或者要增加特殊的指令都可以很快速的修改。以[10]中 所提的例子在完成 32-bit 的 R3000 設計後,表-7 中 modification 顯示他 們改成 24-bit 的指令只花了 13 小時。同樣的在本設計完成後要修改也是 十分容易,可以因應不同的需求來規劃修改。本實作證實了 RISC processor 以 ASIP 的方式,在縮短設計時間上是十分具有優勢的,而產生的 RTL-code, 在 gate count 及速度上也有可以接受的結果。

參 考 文 獻

[1].

http://www.coware.com/products/processordesigner.php Coware

Company.

[2]. http://www.retarget.com/brfchschk.html Target Compiler Tech.

[3]. http://www.asip-solutions.com/ Asip-solutions Company [4]. http://www.ics.uci.edu/~express/ Expression home page.

[5]. Souvik Basu, Rajat Moona, "High Level Synthesis from Sim-nML Processor Models,"

vlsid

, p. 255, 16th International Conference on VLSI Design, 2003.

[6]. http://www.opencores.org/projects.cgi/web/or1k/overview Or1k home page.

[7]. ASIP Meister User’s Manual [8]. ASIP Meister Tutorial.

[9]. OpenRISC 1000 Architecture Manual,

26/June/2004.

[10]. Kitajima, A.; Itoh, M.; Sato, J.; Shiomi, A.; Takeuchi, Y.; Imai, M.“Effectiveness of the ASIP design system PEAS-III in design of pipelined processors",Page(s):649 - 654 ,Design Automation Conference, 2001. Proceedings of the ASP-DAC 2001. Asia and South Pacific 30 Jan.-2 Feb. 2001

附 件 一:test program

/* Basic instruction set test */

#include "../support/spr_defs.h"

/* target=uclinux-or32 start at 0x100 */

_main:

l.addi r9,r8,0x100

_mem: l.movhi r3,0x1234 l.ori r3,r3,0x5678

l.sb 5(r31),r4

l.andi r8,r8,1

l.add r8,r8,r4

l.add r8,r8,r4

l.lwz r9,0(r31) l.add r8,r9,r8 l.sw 0(r31),r8

l.lwz r9,0(r31) l.movhi r3,0x4c69 l.ori r3,r3,0xe5f7 l.add r8,r8,r3

l.or r3,r0,r8

l.nop NOP_REPORT /* Should be */

l.addi r3,r0,0 l.nop NOP_EXIT

/* stack */

.org 0x800

.section .stack .space 0x100 _tmp_stack:

/* error */

.org 0x900

_error:

附 件 二:software simulation log r30=fffffffc r31=ffffffff r2 =00000003 00000150 l.sub r30,r31,r2 r29=fffffff5 r30=fffffffc r3 =00000007 00000154 l.sub r29,r30,r3 r28=ffffffe6 r29=fffffff5 r4 =0000000f 00000158 l.sub r28,r29,r4 r27=ffffffc7 r28=ffffffe6 r5 =0000001f 0000015c l.sub r27,r28,r5 r26=ffffff88 r27=ffffffc7 r6 =0000003f 00000160 l.sub r26,r27,r6 r25=ffffff09 r26=ffffff88 r7 =0000007f 00000164 l.sub r25,r26,r7 r24=fffffe0a r25=ffffff09 r8 =000000ff 00000168 l.sub r24,r25,r8 r23=fffffc0b r24=fffffe0a r9 =000001ff 0000016c l.sub r23,r24,r9 r22=fffff80c r23=fffffc0b r10=000003ff 00000170 l.sub r22,r23,r10 r21=fffff00d r22=fffff80c r11=000007ff 00000174 l.sub r21,r22,r11 r20=ffffe00e r21=fffff00d r12=00000fff 00000178 l.sub r20,r21,r12 r19=ffffc00f r20=ffffe00e r13=00001fff 0000017c l.sub r19,r20,r13 r18=ffff8010 r19=ffffc00f r14=00003fff 00000180 l.sub r18,r19,r14 r17=ffff0011 r18=ffff8010 r15=00007fff 00000184 l.sub r17,r18,r15 r16=ffff0012 r17=ffff0011 r16=ffff0012 00000188 l.sub r16,r17,r16 r3 =ffff0012 r16=ffff0012 0000018c l.or r3,r0,r16 r8 =00000111 r8 =00000111 r4 =00000012 000001b0 l.add r8,r8,r4 EA =0000000b r4 =00000012 000001b4 l.sb 0xb(r31),r4 r4 =00000034 EA =00000005 000001b8 l.lbz r4,0x5(r31) r8 =00000145 r8 =00000145 r4 =00000034 000001bc l.add r8,r8,r4 EA =0000000a r4 =00000034 000001c0 l.sb 0xa(r31),r4 r4 =00000056 EA =00000006 000001c4 l.lbz r4,0x6(r31) r8 =0000019b r8 =0000019b r4 =00000056 000001c8 l.add r8,r8,r4 EA =00000009 r4 =00000056 000001cc l.sb 0x9(r31),r4

r4 =00000078 EA =00000007 000001d0 l.lbz r4,0x7(r31) r8 =00000213 r8 =00000213 r4 =00000078 000001d4 l.add r8,r8,r4 EA =00000008 r4 =00000078 000001d8 l.sb 0x8(r31),r4 r4 =00000078 EA =00000008 000001dc l.lbs r4,0x8(r31) r8 =0000028b r8 =0000028b r4 =00000078 000001e0 l.add r8,r8,r4 EA =00000007 r4 =00000078 000001e4 l.sb 0x7(r31),r4 r4 =00000056 EA =00000009 000001e8 l.lbs r4,0x9(r31) r8 =000002e1 r8 =000002e1 r4 =00000056 000001ec l.add r8,r8,r4 EA =00000006 r4 =00000056 000001f0 l.sb 0x6(r31),r4 r4 =00000034 EA =0000000a 000001f4 l.lbs r4,0xa(r31) r8 =00000315 r8 =00000315 r4 =00000034 000001f8 l.add r8,r8,r4 EA =00000005 r4 =00000034 000001fc l.sb 0x5(r31),r4 r4 =00000012 EA =0000000b 00000200 l.lbs r4,0xb(r31) r8 =00000327 r8 =00000327 r4 =00000012 00000204 l.add r8,r8,r4 EA =00000004 r4 =00000012 00000208 l.sb 0x4(r31),r4 r4 =00001234 EA =00000004 0000020c l.lhz r4,0x4(r31) r8 =0000155b r8 =0000155b r4 =00001234 00000210 l.add r8,r8,r4 EA =0000000a r4 =00001234 00000214 l.sh 0xa(r31),r4 r4 =00005678 EA =00000006 00000218 l.lhz r4,0x6(r31) r8 =00006bd3 r8 =00006bd3 r4 =00005678 0000021c l.add r8,r8,r4 EA =00000008 r4 =00005678 00000220 l.sh 0x8(r31),r4 r4 =00005678 EA =00000008 00000224 l.lhs r4,0x8(r31) r8 =0000c24b r8 =0000c24b r4 =00005678 00000228 l.add r8,r8,r4 EA =00000006 r4 =00005678 0000022c l.sh 0x6(r31),r4 r4 =00001234 EA =0000000a 00000230 l.lhs r4,0xa(r31) r8 =0000d47f r8 =0000d47f r4 =00001234 00000234 l.add r8,r8,r4 EA =00000004 r4 =00001234 00000238 l.sh 0x4(r31),r4 r4 =12345678 EA =00000004 0000023c l.lwz r4,0x4(r31) r8 =12352af7 r8 =12352af7 r4 =12345678 00000240 l.add r8,r8,r4 r3 =12352af7 r8 =12352af7 00000244 l.or r3,r0,r8 00000248 l.nop 0x2

r9 =ffff0012 EA =00000000 0000024c l.lwz r9,0x0(r31) r8 =12342b09 r9 =ffff0012 r8 =12342b09 00000250 l.add r8,r9,r8 EA =00000000 r8 =12342b09 00000254 l.sw 0x0(r31),r8 r7 =fffffffe r5 =ffffffff r3 =00000001 0000026c l.sub r7,r5,r3 r8 =00000002 r3 =00000001 r5 =ffffffff 00000270 l.sub r8,r3,r5 r8 =00000000 r8 =00000000 r7 =fffffffe 00000274 l.add r8,r8,r7 r9 =00000003 r3 =00000001 r4 =00000002 00000278 l.add r9,r3,r4 r7 =fffffffa r9 =00000003 r7 =fffffffa 0000027c l.mul r7,r9,r7 r8 =fffffffa r8 =fffffffa r7 =fffffffa 00000280 l.add r8,r8,r7 r3 =fffffffa r8 =fffffffa 00000284 l.or r3,r0,r8 00000288 l.nop 0x2

--- 100 instruction --- r9 =12342b09 EA =00000000 0000028c l.lwz r9,0x0(r31) r8 =12342b03 r9 =12342b09 r8 =12342b03 00000290 l.add r8,r9,r8 EA =00000000 r8 =12342b03 00000294 l.sw 0x0(r31),r8 r3 =00000001 00000298 l.addi r3,r0,0x1 r4 =00000002 0000029c l.addi r4,r0,0x2 r5 =ffffffff 000002a0 l.addi r5,r0,-1

r6 =ffffffff 000002a4 l.addi r6,r0,-1 r8 =00000000 000002a8 l.addi r8,r0,0x0 r8 =00000000 r8 =00000000 000002ac l.andi r8,r8,0x1 r8 =00000000 r8 =00000000 r3 =00000001 000002b0 l.and r8,r8,r3 r8 =00005a5a r5 =ffffffff 000002b4 l.xori r8,r5,-23131 r8 =ffffa5a5 r8 =ffffa5a5 r5 =ffffffff 000002b8 l.xor r8,r8,r5 r8 =ffffa5a7 r8 =ffffa5a7 000002bc l.ori r8,r8,0x2 r8 =ffffa5a7 r8 =ffffa5a7 r4 =00000002 000002c0 l.or r8,r8,r4 r3 =ffffa5a7 r8 =ffffa5a7 000002c4 l.or r3,r0,r8 000002c8 l.nop 0x2

r9 =12342b03 EA =00000000 000002cc l.lwz r9,0x0(r31) r8 =1233d0aa r9 =12342b03 r8 =1233d0aa 000002d0 l.add r8,r9,r8 EA =00000000 r8 =1233d0aa 000002d4 l.sw 0x0(r31),r8 r8 =ffffff00 r8 =ffffff00 r4 =00000002 000002f0 l.sll r8,r8,r4 r8 =03fffffc r8 =03fffffc 000002f4 l.srli r8,r8,0x6 r8 =00ffffff r8 =00ffffff r4 =00000002 000002f8 l.srl r8,r8,r4 r8 =003fffff r8 =003fffff 000002fc l.srai r8,r8,0x2 r8 =000fffff r8 =000fffff r4 =00000002 00000300 l.sra r8,r8,r4 r3 =000fffff r8 =000fffff 00000304 l.or r3,r0,r8 00000308 l.nop 0x2

r9 =1233d0aa EA =00000000 0000030c l.lwz r9,0x0(r31) r8 =1243d0a9 r9 =1233d0aa r8 =1243d0a9 00000310 l.add r8,r9,r8 EA =00000000 r8 =1243d0a9 00000314 l.sw 0x0(r31),r8 r8 =00000200 r8 =00000200 r4 =00000200 0000032c l.add r8,r8,r4 r3 =00000001 r4 =00000200 00000330 l.sfeq r3,r4 r4 =00000200 r5 =ffffffff 00000334 l.andi r4,r5,0x200 r8 =00000400 r8 =00000400 r4 =00000200 00000338 l.add r8,r8,r4 r3 =00000001 r3 =00000001 0000033c l.sfne r3,r3 r4 =00000200 r5 =ffffffff 00000340 l.andi r4,r5,0x200 r8 =00000600 r8 =00000600 r4 =00000200 00000344 l.add r8,r8,r4 r3 =00000001 r4 =00000200 00000348 l.sfne r3,r4 r4 =00000200 r5 =ffffffff 0000034c l.andi r4,r5,0x200 r8 =00000800 r8 =00000800 r4 =00000200 00000350 l.add r8,r8,r4 r3 =00000001 r3 =00000001 00000354 l.sfgtu r3,r3 r4 =00000200 r5 =ffffffff 00000358 l.andi r4,r5,0x200 r8 =00000a00 r8 =00000a00 r4 =00000200 0000035c l.add r8,r8,r4 r3 =00000001 r4 =00000200 00000360 l.sfgtu r3,r4 r4 =00000200 r5 =ffffffff 00000364 l.andi r4,r5,0x200 r8 =00000c00 r8 =00000c00 r4 =00000200 00000368 l.add r8,r8,r4 r3 =00000001 r3 =00000001 0000036c l.sfgeu r3,r3 r4 =00000200 r5 =ffffffff 00000370 l.andi r4,r5,0x200 r8 =00000e00 r8 =00000e00 r4 =00000200 00000374 l.add r8,r8,r4 r3 =00000001 r4 =00000200 00000378 l.sfgeu r3,r4

r4 =00000200 r5 =ffffffff 0000037c l.andi r4,r5,0x200 r8 =00001000 r8 =00001000 r4 =00000200 00000380 l.add r8,r8,r4 r3 =00000001 r3 =00000001 00000384 l.sfltu r3,r3 r4 =00000200 r5 =ffffffff 00000388 l.andi r4,r5,0x200 r8 =00001200 r8 =00001200 r4 =00000200 0000038c l.add r8,r8,r4 r3 =00000001 r4 =00000200 00000390 l.sfltu r3,r4 r4 =00000200 r5 =ffffffff 00000394 l.andi r4,r5,0x200 r8 =00001400 r8 =00001400 r4 =00000200 00000398 l.add r8,r8,r4 r3 =00000001 r3 =00000001 0000039c l.sfleu r3,r3 r4 =00000200 r5 =ffffffff 000003a0 l.andi r4,r5,0x200 r8 =00001600 r8 =00001600 r4 =00000200 000003a4 l.add r8,r8,r4 r3 =00000001 r4 =00000200 000003a8 l.sfleu r3,r4 r4 =00000200 r5 =ffffffff 000003ac l.andi r4,r5,0x200 r8 =00001800 r8 =00001800 r4 =00000200 000003b0 l.add r8,r8,r4 r3 =00000001 r3 =00000001 000003b4 l.sfgts r3,r3 r4 =00000200 r5 =ffffffff 000003b8 l.andi r4,r5,0x200 r8 =00001a00 r8 =00001a00 r4 =00000200 000003bc l.add r8,r8,r4 r3 =00000001 r4 =00000200 000003c0 l.sfgts r3,r4 r4 =00000200 r5 =ffffffff 000003c4 l.andi r4,r5,0x200 r8 =00001c00 r8 =00001c00 r4 =00000200 000003c8 l.add r8,r8,r4 r3 =00000001 r3 =00000001 000003cc l.sfges r3,r3 r4 =00000200 r5 =ffffffff 000003d0 l.andi r4,r5,0x200 r8 =00001e00 r8 =00001e00 r4 =00000200 000003d4 l.add r8,r8,r4 r3 =00000001 r4 =00000200 000003d8 l.sfges r3,r4 r4 =00000200 r5 =ffffffff 000003dc l.andi r4,r5,0x200 r8 =00002000 r8 =00002000 r4 =00000200 000003e0 l.add r8,r8,r4 r3 =00000001 r3 =00000001 000003e4 l.sflts r3,r3 r4 =00000200 r5 =ffffffff 000003e8 l.andi r4,r5,0x200 r8 =00002200 r8 =00002200 r4 =00000200 000003ec l.add r8,r8,r4 r3 =00000001 r4 =00000200 000003f0 l.sflts r3,r4 r4 =00000200 r5 =ffffffff 000003f4 l.andi r4,r5,0x200 r8 =00002400 r8 =00002400 r4 =00000200 000003f8 l.add r8,r8,r4 r3 =00000001 r3 =00000001 000003fc l.sfles r3,r3 r4 =00000200 r5 =ffffffff 00000400 l.andi r4,r5,0x200 r8 =00002600 r8 =00002600 r4 =00000200 00000404 l.add r8,r8,r4 r3 =00000001 r4 =00000200 00000408 l.sfles r3,r4 r4 =00000200 r5 =ffffffff 0000040c l.andi r4,r5,0x200 r8 =00002800 r8 =00002800 r4 =00000200 00000410 l.add r8,r8,r4 r3 =00002800 r8 =00002800 00000414 l.or r3,r0,r8 00000418 l.nop 0x2

--- 200 instruction --- r9 =1243d0a9 EA =00000000 0000041c l.lwz r9,0x0(r31) r8 =1243f8a9 r9 =1243d0a9 r8 =1243f8a9 00000420 l.add r8,r9,r8 EA =00000000 r8 =1243f8a9 00000424 l.sw 0x0(r31),r8

00000450 l.bf 0x2 r8 =1243f8b3 r9 =1243f8a9 r8 =1243f8b3 000004a0 l.add r8,r9,r8 EA =00000000 r8 =1243f8b3 000004a4 l.sw 0x0(r31),r8 r9 =1243f8b3 EA =00000000 000004a8 l.lwz r9,0x0(r31) r3 =4c690000 000004ac l.movhi r3,0x4c69 r3 =4c69e5f7 r3 =4c69e5f7 000004b0 l.ori r3,r3,0xe5f7 r8 =5eaddeaa r8 =5eaddeaa r3 =4c69e5f7 000004b4 l.add r8,r8,r3 r3 =5eaddeaa r8 =5eaddeaa 000004b8 l.or r3,r0,r8 000004bc l.nop 0x2

r3 =00000000 000004c0 l.addi r3,r0,0x0 000004c4 l.nop 0x1

相關文件