Chapter 4 Implementation and Simulation Result
4.1. System Integration
4.1.2. N903-S Model
The selection of CPU is AndesCore N903-S. For HDL IP protection issue, it uses a behavior model for RTL simulation called AMP model. The N903-S AMP model is a binary model of released firmcore produced by Cadence IP Model Packager. AMP model communicates with simulators through the IEEE Std 1499 Open Model Interface (OMI) protocol or PLI through the OMI Adaptor. In this work we use OMI protocol for CPU model.
4.1.3. AHB Bus Protocol
For CPU compatibility and IP cores connection, the AMBA AHB bus is chosen for system bus. A typical AMBA AHB system design contains the following components, AHB master, AHB slave, AHB arbiter, and AHB decoder, these components are connected by a central multiplexer.
The AHB master sends address and control signals to slave to perform read and write operations. Only one master is allowed to use the bus at any one time. The maximum number of master in AHB bus is 16. The AHB slave receives address and control signals from AHB master and responds to a read or write operation. The AHB arbiter judges which master can use the bus and ensures that only one bus master at a time is allowed to initiate data transfers.
The AHB decoder is used to decode the address of each transfer and provide a select signal for the slave that is involved in the transfer.
ALL bus masters send the address and control signals to indicate the transfer they wish to perform and the arbiter determines which master has its address and control signals routed to all of the slaves. After slave receives the address and control signals sent from master, the decoder is also required to control the read data and response signal multiplexor to select the
appropriate signals from the slave. The central multiplexor interconnection scheme is shown in Fig. 4.2.
Fig. 4.2 AMBA central multiplexor interconnection
The AHB master must send HBUSREQ signal to AHB arbiter to grant the bus before performing a bus transfer. Then the arbiter will judge that which AHB master has higher priority to access the bus. Then the granted master sends the address and control signals to access the slaves. Each slave on the bus will receive the address signal and control signals but only the specific slave can be access and response to the master. The AHB simple transfer is shown in Fig 4.3. The AHB transfer consists of two parts, one is the address phase and the other is the data phase. The master send address and control signals to slave in address phase, and the slave response to master in data phase. Note that the data phase may be extended since the slave is not ready. The data phase can be extended by sending HREADY signal from
Fig. 4.3 AHB simple transfer
4.1.4. AHB Interface
To integrate DCO and error detector into the AMBA-based system, we need a bus interface to deal with the signal transformation between DCO/error detector and AHB bus.
Fig. 4.4 shows the block diagram of bus interface. Depend on the proposed SDPLL algorithm, CPU sets the control signal to DCO and receives the error value from. The DCO write control part passes the DCO_CTW signal to DCO to generate proper frequency and determines the mode of DCO (frequency search stage / phase tracking stage). The error detector read control part passes the divide value to the frequency divider of error detector and the mode of error detector (frequency search stage / phase tracking stage). The error value which estimated by TDC and phase lead / lag signal also passed to bus through error detector read control. All of these read and write operations can be done in one cycle. Table 1 is the comparison of this work and OpenRISC-based bus interface. The increase of area is 13% since the complexity of AMBA protocol and the cycle of operation is equal to OpenRISC-based bus interface. By
implementation of AHB interface, the DCO and error detector can be integrated into
Fig. 4.4 Block diagram of AHB interface
Set control word to DCO
Read from error detector
Bus protocol Area* Process
This work 1 cycle 1 cycle AMBA2.0 2194um^2 90nm CMOS
OpenRISC-based 1 cycle 1 cycle WISHBONE 1935um^2 90nm CMOS
Table 1 Comparison of the bus interface
4.1.5. AHB Interconnection
In this work, the bus interconnection is based on Example AMBA System (EASY) architecture. The EASY architecture provides a typical AMBA platform for SoC design.
Since the EASY platform is integrated with ARM7 and ARM9 series CPU and several default IPs, it must be modified to meet the system requirement of SDPLL platform. Fig. 4.5 shows the bus interconnection of ARM-based SDPLL platform.
Slave select
Fig. 4.5 Bus interconnection of ARM-based SDPLL platform
4.2. Software Programming
4.2.1. Software development flow and hardware simulation environment
For software development, software programmer needs toolchain to transform the source code into an executable program. Andes toolchain is built from GNU, thus the options of gcc, as, and ld are inherited [9] [10]. The cross-compiler compiles the C program, and the assembler and linker converts the assembly programs to a.out file. By default, N903-S starts fetching instructions from memory address 0x0, therefore the –Ttext=0 linker switch asks the linker to arrange the starting address of the final executable image to be at pc=0x0. For hardware / software co-simulation, it is necessary to build a development flow. The C program can be compiled to assembly program by nds32le-elf-gcc cross compiler. However, for hardware simulation the assembly program needs to be converted to a binary code which can be loaded into memory model. This conversion is performed through nds32-elf-aout2mem. Fig. 4.6 shows the conversion flow.
@000006ec 0dee7fff 51fe0000 05ce0002 51ff801c @000006f0 4a007820 51fffe2c 3bfffead 51ffffec @000006f4 51cf8018 46095000 140e0002 46095000 @000006f8 58000014 140e0004 46095000 58000024
@000006fc 140e0003 46095000 58000b84 140e0006 1bc4: 51 ff fe 2c addi $sp,$sp,#-468 [ -f rom_c.exe ] && nds32-elf-aout2mem rom_c.exe NDSROM.dat
The binary called NDSROM.dat can be loaded to memory model by Verilog
$readmemh() system task. CPU fetches instruction from memory address 0x0 and jump to
<c_star> routine and do SDPLL algorithm when simulation start. Fig. 4.7 shows the hardware simulation flow.
sed -e "s,\$$NDS_HOME,$$NDS_HOME," < ../flist.amp | grep -v "#" > flist
@[ -f NDSROM.dat ] || (echo ERROR: NDSROM.dat does not exist; exit 1) LD_LIBRARY_PATH=$(AMP_LIB_PATH):$(LD_LIBRARY_PATH)
Since SDPLL tracking algorithm need access error detector and DCO by setting value to control registers, the memory-mapped I/O mechanism is used in this work. The memory-mapped I/O control is as the following. First, define the device base address. Second, declare a pointer variable with the volatile keyword and assign base address value to the variable. Note that volatile qualifier must be used when reading the contents of a memory location whose value can change unknown to the current program. Third, this pointer can read or write IP cores register by software.
4.3. Simulation Result
In this section, the simulation result of SACA, error detector, DCO is presented as following. These IP are implemented with UMC 90nm standard cell library. Fig. 4.x is simulation waveform of SACA module. The user-defined divided value is 8. That means there are eight clock cycles which synchronous to the rising edge of reference clock will be generated in one reference clock period. The SACA can generate clock frequency with unbalanced duty cycle of the reference clock. Fig. 4.8 shows the SACA module works with 40% / 60% duty cycle of reference clock.
Fig. 4.8 Simulation of SACA with 10MHz reference clock and 83.87MHz output clock
The simulation result of error detector is shown in Fig. 4.9. After divided clock feedbacks to the error detector, the error detector generates a digital value for CTW mapping, and judges if the divided clock leads the reference clock. Part (a) of Fig. 4.9 shows that the TDC receives the phase error between reference clock and divided clock and outputs error value.
(a)
Fig. 4.10 shows the simulation result of DCO control. The DCO output clock period is changed by CTW which is sent from CPU. In part (b) of Fig. 4.10, HCLK, HADDR, HWDATA and HWRITE are come from AMBA AHB bus. The CPU sends CTW 0x8ff80000 to address 0x95000014, which means set CTW to DCO control register. Note that the address is point to DCO control register. After CPU send these control signals, the DCO control bit C1 will be set to 0x000001ff and the DCO output clock period will be changed.
Fig. 4.10 DCO control simulation result (b)
Chapter 5
Conclusion and Future Work
We have introduced basic SDPLL concept and the proposed ARM-based SDPLL platform in this thesis. The proposed SDPLL architecture has feature of software controllability and programmability by integrating CPU and silicon IPs. It is flexible for the hardware architecture and the software operation.
The following topics to extend the work can be proposed. The CPU and memory is behavior model. It must be implemented with gate-level logic to do system verification with precise timing information. And then we can do silicon implementation to verify this SoC system with real chip design.
Bibliography
[1] Terng-Yin Hsu, Bai-Jue Shieh, Chen-Yi Lee” An all-digital phase-locked loop(ADPLL)-based clock recovery circuit” Solid-State Circuits, IEEE Journal of Volume 34, Issue 8, Aug. 1999 Page(s):1063-1073
[2] Chang-Ying Chuang, Terng-Yin Hsu” The study of Software-defined Phase-locked loop”
Thesis CS, NCTU 2008
[3] Ze-Bin Huang, Terng-Yin Hsu” The study of MIMO Softwaredefined Phase-locked Loop” Thesis CS, NCTU 2009
[4] “AndesCore N903-S Integration Guide” IG0005-10, Oct. 2008 [5] “AndesCore N903-S Verification Guide” VG0005-10, Aug. 2008 [6] “AndesCore N903-S Data Sheet” DS0005-10, Nov. 2008 [7] “AMBA Specification” ARM IHI0011A, May 1999
[8] Jung-Chin Lai, Terng-Yin Hsu” The study of Wideband, Cell-based Digital Controlled Oscillator and its Implementation” Thesis CS, NCTU 2007.
[9] “Andes Programming Guide” PR002, Jun. 2009 [10] “AndeSight User Manual” UM017, Jun. 2009