• 沒有找到結果。

Design and Implementation of an ALU Cluster

3.2 Design and Emulation for the AHB Slave Wrapper of Intellectual Property

3.2.1 Architecture of AHB Slave Wrapper

The architecture of the proposed wrapper consists of two components that are finite state machine (FSM) and address generation unit (AGU), as shown in Figure 3.2.1. The FSM is in charge of receiving the signals from the AMBA bus and administrating the IP to obey the protocol of AHB slave. It will tell the AGU to produce the necessary address weather the ALU cluster is accessed in bursting incrementing or wrapping mode. There is one signal, alu_work, from the ALU cluster to FSM. It will be used to identify if the ALU cluster finishes executing the media applications or not. With the necessary information from the AMBA bus and ALU cluster, the FSM just can control the ALU cluster including, write data, read data, and execute instruction.

Figure 3.2.1: Architecture of AHB Slave Wrapper

3.2.1.1 Finite State Machine of AHB Slave Wrapper

The FSM of the wrapper is used to control the states of the IP and response the request of AMBA bus, which is used to dealing with issues such as reading and writing with on-chip bus and activating the ALU cluster. It has the responsibility to ensure that the IP is able to communicate with other modules from the AMBA bus. As a result, it must have some prescribed response back to AMBA bus. Also, it will identify the input signals from the AMBA bus and tell the ALU cluster if it has to execute the applications or still in its original state. Therefore, the FSM is designed with six states; they are Idle, Accessible, ALU_Work, Un-readable Wait, Un-writable Wait, and Error, as shown in Figure 3.2.2.

¾ Idle:

When the IP is not accessed and the ALU cluster finishes its executing, the FSM will be in the Idle state. It will go to other states while the bus is granted and the IP will be accessed or the ALU cluster is activated. Not only until IP has done the employment but also suffers some error, it will come back to the Idle state. In this state the wrapper will be ready to get the signals from AMBA bus and prepare next work. Basically, only while the HTRANS is equal to NONSEQ, it has the chances to move to other states. Otherwise, it will keep the Idle state. If the NONSEQ signal is encountered, it identifies which operation the IP is requested by HWRITE. Then, the FSM can move to its target state.

¾ Accessible

The FSM will directly move to Accessible state if the HTRANS is equal to NONSEQ and HERITE is high. In the Accessible state, the FSM continuously checks if the IP is accessed repeatedly. It will be two different kinds of accessing, but cause FSM still in the Accessible state. One of them is that HTRANS is still equal to NONSEQ. It means the the IP is accessed with different address. The other is that HTRANS is equal to SEQ. It represents that previous access is continuous with burst mode of wrapping or incrementing manner.

There are three cases that the FSM will move away; the HTRANS changes to IDLE or BUSY and the reading data is not ready. These three cases mean that the access is finished, busy to write, and ready to read but data is not ready and will induce the state move to Idle, Un-readable Wait, and Un-writable Wait, respectively.

Figure 3.2.2: Finite State Machine

¾ Un-readable Wait

There are two possible paths that FSM will enter the Un-readable Wait state.

One of the two paths is while the FSM is in the Idle state and the TRANS is NONSEQ and the HWRITE is low. It means the IP is being read. However, the first reading operation needs two cycles to prepare necessary data. Thus it must be in the Un-readable Wait state to wait the data. Until the data is ready, the FSM moves to Accessible state to carry out the following reading request. The other path is from Accessible state to Un-readable Wait state. The reason that the FSM must move to this state is the same as the first path, the necessary latencies.

¾ Un-writable Wait

There is only one reason that will enforce the FSM move to Un-writable Wait state. It is when the IP is being written data in burst mode of wrapping or incrementing way, but the TRANS is changed to BUSY. Pending the TRANS changes back to NONSEQ or SEQ, the FSM will return to Accessible state.

¾ Error

The FSM will go to the Error state while the IP is accessed in wrong ways. They are invalid address and invalid transaction. The invalid address is because the depth of

the data and instruction memory is limited. If the depth is over the real memory, it will be found by the FSM and go to the Error state. The invalid transaction is to avoiding the wrong combinations of HTRANS that violates the AHB protocol, although this might not be happened. The Error state will also occur if the IP is being accessed and is not granted unexpectedly, controlled by HSEL. This is designed to provide ability against accident. If the Error occurs, the Error state must obey the AHB protocol and thus have two cycles response, replying the AMBA BUS with proper HRESP and HREADY as defined in the specification.

¾ ALU_Work

The ALU_Work state manifests the IP is in the working service. The ALU cluster executes applications according to the instructions. There is only one path going to this state that is from Idle state to this one. The ALU cluster will be invoked by a strictly method that is to write the IP the end value of the program counter at predefined location. The details will be in Appendix B. In this state, if the wrapper is accessed weather it is a reading or writing operation, the FSM has the ability to reply the two cycles responses, RETRY, to the AMBA bus. At the same time, the ALU cluster keep working without being affected by the unexpected access until the end of the application. It will have internal check in the ALU cluster to judge if the execution is finished. Then, through signal, alu_work, the wrapper gets the status of ALU cluster.

There is one thing needed to be emphasized. Because the execution must go through many stages to finish one instruction, it will take more cycles to write back the executed results. The extra cycles are depending on the different functional units.

For example, the ALU is a two stages pipeline functional unit so that it takes six cycles that is two plus four to finish its operation. The four cycles is necessary course for every operation, such as instruction decoding, data source selecting, and result writing. Thus, the four stages functional unit, MUL, will take eight cycles and the sixteen cycles’ functional unit, DIV, will take twenty cycles to write back the result.

相關文件