S YSTEM DEVELOPMENT ENVIRONMENT - 針對在單晶片網路系統中任務群組之研究

CHAPTER 2 PRELIMINARY

2.4 S YSTEM DEVELOPMENT ENVIRONMENT

The system development environment of NoC system is quite important because it has great effect on the total system performance. This is a very complex problem with many details to be considered. We developed a flow to solve this problem, and try to find a feasible solution of it.

Figure 5 is the NoC system design flow. The applications are the target applications we want to implement in the NoC system, and it can be developed by using system modeling languages. The architecture platform modeling is the model of our hardware architecture. For example, the processor can be modeled by ISS and the switches can be modeled by system model languages. These models are analyzed and then the flow goes to the mapping process stage. The mapping process will partition the applications into tasks, and then map them to the physical

processing elements. The next step is to analyze the performance and then the result is given. Then, we can make sure the applications can work well under the NoC architecture.

Applications Architecture Platform Modeling

Analysis

Mapping Process

Performance Analysis

Figure 5. A NoC system design flow

Input Task Graph

Figure 6 The flow chart of the mapping process

Figure 6 is the flow chart of the mapping process in our design flow. The first step is to transfer the algorithm into the task graph. We can use the decomposition process in parallel processing to do this job. The next step is to analyze the fitness of the task graph. In this stage we will check the iteration bound, and conduct memory optimization. Then we will check whether our design meets the iteration bound and memory requirement. If the check fails, the flow will go back to algorithm stage to do some modification. If the check passes, the next step is the task mapping step. In this step we will cluster the tasks into groups to reduce the number of nodes in the task graph. The next step is the scheduling, in which all tasks in each processing element are scheduled. Then the physical place and route step will decide the physical PE of every task groups. Then we can do a system simulation to check the feasibility of the result. If the check is ok, the flow stops.

In the algorithm development stage before the mapping process, the algorithm should be developed to be more parallelizable so that we can parallelize the system easier in the following stages.

The algorithm will be transferred into a task graph. The task graph is not scheduled in this stage, but the throughput must be decided. According to the throughput, we can get the information of computing power requirement and buffer requirement. The resource requirement of every task and every communication between tasks must be given in the generated task graph.

In the analysis stage, we will transfer the algorithm into timed Petri net model, thus we can analyze the algorithm and find out the iteration loops of it. An iteration loop can not be separated into two different tasks because of the data dependency.

If a large iteration loop exists in an algorithm and the resource requirement of the iteration loop is more than a PE’s capacity, then this algorithm is not feasible in our system. Thus finding the iteration loops and checking whether they meet the iteration bound are necessary. After this stage, we can make sure that every task can meet a PE’s computing power capacity.

We also run the memory optimization in the analysis stage. The on chip memory is expensive, as in our hardware architecture. Thus it is important to handle the memory carefully for every task. The memory optimization is used to manage the buffer memory between tasks. And in this stage, we need to make sure that the memory requirement for each task is less than a PE’s capacity.

After the analysis stage, we will check whether every task meets the resource constraints. It means that at least one solution exists such that every task being put into a single PE is feasible. If this check fails, the flow will go to the algorithm development stage. The designers should modify the algorithm to meet this requirement.

Next is the task clustering stage. In this stage, we have a task graph as our input. We want to cluster the tasks into some groups. In every group, the constraints of a PE should be met, so that every group can be put into a single PE. This thesis will focus on this problem in the following content. After this stage, we have a grouping of the tasks, and every group can be placed into a PE.

After the grouping stage, we can run the scheduling procedure. In this stage the execution order of the tasks in the same grouping will be decided. If the scheduling is not arranged properly, extra buffer memory or computing power is required to guarantee the throughput of the applications can be met.

The next stage is PnR stage. In this stage, we place the task groups into physical PEs, and decide the communication routing between PEs. Then, we can calculate the real utilization of communication because the physical mapping is decided. If we can not find a feasible solution in this stage, we will go back to task binding and try to strengthen the communication constraint.

Finally, we will run a simulation to verify the running condition of the whole

system. This is for checking the scheduling of tasks and communication conditions.

After this test passes, our system design flow is finished

Chapter 3 A Task Clustering

在文檔中針對在單晶片網路系統中任務群組之研究 (頁 18-24)