Chapter 2 Related Works
2.4 Commercial RTOSes
There are many commercial real-time embedded kernels in the market, such as WindowsCE[13], Nucleus[1], vxWORKS[23], QNX[16], Lynx[11] and etc. They support real-time applications and are suitable for embedded systems. However, all of them are proprietary. Some of them even do not open their source code. Seed is an open source project, so it is royalty and buyout free.
Chapter 3
Design and Implementation
In this chapter, we will describe the design goals and actual implementation of the Seed kernel. In Section 3.1, we first give an overview of the kernel. Then, we describe each Seed component from Section 3.2 to Section 3.8. Finally, we describe the status of implementation in Section 3.9.
3.1 Kernel Overview
Seed OS kernel is designed for embedded systems and real-time systems. Due to the limited memory and CPU resources of embedded system and the timing requirements of real-time systems, Seed has following features:
z Flexibility
Since embedded systems are application-specific, it is important to keep the kernel as flexible as possible. Seed kernel divides its code into several components for flexibility. Each component can be replaced, removed and modified without totally rewriting the kernel. The interfaces and files of each kernel component are explicitly defined. In addition to a component-based kernel, we implement a Seed component as flexible and simple as we can. For example, when we create a task, we can specify its time-slice value, option of preemptive or non-preemptive, and etc.
Furthermore, changing these values at run-time is allowed by the exported interfaces of Seed.
z Deterministic Timing (Real-Time support)
Real time systems care not only the correctness of the computation, but also when the computation is completed. Therefore, a key requirement of a real-time kernel is deterministic timing. This means that the kernel services
should consume only expected amounts of time. In non-real-time kernels, their services may inject random delay into the application, and thus cause the unexpected response time. On the other hand, the real-time kernels (including Seed) have deterministic timing behaviors. Furthermore, real-time kernels should offer constant (load-independent) timing. In other words, a service consumes the same time to complete the job irrespective of the workload. The constant timing is always considered when we develop Seed kernel. With constant or deterministic timing, it is possible to analyze the worst-case performance of the real-time software.
z Portability
Seed explicitly divides the kernel source code into hardware-dependent part and hardware-independent part. The former is called Hardware Abstraction Layer (HAL). The HAL abstracts the underlying hardware, hence makes Seed portable. If we want to port Seed to another hardware platform, all we have to do is modify the HAL. All other components do not need to be changed at all.
z High performance
Since application is an embedded system should cooperate with the kernel, there is little need to implement multiple protection modes. Thus Seed selects single protection mode (i.e., kernel mode) for good performance. Traditional OS, such as Linux, adopts a dual-mode scheme (i.e. user mode and kernel mode) for kernel protection. Under this scheme, additional code is needed for changing protection domains. According to the previous research [4], single protection mode can save the time of system calls. Besides, for the sake of better performance, the Seed kernel is implemented in C language rather than other object-oriented languages such
as C++ and JAVA.
Figure 3.1 shows the architecture of the Seed system. As shown in the figure, the applications run on top of the OS, and the hardware is under the control of the OS.
Typical components in an OS are TCP/IP stack, file systems, window systems, and etc.
However, the kernel (e.g., Seed kernel) is the real nucleus of the whole operating system. The kernel is the system resource manager that allocates resource (such as CPU time, memory and I/O devices) to the tasks. As shown in the right part of Figure 3.1, Seed has following kernel components to manage the system:
z Task management z Interrupt management z Memory management z Timer management
z Message queue management z Semaphore management
z Hardware Abstraction Layer (HAL)
The features of these components are described from Section 3.2 to Section 3.8.
Application
Kernel components :
Figure 3.1 Seed Kernel Architecture Hardware
(Samsung SNDS100)
Seed Kernel
LWIP (TCP/IP)
Hardware Abstraction Layer (HAL)
Semaphore Msg. Queue
Memory Timer
Task
Seed OS
Interrupt
3.1 Task Management
3.1.1 Design
A task (also called a process or a thread) is an instance of program in execution.
An application may divide its work into tasks, each of which is responsible for a portion of the whole job. Each task has a Task Control Block (TCB), which contains CPU registers, stack, and etc. Seed kernel provides the following features on task management.
z Multi-tasking
Multi-tasking is the ability to support multiple concurrent tasks running on the same CPU. It creates pseudo parallelism and maximizes the use of the CPU. Besides, multi-tasking provides a modular construction mechanism for applications, which allows the application programs to be designed and maintained in an easier way.
z Multiple priorities
Each task can be assigned a priority when it is created by the application designer. The priority ranges from 0 to 511, where 0 is the highest priority and 511 is the lowest priority. Seed always schedules the task with the highest priority to run.
z Preemptive
Preemptive multi-tasking means that the running task can be interrupted at any time by another higher priority task. Oppositely, in the case of non-preemptive multi-tasking, the scheduling happens only when a task completes, or it explicitly releases the CPU. Seed kernel supports both kinds of multi-tasking. If we don’t want a task to be preempted, we can specify the task as non-preemptive. In a real-time system, it is prefer to
select preemptive multi-tasking for fast system responsiveness.
z Constant time scheduling
Seed always selects the highest priority task to run. In non-real-time kernels, the time spent by a scheduler for choosing the next task to run is usually non-deterministic. Some real-time kernels, including Seed, allow the task scheduler to find out the task that should be run next in a short constant time.(i.e., O(1) time) We will explain the details of the task scheduling mechanism in Section 3.2.2.
z Time-Slicing ( Round-Robin scheduling )
Seed allows two or more tasks have the same priority. Each task runs for a determined amount of time of time (called quantum), and then the scheduler selects another task with the same priority to run. The time quantum can be assigned while a task is created, or be changed at run-time.
Note that time-slicing is disabled if the task is non-preemptive.
At any given time, a Seed task is always in one of the following states: create, running, ready, suspend, and terminate. As shown in Figure 3.2, a task enters the create state when it is created. When the task is inserted into the ready queue1 and waiting for execution, it is in the ready state. Once the scheduler selects the task to execute, the task goes to the running state. When the task is suspended and waiting for certain system resources, it will go into the suspend state. The task will be resumed and enter into the ready state while the resource is available. Finally, the task goes to the terminate state when it has been killed or its job is completed.
1 The tasks that are ready for execution are kept on a list called ready queue.
Create_Task ( )
Terminate Create
Figure 3.2 Task States
3.1.2 Implementation
In this section, we describe the implementation of Seed scheduler and the task ready queue.
We implemented Seed scheduler in a fashion similar to the μ C/OS-II scheduler[10] . However, we extended it to support more priorities (i.e., 512 priorities) and keep the scheduling job in a constant time. As shown in Figure 3.3, we represent 512 task priorities in an 8 × 8 × 8 cube data structure, Priority_Ready_Table. The Priority_Ready_Table is an array of 64 elements, where each element is a 8-bit bitmap. Each bit is used to indicate the existence of tasks with the corresponding priority. For example, in Priority_Ready_Table [0][0], the binary value 00001000 means that there is at least one ready task with priority 3. To determinate which task
Ready
Suspend
Running
Task is terminated, or the job is completed Insert into ready queue Scheduler selects task to run Terminate_Task ( ) Resume_Task ( )
to run, the scheduler will select the lowest priority number that has its bit set in the Priority_Ready_Table. For the sake of efficiency, we use two data structure as the indexes of this array, Priority_Ready_Row_Groups and Priority_Ready_Col_Groups.
Each of them is an 8-bit bitmap and each bit corresponds to a priority group.
Priority_Ready_Row_Groups is the row index of this array, and Priority_Ready_Col_Groups is the column index. For example, if the bit 0 of Priority_Ready_Row_Groups and the bit 0 of Priority_Ready_Col_Groups are set, there is at least one task, with its priority between 0 to 7, ready for execution. This is because the two indexes point to the element 0 of the array (i.e., Priority_Ready_Table [0][0]) , which has the bitmap that stands for priority 0 through 7.
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0
Priority_Ready_Row_Groups
0 0 0 0 0 0 0 1
1
Priority_Ready_Table[8][8]
Priority_Ready_
Col_Groups
1
There is at least one task with priority 3.
Figure 3.3 Data Structures for Task Scheduling
Using the data structures described above to find out the highest priority task, we use a table-lookup approach. Figure 3.4 shows a mapping table with 256 (28) values that is used for finding the highest priority task. In fact, it is a priority resolution table.
Given an index, the corresponding value in the table stands for the lowest set bit of that index. This is used to determine the highest task priority represented by the
previously mentioned bitmaps. For example, if the element of Priority_Ready_Table [0][0] is 8 (i.e., 1000b), we look up the value of Mapping_Table[8] , and the value 3.
It means that the lowest bit of 8 is bit 3, and hence the highest task priority is 3. By using the Mapping_Table, we can find the highest task priority via three times of table-lookup, which is shown in Figure 3.5. First, we look up the lowest bit of Priority_Ready_Row_Groups and Priority_Ready_Col_Groups. With these two bits, we can find out the corresponding bitmap of the Priority_Ready_Table. Finally we look up the lowest bit of this bitmap.
UNSIGNED_CHAR Mapping_Table [256] = { 0, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0
};
Figure 3.4 Mapping Table for Finding the Highest Priority Task
Row = Mapping_Table [Priority_Ready_Row_Groups];
Col = Mapping_Table [Priority_Ready_Col_Groups];
highest_ready_priority = (UNSIGNED) ( (Row << 6) + (Col <<
3) + Mapping_Table [Priority_Ready_Table[Row][Col]] );
Figure 3.5 Pseudo Code for Finding the Highest Priority Task
No matter how many tasks are in the system, the cost of task scheduling in Seed
is fixed. However, when the number of tasks is quite small, the Seed scheduling time may be slower than some non-real-time kernels. This is due to that a non-real-time kernel usually adopts non-deterministic scheduling, which may find out the highest priority rapidly when there are very few tasks. But the term real-time does not mean as fast as possible. Instead, it requires consistent, repeatable, known timing performance. Therefore, in order to achieve deterministic timing, the small and fixed computation overhead of Seed scheduling is worthy.
After finding out the highest priority task, Seed will de-queue a task control block from the ready queue. As shown in Figure 3.6, the Priority_Ready_Task_List is an array of SEED_TASK (task control block) pointers. Each pointer stands for a single priority, and points to a list of ready tasks (specifically, TCBs) with that priority. The TCB list is a doubly-linked list so that we can insert and remove a TCB in a constant time.
Figure 3.6 Task Priority Ready Queues with Priorities
.1.3 Interface
ted by the task management component is as follows:
ecify
2.
3.
4. ask: This function terminates the task we specified.
3
The interface expor
1. Seed_Create_Task: This function creates a new task. The user can sp the time-slice value, preemptive or non-preemptive, priority and so on.
Seed_Resume_Task: This function resumes a previously suspended or
created task. It will call the scheduler to check if a reschedule is needed.
Seed_Suspend_Task: This function suspends the specified task. If it is the
current running task, the function will invoke the scheduler to selects next ready task to run.
Seed_Terminate_T
5. Seed_Relinquish_Task: This function will yield the control of CPU to next same-priority task, and put the task to the end of the corresponding ready
SEED_TASK SEED_TASK SEED_TASK
TCB list.
Seed_Task
6. _Sleep: This function suspends the calling task for the specified
7. n changes the priority of the
8.
9. lice: This function changes the time slice of the
The interface routines for internal use are as follows:
Seed_Initialize (i.e., the system
2. ng to execute
3. Scheduler: This function implements the task scheduling algorithm. It
4. ntext_Switch: This function is invoked to perform a task context number of timer ticks (1 timer tick = 10ms).
Seed_Change_Task_Priority: This functio
specified task to the new priority value. This function will call the scheduler to check if Seed needs to preempt the executing task with new priority task.
Seed_Change_Task_Preemption: This function changes the preemption
state of currently executing task. If the preemption value is changed from non-preemptive to preemptive, it will call the scheduler to check if a preemption is needed.
Seed_Change_Time_S
specified task to the specified value. If the new time slice value is zero, the time slicing of the task is disabled.
1. Task_Initialize: This function is called by
initialization function). It is responsible for setting the initial value of the internal variables and global data structures in task component.
Task_Start: This function will be invoked when the task is goi
at the first time. It will call the task entry function with the parameters of the task.
Task_
is responsible for finding out the highest priority task, and checks the preemption state of the executing task to see if a task context-switch is needed.
Task_Co
switch. The context (i.e., CPU registers) of the original task is saved into memory, and the context of the resumed task is loaded into the CPU.
Spinlock_Lock: This function is called to lock a spinlock that protect critica
5. l
6. e code
7. d when the time slice of a task is run
8. _Timeout: This function is called to process the task suspension timeout
9. sk scheduling. The
10. Lock_Scheduler
system resources (e.g., kernel data structures) from simultaneous access. If other task has already held this spinlock, the calling task will perform context switch and give the control of CPU to the task that hold the spinlock.
Spinlock_Unlock: This function is called to unlock the spinlock. Th
between Spinlock_lock function and Spinlock_Unlock function will become a critical section that is mutual exclusive.
Task_Time_Slice: This function is calle
out. It is responsible for moving the task to the end of the corresponding TCB list.
Task
condition. It will resume the task from the suspend state.
Lock_Scheduler: This function is used to prevent ta
scheduler is temporarily stopped after calling this function.
Unlock_Scheduler: This function is the counterpart of function. It is used to continue task scheduling.
3.3 Interrupt Management
3.3.1 Design
Interrupt is a mechanism for providing immediate response to an external hardware event. When an interrupt occurs, the CPU suspends the current path of execution and transfers control to the appropriate ISR (Interrupt Service Routine).
Seed allows a component such as a device driver to register an ISR, un-register an ISR with for an IRQ number (interrupt request number) dynamically. The HAL interrupt component will recognize the IRQ, save the CPU context, execute to the ISR, and restore the context of CPU. The details will be described in section 3.8.
In order to protect the internal data structures from simultaneous access, we usually disable the interrupts when we are serving an interrupt. However, it is not desirable to disable the interrupts for a long time in a real-time system. Therefore, Seed adopts 2-stage interrupt handling scheme, which is also adopted by other real-time kernels, e.g. the eCos RTOS [18]. The interrupt handling is separated into two stages, ISR stage and DISR (Deferred Interrupt Service Routine) stage.
In the ISR stage, a normal ISR is executed with interrupts disabled. During the execution, the ISR may activate a DISR to complete the service later. When the ISR is finished, the DISR starts. A DISR is allowed to be run with interrupts enabled. Each DISR has its own stack and control block, and hence it can temporarily be blocked for synchronization or mutual exclusion purpose. In other words, a DISR is just like a task except that it is activated by an ISR. Under this 2-stage interrupt handling mechanism, the interrupts won’t be disabled for a long time.
The eCos kernel also supports DISR. However, the DISRs do not have priorities, and hence that are executed in FIFO order. This might cause problems when a DISR activated by a higher priority ISR is blocked by another one that is activated by a
lower priority ISR. By contrast, there are eight priority levels available for Seed her priority DISR (i.e., activated by a higher priority ISR) becomes
management system, we also take advantage of the mapping table to find DISRs. If a hig
ready, the lower priority DISR is preempted. And, DISRs with the same priority are executed in the order they are activated. The same as the task scheduling time, the time of scheduling a DISR is a small constant time.
3.3.2 Implementation
The implementation of the interrupt system is divided into two parts, namely the ISR and the DISR components.
z ISR component
We define an array called IRQ_Handlers. Each element is a function pointer to an ISR, and the IRQ number is used for indexing the array.
Therefore, an ISR can be registered and un-registered with this array dynamically. When an interrupt occurs, the interrupt part of the HAL component will get the IRQ number from the hardware register, and invoke the corresponding ISR.
z DISR component
The data structures for implementing the DISR component are similar with the Seed tasks. It has an 8-bit bitmap, named Active_DISR_Priority, for the priority status. Since there are only eight priorities for DISRs, the Active_DISR_Priority is enough to represent the priority status. Besides, there is ready queue called Active_DISR_First with eight elements for queuing DISRs. Each element contains a doubly-linked list of DISR control blocks (i.e., SEED_DISR). Figure 3.7 shows an example. In this figure, there are DISRs with priority 0 and 7. Therefore, the Active_DISR_Priority bitmap is 129 (i.e., 10000001b). Similar to the approach used in the task
out the highest priority DISR in a constant time period. It is worth to note that the scheduler always selects the DISRs to run before running the tasks in order to completing the interrupt service as fast as possible.
Active_DISR_First
Figure 3.7 DISR Data Structures
y the interrupt system is shown in the following.
: This function registers the ISR for the specified IRQ
: This function registers the ISR for the specified IRQ