Linux is aUNIX-like system that has gained popularity in recent years. In this chapter, we look at the history and development of Linux, and cover the user and programmer interfaces that Linux presents interfaces that owe a great deal to theUNIX tradition. We also discuss the internal methods by which Linux implements these interfaces. However, since Linux has been designed to run as many standardUNIXapplications as possible, it has much in common with existingUNIXimplementations. We do not duplicate the basic description of UNIXgiven in the previous chapter.
Linux is a rapidly evolving operating system. This chapter describes specif-ically the Linux 2.0 kernel, released in June 1996.
Exercises
21.1 What are the advantages and disadvantages of writing an operating system in a high-level language, such as C?
Answer: There are many advantages to writing an operating sys-tem in a high-level language such as C. First, by programming at a higher abstraction, the number of programming errors is reduced as the code becomes more compact. Second, many high-level languages provide advanced features such as bounds-checking that further mini-mize programming errors and security loopholes. Also, high-level pro-gramming languages have powerful propro-gramming environments that include tools such as debuggers and performance profilers that could be handy for developing code. The disadvantage with using a high-level language is that the programmer is distanced from the underly-ing machine which could cause a few problems. First, there could be a performance overhead introduced by the compiler and runtime sys-tem used for the high-level language. Second, certain operations and instructions that are available at the machine-level might not be acces-137
sible from the language level, thereby limiting some of the functionality available to the programmer.
21.2 In what circumstances is the system-call sequencefork() exec()most appropriate? When isvfork()preferable?
Answer: vfork() is a special case of clone and is used to create new processes without copying the page tables of the parent process. vfork() differs from fork in that the parent is suspended until the child makes a call to exec() or exit(). The child shares all memory with its parent, including the stack, until the child makes the call. This implies con-straints on the program that it should be able to make progress without requiring the parent process to execute and is not suitable for certain programs where the parent and child processes interact before the child performs an exec. For such programs, the system-call sequencefork() exec()more appropriate.
21.3 What socket type should be used to implement an intercomputer file-transfer program? What type should be used for a program that peri-odically tests to see whether another computer is up on the network?
Explain your answer.
Answer: Sockets of type SOCK STREAM use the TCP protocol for communicating data. The TCP protocol is appropriate for implement-ing an intercomputer file-transfer program since it provides a reliable, flow-controlled, and congestion-friendly, communication channel. If data packets corresponding to a file transfer are lost, then they are transmitted. Furthermore, the file transfer does not overrun buffer re-sources at the receiver and adapts to the available bandwidth along the channel. Sockets of type SOCK DGRAM use the UDP protocol for com-municating data. The UDP protocol is more appropriate for checking whether another computer is up on the network. Since a connection-oriented communication channel is not required and since there might not be any active entities on the other side to establish a communication channel with, the UDP protocol is more appropriate.
21.4 Linux runs on a variety of hardware platforms. What steps must the Linux developers take to ensure that the system is portable to different processors and memory-management architectures, and to minimize the amount of architecture-specific kernel code?
Answer: The organization of dependent and architecture-independent code in the Linux kernel is designed to satisfy two design
goals: to keep as much code as
possible common between architectures and to provide a clean way of defining
architecture-specific properties and code. The solution must of course be consistent with the overriding aims of code maintainability and performance.
There are different levels of architecture dependence in the kernel, and different techniques are appropriate in each case to comply with the design requirements. These levels include:
CPU word size and endianness These are issues that affect the porta-bility of all software written in C, but especially so for an
oper-Exercises 139 ating system, where the size and alignment of data must be carefully arranged.
CPU process architecture Linux relies on many forms of hardware support for its process and memory management. Different cessors have their own mechanisms for changing between pro-tection domains (e.g., entering kernel mode from user mode), rescheduling processes, managing virtual memory, and handling incoming interrupts.
The Linux kernel source code is organized so as to allow as much of the kernel as possible to be independent of the details of these architecture-specific features. To this end, the kernel keeps not one but two separate subdirectory hierarchies for each hardware architecture. One contains the code that is appropriate only for that architecture, including such functionality as the system call interface and low-level interrupt man-agement code.
The second architecture-specific directory tree contains C header files that are descriptive of the architecture. These header files contain type definitions and macros designed to hide the differences between archi-tectures. They provide standard types for obtaining words of a given length, macro constants defining such things as the architecture word size or page size, and function macros to perform common tasks such as converting a word to a given byte-order or doing standard manipu-lations to a page table entry.
Given these two architecture-specific subdirectory trees, a large por-tion of the Linux kernel can be made portable between architectures.
An attention to detail is required: when a 32 bit integer is required, the programmer must use the explicit int32 type rather than assume than an int is a given size, for example. However, as long as the architecture-specific header files are used, then most process and page-table manipu-lation can be performed using common code between the architectures.
Code that definitely cannot be shared is kept safely detached from the main common kernel code.
21.5 What are the advantages and disadvantages of making only some of the symbols defined inside a kernel accessible to a loadable kernel module?
Answer: The advantage of making only some of the symbols defined inside a kernel accessible to a loadable kernel module is that there are a fixed set of entry points made available to the kernel module. This ensures that loadable modules cannot invoke arbitrary code within the kernel and thereby interfere with the kernel’s execution. By restricting the set of entry points, the kernel is guaranteed that the interactions with the module take place at controlled points where certain invariants hold. The disadvantage with making only a small set of the symbols defined accessible to the kernel module is the loss in flexibility and might sometimes lead to a performance issue as some of the details of the kernel are hidden from the module.
21.6 What are the primary goals of the conflict resolution mechanism used by the Linux kernel for loading kernel modules?
Answer: Conflict resolution prevents different modules from having conflicting access to hardware resources. In particular, when multiple drivers are trying to access the same hardware, it resolves the resulting conflict.
21.7 Discuss how the clone() operation supported by Linux is used to sup-port both processes and threads.
Answer: In Linux, threads are implemented within the kernel by a clone mechanism that creates a new process within the same virtual address space as the parent process. Unlike some kernel-based thread packages, the Linux kernel does not make any distinction between threads and processes: a thread is simply a process that did not create a new virtual address space when it was initialized.
The main advantage of implementing threads in the kernel rather than in a user-mode library are that:
• kernel threaded systems can take advantage of multiple processors if they are available; and
• if one thread blocks in a kernel service routine (for example, a system call or page fault), other threads are still able to run.
21.8 Would one classify Linux threads as user-level threads or as kernel-level threads? Support your answer with the appropriate arguments.
Answer: Linux threads are kernel-level threads. The threads are visible to the kernel and are independently scheduleable. User-level threads, on the other hand, are not visible to the kernel and are instead manip-ulated by user-level schedulers. In addition, the threads used in the Linux kernel are used to support both the thread abstraction and the process abstraction. A new process is created by simply associated a newly created kernel thread with a distinct address space, whereas a new thread is created by simply creating a new kernel thread with the same address space. This further indicates that the thread abstaction is intimately tied into the kernel.
21.9 What are the extra costs incurred by the creation and scheduling of a process, as compared to the cost of a cloned thread?
Answer: In Linux, creation of a thread involves only the creation of some very simple data structures to describe the new thread. Space must be reserved for the new thread’s execution context its saved regis-ters, its kernel stack page and dynamic information such as its security profile and signal state but no new virtual address space is created.
Creating this new virtual address space is the most expensive part of the creation of a new process. The entire page table of the parent process must be copied, with each page being examined so that copy-on-write semantics can be achieved and so that reference counts to physical pages can be updated. The parent process’s virtual memory is also affected by the process creation: any private read/write pages owned by the parent must be marked read-only so that copy-on-write can happen (copy-on-write relies on a page fault being generated when a write to the page occurs).
Exercises 141 Scheduling of threads and processes also differs in this respect. The decision algorithm performed when deciding what process to run next is the same regardless of whether the process is a fully independent process or just a thread, but the action of context-switching to a separate process is much more costly than switching to a thread. A process requires that the CPU’s virtual memory control registers be updated to point to the new virtual address space’s page tables.
In both cases—creation of a process or context switching between pro-cesses the extra virtual memory operations have a significant cost. On many CPUs, changing page tables or swapping between page tables is not cheap: all or part of the virtual address translation look-aside buffers in the CPU must be purged when the page tables are changed. These costs are not incurred when creating or scheduling between threads.
21.10 The Linux scheduler implements soft real-time scheduling. What fea-tures are missing that are necessary for some real-time programming tasks? How might they be added to the kernel?
Answer: Linux’s “soft” real-time scheduling provides ordering guar-antees concerning the priorities of runnable processes: real-time pro-cesses will always be given a higher priority by the scheduler than normal time-sharing processes, and a real-time process will never be interrupted by another process with a lower real-time priority.
However, the Linux kernel does not support “hard” real-time func-tionality. That is, when a process is executing a kernel service routine, that routine will always execute to completion unless it yields control back to the scheduler either explicitly or implicitly (by waiting for some asynchronous event). There is no support for preemptive scheduling of kernel-mode processes. As a result, any kernel system call that runs for a significant amount of time without rescheduling will block execution of any real-time processes.
Many real-time applications require such hard real-time scheduling. In particular, they often require guaranteed worst-case response times to external events. To achieve these guarantees, and to give user-mode real time processes a true higher priority than kernel-mode lower-priority processes, it is necessary to find a way to avoid having to wait for low-priority kernel calls to complete before scheduling a real-time process.
For example, if a device driver generates an interrupt that wakes up a high-priority real-time process, then the kernel needs to be able to schedule that process as soon as possible, even if some other process is already executing in kernel mode.
Such preemptive rescheduling of kernel-mode routines comes at a cost.
If the kernel cannot rely on non-preemption to ensure atomic updates of shared data structures, then reads of or updates to those structures must be protected by some other, finer-granularity locking mechanism.
This fine-grained locking of kernel resources is the main requirement for provision of tight scheduling guarantees.
Many other kernel features could be added to support real-time pro-gramming. Deadline-based scheduling could be achieved by making modifications to the scheduler. Prioritization of IO operations could be implemented in the block-device IO request layer.
21.11 Under what circumstances would an user process request an operation that results in the allocation of a demand-zero memory region?
Answer: Uninitialized data can be backed by demand-zero memory regions in a process’s virtual address space. In addition, newly malloced space can also be backed by a demand-zero memory region.
21.12 What scenarios would cause a page of memory to be mapped into an user program’s address space with the copy-on-write attribute enabled?
Answer: When a process performs a fork operation, a new process is created based on the original binary but with a new address space that is a clone of the original address space. One possibility is to not to create a new address space but instead share the address space between the old process and the newly created process. The pages of the address space are mapped with the copy-on-write attribute enabled. Then, when one of the processes performs an update on the shared address space, a new copy is made and the processes no longer share the same page of the address space.
21.13 In Linux, shared libraries perform many operations central to the op-erating system. What is the advantage of keeping this functionality out of the kernel? Are there any drawbacks? Explain your answer.
Answer: There are a number of reasons for keeping functionality in shared libraries rather than in the kernel itself. These include:
Reliability. Kernel-mode programming is inherently higher risk than user-mode programming. If the kernel is coded correctly so that protection between processes is enforced, then an occurrence of a bug in a user-mode library is likely to affect only the currently executing process, whereas a similar bug in the kernel could conceivably bring down the entire operating system.
Performance. Keeping as much functionality as possible in user-mode shared libraries helps performance in two ways. First of all, it reduces physical memory consumption: kernel memory is non-pageable, so every kernel function is permanently resident in physical memory, but a library function can be paged in from disk on demand and does not need to be physically present all of the time. Although the library function may be resident in many processes at once, page sharing by the virtual memory system means that at most once it is only loaded into physical memory.
Second, calling a function in a loaded library is a very fast oper-ation, but calling a kernel function through a kernel system ser-vice call is much more expensive. Entering the kernel involves changing the CPU protection domain, and once in the kernel, all of the arguments supplied by the process must be very care-fully checked for correctness: the kernel cannot afford to make any assumptions about the validity of the arguments passed in, whereas a library function might reasonably do so. Both of these factors make calling a kernel function much slower than calling the same function in a library.
Exercises 143 Manageability. Many different shared libraries can be loaded by an application. If new functionality is required in a running system, shared libraries to provide that functionality can be installed without interrupting any already-running processes. Similarly, existing shared libraries can generally be upgraded without re-quiring any system down time. Unprivileged users can create shared libraries to be run by their own programs. All of these attributes make shared libraries generally easier to manage than kernel code.
There are, however, a few disadvantages to having code in a shared library. There are obvious examples of code which is completely un-suitable for implementation in a library, including low-level function-ality such as device drivers or file-systems. In general, services shared around the entire system are better implemented in the kernel if they are performance-critical, since the alternative—running the shared service in a separate process and communicating with it through interprocess communication—requires two context switches for every service re-quested by a process. In some cases, it may be appropriate to prototype a service in user-mode but implement the final version as a kernel routine.
Security is also an issue. A shared library runs with the privileges of the process calling the library. It cannot directly access any resources inaccessible to the calling process, and the calling process has full ac-cess to all of the data structures maintained by the shared library. If the service being provided requires any privileges outside of a normal pro-cess’s, or if the data managed by the library needs to be protected from normal user processes, then libraries are inappropriate and a separate server process (if performance permits) or a kernel implementation is required.
21.14 The directory structure of a Linux operating system could comprise of files corresponding to different file systems, including the Linux /proc file system. What are the implications that arise from having to support different filesystem types on the structure of the Linux kernel?
Answer: There are many implications to having to support different filesystem types within the Linux kernel. For one thing, the filesystem interface should be independent of the data layouts and data structures used within the filesystem to store file data. For another thing, it might have to provide interfaces to filesystems where the filedata is not static
Answer: There are many implications to having to support different filesystem types within the Linux kernel. For one thing, the filesystem interface should be independent of the data layouts and data structures used within the filesystem to store file data. For another thing, it might have to provide interfaces to filesystems where the filedata is not static