Now you know how to declare, initialize, and manipulate a linked list in the kernel.This is all very well and good, but it is meaningless if you have no way to access your data! The linked lists are just containers that hold your important data; you need a way to use lists to move around and access the actual structures that contain the data.The kernel (thank goodness) provides a nice set of interfaces for traversing linked lists and referencing the data structures that include them.
Note that, unlike the list manipulation routines, iterating over a linked list in its entirety is clearly an O(n) operation, for n entries in the list.
The Basic Approach
The most basic way to iterate over a list is with the list_for_each()macro.The macro takes two parameters, both list_head structures.The first is a pointer used to point to the current entry; it is a temporary variable that you must provide.The second is the
list_head acting as the head node of the list you want to traverse (see the earlier section,
“List Heads”). On each iteration of the loop, the first parameter points to the next entry in the list, until each entry has been visited. Usage is as follows:
struct list_head *p;
list_for_each(p, fox_list) {
/* p points to an entry in the list */
}
Well, that is still worthless! A pointer to the list structure is usually no good; what we need is a pointer to the structure that contains the list_head. For example, with the pre-vious fox structure example, we want a pointer to each fox, not a pointer to the list member in the structure.We can use the macro list_entry(), which we discussed ear-lier, to retrieve the structure that contains a given list_head. For example:
struct list_head *p;
struct fox *f;
list_for_each(p, &fox_list) {
/* f points to the structure in which the list is embedded */
f = list_entry(p, struct fox, list);
}
The Usable Approach
The previous approach does not make for particularly intuitive or elegant code, although it does illustrate how list_head nodes function. Consequently, most kernel code uses the
list_for_each_entry() macro to iterate over a linked list.This macro handles the work performed by list_entry(), making list iteration simple:
list_for_each_entry(pos, head, member)
ptg Here, pos is a pointer to the object containing the list_head nodes.Think of it as the
return value from list_entry(). head is a pointer to the list_head head node from which you want to start iterating—in our previous example, fox_list. member is the vari-able name of the list_head structure in pos—list in our example.This sounds confus-ing, but it is easy to use. Here is how we would rewrite the previous list_for_each() to iterate over every fox node:
struct fox *f;
list_for_each_entry(f, &fox_list, list) {
/* on each iteration, ‘f’ points to the next fox structure ... */
}
Now let’s look at a real example, from inotify, the kernel’s filesystem notification system:
static struct inotify_watch *inode_find_handle(struct inode *inode, struct inotify_handle *ih) {
struct inotify_watch *watch;
list_for_each_entry(watch, &inode->inotify_watches, i_list) { if (watch->ih == ih)
return watch;
}
return NULL;
}
This function iterates over all the entries in the inode->inotify_watches list. Each entry is of type struct inotify_watch and the list_head in that structure is named
i_list.With each iteration of the loop, watch points at a new node in the list.The purpose of this simple function is to search the inotify_watches list in the provided inode struc-ture to find an inotify_watch entry whose inotify_handle matches the provided handle.
Iterating Through a List Backward
The macro list_for_each_entry_reverse() works just like list_for_each_entry(), except that it moves through the list in reverse.That is, instead of following the next pointers forward through the list, it follows the prev pointers backward. Usage is the same as with list_for_each_entry():
list_for_each_entry_reverse(pos, head, member)
There are only a handful of reasons to favor moving through a list in reverse. One is performance: If you know the item you are searching for is likely behind the node you are starting your search from, you can move backward in hopes of finding it sooner. A second reason is if ordering is important. For example, if you use a linked list as a stack, you can walk the list from the tail backward to achieve last-in/first-out (LIFO) ordering. If you do
ptg Linked Lists 95
not have an explicit reason to move through the list in reverse, don’t—just use
list_for_each_entry(). Iterating While Removing
The standard list iteration methods are not appropriate if you are removing entries from the list as you iterate.The standard methods rely on the fact that the list entries are not changing out from under them, and thus if the current entry is removed in the body of the loop, the subsequent iteration cannot advance to the next (or previous) pointer.This is a common pattern in loops, and programmers solve it by storing the next (or previous) pointer in a temporary variable prior to a potential removal operation.The Linux kernel provides a routine to handle this situation for you:
list_for_each_entry_safe(pos, next, head, member)
You use this version in the same manner as list_for_each_entry(), except that you provide the next pointer, which is of the same type as pos.The next pointer is used by the list_for_each_entry_safe() macro to store the next entry in the list, making it safe to remove the current entry. Let’s consider an example, again in inotify:
void inotify_inode_is_dead(struct inode *inode) {
struct inotify_watch *watch, *next;
mutex_lock(&inode->inotify_mutex);
list_for_each_entry_safe(watch, next, &inode->inotify_watches, i_list) { struct inotify_handle *ih = watch->ih;
mutex_lock(&ih->mutex);
inotify_remove_watch_locked(ih, watch); /* deletes watch */
mutex_unlock(&ih->mutex);
}
mutex_unlock(&inode->inotify_mutex);
}
This function iterates over and removes all the entries in the inotify_watches list. If the standard list_for_each_entry() were used, this code would introduce a use-after-free bug, as moving to the next item in the list would require accessing watch, which was destroyed.
If you need to iterate over a linked list in reverse and potentially remove elements, the kernel provides list_for_each_entry_safe_reverse():
list_for_each_entry_safe_reverse(pos, n, head, member)
Usage is the same as with list_for_each_entry_safe().
ptg You May Still Need Locking!
The “safe” variants of list_for_each_entry() protect you only from removals from the list within the body of the loop. If there is a chance of concurrent removals from other code—or any other form of concurrent list manipulation—you need to properly lock access to the list.
See Chapters 9, “An Introduction to Kernel Synchronization,” and Chapter 10, “Kernel Syn-chronization Methods,” for a discussion on synSyn-chronization and locking.
Other Linked List Methods
Linux provides myriad other list methods, enabling seemingly every conceivable way to access and manipulate a linked list. All these methods are defined in the header file
<linux/list.h>.
Queues
A common programming pattern in any operating system kernel is producer and consumer.
In this pattern, a producer creates data—say, error messages to be read or networking packets to be processed—while a consumer, in turn, reads, processes, or otherwise consumes the data. Often the easiest way to implement this pattern is with a queue.The producer pushes data onto the queue and the consumer pulls data off the queue.The consumer retrieves the data in the order it was enqueued.That is, the first data on the queue is the first data off the queue. For this reason, queues are also called FIFOs, short for first-in, first-out. See Figure 6.5 for an example of a standard queue.
Enqueue
Dequeue
A Queue
Figure 6.5 A queue (FIFO).
ptg Queues 97
The Linux kernel’s generic queue implementation is called kfifo and is implemented in
kernel/kfifo.c and declared in <linux/kfifo.h>.This section discusses the API after an update in 2.6.33. Usage is slightly different in kernel versions prior to 2.6.33—double-check <linux/kfifo.h> before writing code.
kfifo
Linux’s kfifo works like most other queue abstractions, providing two primary operations:
enqueue (unfortunately named in) and dequeue (out).The kfifo object maintains two off-sets into the queue: an in offset and an out offset.The in offset is the location in the queue to which the next enqueue will occur.The out offset is the location in the queue from which the next dequeue will occur.The out offset is always less than or equal to the in offset. It wouldn’t make sense for it to be greater; otherwise, you could dequeue data that had not yet been enqueued.
The enqueue (in) operation copies data into the queue, starting at the in offset.When complete, the in offset is incremented by the amount of data enqueued.The dequeue (out) operation copies data out of the queue, starting from the out offset.When complete, the out offset is incremented by the amount of data enqueued.When the out offset is equal to the in offset, the queue is empty: No more data can be dequeued until more data is enqueued.When the in offset is equal to the length of the queue, no more data can be enqueued until the queue is reset.