In our system design, there are three components compose system replay ability record and replay. Firstly, we need an initial checkpoint, which contains all the machine states.
Secondly, non-deterministic events must be recorded by recorder. Finally, replayer uses the record to recover system execution. Following are our descriptions in detail.
6.1. Checkpoint
Checkpoint saves the virtual machine states, including states of device, memory, CPU register and the content of all the writable disks. The program execution is divided into multiple checkpoints. We can replay the system by using interval of checkpoints. Mostly, the early checkpoint is used to be the beginning of replay, and the later one is used to check the correctness of replay.
We combine QEMU‟s snapshot with following information into our checkpoint header:
Checkpoint Identifier (CID): Checkpoint identifier is used to decide the interval of
replay. There would be many checkpoints on virtual machine. For flexible replaying, we can choose two of all the checkpoints to be replay interval.
Interval Instruction Number: The interval instruction number help replay system to
know when should stop replay. If the checkpoint is not last one, the interval instruction number would be the number of executed instructions to next one.
Virtual Machine Snapshot Identifier: To record the states of virtual machine have been
implemented by QEMU‟s snapshot. We integrate the snapshot into our checkpoint information. The ID of snapshot is associated with the recorded VM state.
Log File Offset: When the VM is replaying, file offset can let replay system to know where the data begin. Because all the recorded data are in the same file, we can seek the record by using log file offset.
19
6.2. Recorder
The record component logs the entire non-deterministic event into files. The component is made up many modification of QEMU. This part would record every non-deterministic event with ordering of instruction execution. Recorder is composed of many modifications to record emulated I/O and interrupts. We check the state of VM is recording or not when I/O operation or interrupts are taking place. If the state of VM is recording, we will record current data from I/O port or number of interrupt by our record function. We separate our implementation into several parts and describe them in detail as following.
6.2.1. Instruction Counter
To replay instruction dependency correctly, instruction counter plays an important role for instruction-related sequence. It counts how many instructions is virtual CPU computed. Each recorded event will be corresponding with one instruction counter. Instruction counter reveals not only the order of recorded events but also exact timing of replay. Additionally, it can help us to debug replay system when the sequence of re-execution events do not matched with recorded data.
6.2.2. External Inputs
Inputs can change execution result and influence execution state of process. For an operating system, transition of execution state can be varied by many kinds of external inputs, e. g. network packets, keyboard typing and mouse clicking. We consider user behavior and peripheral activity which are mentioned above to be non-deterministic events because of their unpredictability. Consequently, recording non-deterministic events into files makes them to be predictable during re-execution run. Besides, we also record additional information with
20
data type for decreasing log size. For example, the IA-32 allows transmission of data type can be a byte, word or a double word between peripheral. We compress additional information into simply integers to reduce the space overhead.
6.2.3. Interrupts
Interrupts are used to accomplish communication not only hardware but also software.
System execution always accompanies interrupts when I/O request is issued or context-switch in multiprocessing system. QEMU invokes interrupts during executing instructions of guest system and handles them periodically. We record the interrupts in QEMU‟s handler function with corresponding instruction counter.
6.2.4. Clock
All of real world devices need clock to synchronize their operation, and QEMU, a machine emulator has no exception. QEMU emulates clock interrupts by using signal (SIGALARM, SIGIO), and those virtual interrupts are triggered when the specified signal are arriving. However, VM cannot predict host system activity, so it always emulates interrupts handling after emulating amount of code execution. To faithfully replay thread scheduling, the timing and the value of clocks must be recorded with corresponding instruction counter.
Otherwise, it stands for the related timestamp. It is very significant for some of time dependent event such as rdtsc, rdpmc. We combine all above information for recording, a very simple format for decreasing record space.
21
6.3. Replayer
We implemented transparency and deterministic replay in our system. All of the recorded data are corresponding to instruction counter for ensuring time dependency. Transition of the system state will be the same as previous recorded, including thread scheduling, interactions of process, hardware interrupts from external device and even data of packets from network.
However, we do not take the fine-grain replaying of external devices states into account. One of the reasons, synchronizing state of external devices is more complicated than only replaying their output. For example, you can regenerate a packet from a website, but change the website status to send the packet like previously. In other words, states of external devices are not in our concern so that their states would not be transparent.
Non-deterministic events are the determinants for achieving deterministic replay. A non-deterministic event causes a state transition without corresponding to previous states. By checking instruction counter, system states can be replayed predictably because of removing the entire non-deterministic factor. However, we let all the events be deterministic; in addition to we cannot handle such as DMA. During replay of system, all system events are deterministic.
Figure 6.1 Implementation of recording and replaying
22