• 沒有找到結果。

VMMD system allows third party security applications in the domain 0 to introspect and interpose in the memory and disk states of a guest virtual machine (guest VM) on Xen Hypervisor. For memory and disk introspection, third party security applications can read virtual machine memory pages and disk blocks through Xen Hypervisor. For memory and disk interposition, third party security applications can write data to arbitrary memory pages, or disk blocks of a guest VM, and are able to control the memory and disk access of virtual machines.

Figure 1. VMMD System Architecture

The architecture of VMMD system is shown as Figure 1. It provides an interface called LibVMMD in domain 0, a memory module in Xen Hypervisor, and a disk module in QEMU

5

[8]. Monitor applications introspect or interpose between the memory subsystems of guest VMs through LibVMMD. To manipulate the guest memory pages, LibVMMD makes use of Xen Control Library (libxc). After monitor applications call LibVMMD, the libxc module will find the mappings in Extended Page Table (EPT [9]), and get the target memory pages on physical memory. On the other side, VMMD system adds a memory module in Xen Hypervisor. Monitor applications can control the guest access of virtual machine memory subsystems through the memory module. For that, LibVMMD transfers the requests of monitor applications to the memory module, and changes the access control bits in EPT. The corresponding guest physical memory pages of the changed access bits are restrained by monitor applications, and the memory pages cannot be accessed by guest operating systems.

Also, LibVMMD allows monitor applications to introspect, interpose, and control the virtual machine disk subsystems. In the architecture of Xen Hypervisor, QEMU handles the I/O requests, like network card and disk, of virtual machines. In order to introspect and interpose between virtual machine disk subsystems, VMMD system has to work with QEMU-dm. In the system architecture, VMMD adds a disk module in QEMU-dm. After receiving notifications from LibVMMD, the disk module will parse the disk structures to get the target disk blocks for introspection and interposition. However, the disk introspection may meet disk cache coherence problem. The latest file contents may be stored in data cache by guest operating system to enhance system performance. To make the disk introspection work properly, we design “Write Buffer” to capture the disk cache data. Thus, LibVMMD will return the latest data to monitor applications with the help of write buffer. Due to the role of QEMU-dm, the disk access control is built in it as well. Monitor applications can quarantine specified files or disk blocks by calling LibVMMD, and the virtual disk requests from guest operating systems will be blocked by the disk module.

Xen supports two types of virtual machines, para-virtualization based virtual machine

6

(PV) [10] and hardware based virtual machine (HVM) [11]. The proposed system supports HVM primarily. PV reduces the time of guest’s operation execution, but it requires the guest operating systems to be ported for para-API. On the other hand, HVM supports unmodified guest operating systems and provides better performance than full virtualization. HVM is more appealing because many popular operating systems nowadays are still closed-sources including Windows and Mac OS. In order to provide a general solution for virtual machine memory and disk introspection and interposition, this system is designed with the following goals:

1. Allow security application in domain 0 to introspect a guest VM’s memory and disk.

2. Allow security application in domain 0 to interpose in a guest VM’s memory and disk.

3. Require no modification of guest OS kernel.

4. Do not depend on pre-installed drivers in the guest VM.

5. Guest VMs cannot circumvent and interfere with this system.

Memory Introspection and Interposition 3.1

Memory contains programs (sequence of instructions) or data (e.g. program state information) for operating systems. When an operating system is running, CPU fetches instructions and data from specified memory pages. After finishing the fetched instructions, CPU will fetch instructions from another specified memory page. In other words, memory is like a paper which records the next step of an operating system.

7

Figure 2. Flow Chart of Memory Introspection and Interposition

Memory introspection allows monitor applications to get the memory contents outside the virtual machines, and memory interposition allows monitor applications to overwrite the memory contents outside the virtual machine. Because of the isolation between virtual machines, guest VMs are unable to notice whether they are being “monitored”. Also, rootkits have no ability to interfere with the control flow of memory introspection and interposition.

Figure 2 shows the control flow of memory introspection and interposition.

The monitor application calls LibVMMD to do the memory introspection or interposition.

It can assign the process id, the virtual address with cr3, or the guest frame number (GFN) to LibVMMD at step 1. The VMMD Library then gets the memory contents of the argument issued by the monitor application from step 2 to step 6, and returns the contents for introspection, or overwrites the contents for interposition at step 7 and step 8. The step from 2 to 6 is the flow of using Xen Control Library. When the Xen Control Library is called at step 2, the libxc module will map the target guest physical memory page into the address space of LibVMMD at step 3 and step 4. The mapped memory address will return to the LibVMMD at

8

step 5 and step 6. At this moment, LibVMMD can introspect or interpose the guest physical memory page for the monitor application.

Memory Access Control 3.2

Figure 3. x64 hardware page table entry

Memory access control prevents memory pages from unauthorized access. In the guest page tables, each memory page has three types of access control bits: Read, Write and Execute. Figure 3 presents the fields of an x64 page table entry. By setting these access control bits, any illegal memory manipulation will be caught by the guest OS to ensure its system integrities. However, these bits can be changed by a malicious kernel module. The malicious kernel module can modify the access control bits as it is at the same privilege level as the kernel, and makes memory access control from the guest page tables unreliable.

9

Figure 4. Flow Chart of Memory Access Control

Access control from the guest page tables is not reliable because of the same system privilege between malicious kernel module and the kernel. What happens if memory access control possesses higher privilege than the kernel? With this idea, memory access control is built in EPT. Xen Hypervisor deploys EPT for virtual machine page table virtualization. With the help of EPT, Xen can reduce the overhead by avoiding the VM exits associated with page table virtualization. Figure 4 explains the control flow of memory access control.

When a monitor application needs to control the access of virtual machine memory pages, it uses the control functions of LibVMMD (step 1). LibVMMD will request the memory module to find and to change the control types of the corresponding EPT entries (step 2 ~ step 4). EPT provides access control bits as the guest page table does. When the guest operating system accesses the protected page (step 5), it will trigger an EPT violation due to unauthorized memory manipulation. The violation will be caught by the memory module (step 6), and the memory module will return the control back to the next instruction to skip

10

the unauthorized instruction (step 7). Due to the location of EPT, it is more difficult to alter EPT access bits from guest VMs unless Xen has been compromised. After the control types of virtual machine memory blocks have been set, any unauthorized memory manipulation will be rejected by EPT.

Evading Kernel Patching Protection 3.3

In order to improve OS kernel security, Microsoft has planted a kernel patching protection mechanism [12], informally known as PatchGuard, on recent x64 Windows editions.

PatchGuard verifies kernel critical structures periodically to prevent unauthorized kernel patching. When a program (e.g. a malware) attempts to patch the kernel, PatchGuard will trigger a blue screen error and force a reboot of the system.

Because of the kernel patching protection, memory interposition is mostly useless in the kernel address space. Monitor applications cannot patch kernel to set break-points or hook the system APIs through kernel patching techniques such as modification of IDT/SSDT tables or the system call dispatching function. This greatly limits the security monitor applications as much of the operations in the guest kernel cannot be hooked and monitored. We thus need to evade the kernel patch protection mechanism.

PatchGuard protects kernel structures like System Service Descriptor Table (SSDT), Global Descriptor Table (GDT), Interrupt Descriptor Table (IDT), System images (e.g.

ntoskrnl.exe, ndis.sys, hal.dll), and Processor MSRs (system call). As a high level application, PatchGuard is implemented to cache the original copies and/or checksums of the kernel structures, and compare with the recent structures in a time interval. If PatchGuard found any difference between recent structures and original copies, it will invoke a blue screen error to reboot the entire system.

11

Figure 5. Flow Chart of Evading Kernel Patch Protection

LibVMMD bypasses PatchGuard by preventing it from verifying the contents of patched memory pages. For the reason, it emulates the PatchGuard instructions in Xen Hypervisor.

Figure 5 shows the process of evading kernel patching protection. At the beginning, the monitor application patches the guest kernel memory page through LibVMMD interface. To circumvent the PatchGuard verification routine, the read privilege of the patched memory page has to be canceled. Unfortunately, the privilege of write and execute exist simultaneously leads EPT misconfiguration fault, so the memory module changes the access control type to be execute-only (step 1 & step 2). Meanwhile, PatchGuard is invoked to verify the integrity of guest operating system (step 3). When it reads the patched page, it causes an EPT violation fault and transfers the control to the memory module (step 4). The memory module will emulate the offending instruction in Xen Hypervisor, and return the control back to the next PatchGuard instruction until the verification is finished (step 5).

12

Disk Introspection and Interposition 3.4

Disk introspection and interposition allow monitor applications to read or overwrite disk contents outside the virtual machines. Like memory introspection and interposition, rootkits are unable to interfere with the flow of disk introspection and interposition. Disk introspection and interposition are implemented by modifying the virtual disk emulation codes of QEMU in Xen Hypervisor. LibVMMD plays the role of an interface for the disk module in QEMU. The disk module will wait for monitor applications’ requests, and finish them in QEMU.

Figure 6. Flow Chart of Disk Introspection and Interposition

First of all, the monitor application passes requests to the disk module through LibVMMD.

According to the requests, the disk module will find the target blocks in the virtual disk of specified virtual machine. It will return the block contents to monitor applications for disk introspection, or overwrite the block contents for disk interposition. The disk introspection and interposition can be divided into block level and filesystem level. The difference is explained below.

13

3.4.1. Block-level Introspection and Interposition

QEMU emulates the virtual disk as an IDE device. To handle the I/O operations from guest VMs, the IDS device driver uses its I/O operation functions at block level. At block level, the data are stored in sectors as basic unit. The disk module uses the driver operation functions to manipulate the virtual disk at block level. For introspection, it uses the read function of the driver as if the target sector is read by the driver. For interposition, it uses the write function of the driver as if the target sector is overwritten by the driver.

3.4.2. Filesystem-level Introspection and Interposition

Introspecting data at filesystem level is more complicated. The flow is shown in Figure 6.

When the monitor application makes the call for disk introspection or interposition through LibVMMD, the request is sent to the disk module (step a). To get the file contents, the disk module has to find the corresponding sector numbers of the file. In the NTFS filesystem, the files are saved as attributes in the Master File Table [13]. To find the corresponding sector numbers, the disk module has to parse the MFT. After parsing the MFT, the disk module can find the sector numbers of the file for disk introspection and interposition as 3.4.1. Therefore, the disk module can return the file contents for introspection, and modifies the file contents for interposition (step b & step c).

Write Buffer for Disk Cache Coherence 3.5

Although the file contents can be dumped by disk introspection, there is another problem called disk cache coherence. For the reason of improving the speed of fetching data, operating systems provide data cache to store frequently used or the latest updated data. These kinds of data will be stored in data cache for a while, and then they will be written to disk storage.

However, filesystem introspection gets the file contents from the virtual disk. It cannot get the contents which do not exist on virtual disk storage. This embarrassing situation often happens right after the guest system updates a file’s content. When a file is being updated, the contents

14

may be stored in data cache. At this moment, filesystem introspection is unable to get the latest file contents until they are written to the disk storage.

Figure 7. Flow Chart of Disk Introspection with Write Buffer

To solve this problem, we developed the technique called Write Buffer. Write Buffer stores the latest contents when the guest VM modifies its disk files. The contents are captured and stored in Write Buffer by the control center, which intercepts the system calls [14]. Figure 7 shows the newer disk introspection steps and Write Buffer mechanism. For Write Buffer, there is a control center which intercepts the system calls in domain 0. The intercepted system calls are “NtCreateFile”, “NtOpenFile”, “NtWriteFile”, and “NtClose”. When these system calls are invoked by guest user processes at step A, they will be captured by the interception module at step B, and will be sent to the control center at step C. The control center will parse the system call arguments to maintain Write Buffer at step D, and return the control back to

15

the original system call flow of the guest operating system. Write Buffer adds one more step in newer disk introspection. After the disk module gets the contents from virtual disks as usual (step 1 ~ step 3), it checks whether Write Buffer keeps the latest contents (step 4). If yes, the disk module will merge the contents into Write Buffer (step 5) and return back to the monitor applications (step 6). With the help of Write Buffer, the monitor applications can get the latest file contents with disk introspection even they are not written into the guest virtual disk.

Figure 8. Write Buffer Maintenance

Write Buffer is maintained by the control center and the disk module. The control center handles the Write buffer creation and synchronization. Figure 8 explains the steps of Write Buffer maintenance. When the guest user process invokes a system call, the control center intercepts it and checks its type. If the system call is NtCreateFile or NtOpenFile, the control center will create a file entry to save the mapping between the file path and the file handle, and check if the Write Buffer exists. If the Write Buffer does not exist, the control center creates a Write Buffer and copies the file contents through disk introspection into it. If the system call is NtWriteFile, the control center will use the file handle arguments to find the file path from file entries to see if the Write Buffer exists. If the Write Buffer exists, the control center will copy the contents from system call arguments into the Write Buffer. Otherwise, the control center will create the Write Buffer with its file path and save the contents. The file entries are stored in the memory space of the control center. To avoid running out the memory space, the control center frees the file entry according to the NtClose system call argument.

16

Figure 9. Write Buffer Usage

Figure 9 shows the usage of Write Buffer. When the monitor application calls LibVMMD to do the disk introspection with Write Buffer, VMMD system checks if the file contents are stored in Write Buffer. If yes, VMMD return the Write Buffer contents to the monitor application. If not, VMMD saves the file contents into a Write Buffer through disk introspection and return it to the monitor application. Therefore, subsequent file access will not need to do disk introspection again and can be served by the Write Buffer mechanism.

Figure 10. Write Buffer Garbage Collection

However, keeping Write Buffer can enhance the speed of repeating file access, but it also

17

consumes the disk space of Domain 0. To maintain the balance between speed and disk space, VMMD provides garbage collection mechanism for Write Buffer. Garbage collection is a mechanism for cleaning outdated Write Buffer. The cleaning procedure is shown in Figure 10.

The cleaning procedure is started by the control center. In the procedure, the control center checks the existing Write Buffer and gets the file contents through disk introspection. After getting the file contents, the control center compares them with the Write Buffer. If the result shows the contents are the same as the Write Buffer, it means the guest operating system has written the cache data into its virtual disk. Then the control center is able to free the space of the Write Buffer. The frequency of garbage collection can be regulated, and the overhead of different frequency will be discussed in Section 5.6.

Disk Access Controls 3.6

Figure 11. Flow Chart of Disk Access Control

Disk access control prevents unauthorized disk manipulations from guest VMs. The file control type can be read-only or read-write. The operating system manages the file access control through the control attribute, but the file control attribute can be changed with proper system privilege. The antivirus software has another file access control mechanism. When

18

malicious files are found in the operating system, it quarantines the malicious files to avoid them being accessed by another process. However, the quarantine process may be attacked by malwares; so that the quarantined files will get released.

In VMMD system, we implement a disk access control mechanism within the virtual disk subsystems (i.e. QEMU-dm). The mechanism locates outside the guest virtual machine so it is much more difficult for a malware within the guest VM to attack or bypass the disk access control mechanism. The mechanism can be used by a security monitor to implement more effective quarantine for malicious files on the disk. Figure 11 presents the architecture of disk access control.

3.6.1. Block-level Access Control

To block disk access from guest VMs at block level, the disk module contains a blacklist.

The blacklist is maintained by the monitor application and the monitor application can add new blacklist entries through LibVMMD. Each blacklist entry records a virtual disk sector number of a guest VM. When the guest VM tries to access the sectors in blacklist, the disk

The blacklist is maintained by the monitor application and the monitor application can add new blacklist entries through LibVMMD. Each blacklist entry records a virtual disk sector number of a guest VM. When the guest VM tries to access the sectors in blacklist, the disk

相關文件