• 沒有找到結果。

Chapter 1 Introduction

1.2 Synopsis

The rest of this paper is organized as follows. Chapter 2 discusses previous work related to information flow tracking and information protection. Chapter 3 provides a high-level overview of DroidTracking. Chapter 4 describes the system design of DroidTracking and instruction design of ARM architecture. Chapter 5 presents analysis result of our experiments. Chapter 6 summarizes and concludes this paper.

5

Chapter 2

Related Work

Ubiquitous mobile devices become the part of life. A mobile operating system includes iOS, Android OS, web OS, Windows Mobile, or Symbian OS that controls a mobile device or information appliance. Mobile security becomes the important concern. There are approaches to prevent sensitive information from being stolen by malware. The first one is information flow tracking [8, 9, 10, 11, 12, 13] to track sensitive information flowing in the Android operating system. Sensitive information is labeled as a specific identity label. By tracking the taint label, we could reveal behaviors about information leakage cause by malware. It helps users to know what process accesses the sensitive information and where the sensitive information is flowing. Importantly, it reveals that the sensitive information leaves the system at a taint sink. The second one is manifest-based access control [14, 15, 16]. To get the right access to information, programmer has to claim their permission request in the manifest (installation list). When users go to install an application, there is a manifest to be granted by users. It helps users to understand what the application does. A manifest provides the right application with the right access to information. The third one is user awareness mechanism. It protects users from unnecessary use of microphone, camera, Bluetooth, and other sensors. Customized hardware device such as camera light is used to remind users of camera in use. This technique is widely used on small hardware device.

2.1 Information Flow Tracking (IFT)

Information flow tracking means that it tracks information flowing between

6

application-level objects (e.g., function parameters, libraries and messages passing between applications) or system-level objects (e.g., registers, memory, and I/O events);

furthermore, dynamic information flow tracks running process, whereas static information flow tracks application with static source code analysis.

2.1.1 Fine-grained DIFT

Fine-grained dynamic information flow tracking (Fine-grained DIFT) tracks sensitive information by analyzing system-wide information [8, 9, 10, 11, 12]. By using hardware extensions and emulation environments supports, fine-grained DIFT analyzes whole system (e.g., memory, registers, instruction set, and emulated hardware device) and operating system (e.g., applications, user libraries, and kernel modules). The approach identifies the sensitive information with a taint source.

Fine-grained DIFT tracks the taint source at the instruction translation level. By analyzing instruction set, accurate tainted data is propagated and recorded by fine-grained DIFT system. Finally, fine-grained DIFT checks tainted data when information is transmitted to the remote server by the network interface card. These related works are designed for x86 architecture. Of course x86-based design concepts cannot all be used to ARM-based DIFT. According to our survey, there is the related work [13] for the first time proposed some ARM-based DIFT concepts not being implemented yet. The concepts are not enough, because ARM-based smartphone has particular architecture (e.g., ARM instruction set, Thumb instruction set, coprocessor, register banking, addressing mode and so on), sensors (e.g., camera, GPS, microphone and so on) and variety of sensitive information (e.g., IMEI, IMSI, ICC-ID and so on) to be solved. Finally, we propose total solution to address issues on Android operating system.

7

2.1.2 Coarse-grained DIFT

Coarse-grained dynamic information flow tracking (Coarse-grained DIFT) tracks sensitive information inside the application [13]. Coarse-grained DIFT analyzes whole operating system, including applications, user libraries and kernel modules without hardware extensions and emulation environments supports. The approach identifies the sensitive information with a taint source also. However, coarse-grained DIFT proposes the multi-level (message-level, variable-level, method-level and file-level) approaches to track taint source in a physical smartphone. According to our knowledge, TaintDroid [13] proposes the most complete solution to solve the taint propagation in many issues, including Android-specific inter-process communication (IPC), Dalvik VM interpreter and native methods. The most important of all, TaintDroid minimizes a runtime overhead to make realtime analysis real.

Unfortunately, the approach relies on native libraries’ integrity, taint interface libraries’

integrity and firmware’s integrity. In our comment, TaintDroid could not prevent their scheme from malicious attacks caused by DroidDream, because DroidDream can attack integrity of the Android OS.

2.2 Manifest-based Access Control

Manifest-based access control supports an application-level permission mechanism. The permission mechanism provides the right application with the right access to information. In order to access confidential information and device service (e.g., phone call, SMS message, camera and GPS, etc.), Android application lists the manifest (install list) to acquire user permission. During the period of installing an Android application, users could check the manifest to grant it all or not. Nevertheless,

8

users cannot grant partial permission also, and refuse unneeded permission that can be abused.

Related works [14, 15, 16] devote to enhance manifest-based access control mechanism. Kirin [15] and Saint [14] propose the enhanced permission mechanism to prevent sensitive information from being accessed. These two systems propose a concept of selective Android permissions. The goal enables users to refuse unwanted permissions that can be maliciously used. The worst of all, user is the weakest link of computer security. In addition, shared user-ID is another problem. If an Android application A declares a shared user-ID permission to require the other application B’s permission, it is difficult to predict behaviors of the application A actually. For example, application A has an “INTERNET” and “shared B” permission, and application B has a “GPS” permission. Application A has ability to transmit GPS to the Internet therefore.

In our comments, manifest could not prevent permission from being abused, and could not show what the application actually has the behaviors at runtime. In addition, by using a garbled documentation or a malware-downloaded application, DroidDream is able with the root ability to read sensitive information. Manifest-based access control could not have mediation methods to solve problems.

2.3 User awareness

In order to increase the interaction between user and mobile phone, mobile applications have ability to access sensors such as GPS, camera, and microphone. By confirming access permission during the period of installing mobile applications, mobile applications could collect sensitive data via sensors. Our related work [19]

focuses on the problem that sensors may be maliciously used without user awareness.

9

Therefore, there is an idea related work [19] propose to protect users from unwanted use of sensors. By appending hardware device warning to a mobile phone, it reminds users that mobile resources are in used. For example, camera-used LED indicator lights up if camera is in used to capture video data. However, the Android OS has many hardware devices and information (e.g., IMEI, IMSI, SMS, address book, and etc.). Especially, Android malware is interested in information, but related work is not designed for widespread information. Because of DroidDream has the root ability to destroy integrity of the Android OS, user awareness mechanism could not prevent itself from malicious attacks.

10

Chapter 3

Approach Overview

By appending an information flow to emulated ARM CPU, we could know how Android applications steal sensitive information. In addition, it is important to know what information is stolen by malicious application, and what device is used to transmit the sensitive data.

3.1 Challenges

To understand these problems, there are several challenges we present below.

a) Track the sensitive resources which are read and written to the memory space.

Therefore, we could tag the specific memory space as a dirty space (tainted space). After each executed instruction, we track the whole memory and keep a record of memory changes affected by the dirty space.

b) Track the memory space which is read and transmit to remote server by hardware device. We should also check the memory space to know whether the memory space is tagged as a dirty space. If the memory space is tagged as a dirty space, it is the fact that sensitive resources are transmitted to the network.

c) Process identification, register banking and instruction analysis are problems to be discussed. These problems help us to know which process has the malicious behaviors. In addition, it is certainly necessary to track the status of register and memory after each executed instruction.

The challenges are proposed above. The rest of this section demonstrates that proposed approaches solve the problems in detail.

11

3.2 Two-phase Scheme

In order to analyze whole Android OS, there are two phases we propose below.

The first phase, we modify emulated hardware devices on the Android emulator to track triggered events including Taint Source events and Taint Sink events. For example, character device is a virtual device used for communication with the shell command (Android Emulator users could use telnet protocol to connect to the Android OS for shell commands). Sensitive information such as GPS and SMS are sent to the Android OS through emulated character devices. By modifying emulated character devices, we could start to track these monitored resources and handle Taint Source events. By modifying network interface card (NIC), there is a packet sending events occurred to be monitored by DroidTracking. Therefore, the challenge a) and challenge b) are overcome with modified emulated hardware devices. Of course, information flow tracking system helps us to know what information flows from the Taint Source module to the Taint Sink module during the period of running Android OS. The Chapter 4 will demonstrate the information flow tracking in detail. The second phase, we analyze ARM or Thumb instructions to take down behaviors of whole Android OS including Android applications, Android libraries and the Android-based Linux kernel . In order to analyze system-wide behaviors of the Android OS, there are Taint Metadata modules including memory metadata and register metadata to be used to keep track of memory status and register status. To acquire physical memory access space of instructions, we modify the memory management unit (MMU) of the Android Emulator to record memory access. Because of unique visible register issue of ARM architecture, re-banked register with CPU state for instruction analysis is necessary. We do re-bank registers during the instruction translation time. Additionally, a process identification issue is applied to

12

Figure 3.1: System Architecture

identify the process having the malicious behaviors. We trace the context switch to identify the process with the Linux-based Process ID (PID).

3.3 System Architecture

In this paragraph, we show our proposed system architecture and system components with detail demonstration below.

Figure 3.1 presents our system implemented on the Android Emulator. The Android Emulator emulates lots of physical Android hardware devices. By modifying these emulated hardware devices, we finally carry out DroidTracking to track information flows of sensitive data on the Android OS. There are CPU, MMU, NIC module are enhanced on the Android Emulator. By modifying CPU module,

13

DroidTracking analyzes instruction execution so that memory access and register access could be tracked. By modifying memory management unit (MMU) module, DroidTracking acquires physical memory address access of the analyzed instruction.

By modifying network interface card (NIC) module, DroidTracking keeps the track of each packet sending event. NIC module helps us to keep the track of the memory spaces sent by NIC. The most of important, NIC module checks the memory spaces to reveal behaviors of stealing sensitive data.

There are three modules Taint Metadata, Taint Source and Taint Sink in our core Taint Engine. In order to achieve fine-grained and system-wide information flow tracking, Taint Metadata module records the byte-granularity objects in system. Under the state-of-the-art related work, byte-granularity object analysis is the most effective and fine-grained as we know. Taint Source module is set of monitored resource. There are ICC-ID (Integrated Circuit Card Identifier), GPS (Global Positioning System), IMEI (International Mobile Equipment Identity Number) and IMSI (International Mobile Subscriber Identity) in the Taint Source module at present. In the future, we will add more and more monitored resources to the set of Taint Source module. Taint Sink module is used to record specific triggered events. At present, we only track the packet sending events to reveal malicious behaviors of stealing sensitive data.

At the end of this section, we demonstrate the system control flow of our proposed system to readers. First of all, DroidTracking uses Taint Source module to know where sensitive data is written to the memory space. DroidTracking marks this memory space as a dirty memory space then. At the following steps, DroidTracking tracks the information flow during the period of program execution, and propagates the taint tags in Taint Metadata module. Therefore, Taint Metadata components is frequently updated by Taint Engine because of lots of executed instructions. The last, Taint Sink module is the most important protector to reveal behaviors of stealing

14

sensitive data by checking status of memory space which emulated hardware device reads. Of course DroidTracking also indicates the malicious process stealing sensitive data.

15

Chapter 4

DroidTracking

There are two-phase scheme to monitor whole Android OS. The first one is triggered events analysis. The second one is system-wide and fine-grained information flow tracking. Figure 4.1 shows our proposed flow chart.

4.1 System Flow Chart

Figure 4.1: System Flow Chart

The left side of Figure 4.1 is monitored guest environment. Our scheme do not limit the scope inside Java application; moreover, it monitors whole Android OS including Java applications, Android libraries, Android-based Linux kernel and downloaded unknown malware modules. By comprehensive monitoring, it prevents our scheme from unknown malicious attacks. The right side of Figure 4.2 is

16

DroidTracking system based on Android Emulator. In order to achieve fine-grained object analysis, we track system-wide objects including memory and register instead of Java parameter, library parameter, and cross-application message. we propose a Taint Metadata in Figure 4.2 to keep track of byte-granularity memory status and byte-granularity register status. Memory Metadata could record whole Android memory space emulated by the Android Emulator. Register Metadata could record all general purpose registers of the ARM architecture. On the other hand, the rest components of DroidTracking maintain the same Taint Metadata keeping track of system-wide information. Taint Source component is used to identify what the sensitive information is read to memory. By updating Taint Metadata, Taint Source component keeps track of memory address contaminated by sensitive data. To detect behaviors of stealing sensitive data, Taint Sink module distinguishes status of memory address by checking Taint Metadata. In addition, Taint Metadata is frequently updated by Instruction Analysis module also because of changes after each executed instruction.

4.2 Instruction Analysis

We use Register Banking, Memory Accessing and Effects Analysis components to analyze each ARM or Thumb instruction. By the way, we do not analyze Thumb-II instructions because of the Android Emulator version not supported. It helps us to realize how instructions make an impact on the Android OS. For example, an instruction could move a register/memory/immediate object to a register/memory object. Even if CPU processes a data processing instruction such as ADD, MUL and MOV, results of the instruction execution are still a data movement from source objects to destination objects. Therefore, DroidTracking analyzes the relationship

17

between source objects and destination objects to model behaviors of an instruction.

By updating a common Taint Metadata after each executed instruction, it helps us to record the execution process of the executed Android OS including Android application, Android libraries, Android-based Linux kernel and downloaded unknown malicious module. Finally, we make a simple conclusion. Taint Source module reveals what the monitored resource is accessed to the Android System. Instruction Analysis module tracks the executed process of whole Android OS system. Taint Sink module reveals what the monitored resource is stolen and transmitted to the Internet.

4.3 Information Flow Expression

We begin this section with a rigorous expression of information flow and granularity of objects. If a destination object computed by CPU is affected by a source object, there is an information flow from the source object to the destination object.

By the information flow tracking system we proposed, it is possible to comprehend behaviors of the whole Android OS through instruction analysis.

Before our introduction to instruction analysis, we demonstrate object granularity design to achieve our goal of fine-grained analysis. In order to achieve byte-granularity object analysis, a study of the ARM instruction set architecture is carried out. Because an instruction operand could be a register, memory or immediate value, these operands should be considered as analyzed objects. Nevertheless, a destination operand of an ARM instruction cannot be an immediate value. Therefore, an immediate value cannot be affected by source objects. There is a Taint Metadata to track the status of each byte-granularity memory and the status of each byte-granularity register only. Figure 4.2 shows that the third and the forth bytes of Register 1 (R1) are contaminated by IMSI information, but the first and the second

18

Figure 4.2: Taint Metadata bytes of R1 are not contaminated.

By analyzing the ARM instruction set architecture, there are five standard effects we defined could describe that source objects make an impact on destination objects (operands). The first one is one-to-one effect that each byte of source object affects one corresponding byte of destination object. The second one is a mixed effect that each byte of source object affects all bytes of the destination object. The third one is an assigning effect that the destination object is contaminated if the source object is contaminated. The forth one is an appending effect that the destination object is contaminated if the source object or the destination object is contaminated. The fifth one is a clear effect that the destination object is not considered to be contaminated with sensitive information.

19

Table 4.1: Information Flow Expression

Table 4.1 shows expression of information flows. There are few examples shown in this paragraph. The “MOV” instruction has an one-to-one assigning effect on its destination operand. If each byte of the source object is contaminated, the corresponding byte of the destination object is contaminated. The “ADD” instruction moves the sum of two source operands to the destination operand. The first source operand has an one-to-one assigning effect on its destination operand. The second source operand has an one-to-one appending effect on its destination operand, because it cannot change the status of bytes contaminated by the first source operand. The

“MUL” instruction moves the product of two source operands to the destination operand. Because a computed result of the “MUL” instruction has a byte-mixed computation, we evaluate the result with a rough estimate. As a result, the product of two source operands has a mixed appending effect on the destination operand.

4.4 Information Flow Analysis

The instruction analysis system is quite a help to know any change of the Android system. By checking and updating the Taint Metadata, the instruction

相關文件