Literature review - 行動應用程式的函式行為分析

2.1. Malicious behaviors of mobile apps

To fulfill the needs of users, mobile app executes series of functions to perform the requested functionalities; these sets of function calls are the behaviors of the mobile app, which are basically performed to delight users, the intensives the vast majority of app developers have. However, there are times the information misused by the application, or the app itself is made for malicious objectives by hackers or malicious developers.

Adrienne Porter Felt et al. [17] conclude the threat types of mobile applications into three categories: Malware, Personal Spyware and Grayware. Furthermore they evaluate the security of different mobile app markets and classified the incentives of malicious apps; they conclude the incentives as follows: Novelty and Amusement, Selling User Information, Stealing User Credentials, Premium-Rate Calls and SMS, SMS Spam, Search Engine Optimization and Ransom.

William Enck et al. [16] classified the mobile app malicious behavior in another approach; they divided these behaviors into two categories: information misuse and phone misuse. The first type is information misuse, which means that sensitive information on the devices (including IMEI, the device identifier; IMSI, the subscriber identifier; ICCID, the SIM card serial number; location information and so on.) has been being leaked by transferring information outward the device. The other type is phone misuse, which means the smartphone

‧

interface has been manipulated in wrong way; telephone service (premium rate calls and SMS), and socket API use are included. They also investigated the libraries included by lots of mobile apps and found that the use of phone identifier and location is configurable in these libraries; the analytical report is often configured, and these libraries probe for permission using the way like try/catch blocks.

2.2. Detecting malicious behaviors within apps

With the raise of the security issue of mobile applications, there are researches proposed different solutions to detect the malicious behavior within the mobile apps, and lots of these researches take the approaches used to solve similar problems on web applications. There are two main approaches to analyze the behaviors of applications, dynamic approach and static approach.

Dynamic approach means performing the analysis through running and manipulating the application and observing its behaviors and reactions.

William Enck developed TaintDroid [32], which would automatically label the privacy-sensitive data, applies label along with the propagation of the data through files and variables [13, 15]. Once the data is about to be transmitted via the Internet, TaintDroid keeps a record of the label, responsible application and the destination. This approach seems fine whereas there are limitations [14]. Especially not until the private information was delivered does TaintDroid log the behavior.

‧

Peter Gilbert et al. [18] detect the behavior of mobile app on Android via dynamic approach by building virtual Android operating system and run the mobile apps on the virtual operating system. They build an input generator to simulate the user keeping giving inputs to the mobile app. And another module to collect the behavior and the data flow of the app. However, the highest code coverage in this work is about 40 percent, which means more than half part of the app was not inspected.

The main limitation of implementing dynamic analysis approach is that the analysis relies on the observations of executions of apps, the problem including building the environment running the applications and the mechanism to achieve the observation. Additionally, for iOS application development, applications available on App Store are distributed in compiled binaries [33], under this circumstance; the iOS apps are basically not runnable on virtual environments.

On the other hand, static approach means analyze the applications without actually executing them, and perform the analysis on the executable binaries or the source code of the applications.

Yajin Zhou et al. [39] proposed a scheme called permission-based behavioral footprinting based on the files such as manifest file of the app, and find the permission the app requests, and they defined the malicious behaviors by collecting the necessary Android permissions requested by the known malwares. They compare the footprint of app to the known malicious ones to determine the app is suspicious or not. This work gives a quick filter to detect malware because it just exam the part of files within the app but exam the app binary itself, which take lots of efforts.

‧

Barbic et al. [9] draw system call dependency graphs that trace program executions, log system calls, and track how parameters propagate, and finally compute graphs.

Egele et al. [12] present PiOS, the first static binary analysis tool for detecting privacy leaks in iOS applications. They decrypted binary of iOS applications, built control flow graphs of system calls of the binary, conducted data flow analysis, and detected suspicious flaws for privacy leaks. They evaluated the approach against more than 1,400 iPhone applications. This work briefly shows the feasibility of binary analysis for iOS applications.

Mann et al. adopted static analysis to detect privacy leaks in Android applications [23]. They identified private information sources, including location, contact, calendar and network communication et al. Their framework labels the parameters with security levels, and variables representing personal data as above would be given higher security levels.

The framework also restricts methods to be called with parameters under specific levels defined.

Analyzing the application through static approach takes a lot of effort, like finding all dependencies within the source code, it needs lot of computing power and time consuming.

2.3. Distributed computing

Analyzing the behaviors of apps through static approach requires abundant computing power. While targeting real applications, the binaries and the corresponding assembly can be huge. In order to improve the

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

performance of our analysis system, we adopt the distributed computing model.

MapReduce was proposed by Dean and Ghemawat of Google Inc.

[11]. It is a programming model for processing large data sets, which composed with two parts, the Mapper and the Reducer. The Mapper process the input data to a set of intermediate key/value pairs, and the Reducer part merges all the intermediate pair with the same key.

Hadoop is a project of Apache [3], one of the open source tools developed on the idea propose by Google, provide a solution of building distributed computing environment on commodity hardware. All users need to do is to specify the computation with a Map and a Reduce function, and the underlying runtime system would automatically parallelize the computation across large-scale clusters of machines, handle machine failures, and schedule inter-machine communication to make efficient use of the network and disks.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

在文檔中行動應用程式的函式行為分析 - 政大學術集成 (頁 13-18)