從攻擊角度定量評估資訊系統安全性

(1)

國立交通大學

電控工程研究所

博士論文

從攻擊角度定量評估資訊系統安全性

Quantitative Assessments of Cyber Security

from the Perspective of Attacks

研究生

：蔡欣宜

指導教授：黃育綸博士

(2)

從攻擊角度定量評估資訊系統安全性

Quantitative Assessments of Cyber Security from the

Perspective of Attacks

A Dissertation by Hsin-Yi Tsai

Submitted to the Institute of Electrical Control Engineering National Chiao Tung University

in partial fulfillment of the requirements for the Degree of

Doctor of Philosophy in

Electrical Control Engineering

Advisor: Dr. Yu-Lun Huang December 2011

(3)

c

2011 - Hsin-Yi Tsai

(4)

從

從攻

攻

攻擊

擊

擊角

角

角度

度

定

定量

量

量評

評估

評

估

估資

資

資訊

訊

訊系

系

系統

統

安

安全

_全

_全性

_性

研_{究生: 蔡欣}宜 _{指導教授: 黃育綸博士} 國立交通大學電控工程研究所

摘

摘要

要

資訊安全評估機制可以提供資訊系統的安全評估結果，協助系統管理者有效地瞭解系統之安全性，並成為系統管理者管理該系統之參考依據。由於一個系統的安全性涉及許多因素，諸如系統設定、安全機制、現有攻擊方式等等，因此資訊安全的評估不能僅考慮單一面向，而必須要能同時考慮多項因素所造成的影響。本文分別由系統外部與內部攻擊的角度出發，探討資訊安全評估方法之設計，及其所能提供的評估結果。在外部攻擊方面，本文提出一個無線網路風險評估方法；該方法首先考慮網路系統的安全條件、攻擊手法與系統設定，以建立風險模型，接著本文再提出一套量測準則，藉以量化風險數值。在內部攻擊方面，本文提出一套量化分析軟體控制流程模糊化之方法，以評估控制流程模糊化對軟體強韌度之影響。該方法基於控制流程圖之概念，將控制流程模糊化轉換為正規表示式。以此正規表示式為基礎，本文進一步提出新的量測準則，以計算軟體控制流程模糊化所提供的保護能力。最後，本文利用數個範例，說明並驗證本文所提方法之可行性。我們相信本文所提之方法能提供系統管理者更全面的資訊安全評估結果，並進一步地協助系統管理者管理該系統。

(5)

Quantitative Assessments of Cyber Security

from the Perspective of Attacks

Student: Hsin-Yi Tsai Advisor: Dr. Yu-Lun Huang Institute of Electrical Control Engineering

National Chiao Tung University

Abstract

Assessment of cyber security is a long-standing and great challenge since multifarious factors and their reciprocal effects have to be considered in the meanwhile for the assessment. Due to its complexity, assessment of cyber se-curity should be performed with multiple aspects. This dissertation presents the quantitative assessments from the perspectives of both external and inter-nal attacks. Regarding assessing cyber security in terms of exterinter-nal attacks, we propose a wireless risk assessment method which consists of a risk model and an assessment measure. The risk model is in charge of modeling wireless network risk, and the assessment measure is an algorithm of determining the risk value per the risk model. As for internal attacks, we introduce a novel framework to evaluate software robustness in terms of control-flow ob-fuscating transformations. On the basis of this framework, we propose new metrics for quantifying the protection effect yielded by a control-flow obfus-cating transformation. Moreover, we conduct the case studies to validate the proposed assessment methods. We believe that our methods are helpful for a system administrator to evaluate and manage the cyber security in a more effective way.

(6)

Acknowledgement

It has been a long journey, but I am fortunate and thankful where I am. I am pleased to thank every one who supported, encouraged and accom-panied me in the years of my Ph.D. study. First and foremost, I would like to express my sincere gratitude to my advisor, Prof. Yu-Lun Huang, for her pa-tience, motivation, enthusiasm, endless encouragement, immense knowledge and continuous support of my research. Her guidance helped me in all the time of research and writing of this thesis. Her encouragement widened my sight and let me see an amazingly different world. I could not have imagined having a better advisor and mentor for my Ph.D. study.

Besides my advisor, I would like to thank the rest of my thesis committee: Prof. Hahn-Ming Lee, Prof. Chin-Laung Lei, Prof. Yuan-Pei Lin, Prof. Shiuhpyng Shieh, Prof. Hung-Min Sun and Prof. Wen-Guey Tzeng, for their insightful comments and valuable time. Their valued feedback helped me improve the quality of my dissertation.

I am also grateful to Prof. Doug Tygar and Prof. David Wagner, who supervised me during my visit to UC Berkeley in my first year of Ph.D. study. I learned a lot from their passion for research, unique perspectives on academy and careful criticism. My appreciation also goes to Dr. Andr´e Miede, who hosted and supervised me when I was a visiting scholar at TU Darmstadt, Germany in my fourth year of Ph.D. study. Andr´e deeply impressed me with his sense to research ideas and insightful views.

I owe lots of gratitudes to my dearest RTES lab-mates, for their support, their sweet consideration and all the wonderful time we had together. How

(7)

lucky I am to be a member of RTES! Special thanks go to Yung-Wen and Borting. They always helped me a lot in not only project execution, research discussion but also administrative services of our laboratory.

Finally, I would like to give my deepest gratitudes to my parents, who made me who I am. I cannot go that far without their unconditional love. Many thanks to my brother as well for his support and caring during my Ph.D. study. I dedicate this dissertation to them, my beloved family.

(8)

Dedicated to my family

(9)

List of Figures

3.1 The proposed hierarchy per device: general case . . . 22

4.1 Example of the formalization of a parsed program . . . 45

4.2 Atomic operator of inserting a dummy simple block . . . 48

4.3 Atomic operators of inserting opaque predicates . . . 48

4.4 Atomic operator of inserting a fork . . . 51

4.5 Atomic operator of inserting a dummy loop . . . 52

4.6 Atomic operator of splitting a simple block . . . 55

4.7 Atomic operator of splitting a branch . . . 56

4.8 Atomic operator of reordering code blocks . . . 59

5.1 Common subgraphs of two graphs, G1 and G2 . . . 76

5.2 Control flow graph of a conditional jump . . . 79

5.3 Control flow graph of a loop . . . 80

5.4 Extended control flow graph of a loop . . . 81

6.1 Example of four-layer risk analytic hierarchies (4-RAH) . . . . 91

6.2 Example I: networks with different security mechanisms . . . . 104

6.3 Example II: snapshots of a wireless network at different time . 107 6.4 CFG of Program I . . . 109

(13)

6.5 ψ1: obfuscated result of ψ after applying T1. . . 113

(14)

List of Tables

3.1 Types of Attacks . . . 30

3.2 Numerical Impact Severity vs. Linguistic Meanings . . . 38

4.1 Feasibility of Decomposition . . . 61

5.1 Space Penalty of Each Atomic Operator . . . 85

6.1 Attack Analysis . . . 92

6.2 Effective Attacks and Risk levels . . . 95

6.3 Vulnerabilities of Running Services . . . 95

(15)

Chapter 1 Introduction

Assessment of cyber security is important since we cannot improve what we cannot measure [1]. The assessment results are helpful for system administra-tors and users to understand system security easily. Then, the administraadministra-tors are capable of designating countermeasures, applying protection mechanisms, or modifying system configurations to increase security according to the as-sessment results. Nevertheless, asas-sessment of cyber security is critical since various factors (such as security countermeasures, system configurations, vul-nerabilities and realistic attacks) are involved to pose individual effects on cyber security and yield different levels of security risk. It is thus difficult to assess cyber security from a holistic perspective because the multifarious factors and their reciprocal effects have to be considered in the meanwhile.

Cyber security can be compromised in many ways. Security mechanisms and configurations are designed and applied to fortify against different at-tacks. Hence, to plausibly assess security, the assessment should be per-formed from the aspect of attacks that attack targets, prerequisite

(16)

config-urations of an attack and attack impacts are involved. Cyber attacks take various forms and are coarsely classified into two types: external and inter-nal attacks. An exterinter-nal attack is launched by an adversary outside a victim system. An internal attack is started by an attacker who is a legal user of the victim system. Upon attacking a network, an external attacker intends to gather information concerning the network system from the outside and then launches attacks accordingly, while an internal attacker, accessible to a victim, can control the system and assault the victim’s data and programs. Security of a network system hence should be evaluated from both the ex-ternal and inex-ternal aspects to better reflect the realistic situations.

There are many implementations of external attacks, such as penetration attacks, Denial-of-Service (DoS) attacks and eavesdropping attacks. Accord-ing to the variations of attacks, various methods of assessAccord-ing cyber security in terms of external attacks have been proposed. The methods include at-tack graph-based methods [2, 3, 4, 5, 6, 7, 8] and analytic hierarchy process (AHP)-based methods [9, 10, 11, 12, 13]. An attack graph-based method assesses the security of a network system based on analyzing the system’s attack graph, which is drawn mainly from the aspect of penetration attacks. The attack graph-based methods are widely used in assessing security of wired networks, but they are not that appropriate for a dynamic network environment. The whole attack graph needs to be re-generated once the topology or configurations of a network system change. Such re-generation could cause a heavy load for assessing the cyber security due to the frequent change of a dynamic environment. The AHP is a structured technique for decision making problems [14, 15]. It has been applied to several realms, such

(17)

as planning, system designing and risk assessment. Zhao et al. applied the concept of AHP to modeling and assessing network security risk [9, 10, 11]. However, Zhao et al. developed a 3-layer hierarchical structure which is not sufficient to discuss the security impacts resulting from the incorrect con-figurations. [12] and [13] concentrate on the design of the methodology for risk assessment based on the AHP, but their focus does not lay in the design of the risk model to better represent the real security situation. Therefore, there is a need to establish a feasible risk model and design a practical risk assessment method which meets the ground truth.

Unlike an external attacker, an internal adversary obtains the privilege prior to launching an attack so that the adversary is authorized to manipulate the stored data and programs. Factors critical for assessing cyber security against external attacks may not be as crucial for the assessment from the internal attack perspective. In comparison, evaluating robustness of data and programs against internal attacks is the core of security assessment. Much research has been proposed to evaluate capabilities of data protection mechanisms, such as data encryption and digital watermarking. As for the protection of programs, comparatively little attention has been received in evaluating the program protection mechanisms like software obfuscation and software tamperproofing [16].

To distinguish the existing security assessment methods, this disserta-tion offers soludisserta-tions to assessing cyber security in different scenarios and test cases. We present several quantitative assessments of cyber security in terms of both external and internal attacks. We develop a wireless risk assessment method, which is composed of a risk model and an assessment measure. The

(18)

risk model is in charge of modeling wireless network risk from the aspects of the security requirements, the wireless attacks and the configurations, where the wireless attacks fall into the category of the external attacks. The assess-ment measure is an algorithm for determining the risk value based on the risk model. To complement the deficiencies of the existing methods (attack graph-based and AHP-based methods), we extend an existing 3-layer AHP hierarchy into four layers with the considerations of device configurations. An additional layer is constructed to consider the impacts from incorrect configurations and to deal with the frequently changing configuration of a wireless network.

Our 4-layer hierarchy consists of the risk layer (1st _{layer), the requirement}

layer (2nd _{layer), the attack layer (3}rd _{layer) and the configuration layer (4}th

layer) such that the vulnerabilities, the wireless attacks and the attack targets within a wireless network are considered by our method. The separate layers are advantageous to incorporating the dynamic configurations since only the 4th layer is re-built on detecting the changes of the configurations. Further, since our hierarchy is developed per device, we can easily establish or remove a corresponding hierarchy when a device joins or leaves the network. Only the related hierarchy needs to be developed or removed, instead of all the hierarchies within the network. Therefore, the computing load, resulting from the dynamics of the network, can be reduced. On the basis of the hierarchy per device, we propose an assessment measure to calculate the value for wireless network risk.

In regard of program protection against the internal attacks, it is expected that after applying the protection mechanism, a program is more robust

(19)

against being understood or modified by attackers. Software obfuscation is a technique to shield a program from reverse engineering [17, 18, 19, 20, 21]. Collberg et al. [17, 22] classified software obfuscation and proposed several approaches. One approach is control-flow obfuscation, which tries to disguise the real control flow of an original program by re-ordering and obscuring its execution paths. Then, an obfuscated program with higher robustness than the original one is produced. Additionally, software tamperproofing is an-other well-known program protection mechanism. It not only aims at making tampering difficult but also tries to detect and respond to the modification as well [23]. Obfuscation is beneficial to tamperproofing, since an obfuscated program which is harder to understand increases the difficulty for an adver-sary to discover the exact software instructions that he would like to tamper. Tamperproofing is usually combined with obfuscation in practice. Therefore, this dissertation focuses on evaluating software obfuscation to analyze its ef-fects upon software robustness. Then the evaluation result can lead to the further measurement of software robustness enhanced by a tamperproofing mechanism.

To evaluate various control-flow obfuscating transformations, we present an abstract framework for formalizing and modeling them. We describe a control-flow obfuscating transformation as a transformation on program control flow graphs (CFG) in this framework. A control-flow obfuscating transformation can be viewed as a function that accepts the original pro-gram’s CFG as input and yields a modified CFG. By analyzing many ex-isting transformations, we observed that many of them can be decomposed into a sequence of basic building blocks. Thus, we identify a set of atomic

(20)

operators for graph transformations that are guaranteed to preserve the func-tional behavior of the program and hence can be used as building blocks of a control-flow obfuscating transformation. By composing instances of these atomic operators in sequence, we can build many kinds of control-flow ob-fuscating transformations. This helps to understand and classify many prior control-flow obfuscation proposals and may help in devising new candidate obfuscating transformations.

On the basis of the formal representation of a transformation, we propose metrics that we conjecture may be related to software robustness of an ob-fuscated program, in comparison with the original program, against reverse engineering. Our framework with such metrics helps to statistically analyze and evaluate software robustness in terms of control-flow obfuscating trans-formations, while it does not support dynamic analysis of reverse engineering. In addition, we explain how to evaluate the overhead on code size introduced by a control-flow obfuscating transformation on the basis of our framework. Our approach works by characterizing the space penalty of each individual atomic operator. Then, we are able to estimate the overheads an obfuscating transformation yields according to the formalization of the transformation with ease.

The novel contributions of this dissertation are:

• We propose assessment methods of cyber security. The assessment methods concern the scenarios of both external and internal attacks.

• We present an extended AHP-based method for wireless risk assess-ment. The method models the wireless risk according to the widely

(21)

adopted definition of risk, the realistic attacks and the current system configurations. In addition, the method addresses the computing loads caused by the dynamics of a wireless network.

• We show a framework to evaluate software robustness enhanced by control-flow obfuscation. The framework can not only formalize exist-ing control-flow obfuscatexist-ing transformations but is also flexible enough to express new ones. In addition, our framework is helpful in evalu-ating not only software robustness but also space penalty caused by obfuscation at the design stage.

• We propose metrics that we conjecture they may be helpful in measur-ing wireless risk and capability of control-flow obfuscation. We reason the small risk values and the large capability values derived by our met-rics are necessary but not sufficient for security. Then, the metmet-rics can be a useful index for administrators to adjust network configurations or select proper protection mechanisms.

Synopsis Chapter 2 introduces the related work of cyber security assess-ment, including network risk assessment and evaluation of software obfus-cation. Chapter 3 explains our risk assessment method which is designed based on the analytic hierarchy process. We also present metrics and a mea-sure algorithm for assessing wireless network risk. In Chapter 4, we first review the background of CFGs, and describe the proposed atomic operators for formalizing control-flow obfuscation. The formalization of control-flow obfuscating transformations is specified in this chapter. Chapter 5 describes

(22)

the metrics for evaluating control-flow obfuscation. The metrics are devised based on the proposed formalization. Chapter 6 gives examples to illustrate our assessment methods and to validate our methods. Finally, the last chap-ter states the conclusions of this dissertation.

(23)

Chapter 2 Related Work

We review the existing methods of assessing cyber security in this chapter. We also discuss the advantages and insufficiencies of these methods to clarify the motivation of this dissertation again.

2.1 Security Assessment in terms of External

Attacks

In most situations, an adversary has no access to a victim system. The at-tacker needs to start attacking without a given privilege. He may try to gather useful information by external exploration and to exploit vulnerabil-ities to gain a privilege illicitly. Attack graph-based methods assess cyber security based on analyzing potential or possible attack paths existing in a network. AHP-based methods focus on modeling security risk yielded by multifarious factors, including various kinds of attacks, system configura-tions, and so on.

(24)

2.1.1 Attack Graph-Based Assessment Methods

Traditionally, tree-based analyses such as event-tree analysis and fault-tree analysis are used in a quantitative risk assessment [24, 25]. The event-tree analysis produces a sequence of outcomes which may arise after the occur-rence of a selected initiating event. In the fault-tree analysis, an undesired event is assigned as the root of a fault tree. Administrators deduce bottom events that may trigger the undesired event from top to down, and build a fault tree composed of the events. By traversing the event tree or the fault tree, we can ascertain the probability of occurrence of an undesired event. Both event-tree and fault-tree analyses, while useful, are less than satisfac-tory since they are not appropriate for assessing risk resulting from multiple criteria. That is because an administrator can select only one undesired event (initiating event) when build up a fault tree (an event tree). Therefore, the risk value deduced from the tree concerns a single criteria simply.

To improve the deficiencies of the tree-based methods, in 1999 Phillips et al. proposed an approach to modeling network risk based on an attack graph [2], which draws paths that may lead to an unexpected state of a net-work from various initial states. A node in a graph indicates a system state, and an edge is an action of transition from one state to another. An attack graph is generally developed with attack templates, system configurations and attack capabilities [2, 26, 27]. Attack templates mainly describe the pre and post conditions of attacks. The conditions may contain the information of user level, vulnerabilities, capabilities, etc. System configurations indicate the details of the network system. A configuration file should have the

(25)

fol-lowing information: machine type, operating system, ports opened, services, network type, and so on. In an attack graph, attack capabilities can be rep-resented as the initial states. The attack capabilities are one of the factors leading to the probability of success of an attack.

Since attack graphs can provide thoroughly possible attack paths within a network, many researchers and professionals have proposed attack graph-based network security measures. Wang et al. [7, 6] presented a generic framework which considers disjunctive and conjunctive dependency relation-ship between exploits in an attack graph. An attack resistance metric has been proposed to calculate and compare the security of different network con-figurations based on the generic framework. In [4], Mehta et al. presented two algorithms of ranking attack graphs to determine the probability of an attacker reaching the goal states. The first algorithm is similar to Google’s PageRank algorithm to determine the importance of webpages on the World Wide Web [28]. The authors modified Google’s algorithm to find out the probability to reach a certain system state from the initial state. The second algorithm ranks individual states of an attack graph in a random simulation that the transition probability from state si to sj equals the reciprocal of the

number of successors of the state si.

[3] and [26] presented an analysis method of determining a minimal set of attacks that need to be prevented, otherwise the goal state will be reached. They also explained how to interpret an attack graph as a Markov Deci-sion Process to perform quantitative reliability analysis. A number of re-searchers have proposed risk assessment and security analysis methods based on Bayesian network-based attack graphs [8, 29, 30]. Bayesian networks

(26)

en-able system administrators to determine the probability of a particular attack being executed from a given initial system state according to the conditional dependencies among passed states. Dantu et al. [31, 32] also used a Bayesian network-based attack graph for security risk management. The authors inte-grated behavior-based profiles with the Bayesian network-based attack graph to estimate the risk level based on an attacker’s behavior.

The attack graph-based methods are widely used in network security analysis and assessment since an attack graph provides elaborate information about attacks which exploit vulnerabilities existing in a network. However, generation of an attack graph requires high time complexity. In [33], Ou et al. pointed out the complexity of the attack graph generation algorithm of Ammann et al. [34] is O(N6_{) in terms of network size. Ou’s algorithm}

has O(N2_{) complexity under the assumption of constant table look-up time.}

In 2005, Ammann et al. [35] proposed an algorithm to track only “good” attack paths, instead of all possible attack paths. The algorithmic complexity is polynomial in the size of the network.

According to the discussion in the literature, complexity of generating an attack graph is a critical issue for the attack graph-based assessment methods. To assess security of a dynamic network environment, redrawing the whole attack graph is required because the paths of an attack graph are tightly dependent on the exploited vulnerabilities and on the nodes. Periodically redrawing an attack graph of a dynamic network, like wireless networks, could lead to a heavy load because topologies and configurations usually change in high frequency.

(27)

2.1.2 AHP-Based Assessment Methods

The AHP is a structured approach for solving decision-making problems. It is appropriate for complex decisions which involve various decision elements that are difficult to quantify. The AHP contains the steps in developing a hierarchy of decision elements and constructing the relationship between the elements. A weight is set for each element as the representation of the relationship. The AHP has been applied for many realms, including network risk assessment [9, 10, 11, 12, 13].

Wang and Zeng [12] presented a method of assessing information security risk based on the AHP. They quantified the security risk by integrating the AHP with the fuzzy mathematics and the artificial neural network. Zhang et al. [13] proposed an AHP-based risk assessment method for information secu-rity. They adopted a group decision making method to combine the assessing results from individual experts. [12] and [13] concentrate on the design of the methodology and does not mention much about the development of the risk model upon the AHP.

In [9, 10, 11], Zhao et al. constructed 3-layer hierarchical structures based on the AHP to model wireless network risk. The top layer of their structure is the goal of the risk assessment. The middle layer introduces the rules for weighting the risk factors with the aspects of probability, impact severity, and uncontrollability. The combination of these factors leads to a potential risk value of the network. Illegal actions and system faults which may influence the above elements are listed in the bottom layer. In [9], the entropy theory was introduced to determine the coherence of expert experiences. In 2007,

(28)

Zhao et al. extended their previous risk assessment method, which was pro-posed in 2005 [9], by including mobile IP security and wireless interferences in the bottom layer to assess security risk of a wireless network [11].

Compared to the attack graph-based methods, the AHP-based methods require lower time complexity to generate a network risk model. Thus, the AHP could be a convincing candidate basis for modeling and assessing secu-rity of a wireless network with changing configurations and dynamic topolo-gies. Moreover, these hierarchical structures, composed of critical elements for the wireless network risk assessment, are useful to systematically mea-sure network security. However, the existing work ([9], [10] and [11]) simply discusses how the risk factors affect network security without considering the impacts resulting from the practical configurations and network topologies. Because incorrect configuration is the main reason for system vulnerabil-ity for both wired and wireless networks, the existing 3-layer structures are deficient in modeling network risk.

2.2 Security Assessment in terms of Internal

Attacks

Since an internal adversary has access to a victim system, most security mechanisms cannot forbid the adversary from reaching or stealing contents like data or software within the system. Nevertheless, there are some mecha-nisms which try to obstruct an adversary from understanding, interpreting or modifying the contents even the adversary obtains the contents. Therefore,

(29)

security assessment in terms of internal attacks should concern on security evaluation of the corresponding protection mechanisms. Software protection has received comparatively little attention, compared to data protection and evaluation of data protection. This dissertation aims at providing holistic se-curity assessments that we concentrate on devising the evaluation of software protection.

Various software protection mechanisms have been proposed to accom-plish distinct objectives. The mechanisms and their objectives are described as follows.

• Software watermarking targets on discouraging the intellectual prop-erty theft or proving the ownership of the software when the theft occurs by embedding a watermark into the software.

• Software tamperproofing tries to increase difficulty in tampering soft-ware and is able to detect changes if the softsoft-ware is tampered.

• Software obfuscation aims at obscuring software to protect the software from being understood or reverse engineering.

Software watermarking and software tamperproofing are generally combined with software obfuscation since an obfuscated program which is harder to understand increases the difficulty for an adversary to figure out the em-bedded watermarks, or to discover the exact software instructions that he would like to tamper. Therefore, the result of evaluating obfuscation can not only indicate the capability of software obfuscation but also be introduced to further security assessment of software watermarking and tamperproofing.

(30)

It is crucial to dive in the evaluation of software protection starting from evaluating obfuscation.

2.2.1 Evaluation of Software Obfuscation

Software obfuscation increases difficulty in reverse engineering by transform-ing an original program into an obfuscated one which thwarts reverse engi-neering but preserves the original functionality [17]. Despite the theoretic proof of the impossibility of omnipotent obfuscation [36], obfuscation is still able to reach positive results in specific situations [37] and implementation of obfuscation have been widely discussed [17, 18, 19, 20, 21]. According to [17], obfuscation is classified into three types: control-flow obfuscation, data obfuscation and layout obfuscation. Control-flow obfuscation disguises the real execution under scrambled control flow of a program to make re-verse engineering difficult. Various implementations have been introduced to accomplish control-flow obfuscation [18, 38, 39, 40]. Data obfuscation transforms data and data structures in a program without modifying the original functionality. [21] and [41] presented obfuscating transformations by extending the concept of data obfuscation. Layout obfuscation removes the information that an attacker can seize from the program. Most of the commercial obfuscators such as Dotfuscator [42], DashO [43], Zelix [44] and ProGuard [45] adopt the basic idea of layout obfuscation.

Each type of obfuscation provides effective though limited resistance against malicious reverse engineering. In recent years, many researchers proposed various evaluation methods to assess the effectiveness of an obfuscating

(31)

trans-formation. The methods are mostly based on empirical analysis, which eval-uates an obfuscating transformation by running practical experiments to ob-serve how much the obfuscated program resists against deobfuscators or how much time a human subject takes to interpret it [46, 47, 48, 49, 50, 51].

Udupa et al. [52] examined control flow flattening, a control-flow ob-fuscating transformation, by measuring the time required by automatic de-obfuscation. Anckaert et al. [47] introduced a framework to evaluate an obfuscating transformation based on software complexity metrics, which cal-culate the complexity with respect to instructions, control flow, data flow and data. The authors implemented three obfuscating transformations (con-trol flow flattening, static disassembly thwarting and binary opaque predi-cates) and applied the transformations to eleven C programs of the SPECint 2000 benchmark suite, and the obfuscated programs were produced from the benchmark suite. The results of the complexity analysis show that all of the three transformations can provide non-negative effects, but the transforma-tion, binary opaque predicates, is less potent than two others. Majumdar et al. [48, 53] considered a specific reverse-engineering technique, slicing, and developed metrics to evaluate the capability of obfuscation against that technique. [48] and [53] presented three obfuscating transformations (bogus predicate, adding to a while loop, and variable encoding) and applied them to five example programs to derive the values by the defined metrics. The metric values imply that these transformations significantly make reverse engineering difficult.

Ceccato et al. [49, 50] assessed the difficulty an attacker would encounter in examining identifier renaming, one of the obfuscation techniques, by

(32)

ques-tionnaires. The authors asked human subjects to interpret the original and obfuscated programs and to fulfill a comprehension task. The subjects were also asked to fill in a post-task survey questionnaire to describe their behav-ior during the task and the confidence about it. Certain types of statistical tests, such as the Mann-Whitney test and the Wilcoxon test, were adopted to analyze the task results and the questionnaires. The analysis results point out that identifier renaming effectively reduces the capability of the subjects to understand the source code.

The existing work [47, 48, 49, 50, 52] evaluated the effectiveness of obfus-cation by empirical studies. Practical experiments were performed to mea-sure individual obfuscating transformations according to the defined met-rics or the perception of human subjects. These experiment results indicate the relation specifically between a designated original program and a sin-gle obfuscating transformation. While the same obfuscating transformation is intended to be applied to another program, the experiment results may not be applicable to determine the capability of that transformation in the case. In addition, the existing experiment results of evaluating individual transformations cannot help determine the effectiveness of a compound ob-fuscating transformation, which comprises several separate transformations. It thus requires a formal method for evaluating obfuscating transformations in a high-level of abstraction.

Preda and Giacobazzi [54] proposed a formal method for analyzing the effect of a control-flow obfuscating transformation based on program seman-tics. They considered a specific control-flow obfuscating transformation, which obscures the control flow by inserting opaque predicates. They

(33)

eval-uated the transformation by analyzing the effects of the opaque predicates inserted. They also modeled attackers for comparing obfuscating transfor-mations. Their method is the closest to ours, which evaluates control-flow obfuscating transformations based on formal analysis as well. However, our method can formalize and evaluate more types of control-flow obfuscating transformations, not limited to the type of inserting opaque predicates.

2.3 Summary

We reviewed and analyzed the attack graph-based and the AHP-based meth-ods for network risk assessment in terms of external attacks. The analysis showed that there is a need to propose a new risk assessment method, which can represent the risk in the real-world and is capable of addressing the dy-namics of a network. We also discussed the existing methods of evaluating obfuscation. Most of them are empirical-based and examine the effects of obfuscating transformations by experiments; however, a formal method is necessary to help system administrators systematically and effectively assess the capability of obfuscation.

(34)

Chapter 3 Risk Assessment of Wireless

Networks

Risk assessment is critical for risk mitigation and security enhancement. It can be applied to several different realms to address risk management, such as information technology, chemical industry and financial industry. Despite variation in the application realms, risk assessment takes into account cal-culations of two components of risk, the magnitude of the potential loss and the probability that the loss will occur. Then the assessment result is used as a reference for identifying proper controls in treating or eliminating risk during the following process, such as risk treatment and risk mitigation, in a security risk management standard [55, 56, 57].

The dynamics of wireless networks make network security evaluation and management a critical challenge. To help a network administrator effectively manage wireless network security, it is essential to design a risk assessment method which derives a risk value for the administrator to easily understand

(35)

the security of the managed network. A feasible risk assessment method has to reasonably model wireless network risk and measure the risk value according to the characteristics of the network. Network risk is defined as “a function of the likelihood of a given threat-source’s exercising a particular potential vulnerability, and the resulting impact of that adverse event on the organization” [55]. According to the definition, network risk can be interpreted as the resulting impact which results from the likelihood, the threat sources and the vulnerabilities.

To fulfill the definition, we propose a risk model (4-RAH), shown in Fig-ure 3.1, to describe the wireless network risk. The top layer of our model represents the impact severity which threatens the security requirements (2nd

layer) of a wireless network. The impact severity is determined in terms of the factors: likelihood, threat sources and vulnerabilities by the definition. Our model introduces the attack layer (3rd _{layer) and the configuration layer}

(4th layer) to indicate the threat sources and the vulnerabilities, respectively, where the edges between the layers represent the likelihood mentioned in the definition. We construct a hierarchy for each device, and then based on the hierarchy, we propose an assessment measure which contains a newly defined historical vulnerability metric and an algorithm to determine the network risk value. Our risk assessment takes the real-world situation into account and the evaluated result helps an administrator understand a network’s weak points and their impacts. Therefore, our risk assessment can be a useful ref-erence in designating security policy and improving network security.

(36)

Impact Severity

Confidentiality Integrity Availability

A tta c k 1 A tta c k 2 A tta c k n -1 A tta c k n … … C o n fig u ra tio n 1 C o n fig u ra tio n 2 C o n fig u ra tio n k C o n fig u ra tio n m -1 C o n fig u ra tio n m … … … …

(37)

3.1 Preliminaries

This section defines the symbols used in Chapter 3.

αi Severity of the ith vulnerability

β Decaying speed of the exponential function

λi Age of the ith vulnerability

Aap_i ith attack targeting on an access point (AP)

Asta_i ith attack targeting on a wireless station (STA)

ahvm(devi) Value of the ith device (devi), determined by the

ag-gregated historical vulnerability measure (AHVM)

D Degree matrix of a given device. The matrix

dimen-sion is na-by-nr. The entry dij is used to represent

the impact that the ith attack Ai imposes on the jth

security requirement.

hvm(seri) Value of the ith service, determined by the historical

vulnerability measure (HVM)

hvm(seri) Normalized hvm(seri)

I Impact severity upon a device

ihvm(devi) Value of the ith device, determined by the integrated

historical vulnerability metric (IHVM)

ihvm(devi) Normalized ihvm(devi)

na Number of attacks

napa Number of attacks targeting on APs

nsta_a Number of attacks targeting on STAs

nd Number of wireless devices in a network

(38)

nsta_d Number of STAs in a network

ng Number of configurations

nr Number of security requirements

ns Number of services running on a device

nv Number of vulnerabilities of a service

ˆ

p Probability vector. The ith entry pi denotes the

prob-ability of acquiring the ith configuration.

ˆ

r Risk level vector. The ithentry rireflects the help that

a captured configuration may offer to an attacker.

T Total impact severity upon a wireless network

ˆ

wg Weight vector of configurations, an na-dimension

col-umn vector. The ithentry wgi reveals the impact

lead-ing to the ith attack Ai, where the impact varies with

the configurations of a wireless system. ˆ

wr Weight vector of requirements. The vector is an nr

-dimension column vector. The ithentry wrirepresents

the weight of the ithsecurity requirement when

deriv-ing the total impact severity.

3.2 Risk Model:

Four-Layer Risk Analytic

Hierarchy

To accomplish the definition of network risk [55], 4-RAH is proposed to model the wireless network risk with four layers: risk, requirements, attacks and configurations, respectively.

(39)

3.2.1 Risk Layer

The first layer (risk layer) only contains a root node, representing the impact severity of a wireless network as the security requirements of the network are not achieved.

3.2.2 Requirement Layer

We introduce the credible network security requirements, confidentiality, in-tegrity and availability, into the 2nd layer of 4-RAH.

• Confidentiality is imperiled when information is available or disclosed to unauthorized users. Different attacks aim for different targets. For instance, an eavesdropping attack launches impacts on network traffic confidentiality, while a penetration attack causes damage to memory data confidentiality. In this paper, loss of confidentiality can occur in multifarious targets according to the types of attacks.

• Integrity is damaged if data or messages are executed, modified, sus-pended, copied, replayed or deleted by an illicit user. Because attackers may be interested in attacking different targets such as network traf-fic or memory data, the integrity mentioned in this dissertation varies with the types of attacks.

• Availability mainly focuses on whether a service operation is affected by an attack, or whether an authorized user can access a network service they should. The availability mentioned in this dissertation is endan-gered if the service or server is spoofed, penetrated, or suspended, and

(40)

cannot operate as expected.

3.2.3 Attack Layer

In 4-RAH, the third layer (attack layer) represents attacks which may dam-age the security requirements listed in the second layer. An attack may pose different impacts on different security requirements, which have specific con-cerns on various targets, such as bandwidth, network traffic, programs, or computers. The targets may suffer different risks even though they are under the same attack. Taking a beacon flood attack as an example, the attack succeeds when targeting on the bandwidth, but fails if it intends to attack a program. In our model, the attack layer analyses the attacks, not only in terms of their behavior, but also the impacts with respect to the attack targets, and the security requirements. In addition, the impact varies with the sequence of attacks. Because the impacts of attacks are dependent on the sequence in which they are carried out, we define two types of impacts to express the relationship in the attacking sequence: direct, and indirect.

• Direct impact: the impact lays on the security requirements initially targeted by an attack.

• Indirect impact: the impact is a side effect accompanied by the direct impact from the previous attack.

For example, an eavesdropping attack imperils traffic confidentiality by ma-liciously sniffing wireless network packets. It poses the direct impact upon traffic confidentiality, and no direct impact on other targets, such as a file or a program. The packets sniffed by an eavesdropper can become a requirement

(41)

for a subsequent attack, such as a replay attack, and thus further endangers traffic integrity. Hence, an eavesdropping attack results in the indirect im-pact on traffic integrity. When evaluating the imim-pacts caused by an attack, the union of direct and indirect impacts should be considered.

After analyzing the existing wireless attacks, we categorize wireless at-tacks into five types, including scan or monitor, masquerade, Denial of Service (DoS), key cracking, and penetration attacks, with respect to their behavior and intentions.

• Type I: Scan or Monitor attacks

Scan attacks intend to search for accessible wireless networks. The monitor attacks aim at gaining useful, critical information of a victim network by intercepting aerial packets, and analyzing network traffic. Such kind of attacks includes war driving, eavesdropping, active scan attacks, etc. Because Type I tries to obtain critical information, most of the attacks of this type directly impact network traffic confidentiality.

• Type II: Masquerade attacks

An attacker masquerades as a legitimate user to access a wireless net-work, or as a legitimate device to pirate network traffic or disable a functioning access point (AP). Once the attacker has snatched the identity of a victim successfully, the victim can no longer access the network, or the attacker can then provide network service to other il-licit users. Thus this type of attack directly impacts availability. With the counterfeit identity, the masqueraded user can easily capture or reach private information so that confidentiality and integrity are

(42)

usu-ally threatened as well.

• Type III: DoS attacks

Denial of Service (DoS) attacks aim at making computers or network resources unavailable to legitimate users. Attackers take advantage of the paralysis period to launch other attacks. Then, they can devastate the network security severely. Because service requests are denied under this type of attack, the direct impact is against availability.

• Type IV: Key cracking

Key cracking attacks try to recover WEP [58, 59] or WPA [60] keys by analyzing numerous packets. After cracking the protection keys, all requirements (confidentiality, integrity, and availability) are harmed.

• Type V: Penetration attack

This kind of attack attempts to penetrate a victim system through system vulnerabilities. After the success of the attack, the attacker can control the files, the programs, even the computer such that data confidentiality, data integrity, or service availability may be destroyed. All three security requirements are threatened under this type of attack.

3.2.4 Configuration Layer

To launch some attacks toward a wireless network, an attacker needs to obtain certain network information or device configurations, such as IP ad-dresses of wireless stations (STA) or APs, Multimedia Access Control (MAC) addresses of STAs or APs, Service Set Identifiers (SSIDs), wireless channels,

(43)

OS versions, running services, etc. In 4-RAH, the 4th layer (configuration layer) exhibits configurations of wireless devices and wireless networks. The following paragraphs discuss some configurations required to launch certain attacks. More configurations can be added to this layer when needed.

• IP address is one of the prerequisite configurations for an attacker to identify a victim in an IP network. Attacks of Type II, III, and V require such a configuration.

• MAC address is one of the configurations required to identify the phys-ical address of a victim. Attacks of Type II, III, and IV require this configuration.

• SSID is one of the prerequisite configurations when an attacker at-tempts to connect or scan a specific wireless local area network. Attacks of Type II, III, and IV need this configuration.

• Wireless channel is one of the configurations required to launch key cracking attacks. Attacks of Type IV require such a configuration.

• OS version is one of the configurations required to obtain the possible vulnerabilities of a victim. Type V attacks require this configuration.

• Running services and open ports are useful configurations to penetrate a victim. Type V attacks need this configuration.

Table 3.1 lists the five attack types, and the relations with the security requirements and prerequisite configurations. Note that an attacker can start Type I attacks without prerequisite configurations, though the performance

(44)

Table 3.1: Types of Attacks

Types Impacts Prerequisite Attacks

Direct Indirect configurations

I C I, A None War driving,

eavesdrop-ping, etc

II C, I, A - STA IP, AP IP, STA

MAC, SSID, etc

Evil twin, IP spoofing,

TCP hijacking, etc

III A - STA MAC, AP MAC,

SSID, etc

Beacon flood, association flood, etc

IV C, I, A - AP MAC, SSID,

chan-nel, etc

WEP/WPA key cracking

V C, I, A - STA IP, ports, running

services, etc

Penetration attack, etc

C: confidentiality I: integrity A: availability

of the attacks can be enhanced if the attacker obtains more network config-urations.

3.3 Integrated Historical Vulnerability

Met-ric

In our risk assessment method, we define an integrated historical bility metric (abbreviated to IHVM), evolving from the historical vulnera-bility measure (HVM) and the aggregated historical vulneravulnera-bility measure (AHVM) proposed in [61], to determine the risk value of a device based on existing vulnerabilities.

(45)

3.3.1 HVM and AHVM

HVM measures the risk level of a service imposed by vulnerabilities of the ser-vice, and weights the vulnerabilities in terms of their ages [61]. The authors of [61] assumed that a vulnerability discovered a long time ago should take a small weight because the vulnerability may be understood and patched with a high probability as time passes by. Therefore, the age of a vulnerability is introduced in the decaying function of Eq. 3.1. [61] showed that hvm(ser) can imply the probability that service ser will become vulnerability-prone in the future. hvm(ser) = ln 1 + nv X i=1 αi× exp (−β × λi) ! . (3.1)

αi and λi indicate the severity and the age of the ith vulnerability, and β

denotes the decaying speed of the exponential function.

Not all of the vulnerabilities of service ser should be counted because the vulnerability effect usually declines with age, approaching zero. If only the latest n vulnerabilities of service ser are considered, then we can derive hvm(ser) by hvm(ser), as represented in (3.2).

hvm(ser) = hvm(ser)

ln (1 + 10 × n), where 0 ≤ hvm(ser) ≤ 1. (3.2)

A combination of hvm(seri) for all services running on a device dev is

(46)

threats that a device dev faces. ahvm(dev) = ln ns X i=1 exp (hvm (seri)) !

, for all services seri running on dev.

(3.3)

However, if there is no vulnerability detected in dev, AHVM outputs an undefined value, ln 0. To address such an error, a new metric (IHVM) is proposed with our four-layer risk assessment model.

3.3.2 IHVM

IHVM is proposed to ensure the existence of the boundary values. In this metric, the notation ihvm(dev) represents the value for a device dev, calcu-lated by IHVM, while ihvm(dev) stands for the normalized ihvm(dev).

ihvm(dev) = ln 1 + ns X i=1 exp hvm (seri) ! . (3.4)

All services seri running on the device dev contribute to ihvm(dev). The

number of services is denoted by ns. The higher ihvm(dev) implies that the

running services may contribute more severity to the device dev. If no service is running on dev, then ihvm(dev) will be set to 0.

After sorting hvm(seri), ∀ service seri running on dev, if we only

con-sider the top m highest hvm(seri), then the maximum ihvm(dev) becomes

ln (1 + m × exp (1)). So, we can obtain the risk level of a single device ihvm(dev) according to the service vulnerabilities by Eq. 3.5.

(47)

ihvm(dev) = ihvm(dev)

ln(1 + m × exp(1)). (3.5)

As a result, ihvm(dev) falls into the range [0, 1].

3.4 Risk Assessment Algorithm

This section explains the algorithm of our assessment measure and represents a step-by-step progress toward the wireless network risk.

Step 1. Establish risk model.

Initially, an administrator needs to build up a 4-RAH, and generate de-gree matrices (D) of devices within a wireless network by investigating possible attacks.

Step 2. Develop experience mapping tables.

Because mobile wireless devices have certain sociological orbit, the se-curity requirements and risks may differ by the position of a sociological orbit. This step intends to introduce expert experiences to adjust fac-tors, and to achieve scenario-adaptive assessment.

To provide a fair or even close to fair assessment, multiple experts could be consulted, and several databases can be imported. In 2005, Zhao et al. proposed a method to evaluate the consistency of expert opinions by the entropy theory [9] . In our method, once an administra-tor develops the experience mapping tables, experts could be consulted

(48)

to approve the experiences shown in the tables. Because the degrees of approval may be categorized into several levels, the consistency of the degrees should be further evaluated. If all the experts show the same degree level of approval, the consistency reaches the maximum. On the contrary, the consistency reaches the minimum if the degree levels distribute equally. In the end, an administrator can obtain the weighted importance from the consistency.

Step 3. Assess network risk.

This step can be further decomposed into several sub-steps.

1. Specify probability vector ˆp, and risk level vector ˆr.

According to network configurations, expert experiences, and vul-nerability databases, we obtain ˆp, and ˆr, where ˆp relies on the encryption method used in a wireless network, and ˆr is deter-mined with three aspects: 1) adoption of a default value of the configuration, 2) the number of attacks that view the configura-tion as a prerequisite, and 3) the ihvm value for the configuraconfigura-tion of “running services.”

2. Determine weight vector of configurations ˆwg.

We can obtain the ith entry of ˆwg by Eq. 3.6. Each entry wgi

reveals the impact leading to the ith _{attack A}

(49)

varies with the configurations of a wireless system. wgi = n X j=1 rj× pj n . (3.6)

where n indicates the number of configurations. If no prerequisite configuration is required, wgi is set to 1, which is the maximum

weight.

3. Determine weight vector of requirements ˆwr.

We determine the value of each entry of ˆwr in terms of the

func-tionalities of a device. For example, the “availability” of an access point should have a heavier weight than “confidentiality” and “in-tegrity” because the AP is in charge of providing Internet access for wireless devices. ˆwr =

1 4 1 4 1 2 T . 4. Determine impact severity upon a device I.

Because the security of a device may suffer more as the number of attacks that pose interests to the device raises, the range of the impact severity upon a device (I) is designed based on the number of attacks (na) targeting on a device dev. We then obtain

the impact severity of the device as

I = ˆwgT × D × ˆwr. (3.7)

(50)

summation of all entries of ˆwr equals 1, I falls within [0, na].

5. Calculate total impact severity upon a wireless network T . Because any device in a network may jeopardize the network se-curity, we accumulate the contribution of each device towards the total impact severity (T ) by Eq. 3.8.

T = log₁₀ nd X i=1 10Idevi ! (3.8)

Because a compromised device or a device with weak configura-tions is usually viewed as a stepping stone by an attacker to prop-agate attacks, the maximum I dominates the result of Eq. 3.8 while the other smaller values are also introduced. We conjecture that the value of T increases as the network becomes risky.

T , which depends on the number of devices and their configu-rations, varies with different network topologies. If there are more devices within a network, the possible maximum value of T be-comes larger. If there are nap_d APs and nsta_d STAs in a wireless network, T then falls within

[log₁₀(nap_d + nsta_d ), log₁₀ nap_d × 10napa _{+ n}sta

d × 10

nsta

a ].

It might be difficult for an administrator to interpret a linguistic meaning from a numerical value of T since T is dynamic with the variation of nap_d , nsta_d , nap_a , and nsta_a . In spite of the dynamics in a possible value of T , the range of T offers a scale to help grasp the linguistic meaning. An administrator may be able to comprehend

(51)

the level of risk (relatively) easily if there is additional information about the scale. Therefore, T

[log10(napd +nstad ),log10(n ap

d ×10n

ap a +nsta

d ×10nstaa )]

could be a solider index of risk.

Besides, we also devise a referable mapping table between a linguistic meaning and a numerical value of T . We first calculate the maximum impact severity of devices in a network, and then define the thresholds for low, medium, and high threats. If all the devices have their impact severity with the maximum value, then we conjecture in such a situation that the network is undoubtedly unreliable, and absolutely insecure. However, not all the networks require such a strict condition. If a very strict condition is set, an administrator may over-ignore unexpected events, and may not deal with the wrong configurations in real-time. Hence, both the ratio of the maximum value of the total impact severity and the ratio of the number of all the devices have to be contemplated for a plausible mapping. The mapping between the numerical risk values and the semantic risk levels is suggested as shown in Table 3.2. The numerical thresholds can be adjusted according to an administrator’s expertise, experience, or sociological orbits if needed.

6. Refresh the topology snapshot.

If new devices or new configurations are detected, the topology snapshot should be refreshed. In our method, it is not neces-sary to re-calculate the corresponding values of all devices. An

(52)

Table 3.2: Numerical Impact Severity vs. Linguistic Meanings

Numerical impact severity (T ) Linguistic

meanings (Threats) log₁₀ 2nap_d 3 · 10 napa 2 +2n sta d 3 · 10 nsta_a 2 , log₁₀nap_d · 10napa _{+ n}sta d · 10n sta a High (in-secure) log10 nap_d 3 · 10 napa 2 +n sta d 3 · 10 nsta_a 2 , log10 2nap_d 3 · 10 napa 2 +2n sta d 3 · 10 nsta_a 2 Medium

log₁₀ nap_d + nsta_d , log₁₀

nap_d 3 · 10 nap_a 2 +n sta d 3 · 10 nsta_a 2 Low (se-cure)

administrator simply executes the sub-steps 3.1 through 3.5 to determine the impact severity upon changing devices, such as the device newly entering the network, and the device whose config-urations have been changed. Then, sub-step 3.6 is performed to re-calculate the total risk of the wireless network.

3.5 Summary

In Chapter 3, we presented a risk assessment method for wireless networks. We described the design of a risk model and explained a newly proposed metric (IHVM). We also introduced a risk assessment algorithm to measure the risk value of a network. The risk model and the algorithm are designed to address the dynamics of a wireless network. Not all layers of the risk model or not all steps of the algorithm are re-generated and re-calculated when changes occur in the network. The idea underlying our risk assessment

(53)

method help produce a real-time reference for a system administrator to manage network security.

(54)

Chapter 4 Formalization of Control-Flow

Obfuscation

This chapter presents an abstract framework for formalizing and modeling many kinds of control-flow obfuscating transformations. In this framework, we first parse a program into a defined control flow graph. Then we identify a set of atomic operators for graph transformations that are guaranteed to preserve the functional behavior of the program. These operators can thus be used as building blocks of a control-flow obfuscating transformation. By composing instances of these atomic operators in sequence, we can formalize many kinds of existing control-flow obfuscating transformations and devise new candidate obfuscating transformations.

4.1 Preliminaries

(55)

Notations

ξ(Ci) Equivalent block of the ith code block Ci

τ Control-flow obfuscating transformation

φ Termination block

ψ Parsed program

Bi The ith branch

Bij The jth piece split from the branch Bi

Ci The ith code block. A code block can be a branch, a

fork, a join or a simple block.

CE The entry point of a parsed program

Cij The jth piece split from the code block Ci

CA Any code block

CD Dummy code block

CF alse False target of a branch

CT Target code block of an atomic operator

CT rue True target of a branch

E Edge set of a directed graph

Fi The ith fork

G Directed graph

Ji The ith join

O Atomic operator for control-flow obfuscation

OL_D Atomic operator of inserting dummy loops

OS_D Atomic operator of inserting dummy simple blocks

OE Atomic operator of replacing a code block with its

(56)

On_F Atomic operator of inserting folks, where n represents the number of code blocks expected to be run in par-allel

OG Atomic operator of inserting folk edges

OF_Op Atomic operator of inserting type I opaque predicates

OT_Op Atomic operator of inserting type II opaque predicates

O?_Op Atomic operator of inserting type III opaque

predi-cates

OR Atomic operator of reordering code blocks

OS,n_S Atomic operator of splitting a simple block into n

pieces

OB,n_S Atomic operator of splitting a branch into n pieces

PF Type I opaque predicate. It is an obscure branch,

which always evaluates false.

PT Type II opaque predicate. It is an obscure branch,

which always evaluates true.

P? Type III opaque predicate. It is an obscure branch,

which sometimes evaluates false and sometimes true.

Si The ith simple block

Sij The jth piece split from the simple block Si

V Vertex set of a directed graph

4.2 Control Flow Graphs

Control flow graphs (CFGs) were developed by Cota et al. [62, 63] as a repre-sentation of the control flow structure of a program, thus can help an analyzer

(57)

understand the program easily [64, 65]. In this dissertation we use CFGs to facilitate the formalization of control-flow obfuscating transformations. As a high-level abstraction, a program can be parsed into a directed graph whose vertices are code blocks of the program. There is an edge between two code blocks if the second code block can be executed immediately after the first.

This dissertation considers both sequential and parallel programs, so a code block in a CFG can be defined by our program parser as one of the followings:

• Branch: A branch refers to an instruction that can cause execution to transfer, either conditionally or unconditionally, to some statement other than the immediately following statement. In high-level pro-gramming languages, branch instructions may be found in for, while, do-while, if-else, and goto statements.

• Fork: A fork is the code block that creates parallel execution. The immediate successors of a fork can run concurrently until the paths converge.

• Join: A join is the code block at which parallel execution paths con-verge.

• Simple block: A simple block is defined as an ordered sequence of state-ments with no outgoing or incoming branch, fork or join instructions inside this code block.

(58)

• Equivalent block (ξ(Ci)): a code block that is functionally equivalent

to the code block Ci.

• Termination block (φ): the exit point of a source program.

The edges in a CFG represent possible execution paths that the program may take. Our program parser also specifies the following types of edges.

• Sequential edge: A sequential edge, denoted by (Ci, Cj), exists between

two code blocks Ci and Cj. Here, Ci can be only a simple block or a

join.

• Branch edge: Since a branch Bi may jump to either its true or false

target, there are two code blocks that could be executed immediately after the branch. The two branch edges leaving Bi are denoted by

Bi, CT rue

T

and Bi, CF alse

F

. Cj is executed while Bi evaluates to

true, so CT rue _{represents the true target of B}

i. Similarly, CF alse is the

false target of Bi.

• Fork edge: Since several code blocks can be executed concurrently right after a fork Fi, there may be several code blocks as the immediate

successors of Fi. A fork edge is represented as (Fi, Cj), and Cj can be

a simple block, a branch or a fork.

With the definitions of the code blocks and the edges, we represent a directed graph by the pair (V, E) where V is the vertex set and E is the edge set. V contains all the code blocks of a program, including simple blocks, branches, forks, joins and a termination. E is composed of sequential edges, branch edges and fork edges. Then, a parsed program ψ is a pair of an entry

(59)

S1 F1 B1 S4 S2 S3 J1 Φ T F

Figure 4.1: Example of the formalization of a parsed program

block of a directed graph and the graph. Figure 4.1 shows an example of a CFG of ψ. The CFG contains four simple blocks, one branch, one fork and one join. A rectangular indicates a simple block; a diamond denotes a branch; a base-down triangular and a base-up triangular represent a fork and a join, respectively. In Figure 4.1, S1 is the entry block, so we obtain a parsed

pro-gram ψ = (S1, G), where G = (V, E), V = {S1, S2, S3, S4, B1, F1, J1, φ}, and

E = {(S1, F1), (F1, B1), (F1, S4), (B1, S2)T, (B1, S3)F, (S2, S3), (S3, J1), (S4, J1), (J1, φ)}. φ is an indication of the end of execution path without code existing in this

從攻擊角度定量評估資訊系統安全性

國立交通大學

電控工程研究所

博士論文

從攻擊角度定量評估資訊系統安全性

Quantitative Assessments of Cyber Security

from the Perspective of Attacks

研究生

： 蔡欣宜

指導教授： 黃育綸博士

從攻擊角度定量評估資訊系統安全性

Quantitative Assessments of Cyber Security from the

Perspective of Attacks

從

從

從攻

攻

攻擊

擊

擊角

角

角度

度

度

定

定

定量

量

量評

評估

評

估

估資

資

資訊

訊

訊系

系

系統

統

統

安

安

安全

全

全性

性

性

摘

摘

摘要

要

要

Quantitative Assessments of Cyber Security

from the Perspective of Attacks

Abstract

Acknowledgement

Contents

List of Figures

List of Tables

Chapter 1

Introduction

Chapter 2

Related Work

2.1

Security Assessment in terms of External

Attacks

2.1.1

Attack Graph-Based Assessment Methods

2.1.2

AHP-Based Assessment Methods

2.2

Security Assessment in terms of Internal

Attacks

2.2.1

Evaluation of Software Obfuscation

2.3

Summary

Chapter 3

Risk Assessment of Wireless

：蔡欣宜

指導教授：黃育綸博士

_全

_全性

_性

_性