A VULNERABILITY PROPAGATION MODEL OF DISTRIBUTED VIRTUALIZED SYSTEMS BASED ON EPIDEMIC MODEL
Cun-Bin Li
1Bao-Jun Sun
1, 2,* Ding Liu
11School of Economics and Management North China Electronic Power University
Beijing, China 100854, P.R.C.
2Department of Computer Information and
Management Inner Mongolia University of Finance and Economics Hohhot, China 010070, P.R.C.
Key Words: vulnerability propagation, distributed virtualized system, cloud computing, epidemic model.
ABSTRACT
Due to the characteristics of dynamics, mobility and virtualization in a distributed virtualized environment, vulnerabilities of systems are more likely to occur and rapidly propagate. This study divides the vulnerability into two categories according to the reliability level of nodes in distributed virtualized systems, and builds a vulnerability propagation model of distributed virtualized systems based on epidemic model. General expressions of vulnerability pro- pagation are derived through model analysis. Finally, the stability of the mo- del is verified by System Dynamics tools.
I. INTRODUCTION
A vulnerability is usually the internal cause of security and dependability problems of systems [1], and is the re- search focus of software development and system testing fields. With the rapid development of cloud computing, the application range of distributed systems is more and more extensive. The scale of distributed systems is large and their characteristics of dynamic, mobility and virtualization are significant. These characteristics will lead to substantial in- crease and rapid vulnerability propagation in whole systems when interconnecting with distributed systems, which is an urgent challenge [2].
Currently, many scholars have begun to study the law of vulnerability propagation, and these studies include: [3] pro- posing a vulnerability propagation algorithm for object-oriented design to calculate the Attribute Vulnerability Ratio (AVR) through a vulnerable tree, and mainly solving the problem of vulnerability propagation between existing codes and new codes
when code reuse. On this basis, [4] some researched the vulnerability propagation caused by class coupling at the de- sign phase from the object oriented design perspective, adopted vulnerable tree to run analysis using Automated Teller Ma- chine (ATM) machines as example. However, in the case of distributed virtualized systems with millions of object classes, the huge scale will lead to state space explosion when using vulnerable tree to ATM machines system for state space search. In addition, the vulnerability analysis models also in- clude the models based on State Transition Diagram and Colored Petri Nets, Attack Graph Model, Requires/Provides Model, Model Checking Method and so on [5]. They also face the problem of state space explosion when system scale is too huge. With regard to risk management of information system, scholars such as Feng [6] built a security risk ana- lysis model to simulate risk factors and the causal relation- ship between them based on Bayesian Network and estimate the most probably vulnerability propagation path and its risk value through Ant Colony Optimization and Delphi Method.
*Corresponding author: Bao-Jun Sun, e-mail: sunbaojun@126.com
De and others [7] researched vulnerability propagation of multi-hop broadcast protocols in wireless sensor networks, and discussed the vulnerability propagation process at dif- ferent action change rates, connectivity and restorability with the help of spyware. Ozment at the University of Cambridge [8] analyzed the existing work on vulnerability discovery mo- dels as basis, and summarizes the evolution of the source code in the OpenBSD operating system with a data set of eight years of vulnerabilities in OpenBSD. Neuhaus and others [9] used Vulture tool to automatically mine existing vulnerability database of the Mozilla project and mapped past vulnerabilities to components. On this basis, forecasted the evolution of vulnerabilities in Mozilla. Hiller and others [10] analyzed the propagation of data errors in modular soft- ware and proposed the positioning method of setting check- points and recovery points. The above studies mainly operate existing system logs with statistics and analysis, lack theo- retical inference model and have excessive system depen- dencies, thus are inadequate to guide the prevention and treat- ment of vulnerabilities at design phase of software or systems.
Moreover, the connections between compute nodes in distri- buted virtualized systems are probabilistic events owing to the presence of virtualization technology, and the existing system logs are difficult to cover all the possible connections. Thus, such studies have certain limitations.
Studies on vulnerabilities of systems involve extensive fields, and the definitions of vulnerability are not the same for different application scopes and emphases. One of the most representative definitions is the “computer vulnerability”
concept proposed by Bishop and Bailey [11]. According to the concept, a vulnerable state is an authorized state from which an unauthorized state can be reached using authorized state transitions, and vulnerability is a characterization of a vulnerable state which distinguishes it from all non-vulnerable states. The components with vulnerabilities are usually called vulnerable node. The famous compartmental model in epi- demiology (also known as SIR model) proposed by Ker- mack and McKendrick [12] for the studies on the propagation of the Black Death in London and 1906 Mumbai plague lays the foundation of infectious disease dynamics. Judging from the current literature available, studies on vulnerability pro- pagation of distributed virtualized systems based on epidemic model are still rare. Therefore, this paper will build the vulnerability propagation mechanism model of distributed virtualized systems with the thinking and method of epi- demic model, and simulate with System Dynamics tools, in order to derive the vulnerability propagation mechanism in systems.
II. MODEL BUILDING
A distributed virtualized system is a general-purpose computing environment with distributed virtualized techno- logies as core technologies [13]. Typical forms of comput- ing include Overlay Network, Cloud Computing System, and Future Network and so on. An important feature of dis- tributed virtualized systems is the separation of virtual com- ponents and physical resources, the components that constitute services are deployed on remote “fuzzy” cloud. Based on comparative analysis of the vulnerability propagation of distributed virtualized systems and the virus propagation in biological communities, this chapter concludes the charac- teristic elements of distributed virtualized systems and raises the corresponding hypothesis, and then builds the vulnerabi- lity propagation model of distributed virtualized systems based on epidemic model.
1. Model Hypothesis
The research and application of epidemic models has become an important field of applied mathematics. Com- bining with the basic characteristics of the vulnerability pro- pagation of distributed virtualized systems, several hypotheses are given as follows:
i. According to different vulnerability states of nodes at a certain time, the nodes in distributed virtualized sys- tems is divides into three categories: (a) W-nodes (W for short), vulnerable nodes w W that contain known vulnerable points (they are the vulnerability propagation sources); (b) V-nodes (V for short), the nodes v V that contain potential vulnerability. These nodes, though showing no vulnerabilities at present, have the potential to convert into vulnerable points, for they may be in- filtrated when connected to the first type of nodes; (c) F-nodes (F for short), faulted nodes f F that have resulted in failed and are not repaired in time (they will no longer affect the existing vulnerability propagation).
(d) R-nodes (R for short) repaired nodes r R, repair method can be online upgrade, patch and reconfigura- tion. Further, V-nodes is divides into two categories, write V
1-nodes (V
1for short) for the set of the nodes with high reliability levels and V
2-nodes (V
2for short) for the set of the nodes with low reliability levels.
The total number of nodes, written M. M = V
1V
2W F R.
ii. A distributed virtualized system is open-ended and ex-
tensible; some nodes continuously enter the system, mean-
while some nodes exit after completing the calculation
tasks. To show this feature of vulnerability propaga- tion, the model assumes that the three types of nodes in distributed virtualized systems all have the possibility to exit the current system due to scheduling. Assume that the new nodes that enter the system and the old nodes that exit the system are equal in number, denoted as b.
Then denote pbM and (1-p) bM separately for the num- bers of V
1and V
2that enter the system in unit time. And write bV
1and bV
2separately for the numbers of V
1and V
2that exit the system in unit time. The numbers of W and F that exit the system in unit time are separately denoted as bW and bF. The numbers of R that exit the system in unit time are denoted as bR. The total num- ber of nodes in the system maintains the constant M.
iii. In a distributed virtualized system, the two types of nodes that contain vulnerable points, V
1and V
2, have unequal abilities of propagating vulnerabilities. V
2has a low reliability level, which means it has more serious secu- rity issues and more powerful ability to propagate vulner- abilities. Assume that their success rates of propagation are
1and
2, then
1<
2. Assume that the probability of contact between nodes is . Then for V
1, the pro- bability of being infected by W is:
1W
M
and for V
2,
the probability of being infected by W is
2W M
.
iv. At the instant t, in unit time, the number of W-nodes that experience failovers and convert into F-nodes is proportional to the number of its own, and the scale factor is .
v. The repair ability of V
1and V
2and W are set separately
1,
2and
3, and three kinds of nodes are transformed into R after repair.
2. Model Building
Based on the above hypothesis, the propagation process of vulnerable nodes in distributed virtualized systems is de- scribed as Fig. 1.
According to Fig. 1 and referring to [14], the correspond- ing DS-I-A model is built as model (1):
1 2
1 2 3
1 1
1 1 1 1
2 2
2 2 2 2
3 1 1 2 2
d = +
d
d = -
d
d = (1 ) - -
d d
d
d = -
d
W W W
V V - W W - bW
t M M
V W
bpM - V V - bV
t M
V W
b - p M - V V bV
t M
R W V V bR
t F W bF
t
(1)
bV1
bV2
pbM
bW
F bF W γW
2V2 λ
1V
1
λ
3W R λ
bR
(l-p)bM V1
V1
M α β1 W
M α β2 W
Fig.1 The framework chart of vulnerability propagation in distributed virtualized systems
Calculate the sum of the both sides of the five equations in model (1) separately, then d(
1+
2+ +
)= d 0
d d
W +V V R F M
t t ,
that is: M(t) = W(t) V
1(t) V
2(t) R(t) F(t) = M.
The quantity changes of nodes in various states are characterized by five differential equations in the model (1).
The left side of the equation represents the number of nodes changed by time, and items on the right side shows all in- fluence factors of this kind variety, for instance, the change rule of W is indicated by the first equation. There are two direct sources of the inflow and three outflows as shown in the Fig. 1. The rate change of each item is illustrated by the hypothesis in the section II.1.
III. MODEL ANALYSIS
1. Vulnerability Propagation Regeneration Number
To get the equilibrium point of model (1), set the left side of the first four equations as 0, then obtain model (2) as follows:
1
1 2
1 2 3
1 1 1 1
1
2 2 2 2
1 1 1 2 2
+ =
=
(1 ) =
=
W W
V V W W bW
M M
bpM W V V bV
M
b p M W V V bV
M
W V V bR
(2)
In Mathematical Biology, the basic reproduction num-
ber, usually recorded as R
0, is defined as the number of se-
condary infections produced over the average period of
illness when one infected individual is introduced into a host
population [15]. In this paper, the basic reproduction number
is defined as the number of V, which can be successfully
infected by a W within a cycle. According to [16], the value of R
0will directly affect the equilibrium state of the system, if R
0< 1, the equilibrium point E
0will gradually stabilize, and if R
0> 1, E
0will be unstable. Through model (3), cal- culate the equilibrium point E
0with no vulnerability pro- pagation (V
1= pM, V
2= (1-p) M, F = 0).
The Jacobian of model (2) at point E
0is:
1
2
1 2
0
: 0 (1 )
0 0 ( b) (1 )
b p
J b p
p p
It is obvious that as long as ( b)
1p
2(1 p), all eigenvalues of J will be negative. On these grounds, the basic reproduction number R
0of distributed virtualized sys- tems is defined as model (3):
1 2
0
(1 )
p p
R b
(3)
2. The Equilibrium Point and Stability of Vulnerability Propagation
When R
0> 1, the equilibrium points and stability of the model (1) are discussed below.
The solutions of the Eqs. (2) and (3) in the model (1) can be obtained as follows:
1 1
1
bpM M b V W
(4)
1 2 2
(1 ) b p M
b
V W
M
(5)
Substitute the formula (4) and (5) into the equation 1 in the model (1) and obtain:
1 1 3
1 2
1 2
(1 0
bp b p) b
W W
b b
M M
(6)
Therefore, only when there is a positive value of W in the formula (6), Model (1) will be the equilibrium points of vulnerable propagation.
Let the left of the formula (6) as a function F:
1 1 3
1 2
1 2
( ) bp b(1 p) b
F W W W
b b
M M
(7)
Because F W '( ) 0 and lim ( )
30
W
F W b
,
the function F(W) is a monotonically decreasing function, it is inferred that only when F(W) > 0, variable W in the for- mula (6) will have a positive solution, and because R
0> 1,
3
( ) b (
01) 0
F 0 R
can be obtained.
Above all, when R
0> 1, model (1) will be the equilib- rium points of vulnerable propagation.
3. The Equilibrium Point and Stability of Non-Vulnerability Propagation
The stability of the equilibrium point E
0is discussed in the condition of R
0< 1. According to the Eqs. (2) and (3) in the model (1), it can be obtained as below:
1
1 1
2
2 2
( ) ,
(1 ) ( )
dV bpM b V
dt
dV b p M b V
dt
(8)
According to the formula (8), it can be inferred that:
1
2
( )
1 1
( )
2 2
( ) (0) ,
( ) (1 ) (0)
b+ t
b+ t
V t pM +V e V t p M +V e
(9)
Further, the following results can be obtained by the first equation of model (1):
1 0 1
2
2 3
0
1 1 2 2
1
3 1
1 2
1 1 2 2
3
3
( ) (0) exp( ( ) d
( ) d - )
(
(0) exp ( )) (0)
(0)
( )( 1)
(0) exp
t
t
W t I V x x
M V u u - t - bt t M
p p
I b t V
M M V
p p
b t
I b
!
!
!
!
!
!
!
" #
$
$
1 1
1 2
3 0
1 1
1 2
(0) (0) (0) exp(( )( 1) )
exp (0) (0) 0
V V
M M
I b R t
V V
M M
!
!
!
!
!
" #
!
" #
(10)
It can be seen that at the time t and conditions R
0<
1 are met, W(t) will tend to 0.
Table 1 Simulation variables description
Variables Description Unit
state variable
Vulnerable 1 The number of nodes with high reliability level (V
1) individual Vulnerable 2 The number of nodes with low reliability level (V
2) individual
Infectious The number of nodes with vulnerabilities (W) individual
Faulted The number of faulted nodes (F) individual
Rate variable
InfectionRate 1 The propagation rate of V
1(
1W M
)
individual/t
InfectionRate 2 The propagation rate of V
2(
2W M
) individual/t
FaultRate The failure rate ( W) individual/t
Inward_V
1The rate of V
1entering into the system (pbM) individual/t Inward_V
2The rate of V
2entering into the system ((1-p)bM) individual/t OutwardNode_V
1The rate of V
1withdrawal from the system (bV
1) individual/t OutwardNode_V
2The rate of V
2withdrawal from the system (bV
2) individual/t OutwardNode_W The rate of W withdrawal from the system (bW) individual/t OutwardNode_F The rate of F withdrawal from the system (bF) individual/t
Auxiliary variable
InfectionRateAux 1 The auxiliary variable of infectionRate 1 individual/t InfectionRateAux 2 The auxiliary variable of infectionRate 2 individual/t FaultRateAux The auxiliary variable of FaultRate individual/t
Constant
ContactRate Contact rate ( ) individual/t
Infectivety 1 The propagation absorptivity of V
1(
1) % Infectivety 2 The propagation absorptivity of V
2(
2) % InCoeffcient The coefficient of nodes entering the system b % OutCoeffcient The coefficient of nodes withdrawal from the system
b %
FaultCoeffcient Coefficient of node failure ( ) %
TotalNode Total number of nodes in the system (M) individual
Parameter The proportion of nodes (p) %
1
2
( )
1 1
( )
2 2
( ) ( (0))
( ) (1 ) ((1 ) (0))
b+ t
b+ t
V t pM pM V e
V t p M p M V e
(11)
From the formula (11) can be seen, when t , V
1(t) and V
2(t) respectively tend to pM and (1 p)M.
Solving equations composed of the Eqs. (2) and (3) in the model (2), the results can be obtained
IV. NUMERICAL SIMULATION
Redesign the parameters in model (1), and convert them into system dynamics variables. Then use the model- ing tool Vensim [17] to build a system dynamics model of vul- nerability propagation in the distributed virtualized system.
As shown in Fig. 2, the main variables described in Table 1.
On the basis of hypothetic data in the system dynamics mo- del, perform simulation experiment. Table 1 shows the si- mulation variables description and t represents a unit of time.
1. The Simulation Experiment I
Set the number of nodes in the system as 3000. Set unit time as one day. In the initial state of the distributed virtu- alized system, there are 750 nodes that contain vulnerable points, where, the proportion of nodes with high reliability, namely V
1, is 0.25, 2. They are the vulnerability propaga- tion sources. Further, assume
1= 0.13,
2= 0.19, = 0.5, b = 0.02, = 0.08. Then
0 1 2(1 )
p p
R b
0.25 0.12 0.13 0.75 0.19 0.5
0.86 1 0.08 0.02
% % % %
.
Reparir Coefficient 1
Reparir Coefficient 3 Reparired Rate 1
Reparired Rate 3
Reparired Rate 2 Infectivity 2 Infectivity 1
Infection Rate Aux 2 Infection Rate Aux 1
Infection Rate 1
Infection Rate 2
Faulted Rate
Fault Coefficient Contact Rate
parameter InCoefficient
OutCoefficient
Total Node
Outward Node R
Outward Node W
Outward Node V1
Outward Node F
Outward Node V2
R W
Vulnerable 1
Vulnerable 2
Faulted
Reparir Coefficient 2 Inward V1
Fig. 2 System dynamics model of node vulnerability propagation in distributed virtual system
0 50 100 150 200 250 300 350 400
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Vulnerable 1 Vulnerable 2 W Faulted
Fig. 3 The simulation result of the propagation of the two types of nodes with
As can be seen from Fig. 3, when the basic reproduction number R
0< 1, though there are 750 vulnerability propa- gation sources in the initial state of the system, after about 280 days, the numbers of W and F both tend to 0, whereas the number of V
1tends to 500 and the number of V
2tends to 1500, which illustrates that the distributed virtualized sys- tem gradually comes to stability at the equilibrium point E
0. In unit time, the number of W-nodes which exit the current distributed virtualized system or convert into F is greater than the number of the nodes infected by and converted to V, which results in fewer the number of W, until zero.
2. The Simulation Experiment II
Set the number of nodes in the system as 3000. Set the unit time as one day. In the initial state of the distributed vir- tualized system, there are 800 nodes that contain vulnerable points, where, the proportion of the nodes with high reliability, namely V
1-nodes, is 0.25. Further, assume
1= 0.22,
2= 0.39,
= 0.5, b = 0.02, = 0.08. Then R
0p
1(1 p )
2b
0.25 0.22 0.5 0.75 0.39 0.5
1.695 1 0.08 0.02
% % % %
, The simula-
0 50 100 150 200 250 300 350 400
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Vulnerable 1 Vulnerable 2 W Faulted
Fig. 4 The simulation result of the propagation of the two types of nodes with
tion results are shown in Fig. 4.
As can be seen from Fig. 4, when R
0> 1 , after about 300 days, the vulnerability propagation in the distributed vir- tualized system gradually comes to stability, where the num- ber of V
1-nodes maintains around 344, and V
2around 830, W around 165, F around 660. The result illustrates that the nodes with low reliability are more likely to be affected by the vulnerability propagation and the equilibrium point E
0here tends to global stability.
3. The Simulation Experiment III
The equilibrium point of the vulnerability propagation under the influence of the varying proportion of nodes with high reliability level in a distributed virtualized system.
On the basis of the assumption in the simulation experiment I, the proportion of nodes with high reliability (p) increases from 0.25 to 0.75. Other remains unchanged. Then R
0=
1
(1 )
20.75 0.12 0.13 0.25 0.19 0.5 0.08 0.02
p p
b
% % % %
0.592 < 1, The simulation result is shown in Fig. 5.
As can be seen from Fig. 5, after about 240 day, the vul-
0 50 100 150 200 250 300 350 400 0
200 400 600 800 1000 1200 1400 1600 1800 2000
Vulnerable 1 Vulnerable 2 W Faulted
Fig. 5 The simulation result when R
0Ľ 1 and p increases from 0.25 to 0.75
0 50 100 150 200 250 300 350 400
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Vulnerable 1 Vulnerable 2 W Faulted