Balancing Traffic Load for Devolved Controllers in Data Center Networks
Wanchao Liang†, Xiaofeng Gao†, Fan Wu†, Guihai Chen†, Wei Wei‡
†Department of Computer Science and Engineering, Shanghai Jiao Tong University
‡Department of Computer Science, Stanford University
[email protected],{gao-xf, fwu, gchen}@cs.sjtu.edu.cn, [email protected]
Abstract—Using a centralized controller for resource manage- ment and coordination is a common practice in cloud services.
For scalability concern, in recent literature a novel approach, namely devolved controllers, was proposed. Such approach splits the network into regions, while each controller only monitors a portion of the traffic. This technique alleviates scalability issue, but brings other critical problems, such as unbalanced work load among controllers and reconfiguration complexities. In this paper, we investigate the usage of devolved controllers for large-scale data centers, and design a new scheme to overcome shortcom- ings, and to improve system performance. We first define Load Balancing problem for Devolved Controllers (LBDC), and prove its NP-completeness. For LBDC, we design an f -approximation, where f is the largest number of potential controllers for a switch in the network. We also propose both centralized and distributed approaches to solve LBDC time effectively. The numerical results validate our designs, which become a solution to manage and coordinate large-scale data centers.
I. INTRODUCTION
In recent years, data center has emerged as an infras- tructure that holds thousands of servers and supports many cloud services like computing, group collaboration, storage and financial applications, etc. The fast proliferation of cloud computing has promoted a rapid growth of large-scale com- mercial data centers. Companies such as Amazon, Cisco, Google and Microsoft have made huge investments in Data Center Networks (DCNs) for improvement.
Typically, a DCN has a centralized controller to monitor, manage network resources, and update routing information [1]
[2] [3]. For instance, Hedera [4] and SPAIN [5] use a controller to collect the traffic statistics and reroute flows. Controller also provides address look-up services for VM migrations [6].
However, for large-scale DCN with thousands of racks, centralized controller suffers from many problems, such as scalability and availability. Driven by unprecedent scale and control objectives, researchers tried to deploy multiple con- trollers in DCNs [7]–[11]. The concept of devolved controllers is thereby introduced in [7], where they used dynamic flow [8]
to illustrate the detailed configuration. Devolved controllers are a set of controllers that function as an omniscient controller.
However, none of them have the entire information. Instead, every controller only maintain partial information beforehand,
This work was supported by the State Key Development Program for Basic Research of China (973 project 2014CB340303 and 2012CB316201), China NSF grant (61202024, 61272443 and 61133006), Shanghai Sci- ence and Technology fund (13PJ1403900, 12PJ1404900, 12ZR1445000, and 12ZR1414900), and Shanghai Educational Development Foundation (Chen- guang Grant No.12CG09). X. Gao is the corresponding author.
thus reduces the workload significantly.
Multi-controller technique alleviates the scalability prob- lem, but still has several issues to explore. Firstly, pre- computed multipaths must be recalculated and network must be reconfigured if we expand the current network, which brings updating difficulties and heavy load for computation. Secondly, the flow queries will send to every controller, which makes controllers relatively busy and ignores the distance between senders and receivers. Thirdly, the pre-computation is actually centralized when configuring the network, which does not fit the distributed and MapReduce related applications. With the advent of Software Defined Network (SDN) [13] as proposed by OpenFlow [14], data center networking become revolution- izing in the industry, and several papers in designing distributed controller [9]–[12] appear. In [9], the authors overcome the limitation of statically configured mapping between switches and controllers, and propose migration protocol. Thus it is desirable to design an efficient strategy for devolved controllers to better manage the network traffic, and reduce routing cost.
Motivated by these challenges, in this paper, we propose a novel scheme to manage the traffic within OpenFlow frame- work. In our scheme, each controller monitors the traffic of switches locally. When traffic load imbalance occurs, some controllers will migrate part of their work to other controllers to keep the workload dynamically balanced. We define this problem as Load Balancing problem for Devolved Controllers (LBDC). We prove that LBDC is NP-complete, and then design one linear programming with deterministic rounding approximation, one centralized and one distributed algorithm, to dynamically balance the traffic load among controllers.
Such methods can avoid the emergence of traffic hot spot, which will degrade network performance. Also, these schemes can significantly improve the availability and throughput of DCN. To the best of our knowledge, we are the first to discuss workload balancing problem among multi-controllers in DCNs, which has both theoretical and practical significance.
The rest of the paper is organized as follows. Section II presents the scenario and problem statement. Section III and IV presents LBDC solutions. Section V exhibits our performance evaluation. Finally, Section VI concludes the paper.
II. PROBLEMSTATEMENT
Traffic in DCN can be considered as Virtual Machine (VM) communication. VMs in different servers collaborate together to complete designated tasks. In order to communicate between VMs, communication flow will go through several switches.
Base on OpenFlow [14] concept, each switch has a flow
table, and one responsibility of a controller is to modify these flow tables when communication occurs. Every controller composes of several hierarchical switches. Moreover, every rack has a server called designated server [15], which is responsible for aggregating, processing and sending the traffic statistics to the controller. By receiving these data, the con- troller assigns a routing component to compute flow reroute.
Then the controller installs the new route to all associated switches by modifying their flow tables.
Now we define our problem formally. In a typical DCN, denote si as the ith switch, with the corresponding traffic weight w(si), which is defined as the number of out-going flows. Next, given n switches S = {s1, · · · , sn} and m con- trollers C = {c1, · · · , cm}, we make a weighted m-partition for switches such that each controller will monitor a subset of switches. The weight of a controller w(ci) is the weight sum of its monitored switches. Due to physical limitations, assume every si can only be monitored by its potential controller set P C(si). Every ci can only control switches in its potential switch set P S(ci). After the partition, the real controller and switch subset is denoted by rc(si) and RS(ci) respectively.
The symbols used in this paper are listed in Table I:
TABLE I. DEFINITION OFTERMS
Term Definition
S, si switch set consists of n switches: S={s1, · · · , sn} w(si) weight of si, defined as the number of out-going flows.
P C(si) Potential Controllers set of ith switch.
rc(si) the real controller of ith switch.
C, ci controller set consists of m controllers: C={c1, · · · , cm} w(ci) the weight of ith controller, sum of RS(ci)’s weight.
P S(ci) Potential Switches set of ith controller.
RS(ci) Real Switches set of ith controller.
AN (ci) Adjacent Nodes (1-hop neighborhood) of ith controller.
To keep the quality of network management, each con- troller should have nearly the same workload. Otherwise, if all switches always communicate with the same con- troller, it will bring bottleneck congestion, hence down- grade performance. To precisely quantify the balancing per- formance among controllers, we define Standard Deviation of the partitions’ weights as the metric, denoted by σ = q1
m
Pm
i=1(w(ci) − w(c))2, where w(c) is the average weight of controllers. If the traffic flows vary as system running and the weight of ci grows explosively, then we must regionally migrate some switches in RS(ci) to other available controllers to reduce its workload and keep the traffic balanced.
Then our problem becomes balancing the weight among m partitions. We define this problem as Load Balancing problem for Devolved Controllers (LBDC). In our scheme, each controller can dynamically migrate or receive switches to keep load balanced. Figure 1 illustrates the migration pattern.
&RQWUROOHUFL &RQWUROOHUFM
PLJUDWLRQ
Fig. 1. An example of regional balancing migration. Controller cjdominates 17 switches and Controller cidominates 13 switches. The traffic between ci
and cjis unbalanced, and cjis migrating one of its switch to ci.
Define xij =
1 If ci monitors sj
0 otherwise , Then LBDC can be further formulated as an programming:
min
r
1 m
Pm i=1
Pn
j=1w(si) · xij− w(c)2
(1) s.t. w(c) = m1 Pm
i=1
Pn
j=1w(sj) · xij (2) Pm
i=1xij = 1, ∀1 ≤ j ≤ n (3) xij = 0, if sj6∈ P S(ci) or ci 6∈ P C(sj), ∀i, j (4)
xij∈ {0, 1} ∀i, j (5)
Eqn.(1) is the standard deviation, Eqn.(2) calculates the av- erage weight among controllers, Eqn.(3) means each switch should be monitored by exactly one controller, Eqn.(4) is the regional constraints, while Eqn.(5) is the integer constraints.
Theorem 1. LBDC is NP complete.
Proof. We will prove the NP completeness of LBDC by considering a decision version of the problem, and show a reduction from PARTITION problem [16]. An instance of PARTITION is: given a finite set A and a size(a) ∈ Z+ for each a ∈ A, is there a subset A0 ⊆ A such that P
a∈A0size(a) = P
a∈A\A0size(a)? Now we construct an instance of LBDC. In this instance there are 2 controllers c1, c2 and |A| switches. Each switch sa represents an element a ∈ A, with weight w(sa) = size(a). Both controller can control every switch (P S(c1) = P S(c2) = {sa|a ∈ A}).
Then, given a YES solution A0 for PARTITION, we have a solution RS(c1) = {sa|a ∈ A0}, RS(c2) = {sa|a ∈ A\A0} with σ = 0. The reverse part is trivial. The reductions can be done within polynomial time, which completes the proof. 2 Next we present our solutions for LBDC. We implement the scheme within OpenFlow, which changes the devolved controllers from a mathematical model into an implementable prototype. Also, our scheme is topology free, which is scalable for any DCN topology like Fat-Tree, BCube, Portland, etc.
III. LINEARPROGRAMMING ANDROUNDING
Given the traffic status of the current DCN with devolved controllers, we can solve LBDC using programming (1)-(5).
To simplify this programming, we can transfer it into a similar integer programming. Firstly, we can convert the standard deviation (1) to the sum of absolute values:
min m1 Pm i=1|Pn
j=1w(si) · xij− w(c)| (6) Then we rewrite Eqn.(6), and obtain the integer programming:
min m1 Pm
i=1yi (7)
s.t. yi ≥Pn
j=1w(si) · xij− w(c) (8) yi ≥ w(c) −Pn
j=1w(si) · xij (9) w(c) = m1 Pm
i=1
Pn
j=1w(sj) · xij (10) Pm
i=1xij = 1, ∀1 ≤ j ≤ n (11) xij = 0, if sj6∈ P S(ci) or ci 6∈ P C(sj), ∀i, j (12) xij∈ {0, 1} ∀i, j (13) In general, integer programs may not be easily solved in polynomial time, so we adopt relaxation to transfer our integer programming into a linear programming (LP). Then we can get a fractional solution and round it to a feasible
solution of the original integer programming. To obtain the linear programming, we replace Eqn.(13) with xij ≥ 0 (∀i, j).
After solving this LP, we recover a feasible solution to LBDC by a deterministic rounding [17] stated as follows:
Algorithm 1: Deterministic Rounding (LBDC-R)
1 for Each switch sj do
2 Search the solution space of LP:
3 Let l = arg max
i
{xij | 1 ≤ i ≤ m};
4 if ∃ several maximal xij then
5 Let l = arg min
i
{Pn
j=1w(sj) | each max xij}
6 Round xlj= 1;
7 for ci 6= cl do
8 Round xij= 0;
For instance, if the switch j has x1j = 0.2, x2j = 0.7, x3j = 0.1 in the solution space of LP, then according to Alg. 1, we round x2j= xlj= 1, x1j = x3j = 0. We claim that the solution is feasible for LBDC.
Theorem 2. LBDC-R results in a feasible solution for LBDC.
Proof. According to LBDC-R, for each sj, we only round the maximum xij = 1, ∀1 ≤ i ≤ m, all the other xij= 0. Each switch is dominated by only one controller and no switches are idle. Thus we can get a feasible solution for LBDC. 2 Next we analyze the performance of LBDC-R. We define Z∗,ZLP and ZR as the integer programming solution, linear programming solution and the solution after the rounding process respectively. f is defined as the maximum number of controllers in which any switch potentially appears. More formally, f = max
i=1,...,n|P C(si)|. We claim that LBDC-R is an f -approximation. To prove it, we first prove lemma 1 and 2.
Lemma 1. w(c)LP = w(c)∗= w(c)R
Proof. From the definition of the original w(c), the ideal weight of each controller is the sum of the weight of all switches divided by the number of controllers. This definition is suited for all the solution space, thus we can conclude that w(c)LP = w(c)∗= w(c)R=m1 Pn
i=1w(si). 2
Lemma 2. xRij ≤ xLPij · f
Proof. We have the constraintPm
i=1xLPij = 1 (∀1 ≤ j ≤ n). Also xLPlj is the largest of all xLPij (∀1 ≤ i ≤ m), then by Pigeonhole principle, we must have xLPlj · f ≥ 1. Since for each sj, xRlj equals to 1 and others equal to zero, which is less than or equal to the corresponding LP solution times the f factor. Then for any ci, we have xRij ≤ xLPij · f . 2 Theorem 3. LBDC-R is an f -approximation algorithm.
Proof. Since the LP is a relaxation, we have ZLP ≤ Z∗. Also we have Z∗ ≤ ZR because the solution of LBDC-R is feasible by Theorem 2, while Z∗denotes the optimal solution.
Because w(c) means the ideal weight of each controller, it must be the same in all the solutions according to Lemma 1, thus we let w = w(c). From ZLP ≤ Z∗ we can derive:
1 m
m
X
i=1
n
X
j=1
w(si) · xLPij − w ≤ 1
m
m
X
i=1
n
X
j=1
w(si) · x∗ij− w
Since we already know |x| − |y| ≤ |x| + |y|, we can get:
1 m
m
X
i=1
n
X
j=1
w(si) · xLPij ≤ 1
m
m
X
i=1
n
X
j=1
w(si) · x∗ij + 2w The approximation ratio can be obtained by the following:
1 m
m
X
i=1
n
X
j=1
w(si) · xRij− w ≤ 1
m
m
X
i=1
n
X
j=1
w(si) · xRij + w
≤ 1 m
m
X
i=1
n
X
j=1
w(si) · xLPij · f + w
≤ f · 1 m
m
X
i=1
n
X
j=1
w(si) · x∗ij
+ (1 + 2f )w
= f · OP T + (1 + 2f )w
Therefore LBDC-R is an f -approximation. 2 IV. ALGORITHMDESIGN
Linear programming and rounding can solve LBDC the- oretically. But solving an LP is time consuming and not practical for real-world applications. Therefore it is essential to design efficient and applicable algorithms. In this section, we propose centralized and distributed greedy algorithms for LBDC. Centralized scheme is suitable for relatively small scale DCNs, while distributed is natural for huge scale DCNs.
A. Centralized Migration
Centralized Migration splits into two phases. First we need to configure and initialize the DCN. As the traffic changes dynamically, we come to the dynamical migration phase.
Centralized Initialization: First we need to initialize DCN and assign switches to controllers, satisfying the load balance requirement. So we design centralized initialization algorithm (LBDC-CI). In order to get rid of dilemmas when selecting conflicted switches/controllers, we first present Break Tie Law.
Break Tie Law: 1) When choosing si from S, we select the largest weight one. If there are several switches, the one with the smallest |P C(si)| is preferred. If there are still several candidates, pick randomly. 2) When choosing ci from C, we select the minimum weight one. If there are several controllers, the one with the smallest |RS(ci)| is preferred. If there are still several candidates, we choose by physical distance. Finally, if we still cannot make decision, pick randomly.
Then we design LBDC-CI as shown in Alg. 2.
Algorithm 2: Centralized Initialization (LBDC-CI) Input : S with w(si); C with w(ci);
Output: An m-Partition of S to C
1 RemList={s1,s2,· · · ,sn};
3
3 while RemList 6= ∅ do
4 Pick si from RemList;
5 if |P C(si)| = 1 then
6 Assign si to its unique controller in P C(si);
7 else
8 Assign si to the cj with min w(cj) in P C(si);
9 Remove si from RemList;
LBDC-CI needs O(n) to assign the switches in RemList.
In while loop, it takes O(f ) to select a cj. Hence the worst case
running time is O(n2). If we store the RemList in priority heap, we can reduce the overall running time as O(n).
As system runs, traffic may vary frequently and affect the balanced status. Correspondingly, we design the centralized migration algorithm (LBDC-CM) to alleviate the situation.
Centralized Regional Migration: Since we must assess when the controller should execute migration, we set a thresh- old to judge the traffic status. When the controller’s traffic degree exceeds the threshold, we regard this controller as unbalanced that needs migration. Some measurement studies [18] [19] of data center traffic have shown that data center traffic is expected to be linear. We set the threshold upon the current traffic sample and the history record to mimic RTT and Timeout of TCP [20]. This linear expectation use two factors α and β depending on the traffic features of DCN, where 0 ≤ α ≤ 1 and β > 1. We divide the time into several rounds and run LBDC-CM periodically. Then we use T hd and Ef n to denote the parameters of threshold and effluence, Avglast
and Avgnow to represent the average workload of the last and the current sample round. In each round, we sample the current weight of each node and calculate Avgnow= Σw(ci)/m.
The Linear Expectation can be computed as follows:
T hd = Avgnowα + Avglast(1 − α), Ef n = β × T hd The core principle of LBDC-CM is migrating heavy switches to light controllers greedily. Figure 2 and Alg. 3 il- lustrates the workflow and procedure of centralized migration.
Algorithm 3: Centralized Migration (LBDC-CM) Input: S with w(si); C with w(ci);
P endList = OverList = {∅};
1 Step 1: Add ci→ OverList if w(ci) > Ef n;
2 Step 2: Find cmof max weight in OverList;
3 if ∃cn∈ AN (cm) : w(cn) < T hd then
4 repeat
5 Pick smof max weight in cm, refer P C(sm):
6 if ∃cf ∈ AN (cm) && w(cf) < T hd then
7 Send sm→ cf;
8 else
9 Ignore the current sm in cm;
10 until w(cm) ≤ T hd or w(cf) ≥ T hd;
11 if still w(cm) > Ef n, move cmto P endList;
12 else
13 Move cm from OverList to P endList;
14 Step 3: Repeat Step 2 until OverList = {∅};
15 Let OverList = P endList, Repeat Step 2 until P endList become stable;
16 Step4: Now P endList has several connected components CCi(1 ≤ i ≤ |CC|);
17 for each CCi∈ CC do
18 Search theS
cj∈CCiAN (cj);
19 Compute avglocal= w(CC|CC i∪AN (CCi))
i|+|AN (CCi)|;
20 if w(cj) > γ · avglocal, wherecj∈ CCi then
21 Migrate the smax∈ RS(cj) to cmin∈ AN (CCi) repeatedly until w(cj) ≤ γ · avglocal;
22 remove cj ∈ CCi from P endList;
23 Step5: Repeat Step 4 until P endList become stable.
LBDC-CM searches OverList to find cm in Step 2, which takes O(n). Next, it migrate switches from OverList, which takes O(n2). Step 3 invokes Step 2 several times until OverList is empty and makes the P endList become stable, which takes O(n3). Step 4 and Step 5 balance the P endList locally as Step 2 and 3, so the worst case running time is O(n3). Also by storing the OverList and P endList in priority heap, we can reduce the complexity to O(n2).
Unbalanced State Traffic Varies
Network Initialization
w(c)>Efn check next controller
Regional Migration
State No
Yes check
all?
No Migration
Completed Yes
_ g
Fig. 2. Dynamic Load Balancing Workflow of LBDC
B. Distributed Migration
Centralized algorithm is suitable for relatively small-scale network because of its accuracy. But for large-scale network, we need to design a faster and practical distributed algo- rithm [21]. We assume a synchronous environment to perform our two phase algorithm.
Distributed Initialization: During this phase, we assign each switch a controller randomly by message communication.
Alg. 4 illustrates the distributed initialization procedure.
Algorithm 4: Distributed Initialization (LBDC-DI)
1 Send “CONTROL” message to my own P S(cmy);
2 si reply the first-come “CONTROL” message with
“YES”, all the other messages after that with “NO”;
3 Move each si with “YES” from P S(cmy) to RS(cmy);
4 Wait until all the switches in P S(cmy) reply, terminate;
After initialization, we design the distributed migration algorithm (LBDC-DM) to balance the workload dynamically.
Distributed Regional Migration: In this phase, the con- troller uses the threshold to decide whether it should start migration. Since it only access the neighborhood, the threshold is not a global one, but an independent value computed by each controller locally. The algorithm runs periodically in several rounds. In each round, each controller samples AN (ci) and applies Linear Expectation again:
Avg = P
ck∈AN (ci)+ciw(ck)
|AN (ci)| + 1
T hd = Avgnowα + Avglast(1 − α) Ef n = β × T hd
A controller monitors its traffic status by local threshold. When the traffic degree is larger than Ef n, it enters sending state and initiate a transaction to transfer heavy switches to neighbors.
Alg. 5 illustrates the distributed migration procedure.
Algorithm 5: Distributed Migration (LBDC-DM)
1 if ≥ Ef n → sending then
2 if ∃ci∈ AN (cmy) in receiving or idle then
3 add ci→ RList(receiving > idle);
4 repeat
5 Pick smax with max weight, refer P C(smax), find cj(in RList) with min weight, send
“HELP[ci, smax]” to cj, then check response:
6 if response=“ACC”then
7 cmy start migration with cj and smax.
8 else if response=“REJ” then
9 remove cj from RList, find next cj, send
“HELP” again, check response.
10 until w0(ci) ≤ Ef n;
11 else if ≤ T hd → receiving then
12 When receiving “HELP” messages:
13 repeat
14 receive switches for cj and send back “ACC”;
15 until w(cj) + w(smax) ≥ T hd;
16 Now all “HELP” messages will reply “REJ”
17 else if (T hd, Ef n) → idle then
18 When receiving “HELP” message:
19 repeat receiving state until w(cj) + w(smax) > Ef n;
20 Now facing other “HELP”s, controller will reply
“REJ” and enter the sending state;
It is easy to prove the features of a distributed algorithm such as termination, agreement, and validity for Alg. 5, which indicates its correctness and efficiency.
V. PERFORMANCEEVALUATION
We evaluate the performance of our scheme by considering the case of traffic demand changes and examine whether the metric of balanced workload is minimized. We also take the number of migrated switches into consideration. Furthermore, we check how different parameters will impact the results.
A. Environment Setup
We place 10000 switches and 100 controllers in a 100 × 100 m2 square. Switches are evenly distributed, that is, a switch is 1m away from its neighbors. The controller is also evenly distributed and each one is 10m away from its neighbor. Each controller can control the switches within 30m, and can communicate with other controllers within the range of 40m. We assume the weight of each switch follows Pareto distribution with its parameter α = 3. Now we set α = 0.8, β = 1.2, γ = 1.5 in default.
B. Controller Number
Figure 5 uses the default configuration described above, except that the number of controllers varies from 20 to 100.
We first apply LBDC-CI and change the traffic demands dynamically to emulate unpredictable user requests. Then we apply LBDC-CM to ease the spot congestion. We use the metric described in Section II to evaluate the performance. In Fig. 5, we compare the standard deviation of the initial bursty traffic state and the state after migration. We find that after migration, the metric decreases. As the number of controllers is
increasing, the improvement ratio is also increasing. It is quite intuitive that more controllers will share jobs to reach balanced load. This figure also shows that our algorithm has pretty good performance when the number of controllers grows, which indicates our scheme is suitable for huge DCNs.
Figure 6 shows that our distributed algorithm has the effect of minimizing our metric. The performance of the LBDC- DM is poor when the number of the controllers is relatively small. This phenomenon is attributed to the fact that devolved controllers can only cover switches within 30m. When the number of controllers is small, more switches can only be controlled by one particular controller without much choices.
As the number of the controller increases, LBDC-DM has better performance and larger improvement ratio.
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
y
x
34 36 38 40 42 44 46 48 50
Fig. 3. Colormap before migration
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
y
x
34 36 38 40 42 44 46 48 50
Fig. 4. Colormap after migration
C. Centralized Protocol vs. Distributed Protocol
Next, we compare our centralized and distributed protocols by changing the number of controllers from 30 to 210 with a step of 20. The results are shown in Fig. 7. It depicts that the performance of centralized version is much better than the distributed version, but the difference between them is decreasing as the number of the controllers increases.
We then assess the migration phase. Figure 3 and 4 shows the effectiveness of our migration algorithms. We consider a scenario: at the beginning of a time slot, the weights of switches are updated and we run the migration algorithm.
The weight of switches follows Pareto distribution with its parameter α = 3. Figure 8 shows the performance of LBDC- CM dynamically, which significantly reduce the metric. Figure 9 shows the performance of LBDC-DM. As expected, the performance metric is pretty large after initialization because LBDC-DI just assigns each switch a controller without load balancing consideration. As time goes by, the performance increases greatly.
Next we compare the number of migrate switches of centralized and distributed scheme, and information from Fig.
10 shows that the metric of centralized algorithm fluctuates at a certain value while the metric of distributed algorithm is decreasing and becomes stable as time goes by. This is because LBDC-CM can migrate switches from a global perspective.
Therefore every time the number of migrated switches is almost the same. But LBDC-DM only migrates locally, so the number of migrated switches reduce significantly and become stable slowly.
D. Parameter Specification
Next we explore the impact of the threshold parameters α, β, γ. Here α is a parameter to balance conservativeness and radicalness. We examine the impact of changing α. Due to Step 4 in LBDC-CM, the impact is relatively small. β is a
0 2 4 6 8 10 12 14
20 30 40 50 60 70 80 90 100 0 0.1 0.2 0.3 0.4 0.5
Standard Deviation Improvement (%)
Controller # Initial State After Migration Improvement
Fig. 5. Improvement of different number of con- trollers in centralized migration
0 10 20 30 40 50
20 30 40 50 60 70 80 90 100 0 0.1 0.2 0.3 0.4 0.5
Standard Deviation Improvement (%)
Controller # Initial State After Migration Improvement
Fig. 6. Improvement of different number of con- trollers in distributed migration
0 5 10 15 20 25
30 50 70 90 110 130 150 170 190 210
Standard Deviation
Controller # Centralized Distributed
Fig. 7. Performance comparison between central- ized and distributed algorithm
2 3 4 5 6 7
0 2 4 6 8 10 12 14 16 18 20
Standard Deviation
Time Slot Traffic Change After Migration
Fig. 8. Centralized migration traffic statistics at different time slot
3 4 5 6 7 8 9 10 11 12
0 2 4 6 8 10 12 14 16 18 20
Standard Deviation
Time Slot Traffic Change After Migration
Fig. 9. Distributed migration traffic statistics at different time slot
0 10 20 30 40 50 60 70 80
0 2 4 6 8 10 12 14 16 18 20
Migrated Switch #
Time Slot
Centralized Distributed
Fig. 10. Migrated switches of centralization and distributed migration at different time slot
crucial parameter which decide whether to migrate or not. We set different value for β and see the impact of changing β.
TABLE II list the statistics for β ranging from 1.1 to 1.5.
Clearly, the improvement rate and the number of migrated switches is decreasing as β increases, which is correct from the threshold definition. γ is used in Step 4 of LBDC-CM and the effect of γ is similar to β, so we omit the discussion.
TABLE II. INFLUENCE OFβFACTOR β Initial After Migration Rate switch no.
1.1 150.279 96.165 0.541 376
1.2 157.080 107.749 0.414 356
1.3 166.194 123.509 0.365 316
1.4 166.904 130.265 0.327 259
1.5 151.475 119.928 0.287 196
VI. CONCLUSION
As the evolution of DCNs, the usage of a centralized controller is the performance bottleneck of the DCN and the traffic management problem becomes severer. In this paper, we have explored the usage of devolved controllers to manage the data center effectively as well as alleviate the scalability issue.
In order to monitor and manage the traffic of data centers, we have developed a new implementable scheme to overcome the shortcomings such as workload congestions and reconfig- uration complexities. We have further defined Load Balancing problem for Devolved Controllers (LBDC) and given its NP- completeness. We have also provided an f -approximation solution, designed applicable centralized and distributed algo- rithms to balance the workload among controllers in dynamic situation. The feature of traffic load balancing ensures scaling efficiently, enhances responsiveness of client’s requests as well as improves the throughput and the availability of DCNs. Our performance evaluation validates our design, which becomes a solution to monitor, manage, and coordinate large-scale data centers.
REFERENCES
[1] M. Al-Fares, A. Loukissas, and A. Vahdat. “A scalable, commodity data center network architecture,” ACM SIGCOMM, 63-74, 2008.
[2] B. Heller, S. Seetharaman, et al. “ElasticTree: Saving Energy in Data Center Networks” USENIX NSDI, 249-264, 2010.
[3] S. Kandula, J. Padhye, and P. Bahl. “Flyways to de-congest data center networks.” ACM Hotnets-VIII, 2009.
[4] M. Al-Fares, S. Radhakrishnan, et al. “Hedera: Dynamic flow scheduling for data center networks,” USENIX NSDI,19-19, 2010.
[5] J. Mudigonda, et al. “SPAIN: COTS Data-Center Ethernet for Multi- pathing over Arbitrary Topologies.” USENIX NSDI, 265-280, 2010.
[6] A. Greenberg, J. Hamilton, N. Jain, et al. “VL2: a scalable and flexible data center network,” ACM SIGCOMM, 51-62, 2009.
[7] AS-W. Tam, K. Xi, and H. Chao. “Use of devolved controllers in data center networks.” IEEE INFOCOM, 596-601, 2011.
[8] AS-W. Tam, K. Xi, and H. Chao. “Scalability and Resilience in Data Center Networks: Dynamic Flow Reroute as an Example,” IEEE GLOBE- COM, 1-6, 2011.
[9] A. Dixit, F. Hao, et al. “Towards an elastic distributed sdn controller.”
ACM SIGCOMM, 7-12, 2013.
[10] A. Tootoonchian, Y. Ganjali. “HyperFlow: A distributed control plane for OpenFlow.” USENIX INM/WREN, 3-3, 2010.
[11] C. Macapuna, C. Rothenberg, M. Magalhaes. “In-packet Bloom filter based data center networking with distributed OpenFlow controllers.”
IEEE GLOBECOM, 584-588, 2010.
[12] S. Yeganeh and Y. Ganjali. “Kandoo: a framework for efficient and scalable offloading of control applications.” ACM HotSDN, 19-24, 2012.
[13] T. Koponen, et al. “Onix: A Distributed Control Platform for Large- scale Production Networks.” USENIX NSDI, 1-6, 2010.
[14] N. McKeown, T. Anderson, H. Balakrishnan, et al. “OpenFlow: enabling innovation in campus networks,” ACM SIGCOMM, 69-74, 2008.
[15] T. Benson, A. Anand, A. Akella, and M. Zhang. “Microte: fine grained traffic engineering for data centers,” ACM CONEXT, 8, 2011.
[16] R. Karp, “Reducibility among Combinatorial Problems”, Springer US, 85-103, 1972.
[17] D. Williamson and D. Shmoys. “The design of approximation algo- rithms.” Cambridge University Press, 2011.
[18] S. Kandula, S. Sengupta, et al. “The nature of data center traffic:
measurements & analysis,” ACM SIGCOMM, 202-208, 2009.
[19] T. Benson, A. Akella, and D. Maltz. “Network traffic characteristics of data centers in the wild,” ACM SIGCOMM, 267-280, 2010.
[20] D. Corner, “Internetworking with TCP/IP.” Vol 1, Page 226, 2000.
[21] N. Lynch. “Distributed algorithms.” Morgan Kaufmann, 1996.