Network Protocols:
Design and Analysis
Polly Huang EE NTU
http://cc.ee.ntu.edu.tw/~phuang phuang@cc.ee.ntu.edu.tw
Internet Routing III
[Tsuchiya88a]
Landmark Routing
[Tsuchiya88a]
Context
• fairly early in the Internet life
– before BGP-3 – before CIDR
Key Idea
• Self-configuring hierarchy for routing with many routers
Why Landmark Routing?
• area routing requires knowledge of topology, maybe doesn ’t get best aggregration possible
• LM knows about internal structure of nearby nodes, even i f in different AS
• dynamic address assignment—easier to manage
• reduce size of routing table… because address are automat ic, and reassigned on-demand, can get better aggregation t han area hierarchy
• could be more reliable if congestion because supports mult iple (?)
Landmark Routing
Disadvantages
• don’t always get shortest path [but true about all routing protoc ols that have aggregation/policy]
• admin control? (paper hints at approaches, but not fully explor ed)
• performance not fully explored?
– less info further away from destination, therefore more likely to get poo r quality routes to it [but no different from area routing]
– performance of LM placement/config algorithms?
• combines routing and address (but so does area routing) • addressing
– address may not be stable
Landmark hierarchy
• Details about things nearby and less information about things far away
• Not defined by arbitrary boundaries
– thus, not well suited to the real world that does have administrative boundaries
– (although he says something about adding admin boundaries)
A Landmark
1 3 4 5 6 7 8 9 10 11 Router 1 is a landmark of radius 2Landmark Overview
• Landmark routers have “height” which determines how far away they can be seen (visibility)
• Routers within Radius n can see a landmark router LM(n)
• See means that those routers have LM(n)’s address and know next hop to reach it.
– Router x as an entry for router y if x is within radius of y
• Distance vector style routing with simple metric
LM Hierarchy Definition
• Each LM (Li) associated with level (i) and radi us (ri)
• Every node is an L0 landmark
• Recursion: some Li are also Li+1
– Every Li is seen by at least one Li+1
• Terminating state when all level j LMs see entir e network
LM addresses
• LM(2).LM(1).LM(0) (x.a.b and y.a.b)
• LM level maps to radius (part of configuration), e.g.:
– LM level 0: radius 2 – LM level 1: radius 4 – LM level 2: radius 8
• If destination is more than two hops away, will not have complete routing information, refer to LM(1) portion of address, if not then refer to LM(2)..
(c would forward based on y in y.a.b) y
a
LM Routing
• LM does not imply hierarchical forwarding • It is not a source route
• En route to LM(1) may encounter router that is within LM(0) radius of destination address
(like longest match)
LM self-configuration
• Bottom-up hierarchy construction algorithm
– goal to bound number of children
• Every router is L0 landmark
• All routers advertise themselves over a distance • All Li landmarks run election to self-promote
one or more Li+1 landmarks
• Dynamic algorithm to adapt to topology changes--Efficient hierarchy
Landmark Routing: Basic Idea
Source wants to reach LM0[a], whose address is c.b.a:
•Source can see LM2[c], so sends packet towards c •Entering LM1[b] area, first router diverts packet to b
- Not shortest path
- Packet does not necessarily follow specified landmarks
Routing table for Router g
Landmark Level Next hop
LM2[d] LM0[e] LM1[i] LM0[k] LM0[f] 2 1 0 0 0 f k f k f r0 = 2, r1 = 4, r2 = 8 hops Router g
How to go from d.i.g to d.n.t?
Evaluation
• analytic results
– but bounds not very helpful
• simulation
– routing table size (R) – mean path length
– distance to nearby landmark – (seems weak) r/d = radius/distance rt g ta bl e si ze m ea n pa th le n
BGP Routing Convergence Times
[Labovitz00a]
Context
• BGP widely deployed in the Internet • but poorly understood
Key Idea
• convergence time takes longer we expected • observes 2-3 minute convergence times (6x
longer than expected!)
• bounds on BGP convergence: O(n!) worst c ase, O((n-3)*30s) [n is number of ASes]
Why is Convergence Important?
• robustness
– PSTN (telephone) failover times are in milliseconds
– Internet failover times are in 10s of seconds – open research question: how can Internet
Methodology
• experiments over Internet: manually injected faults propagate across net
• simulation to study worst case behavior • theoretical analysis—helps understand
worst case bounds
Methodology Picture
Internet-scale experimentation. What kinds of complexities arise?
Have to be careful with real routes;
([Labovitz00a] Figure 1)
0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 100 120 140 160
Seconds Until Convergence
C u m u la ti v e P e rc e n ta g e o f E v e n ts Tup Tshort Tlong Tdow n Shor t->Lo ng F ail-O ver ( Tlon g) New Rou te, Lon g->S hort Fai l-ov er (Tup and Tsh ort) Failu re (Tdo wn)
Observed Convergence Latency
Labovitz00a Figure 2a
Other Observations
• No correlation between network distance (latency, router, or AS hops) and
convergence times
Affects on Traffic
([Labovitz00a] figure 4a)
Why does loss go up? There’s always a
direct path?
some people use old paths, routing loops
How To Tell What’s Going On?
• Simulate BGP
– model one router per AS – assume full routing mesh – ignore latency
– synchronous processing via global queue simple model that captures key details
What’s going on?
• there are many possible routes (indirect thro ugh other ASes) and it takes a long time w/ BGP to figure out that none work
– BGP can try all paths of length 2, then 3, then 4 => O(n!) steps
– even with min-route-adver it still can take O(n) steps
BGP Convergence Example
R
AS0 AS1 AS2 AS3 *B R via 3 B R via 03 B R via 23 *B R via 3 B R via 03 B R via 13 *B R via 3 B R via 13 B R via 23 * * *What about MinRouteAdver?
• BGP has a minimum advertisement interval timer
– designed to limit updates
– and to encourage aggregation
• How does it affect convergence?
– by delaying announcements, routers figure out the pain sooner
– see section 5.2
Does this explain measurements?
• Tup/Tshort converge quickly because they s horten path length and therefore are quickly accepted
• Tdown/Tlong converge slowly because BG P tries hard to find all alternatives
– Tlong actually sometimes goes quicker if it’s “n ot long enough” and can preempt some of the th rashing
Other Observations
• Could do loop detection at sender side and not just receiver side