```COMS W4995-1
Lecture 6
Dynamic routing protocols II
1. Dynamic Routing Protocols: Link State Routing
2. Intra-Domain Routing Protocols: OSPF & BGP
Dynamic Routing Protocols
The Gang of Four
IGP
EGP
OSPF
IS-IS
Vectoring
RIP
BGP

Based on Dijkstra’ s Shortest-Path-First algorithm.

Each router starts by knowing:



Each router advertises to the entire network (flooding):



Prefixes of its attached networks.
Prefixes of its directly connected networks.
Each router learns:

A complete topology of the network (routers, links).

Each router computes shortest path to each destination.

In a stable situation, all routers have the same graph, and compute
the same paths.
Dijkstra’s Shortest Path Algorithm for a Graph
Input: Graph (N,E) with
N the set of nodes and E the set of edges
cvw
link cost (cvw = 1 if (v,w)  E, cvv = 0)
s
source node.
Output: Dn
cost of the least-cost path from node s to node n
M = {s};
for each n  M
Dn = csn;
while (M  all nodes) do
Find w  M for which Dw = min{Dj ; j  M};
for each neighbor n of w and n  M
Dn = min[ Dn, Dw + cwn ];
Update route;
end for
end while
end for
Global view:
b
3
a
a’s view:
3
a
d
b
6
b
d’s view:
c
c
1
a
c’s view:
2
c
6
a
b’s view: 3
1
2
d
c
b
1
c
2
d
6
Collecting all views yield a global & complete view of the network!
Operation of a Link State Routing protocol
LSAs
Database
Dijkstra’s
Algorithm
LSAs are flooded
to other interfaces
IP Routing
Table

Each node requires complete topology information

Link state information must be flooded to all nodes

Guaranteed to converge
Distance Vector vs. Link State Routing

With distance vector routing, each node has information only about
the next hop:






Node A: to reach F go to B
Node B: to reach F go to D
Node D: to reach F go to E
Node E: go directly to F
Distance vector routing makes
poor routing decisions if
directions are not completely
correct
(e.g., because a node is down).
A
B
C
D
E
If parts of the directions incorrect, the routing may be incorrect until the
routing algorithms has re-converged.
F
Distance Vector vs. Link State Routing

In link state routing, each node has a complete map of
the topology
A


If a node fails, each
node can calculate
the new route
B
C
D
E
A
F
A
Difficulty: All nodes need to
have a consistent view of the
network
A
B
C
D
E
B
C
D
E
A
F
B
C
D
E
B
C
D
E
A
F
B
C
D
E
F
F
A
F
B
C
D
E
F
Distance Vector vs. Link State Routing
•
•
•
•
•
•
Vectoring
Topology information is flooded
within the routing domain
Best end-to-end paths are
computed locally at each
router.
Best end-to-end paths
determine next-hops.
•
Based on minimizing some
notion of distance
Works only if policy is shared
and uniform
Examples: OSPF, IS-IS
•
•
•
•
•
network topology
Only best next-hops are chosen
by each router for each
destination network.
Best end-to-end paths result
from composition of all nexthop choices
Does not require any notion of
distance
Does not require uniform
policies at all routers
Examples: RIP, BGP
Dynamic Routing Protocols
Open Shortest Path First
OSPF



OSPF = Open Shortest Path First
The OSPF routing protocol is the most important link
state routing protocol on the Internet (another link state
routing protocol is IS-IS (intermediate system to
intermediate system)
The complexity of OSPF is significant



RIP (RFC 2453 ~ 40 pages)
OSPF (RFC 2328 ~ 250 pages)
History:





1989: RFC 1131 OSPF Version 1
1991: RFC1247 OSPF Version 2
1994: RFC 1583 OSPF Version 2 (revised)
1997: RFC 2178 OSPF Version 2 (revised)
1998: RFC 2328 OSPF Version 2 (current version)
Features of OSPF

Provides authentication of routing messages

Enables load balancing by allowing traffic to be split
evenly across routes with equal cost

Type-of-Service routing allows to setup different routes
dependent on the TOS field

Supports subnetting

Supports multicasting

Allows hierarchical routing
Hierarchical OSPF
Hierarchical OSPF

Two-level hierarchy: local area, backbone.
 each nodes has detailed area topology; only know
direction (shortest path) to nets in other areas.

Area border routers: “summarize” distances to nets
in own area, advertise to other Area Border routers.

Backbone routers: run OSPF routing limited to
backbone.
Example Network
10.1.1.2
.1
4
.2
.2
3
2
• Metric is in the range [0 , 216]
• Metric can be asymmetric
3
.6
1
.5
.3
5
.6
10.1.7.0 / 24
.4
.3
.3
1
.4
.2
.5
.5
10.1.5.0/24
10.1.2.3
• Link costs are called Metric
.4
10.1.4.0 / 24
10.1.1.0 / 24
Router IDs can be
selected
independent of
but usually chosen to
be the smallest
2
10.1.3.0 / 24
.1
10.1.7.6
10.1.4.4
10.1.6.0 / 24
10.1.1.1
10.1.5.5
10.1.1.1
10.1.1.2
10.1.1.0 / 24
3
2
.2
.2
.4
.4
.4
.3
.5
.3
.3
.5
10.1.5.5

The LSA of router 10.1.1.1 is as follows:

10.1.1.1

10.1.1.1 = Router ID
3 = 2 links plus router itself
Link ID = 10.1.1.2, Metric = 4
Link ID = 10.1.2.2, Metric = 3
Link ID = 10.1.1.1, Metric = 0




.6
.5
10.1.5.0/24
10.1.2.3
.6
10.1.7.0 / 24
10.1.4.0 / 24
10.1.6.0 / 24
.1
.2
10.1.3.0 / 24
4
.1
10.1.7.6
10.1.4.4
= Router ID
10.1.1.1
10.1.1.0 / 24
Each router has a
database which
contains the LSAs
from all other routers
.2
.2
.4
10.1.4.0 / 24
.4
.4
.3
.5
.3
.3
.6
10.1.7.0 / 24
10.1.6.0 / 24
.1
.2
10.1.7.6
10.1.4.4
10.1.3.0 / 24
.1
10.1.1.2
.6
.5
.5
10.1.5.0/24
10.1.5.5
10.1.2.3
LS Type
Checksum
LS SeqNo
LS Age
Router-LSA
10.1.1.1
10.1.1.1
0x9b47
0x80000006
0
Router-LSA
10.1.1.2
10.1.1.2
0x219e
0x80000007
1618
Router-LSA
10.1.2.3
10.1.2.3
0x6b53
0x80000003
1712
Router-LSA
10.1.4.4
10.1.4.4
0xe39a
0x8000003a
20
Router-LSA
10.1.5.5
10.1.5.5
0xd2a6
0x80000038
18
Router-LSA
10.1.7.6
10.1.7.6
0x05c3
0x80000005
1680

The collection of all LSAs is called the link-state
database

Each router has an identical link-state database
 Useful for debugging: Each router has a complete description of the
network

If neighboring routers discover each other for the first
time, they will exchange their link-state databases

The link-state databases are synchronized using reliable
flooding
OSPF Packet Format
O S P F M essage
IP h e a d e r
OS PF M essag e
OSPF packets are not
OSPF has its own IP
protocol number: 89
TTL: set to 1 (in most cases)
B ody o f O S P F M essage
M essage T ype
S p e c ific D a ta
LSA
LSA
Destination IP: neighbor’s IP address or 224.0.0.5
(ALLSPFRouters) or 224.0.0.6 (AllDRouters)
LSA
LSA
D a ta
... ...
LSA
OSPF Packet Format
O S P F M essag e
2: current version
is OSPF V2
versio n
Message types:
1: Hello (tests reachability)
2: Database description
Standard IP checksum taken
over entire packet
Authentication passwd = 1:
Authentication passwd = 2:
B o d y o f O S P F M essag e
typ e
m essag e len g th
so u rce ro u ter IP ad d ress
ID of the Area
from which the
packet originated
A rea ID
ch ecksu m
au th en ticatio n typ e
au th en ticatio n
au th en ticatio n
3 2 b its
0x0000 (16 bits)
KeyID (8 bits)
Length of MD5 checksum (8 bits)
Nondecreasing sequence number (32 bits)
0: no authentication
1: Cleartext
2: MD5 checksum
packet)
Prevents replay
attacks
OSPF LSA Format
LSA
L in k A g e
L in k S ta te ID
LSA
LSA
L in k T y p e
a d v e rtis in g ro u te r
LSA
D a ta
lin k s e q u e n c e n u m b e r
checksum
le n g th
L in k ID
L in k D a ta
L in k T y p e
# T O S m e tric s
M e tric
L in k ID
L in k D a ta
L in k T y p e
# T O S m e tric s
M e tric
Discovery of Neighbors


Routers multicasts OSPF Hello packets on all OSPFenabled interfaces.
If two routers share a link, they can become neighbors, and
10.1.10.1
10.1.10.2
Scenario:
Router 10.1.10.2 restarts
O SPF Hello
O SPF Hello: I heard 10.1.10.2

After becoming a neighbor, routers exchange their link state
databases
Neighbor discovery and database synchronization
Scenario:
Router 10.1.10.2
restarts
Discovery of
1 0 .1 .1 0 .1
1 0 .1 .1 0 .2
O S P F H e llo
O S P F H e llo : I h e a rd 1 0 .1 .1 0 .2
After neighbors are discovered the nodes exchange their databases
D a ta b a s e D e s c rip tio n : S e q u e n c e = X
Sends database
description.
(description only
contains LSA
Acknowledges
receipt of
description
D a ta b a s e D e s c rip tio n : S e q u e n c e = X , 5 L S A h e a d e rs =
R o u te r-L S A , 1 0 .1 .1 0 .1 , 0 x 8 0 0 0 0 0 0 6
R o u te r-L S A ,
1 0 .1 .1 0 .2 , 0 x 8 0 0 0 0 0 0 7
R o u te r-L S A ,
1 0 .1 .1 0 .3 , 0 x 8 0 0 0 0 0 0 3
R o u te r-L S A ,
1 0 .1 .1 0 .4 , 0 x 8 0 0 0 0 0 3 a
R o u te r-L S A ,
1 0 .1 .1 0 .5 , 0 x 8 0 0 0 0 0 3 8
R o u te r-L S A ,
1 0 .1 .1 0 .6 , 0 x 8 0 0 0 0 0 0 5
D a ta b a s e D e s c rip tio n : S e q u e n c e = X +1 , 1 L S A h e a d e r=
R o u te r-L S A ,
1 0 .1 .1 0 .2 , 0 x 8 0 0 0 0 0 0 5
D a ta b a s e D e s c rip tio n : S e q u e n c e = X +1
Sends empty
database
description
Database
description of
10.1.10.2
Regular LSA exchanges
10.1.10.1
Link State Request packets, LSAs =
Router-LSA,
10.1.10.1,
Router-LSA,
10.1.10.2,
Router-LSA,
10.1.10.3,
Router-LSA,
10.1.10.4,
Router-LSA,
10.1.10.5,
Router-LSA,
10.1.10.6,
10.1.10.1 sends
requested LSAs
Link State Update Packet, LSAs =
Router-LSA, 10.1.10.1, 0x80000006
Router-LSA, 10.1.10.2, 0x80000007
Router-LSA, 10.1.10.3, 0x80000003
Router-LSA, 10.1.10.4, 0x8000003a
Router-LSA, 10.1.10.5, 0x80000038
Router-LSA, 10.1.10.6, 0x80000005
10.1.10.2
10.1.10.2 explicitly
requests each LSA
from 10.1.10.1
Dissemination of LSA-Update



A router sends and refloods LSA-Updates, whenever the
not contain new information, the router will not flood the
packet)
Exception: Infrequently (every 30 minutes), a router will
flood LSAs even if there are not new changes.



explicit ACK, or
implicit via reception of an LSA-Update
Question: If a new node comes up, it could build the
database from regular LSA-Updates (rather than
exchange of database description). What role do the
database description packets play?
Dynamic Routing Protocols (Inter-domain)
Border Gateway Protocol
BGP Quick View


BGP = Border Gateway Protocol . Currently in version 4,
specified in RFC 1771. (~ 60 pages)
Note: In the context of BGP, a gateway is nothing else
but an IP router that connects autonomous systems.

Interdomain routing protocol for routing between
autonomous systems

Uses TCP to establish a BGP session and to send
routing messages over the BGP session

BGP is a path vector protocol. Routing messages in BGP
contain complete routes.

Network administrators can specify routing policies
BGP Policy-based Routing

Each node is assigned an AS number (ASN)

BGP’s goal is to find any AS-path (not an optimal one).
Since the internals of the AS are never revealed, finding
an optimal path is not feasible.

Network administrator sets BGP’s policies to determine
the best path to reach a destination network.
How Many ASNs are there today?
20,570
14,588
origin
only (no
transit)
Thanks to Geoff Huston. http://bgp.potaroo.net on October 9, 2005
ARDs versus ASes
Autonomous Routing Domains Don’t Always Need BGP or an ASN
Qwest
Nail up routes 130.132.0.0/16
pointing to Yale
Nail up default routes 0.0.0.0/0
pointing to Qwest
Yale University
130.132.0.0/16
Static routing is the most common way of connecting an
autonomous routing domain to the Internet.
This helps explain why BGP is a mystery to many …
ASNs Can Be “Shared” (RFC 2270)
AS 701
UUNet
AS 7046
Crestar
Bank
AS 7046
NJIT
AS 7046
Hood
College
128.235.0.0/16
ASN 7046 is assigned to UUNet. It is used by
Customers single homed to UUNet, but needing
BGP for some reason (load balancing, etc..) [RFC 2270]
ARDs and ASes: Summary

Most ARDs have no ASN (statically routed at Internet edge)

Some unrelated ARDs share the same ASN (RFC 2270)

Some ARDs are implemented with multiple ASNs (example:
Worldcom)
ASes are just an implementation detail of Inter-domain routing
How many prefixes today?
221,002
33.3%
23%
Thanks to Geoff Huston. http://bgp.potaroo.net on October 9, 2005
Policy-Based vs. Distance-Based Routing?
Minimizing
“hop count” can
violate commercial
relationships that
constrain interdomain routing.
Host 1
Cust1
YES
ISP1
NO
ISP3
ISP2
Cust3
Host 2
Cust2
Thanks to Tim Griffin
http://www.cl.cam.ac.uk/users/tgg22
Customer versus Provider
provider
provider
customer
IP traffic
customer
Why not minimize “AS hop Count”?
National
ISP1
National
ISP2
YES
NO
Regional
ISP3
Cust3
Regional
ISP2
Cust2
Regional
ISP1
Cust1
Shortest path routing is not compatible with commercial relations
The “Peering” Relationship
peer
provider
peer
customer
Peers provide transit between
their respective customers
Peers do not provide transit
between peers
traffic
allowed
traffic NOT
allowed
Peers (often) do not exchange \$\$\$
Peering Provides Shortcuts
Peering also allows connectivity between
the customers of “Tier 1” providers.
peer
provider
peer
customer
Peering Wars
Peer



Reduces upstream transit costs
Can increase end-to-end
performance
May be the only way to connect
your customers to some part of
the Internet (“Tier 1”)
Don’t Peer



You would rather have
customers
Peers are usually your
competition
Peering relationships may
require periodic renegotiation
Peering struggles are by far the most
contentious issues in the ISP world!
Peering agreements are often confidential.
The Border Gateway Protocol (BGP)
BGP =
+
RFC 1771
“optional” extensions
RFC 1997 (communities) RFC 2439 (damping) RFC 2796 (reflection) RFC3065 (confederation) …
+
routing policy configuration
languages (vendor-specific)
+
Current Best Practices in
management of Interdomain Routing
BGP was not DESIGNED.
It EVOLVED.
BGP Route Processing
Open ended programming.
Constrained only by vendor configuration language
filter routes &
BGP
attributes
Apply Import
Policies
Based on
Attribute
Values
Best
Routes
Best Route
Selection
Best Route
Table
Apply Policy =
filter routes &
tweak
attributes
Apply Export
Policies
Install forwarding
Entries for best
Routes.
IP Forwarding Table
Transmit
BGP
BGP Attributes
Value
----1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
...
255
Code
--------------------------------ORIGIN
AS_PATH
NEXT_HOP
MULTI_EXIT_DISC
LOCAL_PREF
ATOMIC_AGGREGATE
AGGREGATOR
COMMUNITY
ORIGINATOR_ID
CLUSTER_LIST
DPA
RCID_PATH / CLUSTER_ID
MP_REACH_NLRI
MP_UNREACH_NLRI
EXTENDED COMMUNITIES
Reference
--------[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1997]
[RFC2796]
[RFC2796]
[Chen]
[RFC1863]
[RFC1863]
[RFC2283]
[RFC2283]
[Rosen]
Most
important
attributes
reserved for development
From IANA: http://www.iana.org/assignments/bgp-parameters
Not all attributes
need to be present in
every announcement
ASPATH Attribute
AS 1129
135.207.0.0/16
AS Path = 1755 1239 7018 6341
135.207.0.0/16
AS Path = 1239 7018 6341
AS 1239
Sprint
AS 1755
Ebon
e
AS 6341
AT&T Research
135.207.0.0/16
Prefix Originated
135.207.0.0/16
AS Path = 1129 1755 1239 7018 6341
AS 12654
RIPE NCC
RIS project
135.207.0.0/16
AS Path = 7018 6341
AS7018
135.207.0.0/16
AS Path = 6341
Global Access
135.207.0.0/16
AS Path = 3549 7018 6341
AT&T
135.207.0.0/16
AS Path = 7018 6341
AS 3549
Global Crossing
Shorter Doesn’t Always Mean Shorter
In fairness:
could you do
this “right” and
still scale?
Mr. BGP says that
path 4 1 is better
than path 3 2 1
Duh!
AS 4
AS 3
Exporting internal
state would
dramatically
increase global
instability and
amount of routing
state
AS 2
AS 1
Routing Example 1
Thanks to Han Zheng
Routing Example 2
Thanks to Han Zheng
Tweak Tweak Tweak (TE)

For inbound traffic



Filter outbound routes
Tweak attributes on
outbound routes in the
hope of influencing your
neighbor’s best route
selection
inbound
traffic
outbound
routes
For outbound traffic


Filter inbound routes
Tweak attributes on
inbound routes to
influence best route
selection
outbound
traffic
inbound
routes
In general, an AS has more control over outbound traffic
Backup Links with Local Preference (Outbound Traffic)
AS 1
Set Local Pref = 100
for all routes from AS 1
AS 65000
Set Local Pref = 50
for all routes from AS 1
Multihomed Backups (Outbound Traffic)
AS 1
AS 3
provider
provider
Set Local Pref = 100
for all routes from AS 1
Set Local Pref = 50
for all routes from AS 3
AS 2
Shedding Inbound Traffic with ASPATH Prepending
AS 1
Prepending will (usually)
force inbound
traffic from AS 1
provider
192.0.2.0/24
ASPATH = 2 2 2
192.0.2.0/24
ASPATH = 2
primary
backup
customer
AS 2
192.0.2.0/24
Yes, this is a
Glorious Hack …
… But Padding Does Not Always Work
AS 1
AS 3
provider
provider
192.0.2.0/24
ASPATH = 2
192.0.2.0/24
ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2
primary
backup
customer
AS 2
192.0.2.0/24
AS 3 will send
traffic on “backup”
customer routes and local
preference is considered
before ASPATH length!
Padding in this way is often
used as a form of load
balancing
COMMUNITY Attribute to the Rescue!
AS 1
AS 3
provider
provider
AS 3: normal
customer local
pref is 100,
peer local pref is 90
192.0.2.0/24
ASPATH = 2
COMMUNITY = 3:70
192.0.2.0/24
ASPATH = 2
primary
backup
customer
AS 2
192.0.2.0/24
Customer import policy at AS 3:
If 3:90 in COMMUNITY then
set local preference to 90
If 3:80 in COMMUNITY then
set local preference to 80
If 3:70 in COMMUNITY then
set local preference to 70
BGP Issues - What is a BGP Wedgie?
 BGP
¾ wedgie
Full
wedgie
policies make sense locally
 Interaction of local policies allows
multiple stable routings
 Some routings are consistent with
intended policies, and some are not
 If an unintended routing is
installed (BGP is “wedged”), then
manual intervention is needed to
change to an intended routing
 When
an unintended routing is
installed, no single group of network
operators has enough knowledge to
debug the problem
Dynamic Routing Protocols: Summary

Dynamic routing protocols: RIP, OSPF, BGP

RIP uses distance vector algorithm, and converges slow
(the count-to-infinity problem)

OSPF uses link state algorithm, and converges fast. But
it is more complicated than RIP.

Both RIP and OSPF finds lowest-cost path.

BGP uses path vector algorithm, and its path selection
algorithm is complicated, and is influenced by policies.

BGP has its own problems see WIDGI by Tim Griffin
BGP Wedgies: Bad Routing Policy Interactions that
Cannot be Debugged
JI’s Intro to interdomain routing.
"Interdomain Setting of PlanetLab Nodes."
PlanetLab Meeting, May 14, 2004.
Understanding the Border Gateway Protocol (BGP)
ICNP 2002 Tutorial Session
References






–
–
[VGE1996, VGE2000] Persistent Route Oscillations in Inter-Domain Routing.
Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer
Networks, Jan. 2000. (Also USC Tech Report, Feb. 1996)
[GW1999] An Analysis of BGP Convergence Properties. Timothy G. Griffin,
Gordon Wilfong. SIGCOMM 1999
[GSW1999] Policy Disputes in Path Vector Protocols. Timothy G. Griffin, F.
Bruce Shepherd, Gordon Wilfong. ICNP 1999
[GW2001] A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong.
INFOCOM 2001
[GR2000] Stable Internet Routing without Global Coordination. Lixin Gao,
Jennifer Rexford. SIGMETRICS 2000
[GGR2001] Inherently safe backup routing with BGP. Lixin Gao, Timothy G.
Griffin, Jennifer Rexford. INFOCOM 2001
[GW2002a] On the Correctness of IBGP Configurations. Griffin and
Wilfong.SIGCOMM 2002.
[GW2002b] An Analysis of the MED oscillation Problem. Griffin and Wilfong.
ICNP 2002.
```