Interdomain Routing and The
Border Gateway Protocol (BGP)
Timothy G. Griffin
Intel Research,
Cambridge UK
[email protected]
CL
Oct 27, 2004
Architecture of Dynamic Routing
IGP
EGP (= BGP)
AS 1
IGP = Interior Gateway Protocol
Metric based: OSPF, IS-IS, RIP,
EIGRP (cisco)
EGP = Exterior Gateway Protocol
IGP
AS 2
Policy based: BGP
The Routing Domain of BGP is the entire Internet
Technology of Distributed Routing
Link State
•
•
•
•
•
•
Topology information is
flooded within the routing
domain
Best end-to-end paths are
computed locally at each
router.
Best end-to-end paths
determine next-hops.
Based on minimizing
some notion of distance
Works only if policy is
shared and uniform
Examples: OSPF, IS-IS
Vectoring
•
•
•
•
•
•
Each router knows little
about network topology
Only best next-hops are
chosen by each router for
each destination network.
Best end-to-end paths
result from composition
of all next-hop choices
Does not require any
notion of distance
Does not require uniform
policies at all routers
Examples: RIP, BGP
The Gang of Four
Link State
IGP
EGP
OSPF
IS-IS
Vectoring
RIP
BGP
How do you connect to the
Internet?
Physical connectivity is
just the beginning of the
story….
Partial View of www.cl.cam.ac.uk
(128.232.0.20) Neighborhood
AS 3356
Level 3
AS 5459
LINX
AS 6461
AboveNet
AS 20965
GEANT
AS 786
ja.net
(UKERNA)
Originates > 180 prefixes,
Including 128.232.0.0/16
AS 7
UK Defense
Research Agency
AS 1239
Sprint
AS 702
UUNET
AS 1213
HEAnet
(Irish academic
and research)
AS 4373
Online Computer
Library Center
How Many ASNs are there today?
15,981
Thanks to Geoff Huston. http://bgp.potaroo.net on October 24, 2003
How Many ASNs are there today?
18,217
12,940
origin
only (no
transit)
Thanks to Geoff Huston. http://bgp.potaroo.net on October 26, 2004
AS Numbers (ASNs)
ASNs are 16 bit values.
64512 through 65535 are “private”
•
•
•
•
•
•
•
•
Currently over 15,000 in use.
Genuity: 1
MIT: 3
JANET: 786
UC San Diego: 7377
AT&T: 7018, 6341, 5074, …
UUNET: 701, 702, 284, 12199, …
Sprint: 1239, 1240, 6211, 6242, …
…
ASNs represent units of routing policy
Autonomous Routing Domains Don’t Always
Need BGP or an ASN
Qwest
Nail up routes 130.132.0.0/16
pointing to Yale
Nail up default routes 0.0.0.0/0
pointing to Qwest
Yale University
130.132.0.0/16
Static routing is the most common way of connecting an
autonomous routing domain to the Internet.
This helps explain why BGP is a mystery to many …
ASNs Can Be “Shared” (RFC 2270)
AS 701
UUNet
AS 7046
Crestar
Bank
AS 7046
NJIT
AS 7046
Hood
College
128.235.0.0/16
ASN 7046 is assigned to UUNet. It is used by
Customers single homed to UUNet, but needing
BGP for some reason (load balancing, etc..) [RFC 2270]
Autonomous Routing Domain != Autonomous System (AS)
• Most ARDs have no ASN
(statically routed at Internet
edge)
• Some unrelated ARDs share the
same ASN (RFC 2270)
• Some ARDs are implemented
with multiple ASNs (example:
Worldcom)
ASes are an implementation detail of Interdomain routing
How many prefixes today?
154,894
Note: numbers
actually depends
point of view…
29%
Address space
covered
23%
Thanks to Geoff Huston. http://bgp.potaroo.net on October 24, 2003
How many prefixes today?
179,903
Note: numbers
actually depends
point of view…
31%
Address space
covered
23%
Thanks to Geoff Huston. http://bgp.potaroo.net on October 26, 2004
Policy-Based vs. Distance-Based Routing?
Minimizing
“hop count” can
violate commercial
relationships that
constrain interdomain routing.
Host 1
Cust1
YES
ISP1
NO
ISP3
ISP2
Cust3
Host 2
Cust2
15
Why not minimize “AS hop count”?
National
ISP1
National
ISP2
YES
NO
Regional
ISP3
Cust3
Regional
ISP2
Cust2
Regional
ISP1
Cust1
16
Shortest path routing is not compatible with commercial relations
Customers and Providers
provider
provider
customer
IP traffic
customer
Customer pays provider for access to the Internet
The “Peering” Relationship
peer
provider
peer
customer
Peers provide transit between
their respective customers
Peers do not provide transit
between peers
traffic
allowed
traffic NOT
allowed
Peers (often) do not exchange $$$
Peering Provides Shortcuts
Peering also allows connectivity between
the customers of “Tier 1” providers.
peer
provider
peer
customer
Peering Wars
Peer
• Reduces upstream
transit costs
• Can increase end-toend performance
• May be the only way to
connect your
customers to some
part of the Internet
(“Tier 1”)
Don’t Peer
• You would rather have
customers
• Peers are usually your
competition
• Peering relationships
may require periodic
renegotiation
Peering struggles are by far the most
contentious issues in the ISP world!
Peering agreements are often confidential.
The Border Gateway Protocol (BGP)
BGP =
+
RFC 1771
“optional” extensions
RFC 1997 (communities) RFC 2439 (damping) RFC 2796 (reflection) RFC3065 (confederation) …
+
routing policy configuration
languages (vendor-specific)
+
Current Best Practices in
management of Interdomain Routing
BGP was not DESIGNED.
It EVOLVED.
BGP Route Processing
Open ended programming.
Constrained only by vendor configuration language
Receive Apply Policy =
filter routes &
BGP
Updates tweak attributes
Apply Import
Policies
Based on
Attribute
Values
Best
Routes
Best Route
Selection
Best Route
Table
Apply Policy =
filter routes &
tweak attributes
Transmit
BGP
Updates
Apply Export
Policies
Install forwarding
Entries for best
Routes.
IP Forwarding Table
22
BGP Attributes
Value
----1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
...
255
Code
--------------------------------ORIGIN
AS_PATH
NEXT_HOP
MULTI_EXIT_DISC
LOCAL_PREF
ATOMIC_AGGREGATE
AGGREGATOR
COMMUNITY
ORIGINATOR_ID
CLUSTER_LIST
DPA
ADVERTISER
RCID_PATH / CLUSTER_ID
MP_REACH_NLRI
MP_UNREACH_NLRI
EXTENDED COMMUNITIES
Reference
--------[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1997]
[RFC2796]
[RFC2796]
[Chen]
[RFC1863]
[RFC1863]
[RFC2283]
[RFC2283]
[Rosen]
Most
important
attributes
reserved for development
From IANA: http://www.iana.org/assignments/bgp-parameters
Not all attributes
need to be present in
every announcement
ASPATH Attribute
AS 1129
135.207.0.0/16
AS Path = 1755 1239 7018 6341
135.207.0.0/16
AS Path = 1239 7018 6341
AS 1239
Sprint
AS 1755
135.207.0.0/16
AS Path = 1129 1755 1239 7018 6341
Ebone
AS 12654
AS 6341
AT&T Research
RIPE NCC
RIS project
135.207.0.0/16
AS Path = 7018 6341
AS7018
135.207.0.0/16
AS Path = 6341
Global Access
135.207.0.0/16
AS Path = 3549 7018 6341
AT&T
135.207.0.0/16
AS Path = 7018 6341
AS 3549
Global Crossing
135.207.0.0/16
Prefix Originated
24
Shorter Doesn’t Always Mean Shorter
In fairness:
could you do
this “right” and
still scale?
Mr. BGP says that
path 4 1 is better
than path 3 2 1
Duh!
AS 4
AS 3
Exporting internal
state would
dramatically
increase global
instability and
amount of routing
state
AS 2
AS 1
Routing Example 1
Thanks to Han Zheng
Routing Example 2
Thanks to Han Zheng
Tweak Tweak Tweak (TE)
• For inbound traffic
– Filter outbound
routes
– Tweak attributes
on outbound routes
in the hope of
influencing your
neighbor’s best
route selection
inbound
traffic
• For outbound traffic
– Filter inbound
routes
– Tweak attributes
on inbound routes
to influence best
route selection
In general, an AS has more
control over outbound traffic
outbound
traffic
outbound
routes
inbound
routes
Implementing Backup Links with Local Preference
(Outbound Traffic)
AS 1
primary link
Set Local Pref = 100
for all routes from AS 1
backup link
AS 65000
Set Local Pref = 50
for all routes from AS 1
Forces outbound traffic to take primary link, unless link is down.
29
Multihomed Backups
(Outbound Traffic)
AS 1
AS 3
provider
provider
primary link
backup link
Set Local Pref = 100
for all routes from AS 1
Set Local Pref = 50
for all routes from AS 3
AS 2
Forces outbound traffic to take primary link, unless link is down.
30
Shedding Inbound Traffic with
ASPATH Prepending
AS 1
Prepending will (usually)
force inbound
traffic from AS 1
to take primary link
provider
192.0.2.0/24
ASPATH = 2 2 2
192.0.2.0/24
ASPATH = 2
primary
backup
customer
AS 2
192.0.2.0/24
Yes, this is a
Glorious Hack …
31
… But Padding Does Not Always Work
AS 1
AS 3
provider
provider
192.0.2.0/24
ASPATH = 2
192.0.2.0/24
ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2
primary
backup
customer
AS 2
192.0.2.0/24
AS 3 will send
traffic on “backup”
link because it prefers
customer routes and local
preference is considered
before ASPATH length!
Padding in this way is often
used as a form of load
32
balancing
COMMUNITY Attribute to the Rescue!
AS 1
AS 3
provider
provider
AS 3: normal
customer local
pref is 100,
peer local pref is 90
192.0.2.0/24
ASPATH = 2
COMMUNITY = 3:70
192.0.2.0/24
ASPATH = 2
primary
backup
customer
AS 2
192.0.2.0/24
Customer import policy at AS 3:
If 3:90 in COMMUNITY then
set local preference to 90
If 3:80 in COMMUNITY then
set local preference to 80
If 3:70 in COMMUNITY then
set local preference to 70
33
BGP Wedgies ---- Bad Policy Interactions
that Cannot be Debugged
[email protected]
http://www.cambridge.intel-research.net/~tgriffin/
What is a BGP Wedgie?
¾ wedgie
full
wedgie
• BGP policies make sense
locally
• Interaction of local policies
allows multiple stable routings
• Some routings are consistent
with intended policies, and
some are not
– If an unintended routing is installed
(BGP is “wedged”), then manual
intervention is needed to change to
an intended routing
• When an unintended routing is
installed, no single group of
network operators has enough
knowledge to debug the
problem
¾ Wedgie Example
AS 3
peer
peer
provider
AS 4
provider
customer
AS 2
provider
primary link
backup link
customer
customer
AS 1
• AS 1 implements
backup link by
sending AS 2 a
“depref me”
community.
• AS 2 implements this
community so that
the resulting local
pref is below that of
routes from it’s
upstream provider
(AS 3 routes)
And the Routings are…
AS 3
AS 4
AS 2
AS 3
AS 4
AS 2
AS 1
Intended Routing
Note: this would be the ONLY
routing if AS2 translated its
“depref me” community to a
“depref me” community of AS 3
AS 1
Unintended Routing
Note: This is easy to reach from
the intended routing just by “bouncing”
the BGP session on the primary link.
Recovery
AS 3
AS 4
AS 2
AS 3
AS 4
AS 2
AS 1
Bring down AS 1-2 session
AS 3
AS 4
AS 2
AS 1
AS 1
Bring it back up!
• Requires manual intervention
• Can be done in AS 1 or AS 2
Load Balancing Example
AS 3
peer
provider
peer
AS 4
provider
customer
customer
AS 2
AS 5
primary link for prefix P2
backup link for prefix P1
primary link for prefix P1
backup link for prefix P2
AS 1
• Recovery for prefix P1 may cause
a BGP wedgie for prefix P2 …
Full Wedgie Example
peer
•
peer
AS 3
AS 4
provider
provider
customer
customer
AS 2
peer
provider
•
peer
AS 5
•
backup links
primary link
customer
customer
AS 1
AS 1 implements
backup links by
sending AS 2 and AS 3
a “depref me”
communities.
AS 2 implements its
community so that the
resulting local pref is
below that of its
upstream providers and
it’s peers (AS 3 and AS
5 routes)
AS 5 implements its
community so that the
resulting local pref is
below its peers (AS 2)
but above that of its
providers (AS 3)
And the Routings are…
AS 3
AS 4
AS 5
AS 2
AS 3
AS 4
AS 5
AS 2
AS 1
AS 1
Intended Routing
Unintended Routing
Recovery??
AS 3
AS 4
AS 5
AS 2
AS 3
AS 4
AS 5
AS 2
AS 1
AS 1
Bring down AS 1-2 session
Bring up AS 1-2 session
Recovery
AS 3
AS 2
AS 4
AS 5
AS 3
AS 4
AS 5
AS 2
AS 1
AS 3
AS 2
AS 1
Bring down AS 1-2 session
AND AS 1-5 session
Try telling AS 5 that it has
to reset a BGP session that is
not associated with a BEST route!
AS 4
AS 5
AS 1
Bring up AS 1-2 session
AND AS 1-5 session
Larry Speaks
Is this any
way to run an
Internet?
http://www.larrysface.com/
References
•
[VGE1996, VGE2000] Persistent Route Oscillations in Inter-Domain Routing.
Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer
Networks, Jan. 2000. (Also USC Tech Report, Feb. 1996)
• [GW1999] An Analysis of BGP Convergence Properties. Timothy G. Griffin,
Gordon Wilfong. SIGCOMM 1999
• [GSW1999] Policy Disputes in Path Vector Protocols. Timothy G. Griffin, F.
Bruce Shepherd, Gordon Wilfong. ICNP 1999
• [GW2001] A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong.
INFOCOM 2001
• [GR2000] Stable Internet Routing without Global Coordination. Lixin Gao,
Jennifer Rexford. SIGMETRICS 2000
• [GGR2001] Inherently safe backup routing with BGP. Lixin Gao, Timothy G.
Griffin, Jennifer Rexford. INFOCOM 2001
– [GW2002a] On the Correctness of IBGP Configurations. Griffin and
Wilfong.SIGCOMM 2002.
– [GW2002b] An Analysis of the MED oscillation Problem. Griffin and Wilfong.
ICNP 2002.
Pointers
• Interdomain routing links
– http://www.cambridge.intelresearch.net/~tgriffin/interdomain/
• These slides
– http://www.cambridge.intelresearch.net/~tgriffin/talks_tutorials
/CL_2031024.ppt
Descargar

An Introduction to Interdomain Routing and the Border