Challenges of Voice-over-IP –
The Second Quarter Century
Henning Schulzrinne
Dept. of Computer Science
Columbia University
1
Credits

Members of the IRT lab and project students:









Clayton Chen
Wenyu Jiang
Jonathan Lennox
Sankaran Narayanan
Jonathan Rosenberg
Kundan Singh
Xin Wang
Xiaotao Wu
IETF SIMPLE, SIP, SIPPING working groups
2
Outline


A brief history of packet voice
Challenges:









QoS
Security
NATs
Service creation
Scaling
Interworking
Emergency calls
CINEMA project at Columbia
Events as new Internet service
3
A brief history

August 1974


December 1974


Live packet voice conferencing between USC/ISI, MIT/LL,
SRI, using LPC and NVCP
Approximately 1976


Packet voice between CHI and MIT/LL, using LPC and NVP
January 1976


Real-time packet voice between USC/ISI and MIT/LL, using
CVSD and NVP.
First packetized speech over SATNET between Lincoln Labs
and NTA (Norway) and UCL (UK)
1990

ITU recommendation G.764 (Voice packetization –
packetized voice protocols)
4
A brief history

February 1991


August 1991


RTP standardized (RFC 1889/1890)
November 1996


First IETF MBONE broadcast (San Diego)
January 1996


LBL's audio tool vat released for DARTnet use
March 1992


DARTnet voice experiments
H.323v1 published
February/March 1999

SIP standardized (RFC 2543)
5
VoIP applications

Trunk replacements between PBXs



IP centrex – outsourcing the gateway


Cisco Avvid, 3Com, Mitel, ...
Consumer calling cards (phone-to-phone)


Denwa, Worldcom
Enterprise telephony


Ethernet trunk cards for PBXs
T1/E1 gateways
net2phone, iConnectHere (deltathree), ...
PC-to-phone, PC-to-PC

net2phone, dialpad, iConnectHere, mediaring, ...
6
VoIP protocol components

RTP for data transmission


SIP or H.323 for call setup (signaling)



ROHC, CRTP for header compression
sometimes, H.248 (Megaco) for control of
gateways
ENUM for mapping E.164 numbers to
(SIP) URIs
TRIP for large gateway clouds
7
Where are we?

Variety of robust SIP phones (and lots
of proprietary ones)


SIP carriers terminate LAN VoIP




not yet in Wal-Mart
number portability?
911
50+ vendors at SIPit
Building blocks: media servers, unified
messaging, conferencing, VoiceXML, …
8
Status in 2002



2000: 6b wholesale, 15b minutes retail
2001: 10b worldwide – 6% of traffic
(only phone-to-phone)
e.g., net2phone: 341m min/quarter
9
Where are we?

Not quite what we had in mind

initially, SIP for initiating multicast conferencing




in progress since 1992
still small niche
even the IAB and IESG meet by POTS conference…
then VoIP



written-off equipment (circuit-switched) vs. new
equipment (VoIP)
bandwidth is (mostly) not the problem
“can’t get new services if other end is POTS’’  “why
use VoIP if I can’t get new services”
10
Where are we?

VoIP: avoiding the installed base issue



cable modems – lifeline service
3GPP – vaporware?
Finally, IM/presence and events



probably, first major application
offers real advantage: interoperable IM
also, new service
11
VoIP at Home


Lifeline (power)
Multiple phones per household





expensive to do over PNA or 802.11
BlueTooth range too short
need wireless SIP base station + handsets
PDAs with 802.11 and GSM? (Treo++)
Incentives

SMS & IM services
12
SIP phones

Hard to build really basic phones


need real multitasking OS
need large set of protocols:






IP, DNS, DHCP, maybe IPsec, SNTP and SNMP
UDP, TCP, maybe TLS
HTTP (configuration), RTP, SIP
user-interface for entering URLs is a pain
see “success” of Internet appliances
“PCs with handset” cost $500 and still have a
Palm-size display
13
Challenges: QoS


Bottlenecks: access and interchanges
Backbones: e.g., Worldcom Jan. 2002




50 ms US, 79 ms transatlantic RTT
0.067% US, 0.042% transatlantic packet
loss
Keynote 2/2002: “almost all had error
rates less then 0.25%” (but some up to
1%)
LANs: generally, less than 0.1% loss,
but beware of hubs
14
15
Challenges: QoS


Not lack of protocols – RSVP, diff-serv
Lack of policy mechanisms and complexity






which traffic is more important?
how to authenticate users?
cross-domain authentication
may need for access only – bidirectional traffic
DiffServ: need agreed-upon code points
NSIS WG in IETF – currently, requirements
only
16
RNAP: price-based admission
and adaptation



Model: users adjust multimedia
bandwidth according to price sensitivity
Generally, automatically based on
profile
DiffServ or IntServ model
17
RNAP network model
RNAP
Messages
HRN
LRN
LRN
Local Resource
Negotiator
LRN
LRN
LRN
Access Domain - A
LRN
LRN
LRN
LRN
LRN
LRN
HRN
LRN
Edge Router
Internal Router
LRN
Access Domain - B
Transit Domain
18
RNAP performance
19
RNAP performance
20
QoS: Voice quality evaluation

Traditional: use lots of human subjects to rate speech
quality (mean-opinion score) or signal-processing
approximations
We: Use automatic speech recognizer to do the job
human mean recognition ratio
90.0%
human mean recognition ratio

80.0%
70.0%
60.0%
50.0%
40.0%
30.0%
20.0%
10.0%
0.0%
0%
5%
10%
15%
20%
loss rate
human mean recognition ratio
21
QoS: voice-quality
relative performance as universal MOS predictor
4
mean MOS
3.5
3
2.5
2
1.5
1
50%
60%
70%
80%
90%
100%
110%
relative recog ratio
relative recog ratio (speaker A)
relative recog ratio (speaker B)
22
QoS: voice quality
G.729 quality with and without FEC, T=20 or
40ms
4.5
4
MOS
3.5
3
2.5
2
1.5
1
0%
5%
10%
15%
20%
loss rate
MOS w/o FEC, T=20ms
MOS w. FEC, T=20ms
MOS w/o FEC, T=40ms
MOS w. FEC, T=40ms
23
QoS: voice quality
Quality comparison of FEC vs. LBR for G.729
codec, T=30ms, Gilbert loss, p_c=30%
4.5
4
MOS
3.5
3
2.5
2
1.5
1
0%
5%
10%
15%
loss rate
G.729+G.723.1 LBR
G.729 FEC (2,1)
24
Challenges: Security


Classical model of restricted access
systems -> cryptographic security
Objectives:




identification for access control & billing
phone/IM spam control (black/white lists)
call routing
privacy
25
SIP security




Bar is higher than for email – telephone
expectations (albeit wrong)
SIP carries media encryption keys
Potential for nuisance – phone spam at
2 am
Safety – prevent emergency calls
26
System model
outbound proxy
SIP trapezoid
[email protected]:
128.59.16.1
registrar
27
SIP session setup
INVITE
[email protected]:
128.59.16.1
REGISTER
BYE
28
Threats


Bogus requests (e.g., fake From)
Modification of content







REGISTER Contact
SDP to redirect media
Insertion of requests into existing dialogs:
BYE, re-INVITE
Denial of service (DoS) attacks
Privacy: SDP may include media session keys
Inside vs. outside threats
Trust domains – can proxies be trusted?
29
Threats

third-party



passive man-in-middle (MIM)




not on path
can generate requests
listen, but not modify
active man-in-middle
replay
cut-and-paste
30
Challenges: NATs and firewalls

NATs and firewalls reduce Internet to
web and email service





firewall, NAT: no inbound connections
NAT: no externally usable address
NAT: many different versions -> binding
duration
lack of permanent address (e.g., DHCP)
not a problem -> SIP address binding
misperception: NAT = security
31
Challenges: NAT and firewalls

Solutions:


longer term: IPv6
longer term: MIDCOM for firewall control?


control by border proxy?
short term:




NAT: STUN and SHIPWORM
send packet to external server
server returns external address, port
use that address for inbound UDP packets
32
Challenges: service creation


Can’t win by (just) recreating PSTN
services
Programmable services:




equipment vendors, operators: JAIN Java
API
web-like (Perl scripts): sip-cgi
proxy-based call routing: CPL
voice-based interaction: VoiceXML
33
Call Processing Language


XML rule set for handling calls
Intentionally not Turing-complete
<cpl>
<subaction id="voicemail">
<location url="sip:[email protected]" ><proxy />
</location>
</subaction>
<incoming>
<location url="sip:[email protected]">
<proxy timeout="8">
<busy><sub ref="voicemail" /></busy>
<noanswer><sub ref="voicemail" /></noanswer>
</proxy>
</location>
</incoming>
</cpl>
34
sip-cgi: scripting phone calls
use DB_File;
sub fail {
my($status, $reason) = @_;
print "SIP/2.0 $status $reason\n\n";
exit 0;
}
tie %addresses, 'DB_File', 'addresses.db'
or fail("500", "Address database failure");
$to = $ENV{'HTTP_TO'};
if (! defined( $to )) {
fail("400", "Missing Recipient");
}
35
Emergency calls

Opportunity for enhanced services:


Finding the right emergency call center
(PSAP)



video, biometrics, IM
VoIP admin domain may span multiple 911 calling
areas
Common emergency address
User location


GPS doesn’t work indoors
phones can move easily – IP address does not
help
36
Emergency calls
common emergency identifier: [email protected]
EPAD
REGISTER sip:sos
302 Moved
Contact: sip:[email protected]
Contact: tel:+1-201-911-1234
Location: 07605
INVITE sip:sos
Location: 07605
SIP
proxy
INVITE sip:[email protected]
Location: 07605
37
Scaling and redundancy

Single host can handle 10-100 calls +
registrations/second  18,000-180,000
users



1 call, 1 registration/hour
Conference server: about 50 small
conferences or large conference with
100 users
For larger system and redundancy,
replicate proxy server
38
Scaling and redundancy

DNS SRV records allow static load
balancing and fail-over



but failed systems increase call setup delay
can also use IP address “stealing” to mask
failed systems, as long as load < 50%
Still need common database


can separate REGISTER
make rest read-only
39
Large system
stateless proxies
sip1.example.com
a1.example.com
a2.example.com
sip2.example.com
sip:[email protected]
sip:[email protected]
b1.example.com
sip3.example.com
b2.example.com
_sip._udp SRV 0 0 sip1.example.com
0 0 sip2.example.com
_sip._udp SRV 0 0 b1.example.com
0 0 b2.example.com
0 0 sip3.example.com
40
Enterprise VoIP



Allow migration of enterprises to IP
multimedia communication
Add capacity to existing PBX, without
upgrade
Allow both



IP centrex: hosted by carrier
“PBX”-style: locally hosted
Unlike classical centrex, transition can be
done transparently
41
Motivation



Not cheaper phone calls
Single number, follow-me – even for analog
phone users
Integration of presence



person already busy – better than callback
physical environment (IR sensors)
Integration of IM



no need to look up IM address
missed calls become IMs
move immediately to voice if IM too tedious
42
Migration strategy
Add IP phones to existing PBX or
Centrex system – PBX as gateway
1.

2.
3.
4.
Initial investment: $2k for gateway
Add multimedia capabilities: PCs,
dedicated video servers
“Reverse” PBX: replace PSTN
connection with SIP/IP connection to
carrier
Retire PSTN phones
43
Example: Columbia Dept. of
CS

About 100 analog phones on small PBX







DID
no voicemail
T1 to local carrier
Added small gateway and T1 trunk
Call to 7134 becomes sip:[email protected]
Ethernet phones, soft phones and conference
room
CINEMA set of servers, running on 1U
rackmount server
44
CINEMA components
Cisco 7960
MySQL
sipconf
user database
rtspd
LDAP server
conferencing
server
(MCU)
sipd
RTSP
media
server
RTSP
plug'n'sip
wireless
802.11b
proxy/redirect server
unified
messaging
server
Pingtel
Nortel
Meridian
Cisco
2600
sipum
VoiceXML
server
PBX
T1
T1
SIP
sipvxml
PhoneJack interface
sipc
SIP-H.323
converter
sip-h323
45
Experiences

Need flexible name mapping



[email protected][email protected]
sources: database, LDAP, sendmail aliases, …
Automatic import of user accounts:

In university, thousands each September




much easier than most closed PBXs
Integrate with Ethernet phone configuration


/etc/passwd
LDAP, ActiveDirectory, …
often, bunch of tftp files
Integrate with RADIUS accounting
46
Experiences

Password integration difficult



Digest needs plain-text, not hashed
Different user classes: students, faculty,
admin, guests, …
Who pays if call is forwarded/proxied?


authentication and billing behavior of PBX
and SIP system may differ
but much better real-time rating
47
SIP doesn’t have to be in a
phone
48
Event notification


Missing new service in the Internet
Existing services:



get & put data, remote procedure call:
HTTP/SOAP (ftp)
asynchronous delivery with delayed pickup: SMTP (+ POP, IMAP)
Do not address asynchronous
(triggered) + immediate
49
Event notification

Very common:




operating systems (interrupts, signals,
event loop)
SNMP trap
some research prototypes (e.g., Siena)
attempted, but ugly:


periodic web-page reload
reverse HTTP
50
SIP event notification

Uses beyond SIP and IM/presence:


Alarms (“fire on Elm Street”)
Web page has changed




cooperative web browsing
state update without Java applets
Network management
Distributed games
51
Conclusion

Transition to VoIP will take much longer than
anticipated  replacement service




digital telephone took 20 years...
3G (UMTS R5) as driver?
combination with IM, presence, event
notification
Emphasis protocols operational
infrastructure



security
service creation
PSTN interworking
52
L3/L4 security options

IPsec





Provides keying mechanism
but IKE is complex and has interop problems
works for all transport protocol (TCP, SCTP, UDP,
…)
no credential-fetching API
TLS



provides keying mechanism
good credential binding mechanism
no support for UDP; SCTP in progress
53
Hop-by-hop security: TLS


Server certificates well-established for
web servers
Per-user certificates less so


email return-address (class 1) certificate
not difficult (Thawte, Verisign)
Server can challenge client for
certificate  last-hop challenge
54
HTTP Digest authentication

Allows user-to-user (registrar)
authentication



mostly client-to-server
but also server-to-client (AuthenticationInfo)
Also, Proxy-Authenticate and ProxyAuthorization

May be stacked for multiple proxies on
path
55
HTTP Digest authentication
401 Unauthorized
WWW-Authenticate: Digest
realm="[email protected]",
qop=auth,
nonce="dcd9"
REGISTER
To: sip:[email protected]
REGISTER
To: sip:[email protected]
Authorization: Digest
username="alice",
nc=00000001,
cnonce="defg",
response="9f01"
REGISTER
To: sip:[email protected]
Authorization: Digest
username="alice",
nc=00000002,
cnonce="abcd",
response="6629"
56
End-to-end authentication

What do we need to prove?




Person sending BYE is same as sending
INVITE
Person calling today is same as yesterday
Person is indeed "Alice Wonder, working for
Deutsche Bank"
Person is somebody with account at MCI
Worldcom
57
End-to-end authentication

Why end-to-end authentication?




prevent phone/IM spam
nuisance callers
trust: is this really somebody from my
company asking about the new widget?
Problem: generic identities are cheap

filtering [email protected] doesn't prevent
calls from [email protected] (new day, sam
person)
58
End-to-end authentication and
confidentiality

Shared secrets



only scales (N2) to very small groups
OpenPGP chain of trust
S/MIME-like encapsulation

CA-signed (Verisign, Thawte)



every end point needs to have list of Cas
need CRL checking
ssh-style
59
Ssh-style authentication


Self-signed (or unsigned) certificate
Allows active man-in-middle to replace
with own certificate


always need secure (against modification)
way to convey public key
However, safe once established
60
DOS attacks



CPU complexity: get SIP entity to
perform work
Memory exhaustion: SIP entity keeps
state (TCP SYN flood)
Amplification: single message triggers
group of message to target

even easier in SIP, since Via not subject to
address filtering
61
DOS attacks: amplification

Normal SIP UDP operation:



Modified procedure:


one INVITE with fake Via
retransmit 401/407 (to target) 8 times
only send one 401/407 for each INVITE
Suggestion: have null authentication


prevents amplification of other responses
E.g., user "anonymous", password empty
62
DOS attacks: memory



SIP vulnerable if state kept after
INVITE
Same solution: challenge with 401
Server does not need to keep challenge
nonce, but needs to check nonce
freshness
63
Descargar

Challenges of Voice-over-IP – The Second Quarter Century