TU/e
technische universiteit eindhoven
Web Server Programming
2. Building Applications
/ architecture of information systems
http://wwwis.win.tue.nl/
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
webserver-side-specific development issues
Differences with “traditional” software applications:
• user interface is webpage based
• choice of programming languages, libraries and tools
• client/server “ping pong”
=> need for session management
• user identification
• access control, security; resource control
• issues with debugging and testing
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
the Web as a user interface layer
Web applications turn the web browser into the universal user interface. E.g.
• specific databases, low-level database management
• forums, wikis, blogs, mail servers, mailing lists, newsgroups
• other webservers (e.g. with rewriting proxies)
• train sets, weather stations; any application
If you develop a multiuser application, make it a server-side web application!
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
web-based user interface paradigm
• formatted text + images + hyperlinks + forms
(forms are special hyperlinks)
=>
URLs for input (+ some extras)
HTML for output (+ some extras)
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
WWW = URLs + HTTP + HTML
d o c u m e n ts
URL
HTTP
s c rip ts
HTML
w e b s e rv e r
w e b b ro w s e r
w e b p ro to c o l
w e b s e rv e r
T C P /IP (In te rn e t)
s u p p o rt
T C P /IP (In te rn e t)
p ro to c o ls
T C P /IP (In te rn e t)
s u p p o rt
o p e ra tin g s y s te m
lo w -le v e l p ro to c o l
o p e ra tin g s y s te m
c o m p u te r
h a rd w a re
c a b le s
c o m p u te r
h a rd w a re
a n o th e r
w e b s e rv e r
d a ta b a s e
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
issue: HTML != HTML
candidate document formats to send to the client:
• HTML (versions: 2.0 , 3.2 , 4.0 , 4.01, XHTML)
• CSS 1 , CSS 2
• Javascript
• plugins: Java applets, Flash,
Real/Quicktime/WMP
• non-embeddable: any other file format
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
issue: HTML != HTML (2)
all web browsers have bugs, quirks, missing bits
and their own special features in their support of
• HTML
• CSS
• Javascript
• various image formats
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
webserver-side-specific development issues
Differences with “traditional” software applications:
 user interface is webpage based
• choice of programming languages, libraries and tools
• client/server “ping pong”
=> need for session management
• user identification
• access control, security; resource control
• issues with debugging and testing
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
programming languages and tools
• which programming languages to use?
• interpreted vs. compiled
• portability
• available libraries
• available frameworks
• prototype applications
• use HTML + hole technology? (e.g. JSP)
• program / server interface?
• CGI
• built into server (e.g. Java servlets)
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
programming languages and tools
PHP Perl Java ASP ASP.NET ColdFusion Python Ruby Lua
XML XSLT SOAP RDF J2EE Typo3 Zope phpNuke
Mediawiki JBoss etc. etc. etc. etc. etc. etc.
• in this course: Java servlets, JSP
• all the rest: not in this course, sorry
• use the Web! for software, paper is obsolete
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
webserver-side-specific development issues
Differences with “traditional” software applications:
 user interface is webpage based
 choice of programming languages, libraries and tools
• client/server “ping pong”
=> need for session management
• user identification
• access control, security; resource control
• issues with debugging and testing
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
session management
webserver applications are inherently multi-user!
HTTP is stateless (requests are separate)
=>
sessions are not automatic
example: tic-tac-toe
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
multiuser tic-tac-toe
r
p
rp 's m o ve
a u x. file s
rp 's n e w p o sitio n
w e b se rve r
p l's m o ve
tic-ta c-to e scrip t
p l's n e w p o sitio n
p
l
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
session management: techniques
• stateless “session”:
server doesn’t remember any state but puts it
in HTTP request/response (see tic-tac-toe)
• session: server remembers state, only puts pointer
in HTTP request/response
• client may also remember state (e.g. cookies)
• client and server can store state across sessions
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
session management and HTTP
options to communicate state info
(the full state, or a pointer to it) with HTTP:
• encode state (or pointer to it) in URLs
• use extra POST info (when URL grows too big)
• use cookies (e.g. to persist across sessions)
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
designing session management with URLs
• a user action activates a URL
• the application generates pages with URLs to itself
state machine
1. determine the application state machine
state:
transition:
2. define a mapping: URLs -> transitions
3. make the hyperlinks in the output page
implement this mapping
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
session management with cookies
• a request arrives that isn’t part of a session yet
• the server generates a “cookie” with state info
• the server sends it with the document (in a HTTP
response header)
• the client remembers it (in memory / on disk)
• the client sends it along with subsequent URLs (in a
HTTP request header)
• the server identifies the client by the cookie
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
multiuser tic-tac-toe
r
p
rp 's m o ve
a u x. file s
rp 's n e w p o sitio n
w e b se rve r
p l's m o ve
tic-ta c-to e scrip t
p l's n e w p o sitio n
p
l
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
URLs vs. cookies
• URLs must be short,
cookies can be somewhat longer
• URLs are public (can be guessed)
• URLs are always supported by the client
• cookies can be unsupported
or refused by the client
• cookies can be saved to disk
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
webserver-side-specific development issues
Differences with “traditional” software applications:
 user interface is webpage based
 choice of programming languages, libraries and tools
 client/server “ping pong”
=> need for session management
• user identification
• access control, security; resource control
• issues with debugging and testing
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
user identification & access control
(security): making sure someone/something can exactly do the
things you want them to do
issues for web application development:
• server-side authorization of client-side users
• client-side authorization of server-side providers
• securing the connection
• separating multiple users/services on the server
• separating multiple users on the client
• copyright
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
multiuser tic-tac-toe
r
p
rp 's m o ve
a u x. file s
rp 's n e w p o sitio n
w e b se rve r
p l's m o ve
tic-ta c-to e scrip t
p l's n e w p o sitio n
p
l
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
server-side authorization of client-side users
A service provider wants to identify a user. Why?
• personalization: remembering user preferences
• assigning legal responsibilities to the user
• purchase (Amazon)
• liability for use of services (e.g. in a chatbox)
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
server-side authorization of client-side users
A service provider wants to identify a user. How?
• by Internet host (unreliable)
• by something the user told you (not 100% reliable)
e.g. username/password
• by external identification (not 100% reliable)
e.g. credit card number; chip card
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
server-side authorization of client-side users
How does username/password identification work?
• “new user” screen: user selects name and possibly password
• if entering into contract/obligation demand external identification data (e.g.
name, home address, e-mail address)
• use it to send them some unguessable data (usually by sending e-mail with a
special sign-on URL)
• “confirm registration” screen: users enters unguessable data
• store user registration data on the server side
• “log in” screen: user enters data – server matches user and starts session
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
server-side authorization of client-side users
Username/password transmission technology:
• put user+password in URLs (very unsafe)
• web passwords (“basic authentication”)
(unsafe connection: eavesdropping on the network)
• https / SSL with password
(safe connection; public key cryptography)
• SSL without password (third party authority)
(e.g. “het elektronische paspoort”)
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
server-side authorization of client-side users
A service provider wants to identify a user. How?
How to use external verification?
• exchange external identification data on the web (e.g.
web-based payment) – data may be compromised so this
requires trust from both parties
• exchange data through another means (e.g. don’t offer
web-based payment) – cumbersome; requires trust in
alternative communication media
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
user identification & access control
(security): making sure someone/something can exactly do the
things you want them to do
issues for web application development:
 server-side authorization of client-side users
• client-side authorization of server-side providers
• securing the connection
• separating multiple users/services on the server
• separating multiple users on the client
• copyright
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
client-side authorization of server-side providers
A user wants to identify a service provider. How?
• provide external identification data (name;
address; phone; etc.)
• SSL without password (electronic identification
certificates via third-party authority)
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
securing the connection
The connection can be prevented from being
compromised (e.g. SSL).
The server can not prevent the client from being
compromised.
The client can not prevent the server from being
compromised.
You can have a safe pipe but the ends may leak.
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
separating multiple users/services on the client
Two users can use the same computer.
They have access to each other’s data.
e.g. TU/e electronic voting blooper
The server end cannot prevent this!
The user is responsible. The user must know this.
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
separating multiple users/services on the server
Two services can use the same webserver.
They must not have access to each other’s private
data.
-> security restrictions on server-side programs
The webserver provider is responsible. The service
provider / programmer must accept this.
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
copyright - “users steal my work!”
• page content (once they have it they can spread it)
• software code (one reason for keeping it server-side)
Everything you put on the web is
• copyrighted
• easy for the user to steal
What to do:
• don’t place value on your copyrights / waive them
• don’t publish anything else, or
• put it on a secure connection and sue the abuser (hard)
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
user identification & access control
(security): making sure someone/something can exactly do the
things you want them to do
issues for web application development:
 server-side authorization of client-side users
 client-side authorization of server-side providers
 securing the connection
 separating multiple users/services on the server
 separating multiple users on the client
 copyright
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
webserver-side-specific development issues
Differences with “traditional” software applications:
 user interface is webpage based
 choice of programming languages, libraries and tools
 client/server “ping pong”
=> need for session management
 user identification
• access control, security; resource control
• issues with debugging and testing
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
resource control
overflow of resources
(CPU; memory; hard disk; network bandwidth)
•
set predefined limits on resource use
(if the software supports it)
•
automatic cleanup
•
manual checking / monitoring
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
debugging and testing
Options for testing / debugging:
• run program outside webserver environment
• use webserver facilities for debugging
• some debuggers have “remote debugging”
(Visual Studio .NET, Komodo)
• set up a test webserver
/ architecture of information systems
/ department of mathematics and computer science
TU/e
technische universiteit eindhoven
webserver-side-specific development issues
Differences with “traditional” software applications:
 user interface is webpage based
 choice of programming languages, libraries and tools
 client/server “ping pong”
=> need for session management
 user identification
 access control, security; resource control
 issues with debugging and testing
/ architecture of information systems
/ department of mathematics and computer science
Descargar

Why server programming?