Building High Throughput, Multithreaded Servers in C#/.NET
David Buksbaum
Who Am I?
Wall Street Wage Slave
Developer – I still write code
Tech-Lead – My bugs are harder to find
Pointy Haired Manager – I live by the KISS principle
What have I built?
• User Time to Near Real Time Systems
• User Time: event time is based on user interaction
• Near Real Time: event timing is as fast as possible
without guarantee
• Server to Server
• Client - Server
• Small Scale: 1-25 concurrent users
• Large Scale: 250+ concurrent users
• Distributed Systems (now called SOA)
C# Crash Course
Application Servers
Connection Stack
Further Concepts
C# Crash Course
• [external]
Application Servers
• What I mean by App Servers
• A stateful server to service client requests for highlevel business functions in near user time
• RRE Messaging Model (request, response, event)
• Middle Tier Intra-Net Systems
• May contain 1 or more logical servers
• What I do not mean by App Servers
Not the back-end infrastructure systems
Broadcast Servers (pricing, curve servers, etc…)
Bus Servers (messaging platforms like Jabber)
Database Servers
• Endian-ness
• Does not matter unless your network is exactly ½ little and ½
• Pick the model used more and make the other convert
• Only the smaller group will have the performance hit
• Most clients to your servers will be little-endian
• Many middle-tier systems are Intel based
• Solaris is being moved further back or out due to costs
• Typical configurations will be WinTel clients with either
WinTel or LinTel servers
Concepts – cont.
• Threading
• Your server functionality will determine threading or not
• Guideline: If you threads on WinTel, then thread count target
should be (2 * CPU’s) + 1
• Explained in detail by Jeffery Richter in "Programming Server Side
Applications for Windows 2000"
• Exceptions: Lots of I/O bound threads for example, so use your
• Burst servers should be guided towards single threaded per
connection (min concurrency)
• Servers with minimal state, fast response time, and no persistent
• HTTP 1.0 defaulted to a burst server (no keepalives)
• What is a burst server? echo, time, etc…
• Pattern: UDP or short lived TCP
Concepts – cont.
• Threading – cont.
• Stateful servers need threads – so use them
• State operations are slower than network
operations, so minimize it in the network thread
• If you use threads, use thread pools
• Creating threads are expensive, so don’t do it
• .NET has a built in thread pool
• System.Threading.ThreadPool
• ThreadPool.QueueUserWorkItem(new
WaitCallback(myThreadFunc), myContextObject);
• In large systems, use multiple specialized pools
• Different Priorities, Order Guarantees, etc…
Concepts – cont.
• Queues
• Rule 1: If you use threads – you use queues
• Queues break performance coupling between two
independent tasks
• Task 1: Read / Write Data From / To Network
• Task 2: Do Something with the Data
• Queues allow for complex tasks to be sub-divided into
simpler sub-tasks
• These sub-tasks can be worked on concurrently (parallelism or grids)
or synchronously
• Reusability of sub-tasks between different parent tasks
• eg: fetch from database, calculate distance between two points on a
map, execute a trade
• Simpler code is easier to build, debug, & maintain
Concepts – cont.
• Synchronicity
• Blocking Synchronous I/O
• Read or Write w/o timeouts
• Select w/o timeouts
• Non-Blocking Synchronous I/O
• Read or Write w/ timeouts
• Select w/ timeouts
• Polling
• Asynchronous I/O
• BeginReceive or BeginSend
• I/O Completion Ports & Overlapped I/O
Concepts – cont.
• Synchronicity – cont.
• Which do you use?
• Rule 2: Never, Ever, Use Blocking I/O
• Server Threading model defines synchronicity
• Thread per connection  Non-Blocking Synchronous I/O
• Thread pools  Asynchronous I/O
• Clients can usually use Non-Blocking Synchronous
I/O in a dedicated thread
Concepts – cont.
• State
• Definition
• Data and Methods encapsulating a logic block related
• An App Server has different types of state
• Server State
• State managed on a per logical server basis
• Used anywhere the server needs to self administer itself
• Usually the container of all other state in the server
• Connection State
• State managed on a per user connection basis
• Used only in a context where the connection needs to be known
• Reading and Writing to the network
• Verifying credentials
Concepts – cont.
• State – cont.
• Business State
• Contains all the cached data used to support the business
• Provides access to all non-cached data transparently
• eg: BusinessState.GetAllAccounts() may go to the cache or some
external store. Follow good OO procedures here.
• State Rules
• Minimize surface area (less public methods), but no side
• Everything public MUST be reentrant or thread-safe
• Encapsulate – no public variables
• Design for threading
Concepts – cont.
• State – cont.
• Rule 3: State will dominate server design &
• Questions to ask when defining state
• Who needs access to it?
• When do they need access to it?
• Special Cases – frequent & pervasive
• Logon, Logoff
• Credentials Check
Concepts – cont.
• State – cont.
• Techniques
• Synchronized Collections
• Equal punishment to readers & writers
• Reader / Writer Locks
• Favors reads over writes
• External Data Store
• Push the locking burden on an expert
• Embedded
• All state is passed between the client and server in every message
• eg: URL’s, Cookies, etc…
• Tokens
• A token for looking up the state is passed between the client & server in every
• Usually used to add state to a stateless design (eg: HTTP)
• Hybrids
• Combinations of the above
• eg: Tokens, Embedded, & External are used by many web servers
Concepts – cont.
• Performance
• User Time vs. Actual Time
• User Time is the time between the initiation of a process
by the user on the client and when the result of that action
is visible to the user
• Actual Time is the actual time it takes to perform the
given action on the server from receipt of a request to the
sending of a response
• Throughput of the server is often defined in terms of
actual time (messages per second)
• Rule 4a: The user does not care about Throughput
• Rule 4b: User Time is all that matters in interactive
Concepts – cont.
• Performance – cont.
• Users vs. Actions
How many concurrent users?
How many concurrent actions?
Similar to Time vs. Memory tradeoff
eq: 1 user with 100 actions or 100 users with 1 action
Concepts – cont.
• Case Example: Massive Multiplayer Online RolePlaying Games (MMORPG) – Everquest 2
• 2500 concurrent users per server
• Each user requires significant state management
Location in the game world
Character statistics
Relationships to other characters, non-player characters,
• State is a hybrid combining tokens & external data store
• User time is between 11 & 140 ms, with the bulk of the users
being at around 100 ms – derived from frame rates
• Actual time peak is 25000 / second (2500 * 10)
• Assumes all 2500 are not just concurrent, but active
Architecture @ 50,000 Feet
User Time
Initial Response Time
Actual Time
• P1, P2 and P3 represent the areas that have the possibility for
high contention due to locking or congestion
• A fast initial response time, followed by incremental updates
(streaming or events), can give the illusion of faster user time
Protocol Stack @ 50,000 Feet
• Wire
• TCP, UDP, or other
network protocol
• Transport
• Envelope to encapsulate
the application message
• Fields:
• Length, Checksum, Flags,
• Application
• Message ID
• Message Data
Wire Protocols
• Not reliable
• Small blocks of data
• Connectionless
• Reasonably reliable
• Any reasonable amount of data
• No information about content
Wire Protocols – cont.
• Message Queues
• Queue to Queue communications
Each end point has a queue
Messages are moved from one queue to another
A centralized queue can have 1 or more listeners in non-exclusive mode
A centralized queue in exclusive mode can have auto-failover
Reliability level is controllable
• Products
• JMS – Java Messaging Service
• .NET bindings exists
• Derivative works like Tibco’s EMS is a very performant version
• Deployable on top of most network infrastructures
• MS has .NET bindings in the System.Messaging namespace
• Not available on all Windows platforms by default
• Many organizations have restricted this product due to tight integration with the
Windows platform
Wire Protocols – cont.
• Message Queues – cont.
• Alternative Systems
• Book: Message Passing Server Internals by Bill Blunden
• Contains the design and implementation of Bluebox; an
enterprise ready messaging queuing system in Java
• Tibco Rendezvous
• Basically a serialized hash table (key, item)
• Jabber
• Java framework originally geared for Instant Messaging,
but now supporting robust bus based communications
• Still XML based, so not a great fit for high performance
Wire Protocols – cont.
• Why not UDP?
• Packet size limitations
• Connectionless
• Reliability
• Why not TCP?
• Reasonable reliability is not always good enough
• Performance in broadcast scenarios (1 to many
Wire Protocols – cont.
• Why messaging systems?
• Reliable & recoverable queues are great for when
the message absolutely must get to its destination
• Trading systems
• Banking systems
• Transactional
• Queues can be used as the back bone of a system that
support distributed functionality with distributed
transactions (XA)
• Your app sever can participate in a transaction that spans
all systems between the client and the database
Wire Protocols – cont.
• Why messaging systems? – cont.
• Support tools
• Many messaging systems provide rich tools for sniffing
and/or capturing the messages
• Captured messages can be used for playback
• Robust infrastructure
• Messaging systems have been built by networking
experts to accommodate a number of different
infrastructures, topologies, and geographical
• You have someone to scream at when it goes wrong
• Rule 5: If you can, use a commercial transport
Transport Protocol
• Why do you need a protocol between the
application and the wire?
• Wire Insulation: The wire may change
• TCP  Message Queues
• The application protocol may not play nicely into a
wire protocol
MB sized files
Streaming data
Variable length records
Transport Protocol – cont.
• Remember our goal at the wire?
• Get the data off the wire and onto the queue as fast
as possible
• We need information as we get the data off the
wire to speed up processing
• Length of the total message
• Message data
• Smallest transport protocol (4 bytes)
• 4 byte length | n byte data block
Transport Protocol – cont.
• What else do we need?
• We have:
• Length
• Data
• We need (loosely used):
• Message type information
• Message ID
• Sequencing information
• Beginning
• Middle
• End
• Flags or Options
Transport Protocol – cont.
• DIME Lite
• Protocol created by Microsoft to allow for easier
transmission of binary data over SOAP
• Can be used independently of SOAP as a unidirectional
transport block
Transport Protocol – cont.
Version (5 bit)
Specifies the version of the DIME message
MB (1 bit)
Specifies that this record is the first record of the message
ME (1 bit)
Specifies that this record is the last record of the message
CF (1 bit)
Specifies that the contents of the message have been
TYPE_T (4 bit)
Specifies the structure and format of the TYPE field
RESERVED (4 bit)
Reserved for future use
Specifies the length (in bytes) of the OPTIONS field,
excluding any necessary padding (up to 3 bytes)
ID_LENGTH (16 bit)
Specifies the length (in bytes) of the ID field, excluding any
necessary padding (up to 3 bytes)
TYPE_LENGTH (16 bit)
Specifies the length (in bytes) of the TYPE field, excluding
any necessary padding (up to 3 bytes)
DATA_LENGTH (32 bit)
Specifies the length (in bytes) of the DATA field, excluding
any necessary padding (up to 3 bytes)
Contains any optional information used by a DIME parser
Contains a URI for uniquely identifying a DIME payload with
any additional padding; the length of this field is specified by
Specifies the encoding for the record based on a type
reference URI or a MIME media-type; reference type is
specified by TYPE_T, and the length of this field is specified
Contains the actual data payload for the record; format of the
data depends on the TYPE specified for the record; length of
this field is specified by DATA_LENGTH
Application Protocol
• What is it?
• It is anything meaningful to your servers business
• Hash Tables
• Schema’s can be used to define required keys
• eg: Default schema may mandate key 100 in all
• Key 100 might be the schema id for the rest of the message
• Allows for complex record based pseudo-objects with mandatory,
optional, default fields
• Compatible cross language / platform
• Efficient
• Client objects can wrap hash table
Application Protocol – cont.
• What is it? – cont.
• Serialized Objects
• Complex message object or hierarchy of objects
• Very rich distributed object functionality
• Limited to .NET environment and possible subset of languages within
the .NET world
• eg: Java RMI, some transport for .NET Remoting
• SOAP, Raw XML, or other text based transport
• Schema or well defined set of tag / value pairs
• String based
• Changes require new memory block for the whole message since strings
are immutable
• Slow to parse (~350 / sec) [validating, msxml]
• Even the fastest is less than 1000 / sec
Application Protocol – cont.
• What is it? – cont.
• Hybrids
• Serialized objects encoded in hash tables
• Flexibility of hash tables
• Expandable through new keys
• Cross platform / language support
• Versionable objects
• Main limitation is the primitive types supported
• Requires well known schemas
Complete Example
Simple RPC Client / Server
Further Thoughts
• Lock / Wait – Free Data Structures
• Used in graphics pipelines and media streaming
• Reduces contention
• Aimed at small # of writers and any number of readers
(where you would use Reader / Writer locks)
• Protocols
• Current trends:
• Ignore wire performance and go for B2B protocols (SOAP, et al)
• Text based protocols, such as ASCII encoded key / value pairs
• Serialization
• No simple way to work cross language / platform
• Primitives are not always portable
• Most research is focused on B2B
Further Thoughts – cont.
• Memory Pools
• Pre-allocate pools of objects that can are reused
rather than freed
Rules Summary
• Rule 1: If you use threads – you use queues
• Rule 2: Never, Ever, Use Blocking I/O
• Rule 3: State will dominate server design &
• Rule 4a: The user does not care about
• Rule 4b: User Time is all that matters in
interactive systems
• Rule 5: If you can, use a commercial transport
References & Links
• HTTP 1.0 – RFC 1945
• HTTP 1.1 – RFC 2616
• Echo Protocol – RFC 862
• Time Protocol – RFC 868
• Superseded by Network Time Protocol (NTP)
References & Links – cont.
• Endianness
• .NET Thread Pools
• MSDN Docs for ThreadPool Class
• Stephen Toub’s Managed Thread Pool
• Enhanced version of Stephen Toub’s Thread Pool
References & Links – cont.
• Overlapped I/O
• I/O Completion Ports
References & Links – cont.
• Message Queues
• Message Passing Server Internals by Bill Blunden
• Jabber
• Lock / Wait Free Data Structures
Extra Slides
UDP Echo Server – the shell
namespace EchoServer
class Program
static void Main(string[] args)
. . .
catch(Exception x)
UDP Echo Server – the code
{ // create a udp server on port 8666
UdpClient server = new UdpClient(8666);
// create an end point that maps to any ip and any port
IPEndPoint endPoint = new IPEndPoint(IPAddress.Any, 0);
// receive data from our end point
byte[] data = server.Receive(ref endPoint);
// convert from bytes to string
string msg = Encoding.ASCII.GetString(data);
// display the message and who it is from
"Received message from {0} ==> {1}",
endPoint.ToString(), msg);
UDP Echo Client in Java
public static void main(String[] args)
{ // set up our locals
= InetAddress.getByName("");
= 8666;
= "Hello World from Java";
// get the bytes for string
= msg.getBytes();
// create a udp client
DatagramSocket client = new DatagramSocket();
// create our data packet
DatagramPacket packet = new DatagramPacket(data, data.length, host, port);
// send the data
catch(Exception x)
UDP Echo Client in C#
static void Main(string[] args)
{ // set up locals
string hostname = "";
= 8666;
string msg
= "Hello World from C#";
{ // get ascii bytes for our message
= Encoding.ASCII.GetBytes(msg);
// create a udp client
UdpClient client = new UdpClient();
// send the bytes to a udp listener
client.Send(data, data.Length, hostname, port);
catch(Exception x)

Building High Throughput, Multi