Multi-threaded Lisp:
Challenges and Solutions
Roger Corman
Corman Technologies
International Lisp Conference
2002
Threads: Why Bother?



Most modern applications use multiple threads of
execution.
Nearly all servers require threads for reasonable
performance.
New languages have threads support built into
the language.

Most modern operating systems support threads.

Lisp is cool; threads are cool.
Right Way To Think About Threads

All threads run simultaneously.

If they don’t, it doesn’t matter.

If you have more than one processor, they
sometimes do.
This model leads you to consider, early, issues
such as synchronization, data sharing,
performance and messaging.
Wrong Way to Think About
Threads

Multiple execution contexts which run one at a
time.

Thread-switching can be intercepted to perform
arbitrary logic.

Thread-switching can be forced to happen only at
appropriate times.
This model will never support multiple
processors, and will not even perform as well on
a single CPU system.
It is easy to develop bad programming habits,
such as WITHOUT-INTERRUPTS.
Next up: Thread
What a
thread
looks
like:
Personal Background



Developed multi-threaded applications in
C/C++/Java.
Designed Corman Lisp to include all the
capabilities of these other language platforms.
Had to make a lot of decisions about how to
implement multiple threads in Lisp.
Threads Are an OS Function



Just as the file system is a function of the
operating system, threads should be allocated
and scheduled by the operating system.
While lisp-implemented threads can be useful,
they will never be as powerful as operating
system threads.
Many Lisp implementations have supported lispimplemented threads, and a standard API has
developed which is supported by several
implementations. It implies lisp-implemented
threads, with fine-grained control over task
scheduling.
Lisp Process:
Lisp-Implemented Threads
Operating System Thread 0
Lisp Scheduled Thread 0
Lisp Scheduled Thread 1
Lisp Scheduled Thread 2
Processor
A
Processor
B
Processor
C
Lisp Process:
Operating System Threads
Operating System Thread 0
Operating System Thread 1
Operating System Thread 2
Processor
A
Processor
B
Processor
C
Corman Lisp Threading Model
In the Corman Lisp implementation, we chose to
implement only operating system threads.


These give Lisp applications the maximum
capabilities provided by the operating system.
The Win32 API allows the application very limited
control over thread scheduling. We don’t view
this as a OS limitation, but rather as an OS
feature.
Threads and the Common Lisp
Standard


While lightweight processes have been supported
in many Lisp implementations, the standard does
not include any discussion of threads.
The standard contains language which could
imply non-conformance of multi-threaded
implementations.
Thread Implementation Issues

Special variables

Heap access

Garbage collection

Foreign calls and callbacks

Code generation

Debugging

Lisp Image saving and loading

Foreign Threads
From ANSI Specification:
3.1.2.1.1.2 Dynamic Variables
The effect of binding a dynamic variable is to create
a new binding to which all references to that
dynamic variable in any program refer for the
duration of the evaluation of the form that creates
the dynamic binding.
From ANSI Specification Glossary:


dynamic extent n. an extent whose duration is
bounded by points of establishment and
disestablishment within the execution of a
particular form. See indefinite extent. “Dynamic
variable bindings have dynamic extent.”
extent n. the interval of time during which a
reference to an object, a binding, an exit point, a
tag, a handler, a restart, or an environment is
defined.
Special Variables And Threads



Top level bindings should be shared between
threads.
Dynamic bindings (via LET, etc.) should only
apply to the thread the created the binding.
Is there ever a reason for dynamic bindings to be
visible to other threads? (I haven’t found one.)
Implementation of Dynamic Binding



Many Lisp implementations store special variable
bindings in a symbol’s value slot. This is a
problem in a multi-threaded system.
Special variable bindings need to be stackspecific, yet efficient.
“Deep binding” where bindings are kept in a list,
can be inefficient.
Dynamic Bindings in Corman Lisp



Top level (shared) bindings are stored in symbol’s
value slot.
Dynamic bindings are stored in a per-thread
symbol table.
Only symbols which are actually dynamically
bound (in loaded, compiled code) need a slot in
the per-thread symbol table.
Special Variable Bindings
*X*
binding
*Y*
binding
*X* *Y* *Z* etc
*Z*
binding
Master symbol table
Special Variable Bindings
*X*
*Y*
*Z*
*X* *Y* *Z* etc
binding
binding
binding
Thread
local
binding
*X* *Y* *Z* etc
*X* *Y* *Z* etc
Master symbol table
Thread 0 Symbol Table
Thread
local
binding
Thread 1 Symbol Table
Thread Local Variables


If you implement efficient per-thread special
variable binding, you end up with an
implementation of thread local variables (à la C#,
Java).
Surprise: they are far easier to use, elegant, and
fit into the language better than most other
thread local variable models in other languages.
Example: Java




public class MyThread extends Thread
{
// static Hashtable stuff;
public static ThreadLocal stuff = new ThreadLocal();
public void run()
{
// each thread must have a different hashtable
stuff.set(new Hashtable);








}
}
if (((Hashtable)stuff.get()).containsValue("Foo"))
doSomething(); // run code which can refer to hash table
Example: Common Lisp
(defvar *hash-table*)
(defun run ()
(let ((*hash-table* (make-hash-table)))
(do-something))) ; run code which can refer to hash table
Next Up: Binding
Dynamic
Binding
Garbage Collection
2 Big Issues:
1) Do you stop the mutator, i.e. all threads,
to perform garbage collection?
2) Impact on code generation.
Garbage Collection (cont.)
Stopping the mutator:

Overall performance hit (higher cost) when
multiple processors are available (unless your
garbage collection algorithm is a parallel
algorithm).

Generally lower overall cost on single processors.

Simpler algorithms.
Garbage Collection (cont.)
Parallel execution:

Threads keep running.

Scales better to many processors.

Higher overall cost on single processor due to
code generation, extra root scanning.

Minimizes pauses for collection.

Complex algorithms, more error-prone.
Corman Lisp Garbage Collector




We chose to use a generational collector which
stops the mutator (all threads) but keeps pauses
short (a few milliseconds, usually).
Most target systems have only one processor.
Windows systems do not support more than 4 (or
8) processors.
Does not require a software write barrier, or any
type of read barrier.
Code generation is minimally impacted.
Corman Lisp Garbage Collector (cont.)
1.
2.
3.
4.
5.
Suspends all executing threads.
If any thread is in an atomic block, resume it
for a few milliseconds and go to step 1.
Performs root analysis on all stacks, heaps, and
static memory.
Copies live data from generation 0 to 1
(triggering other levels if necessary).
Resumes all executing threads.
Garbage Collection and Code
Generation



In a single-threaded system, garbage collection
will only occur at “safe states” (usually when a
heap allocation is requested).
In a multi-threaded system, garbage collection
can occur at any time, potentially between any
pair of instructions. Restricting it to safe states
can be difficult and require code generator
assistance.
Some code optimizations are impractical.
Next up: Me
Me, after a
long night
debugging
the garbage
collector.
Foreign Calls and Callbacks (FFI)
Foreign Call:
Lisp code calls operating system, C function or any
external library.
Foreign Callback:
Operating system, C function, or any other code
calls Lisp function directly.
FFI and Threads



A thread may be executing foreign code when the
garbage collector is called. It may not be in a
predictable state for the collector.
Need a way for the collector to determine which
threads were executing foreign code and which
were executing Lisp code. Lisp code could have
been called by foreign code.
Tricky issue: “thunking” from Lisp-to-foreign or
foreign-to-Lisp code. The collector can run any
time during the process.
Multi-threading and Code
Generation
The code generator must cooperate with the
collector to ensure that executing code can be
analyzed by the collector as necessary.
Areas of interest:






Register Conventions
Tagging Conventions
Stack frame links
Function calling/returning
Foreign code transitions
Stack-allocated data
Multi-threading and Code
Generation
Many common optimizations are difficult or
impossible if the collector can run at any time.
Implementation Strategies:
a) Restrict the collector to only run at “safe states”
and allow these optimizations at any other places in
the code.
b) Never allow optimizations or generated code
which would trip up the collector at any point.
c) Allow code generator to designate portions of the
code which are non-safe.
Corman Lisp Code Generation
We implemented option (c) using a pair of
instructions:
BEGIN_ATOMIC
END_ATOMIC
These mark blocks of code which are protected
from garbage collection.
The atomic sections may never span calls or
branches, and typically should only include a few
instructions.
Corman Lisp Code Generation (cont.)
An atomic block is either:
a)
b)
Instruction sequence between
BEGIN_ATOMIC/END_ATOMIC instructions.
Instructions that build up or tear down a stack
frame (beginning or end of function call).
The latter is not safe because the stack frame
list may be invalid, and the collector could be
tripped up. It is impractical to use
BEGIN_ATOMIC/END_ATOMIC in all these places,
so the collector has special logic to detect that
the instruction pointer is at one of these points.
Corman Lisp Code Generation (cont.)
A safe state is any point during code execution in
which the instruction pointer is not in an atomic
block, or is executing foreign code (which cannot
access the heap).
The collector will delay collection until all threads
are in safe states. Some sequences of instructions
can only be performed in safe states.
Next up: Code Generator
Corman
Lisp Code
Generator
Lisp Debugger


The ANSI specification does not go into detail
about what the common Lisp debugger does.
Implementations commonly will enter a special
READ-EVAL-PRINT loop when an error is signaled
or a BREAK is executed.

This works well for checking the stack of the
current thread, but what should it do about other
threads?
Lisp Debugger (cont.)
Issues
Should other threads be halted?
Should it be possible to examine stack frames of
other threads, and to switch logical threads in the
debugger?
Which thread are you dropped into when you press
control-break?
Lisp Debugger (cont.)
I believe the best debugger will do all these things,
but to do so it really needs to run in a separate
thread.
I don’t think debugger behavior ought to be part of
a standard, but it certainly is something an
implementer has to deal with.
Next Up: Debugger
Corman
Lisp
Debugger
Lisp Image Saving and Loading
A non-standard feature which many Common Lisp
implementations support is loading and saving of
the lisp heap as an image.
It is important that other threads be at least
suspended when saving or loading an image file,
and possibly you need to disallow other threads
from even existing.
I found the implementation of this to be a bit tricky.
Foreign Threads
When the Lisp system allocates a thread, it can
initialize it with any necessary data for the garbage
collector to use when scanning it.
If you want to allow threads not allocated by the
lisp system to call lisp threads, there must be some
way for the collector to know about them and figure
out what to do with them.
If you don’t allow foreign-created threads to call
into Lisp, then it impairs the ability to embed the
Lisp engine in other applications.
Corman Lisp DirectCall
DirectCall allows foreign-created threads to call
directly into Lisp code.
BlessThread()
UnblessThread()
These must be called by any thread that calls into
Lisp directly.
DLLs compiled with Corman Lisp perform this
transparently as needed.
Next up: Foreign Thread
Foreign
Thread
(Before
being
blessed)
Common Lisp Standard Threading
Suggestions
I propose that threading be incorporated in the
Common Lisp standard, at least as an optional
feature specified by the standard. This involves
amending some of the existing language. Mostly it
involves adding a few new macros, functions and
probably some standard classes.
Standard Common Lisp Threading
Suggested Additions





Dynamic variable specification.
Specify which objects/structures are safe for use
by multiple threads (such as hash-tables,
packages, readtables, etc.).
Specify which language features are safe for
multi-threaded operations (read, eval, print,
etc.).
Standard special variable bindings for process id,
thread id.
WITH-LOCK macro for synchronization on objects
and structures.
Standard Common Lisp Threading
Suggested Additions (cont.)


Start with some of the features of the Franz MP
API.
Do not include scheduling functions or macros,
such as WITHOUT-SCHEDULING and WITHOUT-
INTERRUPTS.

Standard of classes for synchronization such as
Mutex, Critical Section, Semaphore.
Summary

Threads are good.

Threads, by design, run in parallel.



There are many system implementation issues
involved.
Common Lisp standard should add a section on
standard multiple thread support.
Illustrations help to clarify abstract concepts.
Original Illustrations by Emmett Corman
Copyright (c) Corman Technologies 2002
Descargar

Multi-threaded Lisp: Challenges and Solutions