Parrot in Detail
Dan Sugalski
[email protected]
Parrot in detail
1
What is Parrot
• The interpreter for perl 6
• A multi-language virtual machine
• An April Fools joke gotten out of
hand
Parrot in detail
2
VMs in a nutshell
•
•
•
•
Platform independence
Impedance matching
High-level base platform
Good target for one or more classes of
languages
Parrot in detail
3
Platform independence
• Allow emulation of missing platform
features
• Allows unified view of common but
differently implemented features
• Isolation of platform-specific underlying
code
Parrot in detail
4
Impedance matching
• Can be halfway between the hardware
and the software
• Provides another layer of abstraction
• Allows a smoother connection between
language and CPU
Parrot in detail
5
High-level base platform
• Provide single point of abstraction for
many useful things, like:
•
•
•
•
Async I/O
Threads
Events
Objects
Parrot in detail
6
Good HLL target
• Provide features that map well to a
class of language constructs
• Reduce the “thought” load for compiler
writers
• Allow tasks to be better partitioned and
placed
Parrot in detail
7
Parts and Pieces
The chunks of parrot
Parrot in detail
8
Parts and pieces
•
•
•
•
Parser
Compiler
Optimizer
Interpreter
Parrot in detail
9
Parser
• Turns source into an AST
• $a = $b + $c becomes
=
$a
+
$b
Parrot in detail
$c
10
Overriding the Parser
• New tokens can be added
• Existing tokens can have their
meaning changed
• Entire languages can be swapped
in or out
• All done with Perl 6 grammars
Parrot in detail
11
Grammars
• Perl 6 grammars are immensely
powerful
• Combination of perl 5 regexes, lex, and
yacc
• Grammars are object-oriented and
overridable at runtime
Parrot in detail
12
Useful side-effects
• Modifying existing grammars lexically is
straightforward
• If you have a perl 6 rule set for a
language you’re halfway to getting it up
on Parrot
• Conveniently, we have a yacc and BNF
grammar to Perl regex converter
Parrot in detail
13
Compiler
• Turns AST into bytecode
• Like the parser, is overridable
• Essentially a fancy regex engine,
with some extras
• No optimizations done here
Parrot in detail
14
Mostly standard
• The compiler’s mostly standard
• Compiling to a machine, even a virtual
one, is a well-known thing
• There’s not much in the way of surprise
here
• Still a fair amount of work
Parrot in detail
15
Optimizer
• Takes AST and bytecode, and
produces better bytecode
• Levels will depends on how perl’s
invoked
• Works best with less uncertainty
Parrot in detail
16
Optimizing is difficult
• Lots of uncertainty at compile time
• Active data (tied/overloaded) kills
optimization
• Late code loading and creation
causes headaches
Parrot in detail
17
Optimizing is difficult
When can
$x = 0;
foreach (1..10000) {
$x++;
}
become
$x = 10000;
Parrot in detail
18
Optimizing ironies
• Perl may well end up one of the leastoptimized languages running on Parrot
• Perl 6 has some constructs to help with
that
• The more you tell the compiler, the
faster your code will run
Parrot in detail
19
Interpreter
• Bytecode comes in, and something
magic happens
• End destination for bytecode
• May not actually execute the
bytecode, but generally will
Parrot in detail
20
Interpreter
• As final destination, may do other
things
• Save to disk
• Transform to an alternate form (like C
code)
• JIT
• Mock in iambic pentameter
Parrot in detail
21
Interpreter Design
Details
In which we tell you far more than anyone
sane wants to know about the insides of our
interpreter.
(And this is the short form)
Parrot in detail
22
Parrot in buzzwords
•
•
•
•
•
•
•
•
•
•
Bytecode driven
Runtime extensible
Register based
Language neutral
Garbage collected
Event capable
Object oriented
Introspective
Interpreter
With continuations
Parrot in detail
23
Bytecode engine
• Well, almost. 32 bit words
• Precompiled form of your program
• Generally loaded from disk (though
not always)
• Generally needs no transformation
to run
Parrot in detail
24
Opcode functions
• Opcode function table is lexically
scoped
• Functions return the next opcode to run
• Most opcodes can throw an exception
• Opcode libraries can be loaded in on
demand
• Most opcodes overridable
• Bytecode loader is overridable
Parrot in detail
25
Fun tricks with dynamic
opcodes
• Load in rarely needed functions only when we
have to
• Allow piecemeal upgrading of a Parrot install
• We can be someone else cheaply
•
•
•
•
•
•
JVM
.NET
Z machine
Python
Perl 5
Ruby
Parrot in detail
26
Registers
• 4 Sets of 32: Integer, String, Float,
PMC
• Fast set of temporary locations
• All opcodes operate on registers
Parrot in detail
27
Stacks
•
•
•
•
•
Six stacks
One per set of registers
One generic stack
One call stack
Stacks are segmented, and have
no size limit
Parrot in detail
28
Strings
Buffer
pointer
Buffer length
Flags
Buffer used
String start
String
Length
Encoding
Character
Parrot in detail
set
29
Strings
• Strings are encoding-neutral
• Strings are character set neutral
• Engine knows how to ask character
sets to convert between
themselves
• Unicode is our fallback (and
sometimes pivot) set
Parrot in detail
30
PMCs
vtable
Property hash
flags
Data pointer
Cache data
Synchronization
GC data
Parrot in detail
31
PMCs
• Parrot’s equivalent of perl 5’s
variables
• Tying, overloading, and magic all
rolled together
• Arrays, hashes, and scalars all
rolled into one
Parrot in detail
32
PMCs are more than they seem
• Lots of behaviour’s delegated to PMCs
• PMC structures are generally opaque to the
VM
• Lots of the power and modularity of Parrot
comes from PMCs
• Engine doesn’t distinguish between scalar,
hash, and array variables at this level
• Done with the magic of vtables and
multimethod dispatch
Parrot in detail
33
Vtables
• Table of pointers to required functions
• Allows each variable to have a custom
set of functions to do required things
• Removes a lot of uncertainty from the
various functions which speed things up
• Allow very customized behaviour
• All load, store, and informational
functions here
Parrot in detail
34
Vtables bring multimethods
• Core engine has support for dispatch
based on argument types
• Necessary for proper dispatch of many
vtable functions
• Support extends to language level
Parrot in detail
35
Multimethod dispatch core
•
•
•
•
All binary PMC operations use MMD
Relatively recent change
Makes life simpler and vtables smaller
Things are faster, too, since everything
of interest did MMD anyway.
Parrot in detail
36
Aggregate PMCs
• All PMCs can potentially be treated
as aggregates
• All vtable entries have a _keyed
variant
• Up to vtable to decide what’s done
if an invalid key is passed
Parrot in detail
37
Keys for Aggregates
• List of
key type
key value
use key as x
• Also plain one-dimensional integer
index
• Keys are inherently multidimensional
• Aggregate PMCs may consume multiple
keys
Parrot in detail
38
Advantages of keys
• Multidimensional aggregates
• No-overhead tied hashes and
arrays
• Allows potentially interesting tied
behavior
Parrot in detail
39
PMCs even hide aggregation
• @foo = @bar * @baz
Turns into
mul foo, bar, baz
Parrot in detail
40
Objects
• Parrot has a low-level view of objects
• Things with methods and an array of
attributes
• Both of which are ultimately delegated
to the object
• Except when we cheat, of course
Parrot in detail
41
Objects
• Objects may or may not be accessed by
reference
• The provided object type is class-based
• Handles mixed-type inheritance with
encapsulation and delegation
• Base system handles mixed-type
dispatch properly
Parrot in detail
42
Exceptions
• An exception handler may be put in
place at any time
• Exception handlers remember their
state (they capture a continuation)
• Handlers may decline any exception
• Exceptions propagate outward
• Exception handlers may target specific
classes of exceptions
Parrot in detail
43
Exceptios are:
• Typed
Information
Warning
Severe
Fatal
We’re Doomed
• Classed
IO
Math
• Languaged
Perl
Ruby
Parrot in detail
44
Throwing an Exception
• Any opfunc that returns 0 triggers an
exception
• The throw opcode also throws an
exception
• The exception itself is stored in the
interpreter
• Exceptions and exception handlers are
cheap, but not free
Parrot in detail
45
Realities of memory
management
• Memory and structure allocation is
a huge pain
• Terribly error prone
• We have full knowledge of what’s
used if we choose to use it
Parrot in detail
46
Arena allocation of core
structures
• All PMCs and Strings are allocated
from arenas
• Makes allocation faster and more
memory efficient
• Allows us to trace all the core
structures as we need for GC and
DOD
Parrot in detail
47
Pool allocation of memory
• All ‘random’ chunks of memory are
allocated from memory pools
• Allocation is extremely fast,
typically five or six machine
instructions
• Free memory is handled by the
garbage collector
Parrot in detail
48
Garbage Collection
• Parrot has a tracing, compacting
garbage collector
• No reference counting
• Live objects are found by tracing
the root set
Parrot in detail
49
Garbage Collection
• All memory must be pointed to by a
Buffer struct (A subset of a String)
• All Buffers must be pointed to by
PMCs or string registers
• All PMCs must be pointed to by
other PMCs or the root set
Parrot in detail
50
DOD and GC are separate
• DOD finds dead structures
• GC compacts memory
• Typically chew up more memory
than structures.
Parrot in detail
51
I/O
• Fully asynchronous I/O system by
default
• Synchronous overlays for easier
coding
• Perl 5/TCL/SysV style streams
• C’s STDIO is dead
Parrot in detail
52
I/O streams
• All streams can potentially be filtered
• No limit to the number of filters on a
stream
• Filters may run asynchronously, or in
their own threads
• Filters may be sources or sinks as need
be
Parrot in detail
53
I/O Stream examples
•
•
•
•
•
UTF8->UTF32 conversion
EBCDIC->ShiftJIS conversion
Auto-chomping
Tee-style fanout
GIF->PNG conversion
Parrot in detail
54
Unified I/O & Event system
• I/O requests and events are versions of
the same thing
• Event generators are just autonomous
streams
Parrot in detail
55
Subs and sub calling
• Several sub types
•
•
•
•
Regular subs
Closures
Co-routines
Continuations
• All done with CPS, though it can be
hidden
• Caller-save scheme for easier tail-calls
Parrot in detail
56
Parrot has calling conventions
• One standard set
• All languages that want to interoperate
should use them
• Only use them for globally exposed
routines
• Terribly boring except when you don’t
have them
Parrot in detail
57
Common object interface
• The OO part of the calling conventions
• Method calling and attribute
storage/retrieval is standardized
• Method dispatch is ultimately delegated
• Attribute storage is also ultimately
delegated
• Support multimethod dispatch,
prototype checking, and runtime
mutability
Parrot in detail
58
Threads
• Three models
• Shared dependent
• Shared independent
• Completely independent
• Built in interpreter thread safety
• Primitives to allow for language-level
thread safety and inter-thread
communication
Parrot in detail
59
Introspection
• All language data structures are
accessible from parrot programs
• Scratchpads
• Global variable tables
• Stacks
• Interpreter level stuff is accessible
• Statistics
• Internal pools
Parrot in detail
60
Runtime Mutability
• Full runtime mutability is supported
• Parser and compiler are generally
available
• Needed for things like perl’s string eval
Parrot in detail
61
Descargar

Parrot in Detail