Banana Algebra:
Syntactic Language Extension via an
Algebra of Languages and Transformations
Jacob Andersen
Claus Brabrand
[ [email protected] ]
[ [email protected] ]
Aarhus University
IT University of Copenhagen
Claus Brabrand, ITU
BANANA ALGEBRA
[1]
ETAPS/LDTA, York
March 28, 2009
Abstract
We propose an algebra of languages and transformations as a means
for extending languages syntactically. The algebra provides a layer of
high-level abstractions built on top of languages (captured by CFGs) and
transformations (captured by constructive catamorphisms).
The algebra is self-contained in that any term of the algebra
specifying a transformation can be reduced to a constant catamorphism,
before the transformation is run. Thus, the algebra comes "for free"
without sacrificing the strong safety and efficiency properties of
constructive catamorphisms.
The entire algebra as presented in the paper is implemented as the
Banana Algebra Tool which may be used to syntactically extend
languages in an incremental and modular fashion via algebraic
composition of previously defined languages and transformations. We
demonstrate and evaluate the tool via several kinds of extensions.
Claus Brabrand, ITU
BANANA ALGEBRA
[2]
ETAPS/LDTA, York
March 28, 2009
Outline
Introduction: "What is a Banana?"
Bananas for Language Transformation
Language Extension Pattern
Banana Algebra
Examples
Implementation
Related Work
Conclusion
Claus Brabrand, ITU
BANANA ALGEBRA
[3]
ETAPS/LDTA, York
March 28, 2009
What is a 'Banana' ?
Datatype; "list":
list
=
Num N
|
Cons N * list
list  N
Banana ("sum-of-list"):
[Num n]
[Cons n l]
=
=
n
n + [l]
(aka. "Catamorphism" )
Separation of recursion and evaluation
Implicit recursion on input structure
bottom-up re-combination of intermediate results
(| n.n , (n,l ).n+l |)
Claus Brabrand, ITU
BANANA ALGEBRA
[4]
ETAPS/LDTA, York
March 28, 2009
Language Transformation

Bananas (statically typed):
(| LS -> LT [] c |)
LS -> LT
Source language: 'LS'
list
=
Num N
|
Cons N * list
Target language: 'LT'
tree
=
Nil
|
Leaf N
|
Node N * tree * tree
Nonterminal-typing: ' '
[list -> tree]
Reconstructors: 'c '
[Num n] = Leaf n
[Cons n l] = Node n (Nil) [l]
Claus Brabrand, ITU
BANANA ALGEBRA
[5]
Type-check'able!
ETAPS/LDTA, York
March 28, 2009
"Growing Languages with Metamorphic Syntax Macros"
[ Claus Brabrand | Michael Schwartzbach ] ( PEPM 2002 )
Banana Properties
"The metafront System: Safe and Extensible Parsing and Transformation"
[ Claus Brabrand | Michael Schwartzbach ] ( LDTA 2003, SCP J. 2007 )
Banana properties:
Simple
Safe
Efficient
(Expressive)
(corresponds to: “simple recursion”)
(syntactically safe + always terminate)
(linear time in size of input + output)
(…enough for interesting extensions)
Banana Algebra “for free” (16 banana ops):
Modular
Incremental
Simple
Safe
Efficient
(Expressive)
Claus Brabrand, ITU
Statically reduce:
Banana Algebra (term)  Banana (term)
BANANA ALGEBRA

[6]
ETAPS/LDTA, York
March 28, 2009
Outline
Introduction: "What is a Banana?"
Bananas for Language Transformation
Language Extension Pattern
Banana Algebra
Examples
Implementation
Related Work
Conclusion
Claus Brabrand, ITU
BANANA ALGEBRA
[7]
ETAPS/LDTA, York
March 28, 2009
Language Extension Pattern
Numeral extension: 'LS'
Exp :
:
:
:
:
:
var Id
lam Id * Exp
app Exp * Exp
zero
succ Exp
pred Exp
Lambda-Calculus: 'LT'
Exp : var Id
: lam Id * Exp
: app Exp * Exp
Nonterminal typing: ' '
[Exp -> Exp]
Reconstructors: 'c '
[var V]
[lam V E]
[app E1 E2]
[zero]
[succ E]
[pred E]
Claus Brabrand, ITU
=
=
=
=
=
=
var
lam
app
lam
lam
app
V
Catamorphism:
V [E ]
(| LS -> LT [] c |)
[E1] [E2]
z (var z)
Using very simple
s [E ]
numeral encoding
[E ] (lam z (var z))
BANANA ALGEBRA
[8]
ETAPS/LDTA, York
March 28, 2009
Algebraic Solution
ln+l  l
+
ln  l
ll
(| ln -> l [Exp -> Exp]
[zero] = lam z (var z)
[succ E] = lam s [E ]
[pred E] = app [E ] ...
idx
|)
l
Exp : var Id
: lam Id * Exp
: app Exp * Exp
Claus Brabrand, ITU
BANANA ALGEBRA
ln
Exp : zero
: succ Exp
: pred Exp
[9]
ETAPS/LDTA, York
March 28, 2009
Banana Algebra
Languages (L):
Transformations (X):
{ CFG }
(| L -> L [] c |)
l
x
v
L\L
L+L
src ( X )
tgt ( X )
let v = L in L
letx w = X in L
w
X\L
X+X
X X
idx ( L )
let v = L in X
letx w = X in X
Claus Brabrand, ITU
BANANA ALGEBRA
[ 10 ]
ETAPS/LDTA, York
March 28, 2009
Algebraic Laws
Idempotency of '+':
L  L + L
Commutativity of '+':
L1 + L2  L2 + L1
Associativity of '+':
L1 + (L2 + L3)  (L1 + L2) + L3
Source-identity:
Target-identity:
L  tgt(idx(L))
L  src(idx(L))
…
Claus Brabrand, ITU
BANANA ALGEBRA
[ 11 ]
ETAPS/LDTA, York
March 28, 2009
Outline
Introduction: "What is a Banana?"
Bananas for Language Transformation
Language Extension Pattern
Banana Algebra
Examples
Implementation
Related Work
Conclusion
Claus Brabrand, ITU
BANANA ALGEBRA
[ 12 ]
ETAPS/LDTA, York
March 28, 2009
Example Revisited
--- "ln2l.x" --let l = "l.l"
in let ln = "ln.l"
in
idx(l) +
(| ln -> l [Exp
Exp.zero =
Exp.succ =
Exp.pred =
|)
--- "l.l" --{
-> Exp]
'\z.z'
;
'\s.$1'
;
'($1 \z.z)' ;
--- "ln.l" --{
Id
= [a-z] [a-z0-9]* ;
Exp.var : Id
;
Exp.lam : "\\" Id "." Exp ;
Exp.app : "(" Exp Exp ")" ;
Exp.zero : "zero"
;
Exp.succ : "succ" "(" Exp ")" ;
Exp.pred : "pred" "(" Exp ")" ;
}
}
Claus Brabrand, ITU
BANANA ALGEBRA
[ 13 ]
ETAPS/LDTA, York
March 28, 2009
Numerals + Booleans
…with Nums & Bools?
…with Nums
+
+
lb+l  l
ln+l  l
ln l
idx
ln
l
Claus Brabrand, ITU
l+ln+lb  l
BANANA ALGEBRA
ll
[ 14 ]
…with Bools
+
idx
lb  l
l
lb
ETAPS/LDTA, York
March 28, 2009
Java + Repeat
575 lines
--- "java.l" --{
Java
...
"try" Stm "catch"
...
Name.id : Id ;
}
--- "repeat.l" --{
Stm.repeat
:
"repeat" Stm "until" "(" Exp ")" ";"
;
}
--- "repeat2java.x" --let java = "java.l"
in let repeat = "repeat.l"
in
idx(java) +
(| repeat -> java [Exp -> Exp, Stm -> Stm]
7 lines !
Stm.repeat
=
'do $1 while (!($2));'
;
|)
Claus Brabrand, ITU
BANANA ALGEBRA
[ 15 ]
ETAPS/LDTA, York
March 28, 2009
Concrete vs. Abstract Syntax
Concrete syntax:
Stm.repeat
=
'do $1 while (!($2));' ;
Exp (with explicit assoc./prec.):
Exp.or
.exp1
Exp1.and
.exp2
Exp2.add
.exp3

Exp7.neg
.exp8
Exp8.par
.var
.num
:
:
:
:
:
:
Exp1 "||" Exp ;
Exp1
;
Exp2 "&&" Exp1 ;
Exp2
;
Exp3 "+" Exp2 ;
Exp3
;
:
:
:
:
:
"!" Exp8
Exp8
"(" Exp ")"
Id
IntConst
;
;
;
;
;
NB: Tool supports BOTH !
Claus Brabrand, ITU
BANANA ALGEBRA
Abstract syntax:
Stm.repeat =
Stm.do(<1>,
Exp.exp1(
Exp1.exp2(
Exp2.exp3(
Exp3.exp4(
Exp4.exp5(
Exp5.exp6(
Exp6.exp7(
Exp7.neg(
Exp8.par(<2>)
))))))))) ;
(unambiguous: concrete  abstract)
[ 16 ]
ETAPS/LDTA, York
March 28, 2009
"FUN" Example
The "FUN" Language: (at Aarhus University)
used for Teaching Functional Programming
Fun
Fun grammar transform
Literals
Basically The Lambda Calculus with…:
numerals, booleans, arithmetic, boolean
logic, local definitions, pairs, literals, lists,
signs, comparisons, dynamic types,
fixed-point combinators, …
Literals→Nums
Unsigned arithmetic + booleans + definitions + pairs
Nums→ + Bools→ + Defs→
+ Pairs→
Lambda Calculus
Claus Brabrand, ITU
BANANA ALGEBRA
[ 17 ]
ETAPS/LDTA, York
March 28, 2009
"FUN" Example
Fun
Component re-use
Fun grammar transform
Fun + FunSigned
Fun grammar transform + FunSigned GT
Literals
Literals→Nums
Literals→Nums
Signed arith→Nums
Unsigned arithmetic + booleans + definitions + pairs
Nums→ + Bools→ + Defs→
+ Pairs→
Lambda Calculus
Claus Brabrand, ITU
BANANA ALGEBRA
[ 18 ]
ETAPS/LDTA, York
March 28, 2009
"FUN" Example
Fun + FunSigned + FunCompare + FunTypesafe
Fun GT + FunSigned GT + FunCompare GT + FunTypesafe GT
245x Banana Algebra ops  4 MB Banana !
Unsigned arithmetic + booleans + definitions + pairs
Nums→ + Bools→ + Defs→
+ Pairs→
Lambda Calculus
Claus Brabrand, ITU
BANANA ALGEBRA
[ 19 ]
ETAPS/LDTA, York
March 28, 2009
"FUN" Usage Statistics
Usage statistics (245x operators) in "FUN":
58x
51x
28x
23x
17x
17x
14x
10x
9x
8x
4x
4x
2x
Claus Brabrand, ITU
{ …cfg… }
Constant languages
"file.l"
Language inclusions
L + L
Language additions
v
Language variables
(|L  L [] c|) Constant transformations
X + X
Transformation additions
"file.x"
Transformation inclusions
let-in
Local definitions
idx(L)
Identity transformations
X
X
Compositions
L \ L
Language restriction
w
Transformation variables
src(X)
Source extractions
BANANA ALGEBRA
[ 20 ]
ETAPS/LDTA, York
March 28, 2009
Other Examples
Self-Application (The tool on itself!):
[L1 << L2]
[X1 << X2]
=
=
'(L1 \ L2) + L2'
'(X1 \ src(X2)) + X2'
SQL embedding (in <bigwig>):
Stm.select =
'factor (<2>) {
if (<3>) return ( # \+ (<1>) );
}'
My-Java (endless variations):
((java \ loops) + sql) o syntaxe_francais
Claus Brabrand, ITU
BANANA ALGEBRA
[ 21 ]
ETAPS/LDTA, York
March 28, 2009
Implementation
The 'Banana Algebra' Tool:
(3,600 lines of O'Caml)
[ http://www.itu.dk/people/brabrand/banana-algebra/ ]
Uses (underlying technologies):
'dk.brics.grammar': for parsing, unparsing, and ambiguity analysis !
'XSugar': for transformation: "concrete syntax  abstract XML syntax"
'XSLT': for transformation: "XML  XML"
Claus Brabrand, ITU
BANANA ALGEBRA
[ 22 ]
ETAPS/LDTA, York
March 28, 2009
Outline
Introduction: "What is a Banana?"
Bananas for Language Transformation
Language Extension Pattern
Banana Algebra
Examples
Implementation
Related Work
Conclusion
Claus Brabrand, ITU
BANANA ALGEBRA
[ 23 ]
ETAPS/LDTA, York
March 28, 2009
Related Work (I/III)
Macro Systems:
"Growing Languages with Metamorphic Syntax Macros"
[ Claus Brabrand | Michael Schwartzbach ] ( PEPM 2002 )
"The metafront System: Safe and Extensible Parsing and Transformation"
[ Claus Brabrand | Michael Schwartzbach ] ( LDTA 2003 , SCP J. 2007 )
Claus Brabrand, ITU
BANANA ALGEBRA
[ 24 ]
ETAPS/LDTA, York
March 28, 2009
Related Work (II/III)
Attribute Grammars:
Language transformation (and extension)…
…via computation on AST's
(using "inherited" or "synthesized" or … attributes)
E.g., Eli, JastAdd, Silver, Both;
… compared to bananas:
More ambitious
(expressivity)
No termination guarantees (safety)
Transformation "indirect" (simplicity)
Rewrite Systems:
Language transformation (and extension)…
…via syntactic rewriting, using encodings…:
S  T gradually rewrite "S-syntax" to "T-syntax"
S  T
gradually rewrite "S-syntax" to "T-syntax"
E.g., Elan, TXL, ASF+SDF, Stratego/XT, …
Claus Brabrand, ITU
BANANA ALGEBRA
[ 25 ]
ETAPS/LDTA, York
March 28, 2009
Related Work (III/III)
Functional Programming:
Catas mimicked by "disciplined style" of fun. programming
…aided by:
Basically ye
olde issue:
GPL vs. DSL
Traversal functions (auto-synthesized from datatypes)
Combinator libraries
"Shortcut fusion" (to eliminate ' ' at compile-time)
Category Theory:
A lot of this work can be
viewed as Category Theory:
Claus Brabrand, ITU
BANANA ALGEBRA
[ 26 ]
ETAPS/LDTA, York
March 28, 2009
Conclusion
IF bananas are sufficiently:
"Niche"
(Expressive)
THEN you get…:
Banana Algebra “for free” (16 banana ops):
Incremental
Modular
Simple
Safe
Efficient
Claus Brabrand, ITU
Statically reduce:
Banana Algebra (term)  Banana (term)
BANANA ALGEBRA

[ 27 ]
ETAPS/LDTA, York
March 28, 2009
BONUS SLIDES
- Reduction Semantics If you want all the details:
"Syntactic Language Extension via an Algebra of Languages and Transformations"
[ Jacob Andersen | Claus Brabrand ] ( ITU Technical Report, Dec. 2008 )
Claus Brabrand, ITU
BANANA ALGEBRA
[ 28 ]
ETAPS/LDTA, York
March 28, 2009
Reduction Semantics
Environments:
ENVL = VARL  EXPL
environment of languages
ENVX = VARX  EXPX
environment of transformations
Reduction relations:
ENVL  ENVX  EXPL  EXPL  'L'
ENVL  ENVX  EXPX  EXPX  'X'
Abbreviations:
, |- L L l ...as a short-hand for: (,,L,l)  'L'
, |- X X x ...as a short-hand for: (,,X,x)  'X'
Claus Brabrand, ITU
BANANA ALGEBRA
[ 29 ]
ETAPS/LDTA, York
March 28, 2009
Semantics (L)
[CONL]
,
l L l
,
,
[RESL]
,
,
,
[ADDL]
,
Claus Brabrand, ITU
wfl
l
[VARL]
L L l
L' L l'
L \ L' L l
L L l
L' L l'
L + L' L l
BANANA ALGEBRA
[ 30 ]
l
,
l
l'
v L (v)
l'
l ~l l'
ETAPS/LDTA, York
March 28, 2009
Semantics (L)
[SRCL]
[TGTL]
[LETL]
X X (| lS -> lT [] c |)
,
,
X X (| lS -> lT [] c |)
,
,
Claus Brabrand, ITU
src (X) L lS
,
L
,
tgt (X) L lT
L l
[v=l],
L' L l'
let v=L in L' L l'
BANANA ALGEBRA
[ 31 ]
ETAPS/LDTA, York
March 28, 2009
Semantics (X)
[CONX]
[VARX]
,
(| lS -> lT [] c |)
,
LS
,
(| LS -> LT [] c |) X (| lS -> lT [] c |)
,
L lS
w X (w)
,
,
[ADDX]
,
Claus Brabrand, ITU
LT
L lT
,
,
[RESX]
,
X X x
X' X x'
X + X' X x
BANANA ALGEBRA
[ 32 ]
x
wfx
X X x
L L l
X \ L X x
x'
x
l
x ~x x'
ETAPS/LDTA, York
March 28, 2009
Semantics (X)
[COMPX]
[IDXX]
,
,
X X (| lS -> lT [ ] c |)
X' X (| lS' -> lT' ['] c' |)
,
X'
X X (| lS -> lT' [' ] c' c |)
,
,
[LETL]
lT
l lS'
L L l
idx (L) X (| l -> l [id(l)] idc(l) |)
,
Claus Brabrand, ITU
X X x
,
,[w=x]
X' X x'
letx w=X in X' X x'
BANANA ALGEBRA
[ 33 ]
ETAPS/LDTA, York
March 28, 2009
BONUS SLIDES
- More Examples -
Claus Brabrand, ITU
BANANA ALGEBRA
[ 34 ]
ETAPS/LDTA, York
March 28, 2009
Numeral & Boolean Extension
Numeral Extension (catamorphism):
[var V]
[lam V E]
[app E1 E2]
[zero]
[succ E]
[pred E]
=
=
=
=
=
=
var
lam
app
lam
lam
app
[V ]
[V ] [E ]
[E1] [E2]
z (var z)
s [E ]
[E ] (lam z (var z))
Exp :
:
:
:
:
:
var Id
lam Id * Exp
app Exp * Exp
zero
succ Exp
pred Exp
Exp : var Id
: lam Id * Exp
: app Exp * Exp
Boolean Extension (catamorphism):
[var V]
[lam V E]
[app E1 E2]
[true]
[false]
[if E1 E2 E3]
Claus Brabrand, ITU
=
=
=
=
=
=
var
lam
app
lam
lam
app
[V ]
[V ] [E ]
[E1] [E2]
a (lam b (var a))
a (lam b (var b))
(app [E1] [E2]) [E3]
BANANA ALGEBRA
[ 35 ]
ETAPS/LDTA, York
Exp :
:
:
:
:
:
var Id
lam Id * Exp
app Exp * Exp
true
false
if Exp Exp Exp
Exp : var Id
: lam Id * Exp
: app Exp * Exp
March 28, 2009
Lambda with Booleans
+
lb+l  l
lb  l
ll
(| lb -> l [Exp -> Exp]
idx
[true] = '\a.\b . a'
[false] = '\a.\b . b'
[if E1 E2 E3] = '(([E1] [E2]) [E3])'
|)
l
Exp : var Id
: lam Id * Exp
: app Exp * Exp
Claus Brabrand, ITU
BANANA ALGEBRA
lb
Exp : true
: false
: if Exp Exp Exp
[ 36 ]
ETAPS/LDTA, York
March 28, 2009
Incremental Development
--- "li.l" ---
--- "l.l" --{
Id
Exp.var
Exp.lam
Exp.app
=
:
:
:
[a-z] [a-z0-9]*
Id
"\\" Id "." Exp
"(" Exp Exp ")"
;
;
;
; }
--- "ln.l" --{
Exp.zero
Exp.succ
Exp.pred
:
:
:
"zero"
"succ" Exp
"pred" Exp
;
;
;
}
{
--- "li2l.x" --let l = "l.l" in
idx(l) +
(| "li.l" -> l [Exp -> Exp]
Exp.id : '\z.z' ;
|)
--- "ln2l.x" --let l = "l.l" in
idx(l) +
(| "ln.l" -> l [Exp -> Exp]
Exp.zero : '\z.z'
Exp.succ : '\x.$1'
Exp.pred : '($1 \z.z)'
|)
Exp.id : "id" ; }
--- "ln2li.x" ---
;
;
;
let l = "l.l" in
idx(l) +
(| ln -> l+"li.l" [Exp -> Exp]
Exp.zero : 'id'
;
Exp.succ : '\x.$1'
;
Exp.pred : '($1 id)' ;
|)
--- "ln2l.x" --Claus Brabrand, ITU
BANANA ALGEBRA
[ 37 ]
"li2l.x" York
o "ln2li.x"
March 28, 2009
ETAPS/LDTA,
Example cont'd
Both statically reduce to same catamorphism:
(|
{ Id
=
[a-z] [0-9a-z]*
Exp.app
Exp.lam
Exp.pred
Exp.succ
Exp.var
Exp.zero
:
:
:
:
:
:
;
"(" Exp Exp ")" ;
"\" Id "." Exp ;
"pred" Exp ;
"succ" Exp ;
Id ;
"zero" ;
}
{ Id
->
=
[a-z] [0-9a-z]*
;
Exp.app : "(" Exp Exp ")" ;
Exp.lam : "\" Id "." Exp ;
Exp.var : Id ;
}
[Exp -> Exp, Id->Id]
Exp.app
Exp.lam
Exp.pred
Exp.succ
Exp.var
Exp.zero
:
:
:
:
:
:
Exp.app($1, $2) ;
Exp.lam($1, $2) ;
Exp.app($1, Exp.lam(Id("z"), Exp.var(Id("z"))))
Exp.lam(Id("x"), $1) ;
Exp.var($1) ;
Exp.lam(Id("z"), Exp.var(Id("z"))) ;
;
|)
Claus Brabrand, ITU
BANANA ALGEBRA
[ 38 ]
ETAPS/LDTA, York
March 28, 2009
Usage Scenarios
Programmers:
May extend existing languages (~ syntax macros)
Developers:
May embed DSLs into host languages (SQL in Java)
Developers (and teachers):
May incrementally specify multi-layered languages
Compiler writers:
May rely on tool and implement only a small core
(and then specify the rest externally as extensions)
Claus Brabrand, ITU
BANANA ALGEBRA
[ 40 ]
ETAPS/LDTA, York
March 28, 2009
BONUS SLIDES
- Parsing & Error Reporting -
Claus Brabrand, ITU
BANANA ALGEBRA
[ 41 ]
ETAPS/LDTA, York
March 28, 2009
Parsing
Parsing (XSugar):
Variant of Earley's algorithm: O ( ||3 )
Can parse any context-free grammar
Closed under union of languages
Support for production priority
Tool easily adapts to other parsing algorithms
Claus Brabrand, ITU
BANANA ALGEBRA
[ 42 ]
ETAPS/LDTA, York
March 28, 2009
Ambiguity: parsingunparsing
.
.
.
L
Parsing:
ASTL / ~L

.
L
Unparsing:
BANANA ALGEBRA
.
Canonical whitespace
Grammar ambiguity
Claus Brabrand, ITU
ASTL / ~L
[ 43 ]
ETAPS/LDTA, York
March 28, 2009
Ambiguity Analysis
Ambiguity Analysis:
"Analyzing Ambiguity of Context-Free Grammars"
[ Claus Brabrand | Robert Giegerich | Anders Møller ] ( CIAA 2007 )
" ) on:
Using implementation ( "dk.brics.grammar
[ by Anders Møller ]
Source language;
Target language; and/or
…all intermediate languages (somewhat expensive)
(Note: Ambiguity analysis comes with XSugar tool)
Claus Brabrand, ITU
BANANA ALGEBRA
[ 44 ]
ETAPS/LDTA, York
March 28, 2009
Error Reporting
Error reporting:
Prototype
Static parse-error (O'Caml-lex):
*** In ln2l.x (4,4)-(4,7):
Parse error at "Exp"
Static transformation error (XSugar):
*** Parse error at character 6 (line 1, column 7)
in /tmp/shape84e645.txt
Could be
improved
(is actually a parse-error in a cata reconstructor)
Dynamic parse-error (XSugar):
*** Parse error at character 23 (line 1, column 24)
in /dev/stdin
Dynamic transformation error:
impossible :-)
Claus Brabrand, ITU
BANANA ALGEBRA
[ 45 ]
ETAPS/LDTA, York
March 28, 2009
Descargar

Document