```Metatheoretic Approaches to
Programming Language
Specification & Verification
Rob Simmons
Princeton University
Introduction
• “A precise description of a programming language
is a prerequisite for its implementation and its
use” (The Definition of Standard ML, 1990)
Introduction
• “A precise description of a programming language
is a prerequisite for its implementation and its
use” (The Definition of Standard ML, 1990)
• “We hold that ‘formal’ means ‘admits logical
reasoning’” (C formalised in HOL, 1998)
Introduction
• “A precise description of a programming language
is a prerequisite for its implementation and its
use” (The Definition of Standard ML, 1990)
• “We hold that ‘formal’ means ‘admits logical
reasoning’” (C formalised in HOL, 1998)
• “Use computers to check your proof. Note that
[this] task is not trivial. A whole new scientific
discipline may be needed. Let me call it
Pedantics.” (Yuri Gurevich)
Introduction
• Lots of theorem proving systems!
–
–
–
–
–
–
–
–
ACL2
Coq
HOL
Isabelle
Nuprl
LEGO
PVS
Twelf
• Twelf can do things others can’t
Outline of Talk
• Introduction to Twelf
• Semantics in Twelf
– Denotational semantics
– Axiomatic semantics
• Proofs as programs
• Metalogic proofs
• Safety for the simply typed lambda calculus
Introduction to Twelf syntax
• Define new categories
nat : type.
exp : type.
tp : type.
animal : type.
_
Introduction to Twelf syntax
• Define new categories
nat : type.
exp : type.
tp : type.
animal : type.
_
Introduction to Twelf syntax
• Define new categories
nat : type.
• Define elements in the categories
z : nat.
s : nat -> nat.
four : nat =
s (s (s (s z))).
% datatype nat = Z
%
| S of nat
% val four = S (S (S (S Z)))
n : nat -> exp.
int : tp.
plus : exp -> exp -> exp.
test : exp = plus (n four) (n (s z)).
Introduction to Twelf syntax
• Define new categories
nat : type.
• Define elements in the categories
z : nat.
s : nat -> nat.
four : nat =
s (s (s (s z))).
% datatype nat = Z
%
| S of nat
% val four = S (S (S (S Z)))
n : nat -> exp.
int : tp.
plus : exp -> exp -> exp.
test : exp = plus (n four) (n (s z)).
Introduction to Twelf syntax
• Define new categories
nat : type.
• Define elements in the categories
z : nat.
s : nat -> nat.
four : nat =
s (s (s (s z))).
% datatype nat = Z
%
| S of nat
% val four = S (S (S (S Z)))
• Define relationships between categories
sum : nat -> nat -> nat -> type.
sum : {a: nat}{b: nat}{c: nat} type. % Equivalent
Introduction to Twelf syntax
• Define new categories
nat : type.
• Define elements in the categories
z : nat.
s : nat -> nat.
four : nat =
s (s (s (s z))).
% datatype nat = Z
%
| S of nat
% val four = S (S (S (S Z)))
• Define relationships between categories
sum : nat -> nat -> nat -> type.
sum : {a: nat}{b: nat}{c: nat} type. % Equivalent
typed : exp -> tp -> type.
steps : exp -> exp -> type.
What does a judgment ‘mean’?
• Right now, we know nothing about sum, other
than that it is a three-part relation.
• I called it “sum,” so I apparently have some
expected meaning for it.
What does a judgment ‘mean’?
• Right now, we know nothing about sum, other
than that it is a three-part relation.
• I called it “sum,” so I apparently have some
expected meaning for it.
• How do we specify that 2 + 1 = 3 ?
i.e. sum (s (s z)) (s z) (s (s (s z)))
• How do we specify that addition is commutative?
What does a judgment ‘mean’?
• Right now, we know nothing about sum, other
than that it is a three-part relation.
• I called it “sum,” so I apparently have some
expected meaning for it.
• How do we specify that 2 + 1 = 3 ?
i.e. sum (s (s z)) (s z) (s (s (s z)))
• How do we specify that addition is commutative?
• (At least) two ways of looking at this
– Assign an underlying semantic meaning, write a proof
(denotational semantics, object logic)
– Write an inductive definition, write an inductive proof
(operational semantics, metalogic)
Assigning semantic meaning
nat : type = tm num.
z : nat = const 0.
s : nat -> nat = [x] const 1 + x.
Assigning semantic meaning
nat : type = tm num.
z : nat = const 0.
s : nat -> nat = [x] const 1 + x.
sum : nat -> nat -> nat -> type =
[a][b][c] pf (a + b == c).
Assigning semantic meaning
nat : type = tm num.
z : nat = const 0.
s : nat -> nat = [x] const 1 + x.
sum : nat -> nat -> nat -> type =
[a][b][c] pf (a + b == c).
sum1+1 : sum (s z) (s z) (s (s z)) =
cut (plus_cong plus_zero plus_zero) [p1]
trans3 p1 (symm plus_zero) plus_assoc.
sum_comm : sum N1 N2 N3 -> sum N2 N1 N3 =
[p1 : pf (N1 + N2 == N3)]
Assigning semantic meaning
nat : type = tm num.
z : nat = const 0.
s : nat -> nat = [x] const 1 + x.
sum : nat -> nat -> nat -> type =
[a][b][c] pf (a + b == c).
sum1+1 : sum (s z) (s z) (s (s z)) =
cut (plus_cong plus_zero plus_zero) [p1]
Has Been Checked By A Computer
trans3 p1This
(symm
plus_zero) plus_assoc.
sum_comm : sum N1 N2 N3 -> sum N2 N1 N3 =
[p1 : pf (N1 + N2 == N3)]
This Has Been Checked By A Computer
Assigning axiomatic meaning
nat : type.
z : nat.
s : nat -> nat.
sum : nat -> nat -> nat -> type.
Assigning axiomatic meaning
nat : type.
z : nat.
s : nat -> nat.
sum : nat -> nat -> nat -> type.
sum_z : sum z N N.
sum_s : sum (s N1) N2 (s N3) <- sum N1 N2 N3.
sum N1 N2 N3
sum (s N1) N2 (s N3)
sum_s
sum_z
sum z N N
Assigning axiomatic meaning
1 = s z.
2 = s 1.
3 = s 2.
sum2+1 : sum 2 1 3 =
???.
sum (s (s z)) (s z) (s (s (s z)))
2
1
3
Assigning axiomatic meaning
1 = s z.
2 = s 1.
3 = s 2.
sum2+1 : sum 2 1 3 =
sum_s (sum_s sum_z).
sum_z
sum z (s z) (s z)
sum_s
sum (s z) (s z) (s (s z))
sum_s
sum (s (s z)) (s z) (s (s (s z)))
2
1
3
Assigning axiomatic meaning
1 = s z.
2 = s 1.
3 = s 2.
sum2+1 : sum 2 1 3 =
sum_s (sum_s sum_z).
Both of these represent an
OBJECT
sum_z
sum z (s z) (s z)
sum_s
sum (s z) (s z) (s (s z))
sum_s
sum (s (s z)) (s z) (s (s (s z)))
2
1
3
Axiomatic Relations
Relations (which represent judgments or proofs)
sum, mul,
typed, steps, etc…
Axiomatic Relations
Relations (which represent judgments or proofs)
sum, mul,
typed, steps, etc…
Relations
representing proofs
Axiomatic Relations
Relations (which represent judgments or proofs)
sum, mul,
typed, steps, etc…
Relations
representing proofs
Relations
AS PROGRAMS
Proofs as programs
• The input to Twelf
%solve sum2+1 : sum 2 1 _.
Proofs as programs
• The input to Twelf
%solve sum2+1 : sum 2 1 _.
• Twelf’s response
sum2+1 : sum 2 1 (s (s 1)) =
sum_s (sum_s sum_z).
%% OK %%
Proofs as programs
• The input to Twelf
%solve sum2+1 : sum 2 1 _.
• Twelf’s response
sum2+1 : sum 2 1 (s (s 1)) =
sum_s (sum_s sum_z).
%% OK %%
• The input to Twelf
%solve sum2+1 : sum _ 1 _.
• Twelf’s response
sum2+1 : sum z 1 1 = sum_z.
%% OK %%
Proofs as programs
• It is sometimes useful to think at these definitions
as a function that recursively calls itself - so the
second line (the subgoal) acts like a recursive call
to sum.
sum_s : sum (s N1) N2 (s N3)
<- sum N1 N2 N3.
• Mode declarations
%mode sum +N1 +N2 –N3.
• “If the first two positions contain well-defined (i.e.
ground, without metavariables) inputs, then IF the
search succeeds in a finite amount of time, the
search will generate well-defined outputs”
• Mode declarations
%mode sum +N1 +N2 –N3.
• “If the first two positions contain well-defined (i.e.
ground, without metavariables) inputs, then IF the
search succeeds in a finite amount of time, the
search will generate well-defined outputs”
sum_z_bad : sum z N N'.
Occurrence of variable N' in output (-) argument not
necessarily ground
%% ABORT %%
• World declarations
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
• The program is using the “closed worlds
assumption” - no new things (a new natural
number, or a new rule for sum) will appear during
execution.
• World declarations
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
• The program is using the “closed worlds
assumption” - no new things (a new natural
number, or a new rule for sum) will appear during
execution.
sum_s_bad : sum (s N1) N2 (s N3)
<- {n: nat} sum N1 n N3.
World violation
%% ABORT %%
• Totality declarations
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
%total N1 (sum N1 N2 N3).
• The program acts like a total function
–
–
–
–
This involves three things:
Termination
Input coverage
Output coverage
• Totality declarations & termination
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
%total N1 (sum N1 N2 N3).
• The program always terminates (has a term which
always strictly decreases in size)
sum_bad : sum N1 N2 N3
<- sum N2 N1 N3.
• Totality declarations & termination
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
%total N1 (sum N1 N2 N3).
• The program always terminates (has a term which
always strictly decreases in size)
sum_bad : sum N1 N2 N3
<- sum N2 N1 N3.
Termination violation:
---> (N2) < (N1)
%% ABORT %%
• Totality declarations & input coverage
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
%total N1 (sum N1 N2 N3).
• All possible inputs are covered (case analysis)
• Totality declarations & input coverage
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
%total N1 (sum N1 N2 N3).
• All possible inputs are covered (case analysis)
• If we remove the sum_z rule, Twelf will complain
Coverage error --- missing cases:
{X1:nat} {X2:nat} |- sum z X1 X2.
%% ABORT %%
• Totality declarations & output coverage
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
%total N1 (sum N1 N2 N3).
• All possible OUTPUTS are covered
• Totality declarations & output coverage
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
%total N1 (sum N1 N2 N3).
• All possible OUTPUTS are covered
sum_s_bad : sum (s N1) N2 (s N3)
<- sum N1 N2 (s N3).
Totality: Output of subgoal not covered
Output coverage error --- missing cases:
{X1:nat} {X2:nat} |- sum X1 X2 z.
%% ABORT %%
• Totality declarations
%mode sum +N1 +N2 –N3.
%worlds () (sum _ _ _).
%total N1 (sum N1 N2 N3).
• The sum function is TOTAL
–
–
–
–
–
It has clearly defined inputs and outputs
No new terms will be defined during execution
It always terminates
It will always be able to handle any inputs
Recursive calls place no constraints on outputs
Metatheorems in Twelf
Relations (which represent judgments or proofs)
sum, mul,
typed, steps, etc…
Relations
representing proofs
Relations
AS PROGRAMS
Metatheorems in Twelf
Metatheorems in Twelf
• If N1 + N2 = N3, then N2 + N1 = N3
Metatheorems in Twelf
• If N1 + N2 = N3, then N2 + N1 = N3
• If I can derive “sum N1 N2 N3,”
then “sum N2 N1 N3” should also be derivable
Metatheorems in Twelf
• If N1 + N2 = N3, then N2 + N1 = N3
• If I can derive “sum N1 N2 N3,”
then “sum N2 N1 N3” should also be derivable
sum_comm: sum N1 N2 N3 -> sum N2 N1 N3
-> type.
%mode sum_comm +SUM -SUM'.
• Can we show this is total?
Metatheorems in Twelf
sum_comm: sum N1 N2 N3 -> sum N2 N1 N3 -> type.
%mode sum_comm +SUM -SUM'.
sum_ident : {N: nat} sum N z N -> type.
%mode sum_ident +N -SUM.
sum_ident_z : sum_ident z (sum_z : sum z z z).
sum_ident_s : sum_ident (s N) ((sum_s SUM): sum (s N) z (s N))
<- sum_ident N (SUM: sum N z N).
%worlds () (sum_ident _ _).
%total N (sum_ident N _).
sum_inc : sum A B C -> sum A (s B) (s C) -> type.
%mode sum_inc +SUM -SUM'.
sum_inc_z : sum_inc (sum_z : sum z N N) (sum_z : sum z (s N) (s N)).
sum_inc_s : sum_inc (sum_s D) (sum_s D')
<- sum_inc D D'.
%worlds () (sum_inc _ _).
%total SUM (sum_inc SUM _).
& : sum_comm (sum_z : sum z N N) SUM
<- sum_ident N SUM.
& : sum_comm (sum_s SUM) SUM'‘
<- sum_comm SUM SUM‘
<- sum_inc SUM' SUM''.
%worlds () (sum_comm _ _).
%total SUM (sum_comm SUM SUM').
Metatheorems in Twelf
sum_comm: sum N1 N2 N3 -> sum N2 N1 N3 -> type.
%mode sum_comm +SUM -SUM'.
sum_ident : {N: nat} sum N z N -> type.
%mode sum_ident +N -SUM.
sum_ident_z : sum_ident z (sum_z : sum z z z).
sum_ident_s : sum_ident (s N) ((sum_s SUM): sum (s N) z (s N))
<- sum_ident N (SUM: sum N z N).
%worlds () (sum_ident _ _).
%total N (sum_ident N _).
sum_inc : sum A B C -> sum A (s B) (s C) -> type.
%mode sum_inc +SUM
-SUM'.
This
Has Been Checked By A Computer
sum_inc_z : sum_inc (sum_z : sum z N N) (sum_z : sum z (s N) (s N)).
sum_inc_s : sum_inc (sum_s D) (sum_s D')
<- sum_inc D D'.
%worlds () (sum_inc _ _).
%total SUM (sum_inc SUM _).
& : sum_comm (sum_z : sum z N N) SUM
<- sum_ident N SUM.
& : sum_comm (sum_s SUM) SUM'‘
<- sum_comm SUM SUM‘
<- sum_inc SUM' SUM''.
%worlds () (sum_comm _ _).
%total SUM (sum_comm SUM SUM').
Now what?
•Introduction of the simply typed
lambda calculus in Twelf
•Proof of progress (it turns out to be the
most interesting one in several
respects)
•The full safety proof is on the web at
http://www.princeton.edu/~rsimmons/talk.elf
The simply typed λ-calculus
•Expressions
e := ()
| λ x:t.e
|ee
•Types
t := unit
|t→t
The simply typed λ-calculus
•Expressions
exp : type.
unit : exp.
lam : tp -> (exp -> exp) -> exp.
@ : exp -> exp -> exp.
%infix left 10 @.
e := ()
| λ x:t.e
|ee
•Types
t := unit
|t→t
The simply typed λ-calculus
•Expressions
exp : type.
unit : exp.
lam : tp -> (exp -> exp) -> exp.
@ : exp -> exp -> exp.
%infix left 10 @.
e := ()
| λ x:t.e
|ee
•Types
tp : type.
unittp : tp.
=> : tp -> tp -> tp.
%infix right 10 =>.
t := unit
|t→t
The simply typed λ-calculus
Γ├ () : unit
•The typing relation
Γ[x:t]├ e : t’
Γ├ e : t → t’
Γ├ λ x:t.e : t → t’
Γ├ e : t
Γ├ e e’ : t’
Γ(x) = t
Γ├ x : t
The simply typed λ-calculus
•The typing relation
Γ[x:t]├ e : t’
Γ├ λ x:t.e : t → t’
Γ├ () : unit
Γ├ e : t → t’
Γ├ e : t
Γ├ e e’ : t’
typed : exp -> tp -> type.
t_unit : typed unit unittp.
t_lam : typed (lam T ([x] E x)) (T => T')
<- ({x} typed x T -> typed (E x) T').
t_app : typed (E1 @ E2) T
<- typed E1 (T' => T)
<- typed E2 T'.
The simply typed λ-calculus
•The value relation
isval : exp -> type.
v_unit : isval unit.
v_lam : isval (lam T ([x] E x)).
The simply typed λ-calculus
•The step relation
(λ x:t.e) v  [x←v]e
e 1  e 1’
e 2  e 2’
e 1 e 2  e 1’ e 2
v e2  v e 2 ’
The simply typed λ-calculus
•The step relation
(λ x:t.e) v  [x←v]e
e 1  e 1’
e 2  e 2’
e 1 e 2  e 1’ e 2
v e2  v e 2 ’
eval : exp -> exp -> type.
s_app1 : eval (E1 @ E2) (E1' @ E2)
<- eval E1 E1'.
s_app2 : eval (E1 @ E2) (E1 @ E2')
<- isval E1
<- eval E2 E2'.
e_app : eval (lam T ([x] E x) @ V) (E V)
<- isval V.
The simply typed λ-calculus
• The Progress Theorem
– If an expression e has type t, then the expression is not stuck (the
expression steps or is a value)
– Proof by induction and case analysis on the last step of the typing
derivation
– If the last step in the typing derivation is t_unit or t_lam, then the
expression is a value
– If the last step in the typing derivation is t_app, then the inductive
case establishes that the two subexpressions aren’t stuck
– If either subexpression steps, then either s_app1 or s_app2 will
allow the expression to step
– If both subexpressions are values, then because the first
subexpression has type t → t’, the canonical forms lemma says
that it must be a function, so the expression can step by e_app
The simply typed λ-calculus
• The Progress Theorem
– If an expression e has type t, then the expression is not stuck (the
expression steps or is a value)
– Proof by induction and case analysis on the last step of the typing
derivation
notstuck : exp -> type.
ns_isval : notstuck E
<- isval E.
ns_steps : notstuck E
<- eval E E'.
progress : typed E T -> notstuck E -> type.
%mode progress +T -E.
The simply typed λ-calculus
• The Progress Theorem
– If the last step in the typing derivation is t_unit or t_lam, then the
expression is a value
& : progress t_unit (ns_isval v_unit).
& : progress (t_lam _) (ns_isval v_lam).
The simply typed λ-calculus
• The Progress Theorem
– If the last step in the typing derivation is t_app, then the inductive
case establishes that the two subexpressions aren’t stuck
– If either subexpression steps, then either s_app1 or s_app2 will
allow the expression to step
– If both subexpressions are values, then because the first
subexpression has type t → t’, the canonical forms lemma says
that it must be a function, so the expression can step by e_app
& : progress (t_app T2 T1) (ns_steps (s_app1 E))
<- progress T1 (ns_steps E).
The simply typed λ-calculus
• The Progress Theorem
– If the last step in the typing derivation is t_app, then the inductive
case establishes that the two subexpressions aren’t stuck
– If either subexpression steps, then either s_app1 or s_app2 will
allow the expression to step
– If both subexpressions are values, then because the first
subexpression has type t → t’, the canonical forms lemma says
that it must be a function, so the expression can step by e_app
& : progress (t_app T2 T1) (ns_steps (s_app1 E))
<- progress T1 (ns_steps E).
The simply typed λ-calculus
• The Progress Theorem
– If either subexpression steps, then either s_app1 or s_app2 will
allow the expression to step
– If both subexpressions are values, then because the first
subexpression has type t → t’, the canonical forms lemma says
that it must be a function, so the expression can step by e_app
lemma : typed E (T' => T) -> notstuck E
-> notstuck E' -> notstuck (E @ E') -> type.
%mode lemma +T +NS1 +NS2 -NS.
& : lemma _ (ns_steps E) _ (ns_steps (s_app1 E)).
& : lemma _ (ns_isval v_lam) (ns_steps E)
(ns_steps (s_app2 E v_lam)).
& : lemma _ (ns_isval v_lam) (ns_isval V)
(ns_steps (e_app V)).
%worlds () (lemma _ _ _ _).
%total {} (lemma _ _ _ _).
The simply typed λ-calculus
• The Progress Theorem
– If the last step in the typing derivation is t_app, then the inductive
case establishes that the two subexpressions aren’t stuck
– If either subexpression steps, then either s_app1 or s_app2 will
allow the expression to step
– If both subexpressions are values, then because the first
subexpression has type t → t’, the canonical forms lemma says
that it must be a function, so the expression can step by e_app
& : progress (t_app T2 T1) NS
<- progress T1 NS1
<- progress T2 NS2
<- lemma T1 NS1 NS2 NS.
%worlds () (progress _ _).
%total T (progress T E).
• MiniML
– Distributed with Twelf in the examples directory
– The simply typed polymorphic lambda calculus w/ type
reconstruction
– Many tutorials discuss this, such as The Twelf User’s Guide
• Chapter 2 of my thesis report
– Develops what is more or less the simply typed lambda calculus
with a mutable store
– Describes the full proof in more detail and in a different way
– http://www.princeton.edu/~rsimmons/thesis.pdf
• A metalogical approach to foundational proof carrying
code (Crary and Sarkar, 2003)
– Presents an introductory approach starting with the sum – I
adapted the sum examples from that code
– Concerned with assembly code more than formalizing
programming languages
Conclusions
• Twelf metatheory is a good system for formalizing
programming languages, and seems well suited
for a variety of other tasks as well
• Twelf is a pretty good system to ‘tinker’ with
formal programming language descriptions
– Chapter 3 of my thesis describes some strategies for
keeping descriptions ‘tinkerable’ as they get larger.
• The tools are hopefully going to keep getting
better
– Delphin is a very interesting upcoming tool to keep an
eye on
```