```Checking Validity of Quantifier-Free
Formulas in Combinations of First-Order
Theories
Clark W. Barrett
Ph.D. Dissertation Defense
Department of Computer Science
Stanford University
August 2001
The Problem: First-Order Logic



First-Order Logic is a mathematical system for making
precise statements.
Statements in first-order logic are made up of the following
pieces:
 Variables
x, y
 Constants
0, John, 
 Functions
f (x ), x + y
 Predicates
p (x ), x > y, x = y
 Boolean connectives , , , 
 Quantifiers
, 
Example: “Every rectangle is a square”
x. (Rectangle (x )  Square(x))
The Problem: First-Order Theories

A first-order theory is a set of first-order statements about a
related set of constants, functions, and predicates.

A theory of arithmetic might include the following
x. ( x + 0 = x )
x,y. (x + y = y + x )
The Problem: Validity

An expression is valid if every possible way of interpreting
it results in a true statement.
x=x
p(x )  p(x )
x = y  f (x ) = f (y )
f (x ) = f (y )  x = y

Valid
Valid
Valid
Invalid
An expression is valid in a theory if every possible way of
interpreting it in that theory results in a true statement.
x  0
Invalidinin
Valid
positive
the theory
real arithmetic
of real arithmetic
The Problem: Validity Checking

Suppose T is a first-order theory and  is a first-order
formula
 We write T =  as an abbreviation for “ is valid in T ”
 A classical result in Computer Science states that in
general, the question of whether T =  is undecidable.
 It is impossible to write a program that can always
figure out whether T = 


However, given appropriate restrictions on T and  , a
program can automatically decide T = 
We consider theories T such that T =  is decidable
when  is quantifier-free.
Motivation

Many interesting and practical problems can be solved by
checking the validity of a formula in some theory.

As evidence of this claim, consider the following widelyused tools tools which include decision procedures for
checking validity
 PVS [Owre et al. ‘92]
 STeP [Manna et al. ‘96, Bjørner ‘99]
 ESC [Detlefs et al. ‘98]
 Mona [Klarlund and Møller ‘98]
 SVC [Barrett et al. ‘96]
The SVC Story



Roots in processor verification
 [Burch and Dill ‘94]
 [Jones et al. ‘95]
Internal use at Stanford
 Symbolic simulation [Su et al. ‘98]
 Software specification checking [Park et al. ‘98]
 Infinite-state model checking [Das and Dill ‘01]
External use since public release in 1998
 Model Checking [Boppana et al. ‘99]
 Theorem prover proof assistance [Heilmann ‘99]
 Integration into programming languages [Day et al. ‘99]
 Many others
The SVC Story

Despite its success, SVC has many limitations
 Gaps in theoretical understanding
 Outgrown its original software architecture
 Unnecessarily slow performance in some cases

This thesis is the result of ongoing efforts to address these
limitations.
 New contributions to underlying theory
 A flexible and efficient implementation
 Techniques for faster and more robust performance
Outline

Validity Checking Overview

The Problem

Motivation

The SVC Story

Top-Level Algorithm

Methods for Combining Theories

Implementation


Contributions and Conclusions
Top-Level Algorithm

Consider the following formula in the theory of arithmetic
x>y

true  y > x  x = y
true



y>x

x=y
false  y > x  x = y
y>x  x=y
Step 1: Choose an atomic formula
Step 2: Consider two cases:
 Replace the atomic formula with true
 Replace the atomic formula is with false
Step 3: Simplify
Top-Level Algorithm

Consider the following formula in the theory of arithmetic
x>y

y>x
true  y > x  x = y
x=y
false  y > x  x = y
y>x  x=y
true
xy  yx  xy

true
x=y
true
false
This formula is unsatisfiable
Validity Checking Overview

A literal is an atomic formula or its negation

The validity checker is built on top of a core decision
procedure for satisfiability in T of a set of literals.

The method for checking satisfiability will vary greatly
depending on the theory in question

The most powerful technique for producing a satisfiability
procedure is by combining other satisfiability procedures
Outline

Validity Checking Overview

Methods for Combining Theories

The Problem

Shostak’s Method

The Nelson-Oppen Method

A Combined Method

Implementation


Contributions and Conclusions
The Problem

Consider the following theories:
 Real linear arithmetic: +,-,0,1,…,
 Arrays: s[i], update(s,i,v)
 Uninterpreted functions and predicates: f (x ), p(x ),…

And the following set of literals in the combined theory:
p (y )  s = update (t, i, 0 )  x - y - z = 0 
z + s[i ] = f (x - y )  p (x - f (f (z ) ) )

Question:
a method
towith
decide
satisfiability
Two
main Given
approaches,
each
and of literals
in each theory, how do we decide the satisfiability of literals
in the
combined
theory?
Shostak
[Shostak
‘84]
 Nelson-Oppen [Nelson and Oppen ‘79]
Shostak’s Method




Has formed an ongoing strand of research
 Originally published in 1984 [Shostak ‘84]
 Several clarifying papers since then
 [Cyrluk et al. ‘96]
 [Ruess and Shankar ‘01]
Used in several automated deduction systems
 PVS, STeP, SVC
Unfortunately, remains difficult to understand
 Details are nonintuitive
 Simple proof of correctness has been especially elusive
Contribution : A new presentation of a key subset of
Shostak’s original algorithm.
Shostak’s Method: Canonizer

There are two main components in a Shostak satisfiability
procedure: the canonizer and the solver.

The canonizer rewrites terms into a unique form


T = a = b  canon (a ) = canon (b )
Example: canonizer for linear arithmetic
 Combines like terms
 canon (x + x ) = 2x
 Imposes an ordering on the variables
 canon (y + x ) = x + y
Shostak’s Method: Solver

A set of equations E is said to be in solved form if the lefthand side of each equation is a variable which appears only
once in E
in solved form
x = y+z
w=z-a
v = 3y + b

not in solved form
x =y+z
w =z+x
2v = 3y + b
 S  means replace each left-hand side variable occurring
in S with its corresponding right-hand side
E (w + x + y + z ) = z - a + y + z + y + z
Shostak’s Method: Solver

The solver transforms an equation into an equisatisfiable
set of equations in solved form
If T = a  b , then solve (a = b ) = { false }
 Otherwise:
 solve (a = b ) = a set of equations E in solved form



T = (a = b  x. E )
 x is a set of fresh variables appearing in E, but not
in a or b.
Example: solver for real linear arithmetic
 solve (x - y - z = 0 ) = { x = y + z }
 solve (x + 1 = x - 1 ) = { false }
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Use a generalization of Gaussian elimination with
back substitution
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
Choose matrix row
  Select an equation  from 
 Apply E as a substitution to 
 Solve to get E’
 Apply E’ as a substitution to E

-x - 3y + 2z = -1
x - y - 6z = 1
2x + y - 10z = 3
E
 1  3 2

1 1  6

 2 1  10

 1

1

3
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
  Apply E as a substitution to  Apply previous rows
 Solve to get E’
 Apply E’ as a substitution to E

-x - 3y + 2z = -1
x - y - 6z = 1
2x + y - 10z = 3
E
 1  3 2

1 1  6

 2 1  10

 1

1

3
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
Apply previous rows
 Apply E as a substitution to 
Make pivot 1
  Solve to get E’
  Apply E’ as a substitution to E Apply to previous rows

x = -3y + 2z +1
x - y - 6z = 1
2x + y - 10z = 3
E
1  3  2

1 1  6

 2 1  10

 1

1

3
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
Choose matrix row
  Select an equation  from 
Apply previous rows
 Apply E as a substitution to 
Make pivot 1
 Solve to get E’
 Apply E’ as a substitution to E Apply to previous rows
  Add E’ to E

E
x - y - 6z = 1
2x + y - 10z = 3
x = -3y + 2z +1
 1 3 2

1 1  6

 2 1  10

1

1

3
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
  Apply E as a substitution to  Apply previous rows
Make pivot 1
 Solve to get E’
 Apply E’ as a substitution to E Apply to previous rows

E
-3y +2z +1-y -6z =1 x = -3y + 2z +1
2x + y - 10z = 3
 1 3 2

0 4 4

 2 1  10

1

0

3
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
Apply previous rows
 Apply E as a substitution to 
Make pivot 1
  Solve to get E’
 Apply E’ as a substitution to E Apply to previous rows

E
y = -z
2x + y - 10z = 3
x = -3y + 2z +1
 1

0

 2

3
2
1
1
1  10
1

0

3
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
Apply previous rows
 Apply E as a substitution to 
Make pivot 1
 Solve to get E’
  Apply E’ as a substitution to E Apply to previous rows

E
 1
x = -3(-z) +2z +1 
y = -z
0
2x + y - 10z = 3

 2

0
5
1
1
1  10
1

0

3
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
Choose matrix row
  Select an equation  from 
Apply previous rows
 Apply E as a substitution to 
Make pivot 1
 Solve to get E’
 Apply E’ as a substitution to E Apply to previous rows
  Add E’ to E

2x + y - 10z = 3
E
x = 5z +1
y = -z
 1

0

 2

0
5
1
1
1  10
1

0

3
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
  Apply E as a substitution to  Apply previous rows
Make pivot 1
 Solve to get E’
 Apply E’ as a substitution to E Apply to previous rows

E
2(5z +1)+(-z )-10z=3 x = 5z +1
y = -z
 1

0

 0

0 5
1
1
0 1
1

0

1
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
Apply previous rows
 Apply E as a substitution to 
  Solve to get E’
Make pivot 1
 Apply E’ as a substitution to E Apply to previous rows

z = -1
E
x = 5z +1
y = -z
 1

0

 0

0 5
1
1
0
1
1

0

 1
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
Apply previous rows
 Apply E as a substitution to 
Make pivot 1
 Solve to get E’
  Apply E’ as a substitution to E Apply to previous rows

z = -1
E
x = 5(-1) +1
y = -(-1)
1 0 0

0 1 0

0 0 1

 4

1

 1
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Select an equation  from 
Choose matrix row
Apply previous rows
 Apply E as a substitution to 
Make pivot 1
 Solve to get E’
 Apply E’ as a substitution to E Apply to previous rows
  Add E’ to E

E
x = -4
y =1
z = -1
1 0 0

0 1 0

0 0 1

 4

1

 1
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Step 2: Use this set of equations together with the
canonizer to check if any disequality is violated
  For each a  b  
  Check if canon (E (a ) ) = canon (E (b ) )
E
x = -4
y =1
z = -1

42  42
2y - 10x
6(z - 2x)
2(1)-10(-4)6(-1-2(-4))
The Simplified Algorithm

Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Step 2: Use this set of equations together with the
canonizer to check if any disequality is violated
  For each a  b  
  Check if canon (E (a ) ) = canon (E (b ) )
E
x = 5z +1
y = -z

4z
1 (5z
4z
+ z1-z
1 -+4y
x1-4(-z)
+1)
The Simplified Algorithm


Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Step 2: Use this set of equations together with the
canonizer to check if any disequality is violated
 For each a  b  
 Check if canon (E (a ) ) = canon (E (b ) )
Technical detail:
 If there is more than one disequality, the theory must be
convex
Shostak’s Method: Combining Theories


In what sense is this algorithm a method for combining
theories?
Two Shostak theories T1 and T2 can often be combined to
form a new Shostak theory T = T2  T2
 Compose canonizers: canon = canon1 o canon2
 Often, solvers can also be combined
 Treat terms from other theory as variables
 Repeatedly apply solvers from each theory until
resulting set of equations is in solved form
Shostak’s Method: Contributions

Shostak’s original algorithm is much more complicated
because it includes a decision procedure for the theory of
pure equality with uninterpreted functions

Why is the simplified version a contribution?
 Can be applied directly to produce decision procedures,
even combinations of decision procedures
 Much easier to understand and prove correct
 Provides intuition for understanding the original
algorithm
 Provides the foundation for a generalization of the
original Shostak method based on a variation of
Nelson-Oppen
Nelson-Oppen

Developed for the Stanford Pascal Verifier
 [Nelson and Oppen ‘79]
 [Nelson ‘80, Oppen ‘80]

Tinelli and Harandi discovered a new (simpler) proof and
an important optimization
 [Tinelli and Harandi ‘96]

Used in real systems
 ESC
 EHDM [von Henke et al. ‘88]
Nelson-Oppen


Unlike Shostak, Nelson-Oppen does not impose a specific
strategy on individual theories

Instead of a solver and canonizer,

Each theory provides a complete satisfiability procedure

Technical detail: Each theory must be stably infinite
There are two phases in the version of Nelson-Oppen
presented by Tinelli and Harandi

Purification phase

Check phase
Nelson-Oppen: Purification Phase

Transform a set of literals in a combined theory to an
equisatisfiable set of literals such that each literal is pure :
 A pure literal contains symbols from only a single theory

Consider again the following set of literals in a combined
theory of arithmetic, arrays, and uninterpreted functions
p (y )
s = update (t, i, 0 )
x-y-z=0
z + s[i ] = f (x - y )
p (x - f (f (z ) ) )
j =0
Nelson-Oppen: Purification Phase

Transform a set of literals in a combined theory to an
equisatisfiable set of literals such that each literal is pure :
 A pure literal contains symbols from only a single theory

Consider again the following set of literals in a combined
theory of arithmetic, arrays, and uninterpreted functions
p (y )
j =0
s = update (t, i, j )
k = s[i ]
x-y-z=j
z + s[i ] = f (x - y )
p (x - f (f (z ) ) )
Nelson-Oppen: Purification Phase

Transform a set of literals in a combined theory to an
equisatisfiable set of literals such that each literal is pure :
 A pure literal contains symbols from only a single theory

Consider again the following set of literals in a combined
theory of arithmetic, arrays, and uninterpreted functions
p (y )
j =0
s = update (t, i, j )
k = s[i ]
x-y-z=j
l =x-y
z + k = f (x - y )
m=z +k
p (x - f (f (z ) ) )
Nelson-Oppen: Purification Phase

Transform a set of literals in a combined theory to an
equisatisfiable set of literals such that each literal is pure :
 A pure literal contains symbols from only a single theory

Consider again the following set of literals in a combined
theory of arithmetic, arrays, and uninterpreted functions
p (y )
j =0
s = update (t, i, j )
k = s[i ]
l-z=j
l =x-y
m = f (l )
m=z +k
p (x - f (f (z ) ) )
Nelson-Oppen: Purification Phase

Transform a set of literals in a combined theory to an
equisatisfiable set of literals such that each literal is pure :
 A pure literal contains symbols from only a single theory

Consider again the following set of literals in a combined
theory of arithmetic, arrays, and uninterpreted functions
p (y )
j =0
s = update (t, i, j )
k = s[i ]
l-z=j
l =x-y
m = f (l )
m=z +k
p (v )
n = f (f (z ) ) )
v =x-n
Nelson-Oppen: Purification Phase

Transform a set of literals in a combined theory to an
equisatisfiable set of literals such that each literal is pure :
 A pure literal contains symbols from only a single theory

Consider again the following set of literals in a combined
theory of arithmetic, arrays, and uninterpreted functions
l-z=j
s = update (t, i, j )
p (y )
j =0
k = s[i ]
m = f (l )
l =x-y
p (v )
m=z +k
n = f (f (z ) ) )
v =x-n
Nelson-Oppen: Check Phase Definitions

Shared variables are variables that appear in literals from
more than one theory


Shared: l, z, j, y, m, k, v, n

Unshared: x, s, t, i
An arrangement of a set is a set of equalities that
partitions the set into equivalence classes p (y )
s = update (t, i, j )
l-z=j
 Suppose S = { a , b , c }
m = f (l )
k = s[i ]
j =0
 Some arrangements of S
p (v )
l =x-y
{ a  b , a  c , b  c }
{{a},{b},{c}}
n = f (f (z ) ) )
m=z +k
{ a = b , a  c , b  c }
{{a,b},{c}}
v =x-n
{ a = b , a = c , b = c }
{{a,b,c}}
Nelson-Oppen: Check Phase



Choose an arrangement A of the shared variables
For each theory, check if the set of literals pure in that
theory together with the arrangement A is satisfiable
If an arrangement exists that is compatible with each set of
literals, then the original set of literals is satisfiable in the
combined theory
Arithmetic
l-z=j
j =0
l =x-y
m=z+k
v =x-n
Arrays
s = update (t, i, j )
k = s[i ]
A (l, z, j, y, m, k, v, n )
Uninterpreted
p (y )
m = f (l )
p (v )
n = f (f (z ) ) )
Nelson-Oppen: A Variation

Contribution : A Variation of Nelson-Oppen
The purification phase can be eliminated
 Instead, simply partition the formulas according to the
outer-most symbol

p (y )
s = update (t, i, 0 )
Arithmetic
Arrays
Uninterpreted
x - yx--zy=-0z = 0 s = update (t, i, 0 ) p (y )
z + s[i
p (x - f (f (z ) ) )
z +] =
s[if ](x=-f y
(x)- y )
p (x - f (f (z ) ) )
Nelson-Oppen: A Variation

Contribution : A Variation of Nelson-Oppen
The purification phase can be eliminated
 Instead, simply partition the formulas according to the
outer-most symbol
 Choose an arrangement A of the shared terms which
appear in a term or formula belonging to another theory
 For each theory, check if the set of literals assigned to
that theory together with the arrangement is satisfiable


Terms with foreign symbols are treated as variables
Arithmetic
Arrays
Uninterpreted
x-y-z=0
s = update (t, i, 0 ) p (y )
z + s[i ] = f (x - y )
p (x - f (f (z ) ) )
A (s[i ], x - y, f (x - y ), 0, y, z, f (f (z ) ), x - f (f (z ) ) )
Nelson-Oppen: A Variation

Contribution : A Variation of Nelson-Oppen
The purification phase can be eliminated
 Instead, simply partition the formulas according to the
outer-most symbol
 Choose an arrangement A of the shared terms which
appear in a term or formula belonging to another theory
 For each theory, check if the set of literals assigned to
that theory together with the arrangement is satisfiable



Terms with foreign symbols are treated as variables
Contributions of this variation
 Fewer formulas given to each theory
 Easier to implement
 Easier to combine with Shostak
Combining Shostak and Nelson-Oppen

Theory requirements
 Shostak requires convexity
 Nelson-Oppen requires stable-infiniteness
 Contribution : The following theorem relates the two
Every convex first-order theory
with no trivial models is stably-infinite
The proof is based on first-order compactness
 Note: if a convex theory does admit trivial models, it can
usually be modified to include the non-triviality axiom:
x,y. x  y

Combining Shostak and Nelson-Oppen

Contribution : An algorithm for combining the two methods

Equalities are processed according to the Shostak algorithm
to get a set of equalities E in solved form

All literals are partitioned as in the Nelson-Oppen variation

The key idea is to consider the partial arrangement induced
on the shared terms S by canon and E :


A= : { a = b  a,b  S  canon (E(a )) = canon (E(b )) }
An arrangement A is chosen as in the Nelson-Oppen
variation, but this arrangement must include A=


This arrangement is automatically consistent with E
The non-Shostak theories are checked for consistency
with the arrangement as before
Outline

Validity Checking Overview

Methods for Combining Theories

Implementation


Contributions and Conclusions
Implementation: Approach




Based on Nelson-Oppen and Shostak combination
Online algorithm
Optimizations
 A Union-Find data structure and an Update List are used
to efficiently keep track of both E and A simultaneously
 Each new formula is simplified
 Enables rewrites that can reduce the number of shared
terms
Flexible theory interface
 Accommodates Nelson-Oppen theories, Shostak theories,
and more
Implementation: Interface

Recall the top-level algorithm
x>y

true  y > x  x = y




y>x

x=y
false  y > x  x = y
true
y>x  x=y
Choose an atomic formula 
Consider two cases:
Repeat until formula is true or set of choices is unsatisfiable
Interface from top-level : AddFact, Simplify, Satisfiable
Top-level code
Satisfiable
Setup
Term
Assert
Formula
Assert
Assert
Equalities
Setup
CheckSat
Assert
Solve Update
Theory-specific code
Simplify
Rewrite
Rewrite
p(y), s = update(t, i, 0), x -y -z = 0, z + s[i] = f (x - y), p(x -f (f (z)))
Top-level code
p(y)
Satisfiable
p(y)
p(y)
Assert
Simplify
p(y)
p(y)
Setup
Term
y
Assert
Formula
Assert
Equalities
Rewrite
p(y)
Uninterpreted
y
p(y)
Arrays
Arithmetic (Shostak)
Update List
E
p(y), s = update(t, i, 0), x -y -z = 0, z + s[i] = f (x - y), p(x -f (f (z)))
Top-level code
s = update(t, i, 0)
Satisfiable
Assert
Simplify
s = update(t, i, 0)
Setup
Term
0
Assert
Formula
s = ...
Assert
Equalities
Rewrite
s = update(t, i, 0)
Uninterpreted
y
p(y)
Arrays
Arithmetic (Shostak)
Update List
E
0
s = update(t, i, 0) s = update(t, i, 0)
p(y), s = update(t, i, 0), x -y -z = 0, z + s[i] = f (x - y), p(x -f (f (z)))
Top-level code
x -y -z = 0
Satisfiable
Assert
x -y
= z0
x =-z
y+
Setup
Term
y+z
Assert
Formula
x = ...
Simplify
x=y+z
Assert
Equalities
Rewrite
x=y+z
Uninterpreted
y
p(y)
Arrays
Arithmetic (Shostak)
Update List
E
0
s = update(t, i, 0) s = update(t, i, 0) y + z
x=y+z
p(y), s = update(t, i, 0), x -y -z = 0, z + s[i] = f (x - y), p(x -f (f (z)))
Top-level code
z + s[i] = f (x - y)
z+s[i]= ...
Satisfiable
Assert
Simplify
z = f (z)
z = f (z)
f (z)
Setup
Term
Assert z=f (z)
Assert
Formula
Equalities
z = f (z)
Uninterpreted
y
p(y)
z
f (z )
z = f (z)
Arrays
y
s[i]
z x0
Rewrite
xz
- s[i]
y
0
Arithmetic (Shostak)
Update List
E
0
s = update(t, i, 0) s = update(t, i, 0) y + z
x=y+z
z = f (z)
p(y), s = update(t, i, 0), x -y -z = 0, z + s[i] = f (x - y), p(x -f (f (z)))
Top-level code
p(x -f (f (z)))
p(x -…)
Satisfiable
Assert
Simplify
p (y )
-f
(z)
z (z))
fy(z)xf (f
p (y )
Setup
Term
Assert
Formula
p (y )
Uninterpreted
y
p(y)
z
f (z )
z = f (z)
p (y )
Arrays
Assert
Equalities
Rewrite
z(z))
f (f
fx
(z)
-f
(z)
f (z)
y
Arithmetic (Shostak)
Update List
E
0
s = update(t, i, 0) s = update(t, i, 0) y + z = y + f (z)
x = y + f (z)
z = f (z)
Implementation: Contributions


Better implementation of Nelson-Oppen
 Online algorithm
 Each theory only needs to consider a subset of the
shared terms
 Simplify phase
 Can reduce number of shared terms
 Equality reasoning is only done once
 Simple algorithm with detailed proof
 Flexible theory interface
Combined with Shostak
 Generalizes original Shostak algorithm
 Efficient: same data structure for E and A
Outline

Validity Checking Overview

Methods for Combining Theories

Implementation



The Problem

Combining with SAT

Results
Contributions and Conclusions
The Problem

Recall the top-level algorithm
x>y

true  y > x  x = y



y>x

x=y
false  y > x  x = y
true
y>x  x=y
Choose an atomic formula 
Consider two cases:
Repeat until formula is true or set of choices is unsatisfiable
The Problem

The choice of which atomic formula to try next can make a
dramatic difference in performance

SVC includes clever heuristics that improve performance
significantly

We are convinced that better performance is possible
 Equivalent formulas can vary significantly in
performance
 Research in a related area, Boolean satisfiability (SAT),

Strategy : Find a way to apply SAT techniques to first-order
validity checking
Combining with SAT: Approach

Generate SAT problem from validity-checking problem
 Negate the formula whose validity is in question
 Extract Boolean structure from resulting formula
 Convert to CNF [Larabee ‘92]
 Run SAT on converted formula

If SAT reports unsatisfiabile, the formula is valid

The inverse is not true
 A satisfying assignment must be checked for first-order
consistency
Combining with SAT: Initial Results

Implementation
 GRASP SAT engine [Silva ‘96]
 SVC2

Initial results were disappointing
 Examples of interest could not be proved by just
considering Boolean structure
 SAT techniques do not compensate for the loss of
information resulting from translation to SAT

Idea :

Combining with SAT: Conflict Clauses

A conflict clause captures a minimal set of decisions that
lead to a conflict and keeps SAT from ever making the same
set of choices
f (x ) = f (y )  y > x  x  y
true  y > x  x  y
false  y > x  x  y
y>x  xy
true
Unsatisfiable
f (x )  f (y )
yx
x=y
true
xy
true
false
Combining with SAT: Conflict Clauses


How do we get a conflict clause from the first-order
satisfiability algorithm
 Using all decisions too slow
 Black-box minimization methods too slow
Solution : Use proof-production!
Aaron Stump has extended several SVC decision
procedures to produce a proof for every result deduced
 By looking at what assumptions are used in a proof of
inconsistency, a conflict clause can be obtained

Results
Test Case
SVC (no heuristics)
Decisions
SVC (current heuristics)
Time (s)
Decisions
Time(s)
SVC2 with SAT
Decisions
Time(s)
fb_var_12_11
17484
6.8
14386
6.0
257
0.8
fb_var_5_11
73484
29.0
60236
25.3
279
0.8
fb_var_6_12
25156
8.0
19533
5.9
79
0.1
93637
55.4
902
1.9
894
5.8
344893
292.9
35491
18.5
629
4.1
pp-dmem2
361854
293.6
47989
26.3
775
6.0
3547
2
3484
1.9
174
0.5
260
0.3
384
0.4
1244
10.0
dlx-dmem
2809
1.8
655
0.8
2149
30.1
dlx-regfile
989
0.9
936
1.1
40999
1132
pp-invariant
dlx-pc
Results: Preliminary Conclusions

Naïve approach does not work well

Adding conflict clauses results in dramatic speed-ups on
several examples

Most helpful on formulas with more Boolean structure

Still more work to be done
 Find out source of performance problems
 Compare to related work
 [Goel et al. ‘98]
 [Bryant et al. ‘99]
Outline

Validity Checking Overview

Methods for Combining Theories

Implementation


Contributions and Conclusions
Thesis Contributions

A new presentation of the core of Shostak’s algorithm
 Easier to understand and prove correct
 Can be applied directly to produce decision procedures
 Forms the foundation of a generalization

A new variation of Nelson-Oppen
 Eliminates purification phase
 Fewer formulas given to each theory
 Easier to implement
 Easier to combine with Shostak

A new algorithm combining Shostak and Nelson-Oppen
 Theoretical result relating convex and stable-infinite
 Generalization of Shostak’s original method
Thesis Contributions

A detailed and provably correct implementation
 Online
 Optimized to eliminate redundant equality reasoning
 Optimized to reduce number of shared terms
 Flexible theory API

Faster search by combining with SAT
 Methodology and implementation for extracting CNF
 Better performance via conflict clauses
 Conflict clauses from proofs (with Aaron Stump)
 Dramatic improvements on several examples
Future Work

Relaxing restrictions on theories and formulas
 Non-disjoint signatures
 Non-stably-infinite theories
 Formulas with quantifiers

Individual Theories
 Efficient implementation for Presburger arithmetic
 Better techniques for accommodating third-party
decision procedures

SAT
 Understand cases where combination with SAT fails
Acknowledgements






Orals Committee: John Gill, Zohar Manna, John Mitchell,
Natarajan Shankar
Stanford Associates: Aaron Stump, Jeremy Levitt, Satyaki
Das, Jeffrey Xsu, Robert Jones, Vijay Ganesh, Kanna
Shimizu, Husam Abu-Haimed, Jens Skakkebæk, David
Park, Shankar Govindaraju, Madan Musuvathi, Chris
Wilson
Others: Cesare Tinelli
SVC Users
Personal: Friends and family
Validity Checking Overview

Top-level Algorithm
CheckValid(h,c)
IF c = true THEN RETURN TRUE;
CheckValid(h,c)
IF !Satisfiable(h) THEN RETURN FALSE;
IF
IF c
c =
= true
falseTHEN
THENRETURN
RETURNTRUE;
FALSE;
IF
!Satisfiable(h)
THEN
RETURN
subgoals := ApplyTactic(h,c); FALSE;
IF
c = false
RETURN FALSE;
FOREACH
(h,c)THEN
in subgoals
DO
subgoals
:=
ApplyTactic(h,c);
IF !CheckValid(h,c) THEN RETURN FALSE;
FOREACH
(h,c) in subgoals DO
RETURN TRUE;
IF !CheckValid(h,c) THEN RETURN FALSE;
RETURN TRUE;
ApplyTactic(h,c)

Let e be an atomic formula appearing in c;
If CheckValid(
T,  ) = TRUE , then T = 
c1 := Simplify(h1,c);
c2 := Simplify(h2,c);
RETURN {(h1,c1),(h2,c2)};
Shostak’s Method: Convexity


A set of literals S is convex in a theory T if T  S does not
entail any disjunction of equalities without entailing one of
the equalities itself
A theory T is convex if every set of literals in the language
of T is convex in T
Shostak’s Method: Requirements on T

Shostak Theory T
 Signature of T contains no predicate symbols
 T is convex

Canonizer  such that a,b. T = a =b iff a  = b 

Solver  such that if T = a b , then a =b   { false }
Otherwise:
 a =b  = a set of equations E in solved form

T = a =b  x. E, where x is the set of variables
appearing in E, but not in a or b.
 The variables in x are guaranteed to be fresh.

The Simplified Algorithm


Given a set of equations  and disequations 
 Step 1: Use the solver to convert  into an equisatisfiable
set of equations E in solved form
 Step 2: Use this set of equations together with the
canonizer to check if any disequality is violated
 Suppose a  b  
 canon (E (a ) ) = canon (E (b ) )

T = E (a ) = E (b )

T  E = a = b

T  E  { a  b } is unsatisfiable
Technical detail:
The method is complete only for convex theories
Shostak’s Method: The Algorithm
Shostak,,,
 := ;
WHILE    DO BEGIN
Remove some equality a = b from ;
Let a’:=  a and b’:=  b;
Let ’:= a’= b’;
IF ’ = false THEN RETURN FALSE;
Let  := ’  U ’;
END
IF a = b for some a  b in 
THEN RETURN FALSE;
ELSE RETURN TRUE;
Shostak(,,,) = TRUE iff
   is satisfiable in T
Nelson-Oppen: Definitions


Theories must be stably-infinite
 A theory T is stably-infinite if every quantifier-free
formula is satisfiable in T iff it is satisfiable in an infinite
model of T
Terminology for combinations of theories
 Theories T1, T2, … Tn with signatures 1, 2, … n
 As with Shostak, signatures must be disjoint
 Members of i are called i-symbols
 An expression containing only i-symbols is called pure
 An i-term is a constant i-symbol, an application of a
functional i-symbol, or an i-variable
 Each variable is associated arbitrarily with a theory
Nelson-Oppen: Definitions


Terminology for combinations of theories (continued)
 An i-predicate is the application of a predicate i-symbol
 An atomic i-formula is an i-predicate or an equation
whose left-hand side is an i-term
 An i-literal is an atomic i-formula or its negation
 An occurrence of a term is i-alien if it is a j-term (i  j)
and all its super-terms are i-terms
If S is a set of terms, then an arrangement of S is a set of
equations and disequations induced by a partition of S
S = { a , b , c }
 Partition P = { { a , b } , { c } }
 Arrangement : { a = b , a  c , b  c }
Nelson-Oppen: Purification Phase
NO-Purify()
WHILE  !=  DO BEGIN
Let  be some i-literal in ;
IF  is pure THEN
Remove  from ;
i := i U {};
ELSE
Let t be an i-alien j-term in ;
Replace every occurrence of t in 
with a new j-variable z;
 :=  U { j = t };
ENDIF
END
RETURN 1^…^n;

 is satisfiable in T iff 1 ^ 2 ^ … n is satisfiable in T
Nelson-Oppen: Check Phase
NO-Check(1,...n,Sat1,…,Satn)
Let S be the set of variables which appear
in more than one i;
Let A be an arrangement of S;
sat := TRUE;
FOREACH i DO BEGIN
sat := sat ^ Sati(i^A);
END
RETURN sat;


The second step is non-deterministic
1 ^ 2 ^ … n is satisfiable in T iff
it is possible for NO-Check to return TRUE
If the theories are convex, the algorithm can be
determinized inexpensively
Nelson-Oppen: A Variation
NO-Check(,Sat1,…,Satn)
Let S be the set of terms which are i-alien in
either an i-literal or an i-term in ;
Let A be an arrangement of S;
sat := TRUE;
FOREACH set of i-literals i in  DO BEGIN
sat := sat ^ Sati(i^A);
END
RETURN sat;



The purification phase can be eliminated
S is a set of terms rather than a set of variables
In calls to Sati , i-alien terms are treated as variables
Combining Shostak and Nelson-Oppen
NO-Shostak(,,,SatNO)
Let S be the set of shared terms;
Let  be the 1-equalities,  the 1-disequalities,
and NO the 2-literals in ;
 := ;
LOOP BEGIN
IF !SatNO(NO^A=) THEN RETURN FALSE;
ELSE IF !SatNO(NO^A) THEN Choose a,b from S
such that T2NOA |= a=b, but a=b  A=
ELSE IF  =  THEN BREAK;
ELSE Remove some equality a = b from ;
Let a’:= (a) and b’:= (b);
Let ’:= (a’= b’);
IF ’ = {false} THEN RETURN FALSE;
Let  := ’() U ’;
END
IF   A THEN RETURN TRUE;
ELSE RETURN FALSE;
Combining Shostak and Nelson-Oppen
NO-Shostak(,,)
 := ;S := ;
LOOP BEGIN
IF t1=f(x1,…,xn), t2=f(y1,…,yn) with t1,t2 in S
and norm(xi)=norm(yi) but norm(t1) != norm(t2)
THEN a := t1, b := t2;
ELSE IF  =  THEN RETURN TRUE;
ELSE Remove some equality a = b from ;
Let a’:= can(a) and b’:= can(b);
Add each sub-term of a’,b’ to S;
Let ’:= (a’= b’);
IF ’ = {false} THEN RETURN FALSE;
Let  := ’() U ’;
END
RETURN TRUE;
Individual Theories

SVC contains decision procedures for a number of
individual theories
 Pure equality with uninterpreted functions
 Real linear arithmetic
 Arrays
 Bit-vectors
 Records

In our efforts to revisit and improve these decision
procedures, a number of interesting issues were uncovered
 Finite domains
 Strategies for arithmetic
Finite Domains

Theoretical technicalitiy
 Cannot directly combine a theory with only finite models
 Not stably-infinite
 Union of theories likely to actually be inconsistent
 Solution: Form an extended theory whose relativized
reduct with respect to a new predicate P is the theory
with a finite domain.

Implementation strategy for nonconvexity
 Keep track of the terms for which P holds
 Use graph coloring to determine satisfiability
Arithmetic


Suppose we want to handle linear arithmetic formulas with
mixed variable types: some real and some integer.
One approach is the following:
 Split weak inequalities into the disjunction of an
equation and a strong inequality
 Use Shostak-style solver to eliminate all equations that
can be solved for a real variable
 Use Fourier-Motzkin techniques to eliminate all real
variables from inequalities
 Eliminate disequalities which can be solved for a real
variable
 What’s left can be done with Presburger decision
procedures
Math symbols
()
```