272: Software Engineering
Fall 2012
Instructor: Tevfik Bultan
Lecture 9: Test Generation from Models
Test Data Generation from Models
• Object-Role Modeling (ORM): A modeling language for specifying
relational data models
– Kind of like UML class diagrams or entity-relationship diagrams
– Has formal semantics
• It allows the developer specify a data model with various constraints
• Used in industry
• Problem:
– Satisfiability: Given an ORM diagram, are there any instances that
satisfy the constraints specified in the diagram
• Test generation: Given a satisfiable ORM diagram, generate
instances that satisfy the constraints specified in the diagram
ORM diagrams
• Ovals show entities or values (basic types)
– These are sets of objects or values
• Rectangles show predicates
– These represent the relations among the objects
– They correspond to the tables in a relational database
ORM constraints
• Uniqueness constraint: Occurrence of a value in any column of a
predicate is unique
• Mandatory constraint: Every object of an entity is implicitly assumed
to participate in some of the predicates in which the entity has a role
• Frequency constraint: Specifies how many times a tuple can occur in
a predicate
• Subset, equality constraints: It specifies that the tuples in the roles of
the first predicate have to be among the tuples in the roles of the
• Subtype constraints: Used to model subtyping among entities
• Value and cardinality constraints: Specify a range of values for
primitive type or a cardinality (which can be a number or a range) for
an entity
• Exclusion constraints: specifies that a particulate value can occur
online in one table
• Ring constraints: Constraints on predicates (symmetric, acyclic,
transitive, intransitive, reflexive, irreflexive, asymmetric)
• For the most general case where constraints on the diagrams are
specified using arbitrary queries
– Satisfiability is undecidable
• For a more restricted class of diagrams, satisfiability checking is NPhard
• Brute-force checking of satisfiability (bounded-exhaustive testing)
does not scale
• For a class of diagrams that the authors define called ORM-, the
satisfiability can be checked in polynomial time
NP-Hardness of ORM satisfiability
• NP-hardness proof shows that well-known NP-complete boolean
satisfiability problem, 3SAT, can be reduced to an instance of ORM
model satisfiability
• A 3-CNF formula has the following form:
(c1,1  c1,2  c1,3)  (c2,1 c2,2  c2,3) … where each ci,j represents
expressions of the form qi or qi over a set of boolean variables q1,
q2, …, qn. We call (ci,1  ci,2  ci,3) clause ci.
• Determining satisfiability of a 3-CNF formula is an NP-complete
• A 3-CNF formula can be translated to a ORM diagram, such that, the
3-CNF formula is satisfiable if and only if the generated ORM diagram
is satisfiable
NP-Hardness of ORM satisfiability
3-CNF formula to ORM diagram translation
• Introduce an object type for each qi
• Introduce an object type for each clause ci
• Introduce a value constraint for each of the object types above where
their value is restricted to a single value qi or ci
• For each variable qi let n be the number of times it occurs in the
clauses. If n > 0, create a (n+1)-ary predicate qi-true
• For each expression qi let m be the number of times it occurs in the
clauses. If m > 0, create a (m+1)-ary predicate qi-false
• Connect type qi to predicates qi-true and qi-false
• Impose and exclusion constraint on the roles played by qi in qi-true
and qi-false (meaning that qi is either true or false, but not both)
• Connect each clause ci to the predicate qi-true if it contains qi or qifalse if it contains qi
Brute-force approach
• As it is possible to translate satisfiability of boolean formulas to
satisfiability of ORM diagrams, it is also possible to translate
satisfiaibility of ORM diagrams, satisfiability of boolean formulas
• The authors tried this approach
They translated ORM diagrams to input languages of CoBaSa and
Alloy Analyzer
– These are tools that can check satisfiability queries about
bounded models
– They use SAT solvers as back-end solvers
• Experiments with this approach shows that ORM satisfiability
checking using a SAT solver is not efficient and does not scale to
large diagrams
ORMGiven the theoretical difficulty of the general ORM satisfiability problem
and the practical inefficiency of the brute-force approach
– the authors define a subset of the ORM language for which
satisfiability can be determined efficiently.
ORM- allows only a subset of ORM constraints
• Uniqueness constraints cannot overlap
• Mandatory constraints are allowed on a single role
• Frequency constraints cannot overlap
• Value and cardinality constraints are allowed
• Subtype constraints are allowed
According the authors majority of constraints used in practice are
covered by this subset.
Note that, subset constraints, exclusion constraints and ring constraints
are not allowed.
Testing Satisfiability of ORM• Using a transformation from the ORM- to integer constraints the
authors show that the satisfiability of the ORM- can be checked by
checking satisfiability of the resulting integer constraints
• They also show that the satisfiability of the resulting integer
constraints can be checked in polynomial time
Converting to integer constraints
• For each type A introduce a variable a representing its cardinality. For
each predicate R, introduce a variable r denoting its cardinality. For
each role in R played by A introduce a variable ra representing the
number of unique elements from A in this role
• Produce inequalities: ra ≤ a and ra ≤ r
• For each cardinality constraint for a type A such as min … max
produce inequalities: min ≤ a and a ≤ max
– Do the same for the predicates. The default min value is 1 and the
default max value is M (which could be set to MaxInt)
• For each value constraint let ds denote the domain size for the value
domain, generate the inequality: a ≤ ds
• For each mandatory constraint for a role played by type A in predicate
R, generate the inequality: a ≤ ra
Converting to integer constraints
• For each frequency constraint with frequency with range fmin, … , fmax
on the roles played in predicate R by types A, B, …, K, introduce the
variable rab…k representing the number of unique tuples in these roles
and produce the following inequalities:
r ≤ fmax . rab…k
fmin . rab…k ≤ r
rab…k ≤ ra . rb . … . rk
ra ≤ rab…k , rb ≤ rab…k , … , rk ≤ rab…k
• For each subtype constraint between types A and S produce the
inquality: a ≤ s
Converting to integer constraints
• For each predicate R with roles played by types A, B, … , N express
the implicit uniqueness constraint over all roles by producing the
r ≤ rab…k . rlm…p . … . rn
where right hand side are variables that correspond to ranges of
roles that participate in some frequency constraints, followed by the
variables corresponding to roles that are under no frequency
• For each non-independent type A that plays roles in predicates R, S,
…., V introduce the constraint: a ≤ ra + sa + … + va
which captures the implicit disjunctive mandatory constraint that each
object of a non-independent type needs to appear in some predicate
Properties of this conversion
• The set of constraints generation by this conversion are satisfiable if
and only if the ORM- diagram itself is satisfiable
– Hence this approach is both sound and complete
• This means that we reduced the satisfiability of the ORM- diagrams to
satisfiability of a set of integer constraints
• Satisfiability of integer constraints is a difficult problem itself.
• However, the integer constraints generated with this conversion are of
a specific type for which the satisfiability can be checked efficiently as
the authors show
Checking satisfiability of constraints
• Rewrite all inequalities of the form c . x ≤ y1 . y2 . … . yn as form
x ≤ (y1 . y2 . … . yn) / c
• For each variable maintain its current upper bound. In every step
substitute the current upper bounds for all variables on the rhs and
after arithmetic manipulation obtain upper bound candidates for each
variable and choose the smallest one
• Repeat until either an upper bound crosses a lower bound or all
upper bounds remain unchanged
The algorithm runs in polynomial time because if there are n variables in
the system, each inequality can yield a different upper bound at most
n times.
Test Generation
• Any set of integer values satisfying the generated integer constraints
can be used to construct an instance of the ORM- data model
• These instances can then be used as test cases for testing a
• So the overall approach would be:
– Given a database, specify an ORM- diagram that characterizes a
set of test cases for testing the database
– First check the satisfiability of the ORM- diagram by generating
integer constraints as described above and then check the
satisfiability of the generated integer constraints
– If the integer constraints are satisfiable, find integer values that
satisfy the constraints and convert those integer values the
instances of the ORM- model