Extensible Semantics for XrML
Vicky Weissman
Joint work with Joe Halpern
The big picture


A policy says that under certain
conditions an action, such as
downloading a file, is permitted or
forbidden.
Digital content providers want


to write policies about how their works
may be accessed, and
to have their policies enforced.
Diverse apps – same need
Because we can’t regulate access to
online content with precision:



Digital libraries can’t put certain content
online; it might violate IP laws.
The Greek Orthodox Archdiocese of
America is wary of defamation.
Cultural traditions aren’t respected.
(Australian Aboriginal communities often
restrict access to a clan or gender.)
XrML to the rescue



XrML is a language for writing policies.
Syntax is XML-based.
Semantics is given in 2 ways.
1. An English interpretation of the syntax.
2. An English description of an algorithm
that says if a set of XrML policies imply a
permission.

Bottom line: write policies in XrML,
enforce using the algorithm.
Industry likes XrML



XrML endorsed by Adobe, HewlettPackard, Microsoft, Xerox,
Barnesandnoble.com, MPEG
International Standards Committee…
Microsoft and others plan to make
XrML-compliant products.
Will tomorrow’s OS, DVD player, …
enforce XrML policies?
Improving XrML

MPEG International Standards Committee





generalized some of the concepts and
extended the language (slightly).
Released their version in March 2003.
We refer to the 2003 version as XrML.
They released another version in 2004,
which we discuss later in this talk.
XrML Shortcomings

No formal semantics.



Policies can be ambiguous.
The interpretation of the syntax doesn’t
quite match the algorithm.
The algorithm’s behavior on some
(realistic) input is unintuitive and
unintended by language designers.

E.g. If Alice is a student and every student
may eat lunch, may Alice? Alg. says no.
Improving XrML (cont)


Fix the algorithm to match developers’ intent.
Translate XrML policies to formulas in modal
first-order logic.



Formal semantics given in a fairly standard way.
Lets us compare XrML with languages in CS
literature, borrow complexity results, extensions,…
Prove our translation matches the algorithm.

Algorithm says policies imply a permission iff
translated policies imply translated permission.
XrML syntax

XrML is an XML-based language.


XrML policies are verbose.
We present a syntax that is


more concise and
easy to map to XrML syntax.
Syntax

Principals


Resources


Digital content (e.g., a movie, an article)
Rights


Agents (e.g., Alice, Bob)
Actions (e.g., play, edit)
Properties

Describe a principal (e.g., adult, trusted)
Syntax (cont.)

Statements



Stmt ::= Pr(p) | Perm(p, r, s) | true
Pr(p) means principal p has property Pr.
Perm(p, r, s) means p is permitted to
exercise right r over resource s.
Syntax (cont.)

Statements



Stmt ::= Pr(p) | Perm(p, r, s) | true
Pr(p) means principal p has property Pr.
Perm(p, r, s) means p is permitted to
exercise right r over resource s.
Syntax (cont.)

Statements



Stmt ::= Pr(p) | Perm(p, r, s) | true
Pr(p) means principal p has property Pr.
Perm(p, r, s) means p is permitted to
exercise right r over resource s.
Syntax (cont.)

Statements



Stmt ::= Pr(p) | Perm(p, r, s) | true
Pr(p) means principal p has property Pr.
Perm(p, r, s) means p is permitted to
exercise right r over resource s.
Stmt ::= Pr(p) | Perm(p, r, s) | true
Syntax (cont.)

grant ::=x1…xn(Stmt
…
Stmt  Stmt)
condition



If condition holds, then conclusion holds.
In our fragment, grants are closed
free variables).
license ::= (grant, principal)

conclusion
(g, p) means p issues/says g.
(no
Examples

Can write:
 `Joe is a professor’ as
true  Prof(Joe) and
 Vicky says `Every professor who
gives a talk may have a cookie’ as
(x (Prof(x)  GivesTalk(x) 
Perm(x, eat, cookie)), Vicky).
Examples

Can write:
 `Joe is a professor’ as
true  Prof(Joe) and
 Vicky says `Every professor who
gives a talk may have a cookie’ as
(x (Prof(x)  GivesTalk(x) 
Perm(x, eat, cookie)), Vicky).
Examples

Can write:
 `Joe is a professor’ as
true  Prof(Joe) and
 Vicky says `Every professor who
gives a talk may have a cookie’ as
(x (Prof(x)  GivesTalk(x) 
Perm(x, eat, cookie)), Vicky).
Principals – in some detail

Set of principals is the set of primitive
principals (e.g., Alice, Bob) closed under
union. E.g., Alice  Bob is a principal.


Often written as {Alice, Bob}.
According to the XrML doc, {p1,.., pn}
represents p1, …, pn “acting together as
one holistic identified entity”.

But what does this mean?
Groups/members relationship


Suppose that Alice has property PrA and
the group {Alice, …} has property Prg.
What should we infer?
 Option 1: nothing.
 Option 2: {Alice, …} has property PrA.
 Option 3: Alice has property Prg.
Groups/members relationship



Suppose that Alice has property PrA and
group {Alice, …} has property Prg.
What should we infer?
 Option 1: nothing.
 Option 2: {Alice, …} has property PrA.
 Option 3: Alice has property Prg.
XrML chooses each of these options
(at different points in the specification).
Groups/members relationship



Suppose that Alice has property PrA and
group {Alice, …} has property Prg.
What should we infer?
 Option 1: nothing.
 Option 2: {Alice, …} has property PrA.
 Option 3: Alice has property Prg.
XrML chooses each of these options
(at different points in the specification).
No formal semantics  language is inconsistent!
Our fix

Since XrML is inconsistent…


We do not assume that a group has the
properties of its members or vice-versa.
But can easily write policies to force either
relationship (or both).
The syntax given here is a fragment of
XrML. (See paper for details.)
XrML Algorithm

Query(s,L,G) algorithm





s is a closed statement.
L is a set of licenses.
G is a set of grants that implicitly hold.
Returns true if s “follows” from L and G.
Query calls Auth and Holds.
Auth Algorithm

Recall





s is a closed statement.
L is a set of licenses.
G is a set of grants that implicitly hold.
A condition is a conjunction of statements.
Auth(s, L, G) returns a set D of closed
conditions; s “follows” from L and G if a
condition in D “holds”.
Holds Algorithm

Holds(d,L) algorithm



d is a closed condition.
L is a set of licenses.
Returns true if d “follows” from L.
Query(s, L, G) overview
Query(s, L, G)
Set D to the output of Auth(s, L, G)
Return dD Holds(d, L)



s is a closed statement, L is a set of licenses,
and G is a set of grants that hold implicitly.
s “follows” from L and G if a condition in the
output of Auth(s, L, G) “holds”.
Holds(d, L) returns true if d “follows” from L.
Problem



Let g = true  Student(Alice),
g’ = x (Student(x)  Perm(x, eat, lunch))
May Alice eat lunch?
Query(Perm(Alice, eat, lunch), , {g, g’})
Problem



Let g = true  Student(Alice),
g’ = x (Student(x)  Perm(x, eat, lunch))
May Alice eat lunch?
Query(Perm(Alice, eat, lunch), , {g, g’})


Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}).
Auth returns {Student(Alice)}.
Problem



Let g = true  Student(Alice),
g’ = x (Student(x)  Perm(x, eat, lunch))
May Alice eat lunch?
Query(Perm(Alice, eat, lunch), , {g, g’})



Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}).
Auth returns {Student(Alice)}.
Query calls Holds(Student(Alice), ).
lost g!
Problem



Let g = true  Student(Alice),
g’ = x (Student(x)  Perm(x, eat, lunch))
May Alice eat lunch?
Query(Perm(Alice, eat, lunch), , {g, g’})




Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}).
Auth returns {Student(Alice)}.
Query calls Holds(Student(Alice), ).
lost g!
Holds returns false; so, Query returns false.
The fix


To correct the problem, pass G to Holds
and modify Holds to use the new info.
Bug is easy to find and easy to fix, but
still made it into the released March
2003 version of the spec.
Another bug

Query(s, , {x (Perm(p, issue, x)  s)})
 Query calls Auth on same input.
 Auth returns {Perm(p, issue, g) | g is a grant}.


Recall: Auth output is a set of closed conditions.
Query calls Holds on each returned condition.
Another bug

Query(s, , {x (Perm(p, issue, x)  s)})
 Query calls Auth on same input.
 Auth returns {Perm(p, issue, g) | g is a grant}.




Recall: Auth output is a set of closed conditions.
Query calls Holds on each returned condition.
The set of grants is infinite.
 g0 = true  Student(Alice)
 gi = true  Perm(Bob, issue, gi-1), i = 1, …
D is an infinite set; so, Query doesn’t terminate.
Our fix

Restrict the grants in the language.



If a grant g has a condition d, d mentions
a resource variable x, and x is free in d,
then x is free in g’s conclusion.
Easy to prove that if the restriction is met,
then Auth always returns a finite set.
Can make an empirical argument for why
this restriction is okay.
But that’s not all…
In this small fragment of XrML, there
are 2 other bugs. See paper for
details.
The translation


The translation is fairly straightforward.
Two points worth noting:


Query assumes that a grant g holds, if it’s
issued by some trusted principal (i.e., a
principal whose permitted to issue g).
Holds(d, L, G) returns true iff d logically
follows from L and G. So, a condition d
holds iff d logically follows from L and G.

The translation depends on L and G.
Correctness

Thm: the fixed Query(s, L, G) returns
true iff lLl L,G  gG gL,G  sL,G is true
in every model that satisfies the union
properties (p1p2 = p2p1, …).

tL,G is the translation of t wrt L and G.
Complexity



The XrML alg. runs in exponential time.
The XrML document says that the
language implementers are responsible
for optimizations.
But using the translation, we can prove
that…
Complexity

Determining if a set of XrML grants
imply a statement is NP-hard.


This is because the language supports sets
of primitive principals.
If we remove  from the language…


XrML translates (essentially) to Datalog,
which is a well-known tractable language.
Given the translation, finding a tractable,
fairly expressive fragment is easy.
Accomplishments


We have proposed the first formal
semantics for XrML and, in the process,
found significant problems with the
spec.
But the importance of this work hinges
on whether the results actually lead to a
better language.
Impact

When we found bugs, we told Xin Wang
and Thomas DeMartini from the MPEG
standards committee.
Impact



When we found bugs, we told Xin Wang
and Thomas DeMartini from the MPEG
standards committee.
In March 2004, the committee released a
new version, which is an ISO standard.
They addressed all of our concerns!
Key Change


The algorithm is replaced by a formal
description of when a permission
follows from L and G.
A permission p follows from L and G iff
there’s a tree of finite depth such that



leaves are statements that implicitly hold,
non-leaves are implied by a single grant
and their children, and
the root is p.
Example


Alice is a good student and all good
students may play. May Alice play?
Yes.
Alice may play
Alice is good
follows from children and grant
`all good students may play’
Alice is a student
known facts
Observations

The standard has formal semantics!


That’s what the formal description is.
Also, some corrections are no longer
necessary. We no longer need to


restrict the class of grants so that the
algorithm terminates or
modify the algorithm so that complete
information is passed from one routine to
the next.
But, wait a minute


At the end of the day, we want to
implement XrML.
How do we do this without an algorithm?


Tweak our translation to match the
standard, then use Datalog techniques to
solve the validity problem (assuming no
union op).
First-order semantics still useful, even
though standard has its own.
Another change

The standard assumes that a group has
the properties of its members.


Recall: 2003 version is inconsistent.
Why build this relationship into the
language?
Hidden assumption


The XrML designers assume that
(essentially) all facts about the world
come from individuals presenting
certificates.
Examples:


Alice is a student if she presents her
student id.
A principal is the group {Alice, Bob} if the
principal presents both Alice and Bob’s IDs.
A consequence

In certificate passing systems it is rare that
an individual loses a right by presenting a
certificate.



{Alice, Bob} gets some rights by presenting
the Alice ID and does not lose these rights by
presenting Bob’s.
The assumption guarantees this.
Assuming a certificate passing system
leads to other design decisions….
Negation is unnecessary

Most policies for certificate passing
systems are negation-free.

Certificates usually give the properties that
their holders have.


E.g. students are given student ids, instead of
everyone else being given `not a student’ ids.
It’s silly to have a policy that forbids an action
if certain credentials are presented.

No one will present them when wanting to do the
forbidden task.
Problem


Certificate passing systems are not
appropriate for all applications.
Many policies need negation.


E.g. `if a child is not grounded, then she
may play outside’ and `smoking is not
permitted on the airplane.’
These policies cannot be captured
explicitly in the standard.
A partial solution


Assume that an action is forbidden unless
it’s explicitly permitted.
Problem: Can’t distinguish forbidden
actions from unregulated ones.


As a result, policy sets can’t be merged.
E.g. A university’s policies talk about who’s
permitted to get tenure. The policies for
Alice’s new outreach program don’t.
 Alice’s policies contradict the university’s.
We may want functions too.


Functions often occur naturally when translating
policies from English to first-order logic.
E.g. `Classified information may be copied from
one secure server to another’:
x1, x2, x3, x4
(Classified(x1)  Secure(x2)  Secure(x3) 
Permitted(x4, copySrcDst(x2, x3), x1))
We may want functions too.


Functions often occur naturally when translating
policies from English to first-order logic.
E.g. `Classified information may be copied from
one secure server to another’:
x1, x2, x3, x4
(Classified(x1)  Secure(x2)  Secure(x3) 
Permitted(x4, copySrcDst(x2, x3), x1))
Extending the standard




We want the standard to support
negations and functions.
Extending the syntax is easy.
It’s not clear how to modify the
semantics given in the standard.
But, extending the translation is easy.

Policies simply translate to a fragment of
modal first-order logic that includes
functions and negation.
Complexity


The validity problem for first-order
formulas with functions and negation is
undecidable.
Idea: Restrict functions and negation so
that policies translate to a tractable
fragment.
Datalog

There are tractable fragments of Datalog
that support some functions and
negation, but


the use of functions is severely restricted and
negation cannot appear in the conclusion of
Datalog rules, so no support for policies that
forbid actions.
Lithium

The Lithium language
[Halpern/Weissman] is a tractable
fragment of first-order logic that


supports unlimited use of functions and
allows a fair amount of negation in both
the premise and conclusions of statements.
Expressivity

Experimental results suggest that
Lithium is sufficiently expressive for
many applications.

We collected a number of policies from
libraries and government databases, and
have written them in Lithium.
Summary



We proposed the first formal semantics
for XrML.
This helped the language developers to
find several bugs, which they fixed.
They have since given XrML formal
semantics; however, our approach
seems better suited to extending and
analyzing the language using known
techniques.
Summary


Industry wants to implement XrML but …
XrML has no formal semantics



We give formal semantics to a representative
fragment of XrML.
Even a small fragment is intractable.


and needs them!
We can leverage results in the CS literature to find
fairly expressive, tractable options.
Next step: Add negation to XrML.

This is critical for merging policies.
 Two minor bugs in paper (don’t effect results); corrected version online.
talk ends on preceding slide
Sample XrML policy


Consider the policy `anyone may play
the movie `Big Hit’ for $2 (per use)’.
We could write this policy in XrML as…
<license>
<grant> <forAll varName="anyone" />
<!-- This is saying that anyone can use this grant. -->
<principal varRef="anyone" />
<!-- The right to play the movie is granted -->
<cx:play />
<!-- This is the movie that we are giving access to. -->
<cx:digitalWork>
<cx:title>Big Hit </cx:title>
</cx:digitalWork>
<!-- $2.00 each -->
<sx:fee>
<sx:paymentPerUse>
<sx:rate currency="USD">2.00</ sx:rate>
</sx:paymentPerUse
</sx:fee>
</grant>
</license>
The translation



We now translate XrML licenses and
grants to “equivalent” formulas in modal
first-order logic.
The translation relies on which licenses
have been issued and which grants
implicitly hold.
Let sL,G be the translation of any string s
wrt the input parameters L and G.
Translation (cont.)

Except for licenses and grants, translation
is easy.


We assume a constant cg for each grant g
Perm(p, issue, g)L,G = Perm(p, issue, cg)

(d1  d2)L,G = d1L,G  d2L,G,
Pr(p)L,G = Pr(p), and

trueL,G = true

Translating licenses


Recall: (g, p) means p said g.
According to Query,



Option 1: (g, p)L,G = Said(p, cg),


if p may issue g, then (g,p) means that g holds
otherwise, (g, p) is meaningless
restrict to models satisfying the axiom scheme
Said(p, cg)  Perm(p, issue, cg)  gL,G
Option 2: (g, p)L,G = Perm(p, issue, cg)  gL,G
Translating grants

x1…xn(d  e)L,G =
x1…xn (Holds(d, L, G)  eL,G)


Holds(d, L, G) returns true iff d is a logical
consequence of L and G.
Define a modal operator Val, where Val() is
true in a model m iff  is true in all models.
Holds(d, L, G)=Val(lL l L,G  gG gL,G  dL,G )
Descargar

A Formal Foundation for XrML