Extensible Semantics for XrML Vicky Weissman Joint work with Joe Halpern The big picture A policy says that under certain conditions an action, such as downloading a file, is permitted or forbidden. Digital content providers want to write policies about how their works may be accessed, and to have their policies enforced. Diverse apps – same need Because we can’t regulate access to online content with precision: Digital libraries can’t put certain content online; it might violate IP laws. The Greek Orthodox Archdiocese of America is wary of defamation. Cultural traditions aren’t respected. (Australian Aboriginal communities often restrict access to a clan or gender.) XrML to the rescue XrML is a language for writing policies. Syntax is XML-based. Semantics is given in 2 ways. 1. An English interpretation of the syntax. 2. An English description of an algorithm that says if a set of XrML policies imply a permission. Bottom line: write policies in XrML, enforce using the algorithm. Industry likes XrML XrML endorsed by Adobe, HewlettPackard, Microsoft, Xerox, Barnesandnoble.com, MPEG International Standards Committee… Microsoft and others plan to make XrML-compliant products. Will tomorrow’s OS, DVD player, … enforce XrML policies? Improving XrML MPEG International Standards Committee generalized some of the concepts and extended the language (slightly). Released their version in March 2003. We refer to the 2003 version as XrML. They released another version in 2004, which we discuss later in this talk. XrML Shortcomings No formal semantics. Policies can be ambiguous. The interpretation of the syntax doesn’t quite match the algorithm. The algorithm’s behavior on some (realistic) input is unintuitive and unintended by language designers. E.g. If Alice is a student and every student may eat lunch, may Alice? Alg. says no. Improving XrML (cont) Fix the algorithm to match developers’ intent. Translate XrML policies to formulas in modal first-order logic. Formal semantics given in a fairly standard way. Lets us compare XrML with languages in CS literature, borrow complexity results, extensions,… Prove our translation matches the algorithm. Algorithm says policies imply a permission iff translated policies imply translated permission. XrML syntax XrML is an XML-based language. XrML policies are verbose. We present a syntax that is more concise and easy to map to XrML syntax. Syntax Principals Resources Digital content (e.g., a movie, an article) Rights Agents (e.g., Alice, Bob) Actions (e.g., play, edit) Properties Describe a principal (e.g., adult, trusted) Syntax (cont.) Statements Stmt ::= Pr(p) | Perm(p, r, s) | true Pr(p) means principal p has property Pr. Perm(p, r, s) means p is permitted to exercise right r over resource s. Syntax (cont.) Statements Stmt ::= Pr(p) | Perm(p, r, s) | true Pr(p) means principal p has property Pr. Perm(p, r, s) means p is permitted to exercise right r over resource s. Syntax (cont.) Statements Stmt ::= Pr(p) | Perm(p, r, s) | true Pr(p) means principal p has property Pr. Perm(p, r, s) means p is permitted to exercise right r over resource s. Syntax (cont.) Statements Stmt ::= Pr(p) | Perm(p, r, s) | true Pr(p) means principal p has property Pr. Perm(p, r, s) means p is permitted to exercise right r over resource s. Stmt ::= Pr(p) | Perm(p, r, s) | true Syntax (cont.) grant ::=x1…xn(Stmt … Stmt Stmt) condition If condition holds, then conclusion holds. In our fragment, grants are closed free variables). license ::= (grant, principal) conclusion (g, p) means p issues/says g. (no Examples Can write: `Joe is a professor’ as true Prof(Joe) and Vicky says `Every professor who gives a talk may have a cookie’ as (x (Prof(x) GivesTalk(x) Perm(x, eat, cookie)), Vicky). Examples Can write: `Joe is a professor’ as true Prof(Joe) and Vicky says `Every professor who gives a talk may have a cookie’ as (x (Prof(x) GivesTalk(x) Perm(x, eat, cookie)), Vicky). Examples Can write: `Joe is a professor’ as true Prof(Joe) and Vicky says `Every professor who gives a talk may have a cookie’ as (x (Prof(x) GivesTalk(x) Perm(x, eat, cookie)), Vicky). Principals – in some detail Set of principals is the set of primitive principals (e.g., Alice, Bob) closed under union. E.g., Alice Bob is a principal. Often written as {Alice, Bob}. According to the XrML doc, {p1,.., pn} represents p1, …, pn “acting together as one holistic identified entity”. But what does this mean? Groups/members relationship Suppose that Alice has property PrA and the group {Alice, …} has property Prg. What should we infer? Option 1: nothing. Option 2: {Alice, …} has property PrA. Option 3: Alice has property Prg. Groups/members relationship Suppose that Alice has property PrA and group {Alice, …} has property Prg. What should we infer? Option 1: nothing. Option 2: {Alice, …} has property PrA. Option 3: Alice has property Prg. XrML chooses each of these options (at different points in the specification). Groups/members relationship Suppose that Alice has property PrA and group {Alice, …} has property Prg. What should we infer? Option 1: nothing. Option 2: {Alice, …} has property PrA. Option 3: Alice has property Prg. XrML chooses each of these options (at different points in the specification). No formal semantics language is inconsistent! Our fix Since XrML is inconsistent… We do not assume that a group has the properties of its members or vice-versa. But can easily write policies to force either relationship (or both). The syntax given here is a fragment of XrML. (See paper for details.) XrML Algorithm Query(s,L,G) algorithm s is a closed statement. L is a set of licenses. G is a set of grants that implicitly hold. Returns true if s “follows” from L and G. Query calls Auth and Holds. Auth Algorithm Recall s is a closed statement. L is a set of licenses. G is a set of grants that implicitly hold. A condition is a conjunction of statements. Auth(s, L, G) returns a set D of closed conditions; s “follows” from L and G if a condition in D “holds”. Holds Algorithm Holds(d,L) algorithm d is a closed condition. L is a set of licenses. Returns true if d “follows” from L. Query(s, L, G) overview Query(s, L, G) Set D to the output of Auth(s, L, G) Return dD Holds(d, L) s is a closed statement, L is a set of licenses, and G is a set of grants that hold implicitly. s “follows” from L and G if a condition in the output of Auth(s, L, G) “holds”. Holds(d, L) returns true if d “follows” from L. Problem Let g = true Student(Alice), g’ = x (Student(x) Perm(x, eat, lunch)) May Alice eat lunch? Query(Perm(Alice, eat, lunch), , {g, g’}) Problem Let g = true Student(Alice), g’ = x (Student(x) Perm(x, eat, lunch)) May Alice eat lunch? Query(Perm(Alice, eat, lunch), , {g, g’}) Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}). Auth returns {Student(Alice)}. Problem Let g = true Student(Alice), g’ = x (Student(x) Perm(x, eat, lunch)) May Alice eat lunch? Query(Perm(Alice, eat, lunch), , {g, g’}) Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}). Auth returns {Student(Alice)}. Query calls Holds(Student(Alice), ). lost g! Problem Let g = true Student(Alice), g’ = x (Student(x) Perm(x, eat, lunch)) May Alice eat lunch? Query(Perm(Alice, eat, lunch), , {g, g’}) Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}). Auth returns {Student(Alice)}. Query calls Holds(Student(Alice), ). lost g! Holds returns false; so, Query returns false. The fix To correct the problem, pass G to Holds and modify Holds to use the new info. Bug is easy to find and easy to fix, but still made it into the released March 2003 version of the spec. Another bug Query(s, , {x (Perm(p, issue, x) s)}) Query calls Auth on same input. Auth returns {Perm(p, issue, g) | g is a grant}. Recall: Auth output is a set of closed conditions. Query calls Holds on each returned condition. Another bug Query(s, , {x (Perm(p, issue, x) s)}) Query calls Auth on same input. Auth returns {Perm(p, issue, g) | g is a grant}. Recall: Auth output is a set of closed conditions. Query calls Holds on each returned condition. The set of grants is infinite. g0 = true Student(Alice) gi = true Perm(Bob, issue, gi-1), i = 1, … D is an infinite set; so, Query doesn’t terminate. Our fix Restrict the grants in the language. If a grant g has a condition d, d mentions a resource variable x, and x is free in d, then x is free in g’s conclusion. Easy to prove that if the restriction is met, then Auth always returns a finite set. Can make an empirical argument for why this restriction is okay. But that’s not all… In this small fragment of XrML, there are 2 other bugs. See paper for details. The translation The translation is fairly straightforward. Two points worth noting: Query assumes that a grant g holds, if it’s issued by some trusted principal (i.e., a principal whose permitted to issue g). Holds(d, L, G) returns true iff d logically follows from L and G. So, a condition d holds iff d logically follows from L and G. The translation depends on L and G. Correctness Thm: the fixed Query(s, L, G) returns true iff lLl L,G gG gL,G sL,G is true in every model that satisfies the union properties (p1p2 = p2p1, …). tL,G is the translation of t wrt L and G. Complexity The XrML alg. runs in exponential time. The XrML document says that the language implementers are responsible for optimizations. But using the translation, we can prove that… Complexity Determining if a set of XrML grants imply a statement is NP-hard. This is because the language supports sets of primitive principals. If we remove from the language… XrML translates (essentially) to Datalog, which is a well-known tractable language. Given the translation, finding a tractable, fairly expressive fragment is easy. Accomplishments We have proposed the first formal semantics for XrML and, in the process, found significant problems with the spec. But the importance of this work hinges on whether the results actually lead to a better language. Impact When we found bugs, we told Xin Wang and Thomas DeMartini from the MPEG standards committee. Impact When we found bugs, we told Xin Wang and Thomas DeMartini from the MPEG standards committee. In March 2004, the committee released a new version, which is an ISO standard. They addressed all of our concerns! Key Change The algorithm is replaced by a formal description of when a permission follows from L and G. A permission p follows from L and G iff there’s a tree of finite depth such that leaves are statements that implicitly hold, non-leaves are implied by a single grant and their children, and the root is p. Example Alice is a good student and all good students may play. May Alice play? Yes. Alice may play Alice is good follows from children and grant `all good students may play’ Alice is a student known facts Observations The standard has formal semantics! That’s what the formal description is. Also, some corrections are no longer necessary. We no longer need to restrict the class of grants so that the algorithm terminates or modify the algorithm so that complete information is passed from one routine to the next. But, wait a minute At the end of the day, we want to implement XrML. How do we do this without an algorithm? Tweak our translation to match the standard, then use Datalog techniques to solve the validity problem (assuming no union op). First-order semantics still useful, even though standard has its own. Another change The standard assumes that a group has the properties of its members. Recall: 2003 version is inconsistent. Why build this relationship into the language? Hidden assumption The XrML designers assume that (essentially) all facts about the world come from individuals presenting certificates. Examples: Alice is a student if she presents her student id. A principal is the group {Alice, Bob} if the principal presents both Alice and Bob’s IDs. A consequence In certificate passing systems it is rare that an individual loses a right by presenting a certificate. {Alice, Bob} gets some rights by presenting the Alice ID and does not lose these rights by presenting Bob’s. The assumption guarantees this. Assuming a certificate passing system leads to other design decisions…. Negation is unnecessary Most policies for certificate passing systems are negation-free. Certificates usually give the properties that their holders have. E.g. students are given student ids, instead of everyone else being given `not a student’ ids. It’s silly to have a policy that forbids an action if certain credentials are presented. No one will present them when wanting to do the forbidden task. Problem Certificate passing systems are not appropriate for all applications. Many policies need negation. E.g. `if a child is not grounded, then she may play outside’ and `smoking is not permitted on the airplane.’ These policies cannot be captured explicitly in the standard. A partial solution Assume that an action is forbidden unless it’s explicitly permitted. Problem: Can’t distinguish forbidden actions from unregulated ones. As a result, policy sets can’t be merged. E.g. A university’s policies talk about who’s permitted to get tenure. The policies for Alice’s new outreach program don’t. Alice’s policies contradict the university’s. We may want functions too. Functions often occur naturally when translating policies from English to first-order logic. E.g. `Classified information may be copied from one secure server to another’: x1, x2, x3, x4 (Classified(x1) Secure(x2) Secure(x3) Permitted(x4, copySrcDst(x2, x3), x1)) We may want functions too. Functions often occur naturally when translating policies from English to first-order logic. E.g. `Classified information may be copied from one secure server to another’: x1, x2, x3, x4 (Classified(x1) Secure(x2) Secure(x3) Permitted(x4, copySrcDst(x2, x3), x1)) Extending the standard We want the standard to support negations and functions. Extending the syntax is easy. It’s not clear how to modify the semantics given in the standard. But, extending the translation is easy. Policies simply translate to a fragment of modal first-order logic that includes functions and negation. Complexity The validity problem for first-order formulas with functions and negation is undecidable. Idea: Restrict functions and negation so that policies translate to a tractable fragment. Datalog There are tractable fragments of Datalog that support some functions and negation, but the use of functions is severely restricted and negation cannot appear in the conclusion of Datalog rules, so no support for policies that forbid actions. Lithium The Lithium language [Halpern/Weissman] is a tractable fragment of first-order logic that supports unlimited use of functions and allows a fair amount of negation in both the premise and conclusions of statements. Expressivity Experimental results suggest that Lithium is sufficiently expressive for many applications. We collected a number of policies from libraries and government databases, and have written them in Lithium. Summary We proposed the first formal semantics for XrML. This helped the language developers to find several bugs, which they fixed. They have since given XrML formal semantics; however, our approach seems better suited to extending and analyzing the language using known techniques. Summary Industry wants to implement XrML but … XrML has no formal semantics We give formal semantics to a representative fragment of XrML. Even a small fragment is intractable. and needs them! We can leverage results in the CS literature to find fairly expressive, tractable options. Next step: Add negation to XrML. This is critical for merging policies. Two minor bugs in paper (don’t effect results); corrected version online. talk ends on preceding slide Sample XrML policy Consider the policy `anyone may play the movie `Big Hit’ for $2 (per use)’. We could write this policy in XrML as… <license> <grant> <forAll varName="anyone" /> <!-- This is saying that anyone can use this grant. --> <principal varRef="anyone" /> <!-- The right to play the movie is granted --> <cx:play /> <!-- This is the movie that we are giving access to. --> <cx:digitalWork> <cx:title>Big Hit </cx:title> </cx:digitalWork> <!-- $2.00 each --> <sx:fee> <sx:paymentPerUse> <sx:rate currency="USD">2.00</ sx:rate> </sx:paymentPerUse </sx:fee> </grant> </license> The translation We now translate XrML licenses and grants to “equivalent” formulas in modal first-order logic. The translation relies on which licenses have been issued and which grants implicitly hold. Let sL,G be the translation of any string s wrt the input parameters L and G. Translation (cont.) Except for licenses and grants, translation is easy. We assume a constant cg for each grant g Perm(p, issue, g)L,G = Perm(p, issue, cg) (d1 d2)L,G = d1L,G d2L,G, Pr(p)L,G = Pr(p), and trueL,G = true Translating licenses Recall: (g, p) means p said g. According to Query, Option 1: (g, p)L,G = Said(p, cg), if p may issue g, then (g,p) means that g holds otherwise, (g, p) is meaningless restrict to models satisfying the axiom scheme Said(p, cg) Perm(p, issue, cg) gL,G Option 2: (g, p)L,G = Perm(p, issue, cg) gL,G Translating grants x1…xn(d e)L,G = x1…xn (Holds(d, L, G) eL,G) Holds(d, L, G) returns true iff d is a logical consequence of L and G. Define a modal operator Val, where Val() is true in a model m iff is true in all models. Holds(d, L, G)=Val(lL l L,G gG gL,G dL,G )

Descargar
# A Formal Foundation for XrML