CSA3180: Natural Language Processing Semantics I – Truth Conditions, FOL, Quantified Sentences, XML and Taxonomies • Truth Conditions and First Order Logic • Quantified Sentences • Translating English into FOL and vice-versa • XML in an NLP context • Semantic Web • Taxonomies November 2005 CSA3180: Semantics I 1 Introduction • Quantification and FOL/English translation slides partly based on Introduction to Logic Lectures by Angelo Dalli given in 2000 • Quotes from W3C website and NLPRS 2001 Tokyo • Will introduce the concepts of linking semantics to syntactic objects • Taxonomies and the use of XML in an NLP context November 2005 CSA3180: Semantics I 2 Quantification Prepositional Logic addresses shortcomings of Propositional Logic mainly by introducing predicates. Atomic or Compound Propositional statements like “This whiteboard is white” do not allow us to get to more generic/lower level concepts, like “You can write on all whiteboards” November 2005 CSA3180: Semantics I 3 Propositional to Predicate Propositional logic uses the notion of variables. Variables are used as placeholders that indicate relationships between quantifiers and argument positions of predicates. So apart from statements like father(Max) and mother(Claire) we can have father(X) and mother(Y). November 2005 CSA3180: Semantics I 4 Propositional Logic Propositional logic is thus similar to algebra using constants only (like 1+(2/3)), while prepositional logic uses variables (like x+(y/z)). November 2005 CSA3180: Semantics I 5 Variables • Variables are named symbolically - a,b,c. In Prolog they usually start with an uppercase letter. • Variables can appear in argument lists ex. big(i) • Variables can appear in place of constants, ex. student(x) noisy(x) • With the help of variables we can produce wffs - man(x), mortal(x) November 2005 CSA3180: Semantics I 6 Formulae vs. Sentences A formula like man(x) is not a sentence because it does not make an identifiable claim. To make such claims we require quantifiers in order to actually bind the variables (in this case ‘x’) Examples of an atomic wff: cube(a) big(a) green(a) doctor(x) expensive(x) Examples of FOL which we would like to represent: All green cubes are green Some doctors are expensive November 2005 CSA3180: Semantics I 7 Quantifiers A need to use quantifiers has therefore been argued due to the lack of expressiveness of Propositional logic and also to represent better FOL wffs in Predicate logic. Quantifiers tell us about the number or quantity of things that satisfy some of the conditions within the scope of the quantifier. They are also used to help bind variables to values within a universe of discourse. The universe of discourse is the domain of the interpretation under consideration, or, more formally, ‘the set of individual objects which we are discussing now’. November 2005 CSA3180: Semantics I 8 UNIVERSAL Quantifier The first of the two quantifiers is the : “for all” or “for every” or “for ever ” The domain of the quantifier when we say (x) includes all those objects that can take up the value of ‘x’ in the universe of discourse - all have to bind The scope of when we state (x)(is_integer(x) has_prime_fac(x)) is obviously equivalent exactly to (y)(is_integer(y) has_prime_fac(y)) However, the following is not possible (x)(is_integer(x) has_prime_fac(y)) November 2005 CSA3180: Semantics I 9 UNIVERSAL Quantifier E.g.1. Every (all) student is noisy That is, for all x, if x is a student, then x is noisy. For all x, (student(x) noisy(x)) (x)(student(x) noisy(x)) E.g.2. All men are mortal. Socrates is a man. Therefore Socrates is mortal For all y, (is_a_man(y) is_mortal(y)) (y)(is_a_man(y) is_mortal(y)) November 2005 CSA3180: Semantics I 10 EXISTENTIAL Quantifier The second quantifier is the existence, meaning “there exists” or “there xists” at least on object in the domain that binds with the variable to satisfy the wff. The scope of , that is, the part of the formula to which it applies, is the same as , exactly where the variable is bound to some value or object within the domain of discourse. So, in this case the use of brackets is very important, as seen in this example: x y (y = 2x) Is it O.K. if: y x (y = 2x) More about scope in ‘Free vs Bound’ slide. November 2005 CSA3180: Semantics I 11 EXISTENTIAL Quantifier E.g.1. Some persons never learn. That is, there exists at least one x, if x is a person, then x will never learn. (x) (person(x) never_learns(x)) E.g.2. Some footballers will never play in the Premier or First division. Reformulating, there exists y such that y is a footballer and y will not play in the premier or first division. There exists at least on person, y, who ftball(y) ~ (prem(y) div1(y) (y) (ftball(y) ~ (prem(y) div1(y)) November 2005 CSA3180: Semantics I 12 Free vs. Bound Variables If P is a wff and ‘v’ is a variable, then: v P and v P are wff too and ‘v’ is bound in P. E.g. x (student(x) noisy(x)) ‘x’ is bound within the scope of the A variable which is not bound in P is said to be unbound or free in P. E.g. x student(x) noisy(y) ‘y’ is unbound within the scope of A sentence is a wff with NO unbound variables. November 2005 CSA3180: Semantics I 13 Points to Remember • Quantified sentences make claims about some intended domain of discourse. • A sentence of the form is x P(x) is TRUE iff the wff P(x) is satisfied by every object in the domain of discourse. • A sentence of the form is x P(x) is TRUE iff the wff P(x) is satisfied by some object (at least one) in the domain of discourse. November 2005 CSA3180: Semantics I 14 Translating Quantified Sentences • • • • • • • • • is often used in sentences like the following: Every P is a Q x (P(x) Q(x)) While is normally used as follows: There is a P which also has property Q. x (P(x) Q(x)) It is often tempting to translate the latter sentence as: x (P(x) Q(x)) but this means something rather different, being true just in case there is an object which is either not a P or else is a Q; in particular, it is true when there is no object satisfying P(x). November 2005 CSA3180: Semantics I 15 Vacuously True Sentences • Suppose we try to evaluate the sentence: • x (student(x) noisy(x)) • in a world where there are no students. Nobody will satisfy the first part (student(x)) and so from the truth table for implication, all the possible instances come out True - hence the universal statement holds. • From this we can conclude that any sentence of the form: • x (P(x) Q(x)) • is vacuously true in a world where the first part of the universal statement does not hold. November 2005 CSA3180: Semantics I 16 Complex Noun Phrases • Most of the time we use to translate sentences with “every” or “all”. • Every small dog that is at home is happy. • x (small(x) dog(x) at_home(x) happy(x)) • and we use to translate sentences involving “a”. • A small happy dog is at home. • x (small(x) happy(x) home(x)) • However, sometimes “a” has also a universal sense, as in: • A dog is a kind mammal. • x y (dog(x) • kind_of(x,y) mammal(y)) November 2005 CSA3180: Semantics I 17 Quantifier Equivalence • If it is a known fact that not everything has some property, then it follows that there is something that does not have that property. • Symbolically, ~x P(x) x ~P(x) • Similar to ~(AB…) (~A~B…) • ~(P(x1)P(x2)...) (~P(x1)~P(x2) …) • Similarly, if it is a known fact that it’s not the case that something has a property, then all things do not have that property. • Symbolically, ~x P(x) x ~P(x) • Similar to ~(AB…) (~A~B…) • ~(P(x1)P(x2)...) (~P(x1)~P(x2)…) November 2005 CSA3180: Semantics I 18 Multiple Quantifiers • Some cube is to the left of some tetrahedron. x y (cube(x) tet(y) leftof (x,y)) Precisely expressing the logical formula as an English sentence reading from left to right: ‘There exists x, there exists y, such that x is a cube, y is a tetrahedron and x is on the left of y’ • All cubes are to the left of all tetrahedrons. xy ((cube(x) tet(y)) leftof(x,y)) ‘For all x, for all y, if x is a cube and y is a tetrahedron, then x is to the left of y’ November 2005 CSA3180: Semantics I 19 Prenex Form • When translating from English to FOL quantifiers and connectives usually end up mixed together. • In prenex form all quantifiers are put at the start of the sentence, followed by a wff that is quantifier-free. Q1v1Q2v2…Qnvn P • Where every Qi is either or , each vi is a variable and P is quantifier-free wff. November 2005 CSA3180: Semantics I 20 Restrictions and Sets Restricted quantifiers – quantifiers that are restricted to some set membership. Ex. If P(x) denotes the predicate that is true when x is a person. Thus the set P generated by P(x) is the set of all persons. This is denoted formally by (x)P Alternatively you can define P(x) and then say that x P. Then you can simply write down x November 2005 CSA3180: Semantics I 21 Restrictions and Sets P(x) generates P, which is the set of all people November 2005 (x)P CSA3180: Semantics I 22 FOL to English Translation • Two main steps: • 1. Translate the formula by writing the literal meanings of the logical symbols and predicates as they occur. • 2. Reword the sentence so that it has the same logical meaning (the truth or falsity of the sentence should not change) but is written in more ‘acceptable’ English. This actually involves avoiding the use of variable names. November 2005 CSA3180: Semantics I 23 Alternative Notations Course Notation Alternative Notations P ~P, !P, P, Np PQ P&Q, P&&Q, P.Q, PQ, Kpq PQ P|Q, P||Q, P+Q, Apq PQ PQ X Y P Q, Cpq November 2005 P Q, Epq X,Y CSA3180: Semantics I 24 Some simple exercises… Let van(x) car(x) bike(x) exp(x,y) faster(x,y) represent ‘x is a van’, represent ‘x is a car’, represent ‘x is a bike’, ‘x is more expensive y’, ‘x is faster than y’. Translate the following formula into natural language: 1. x bike(x)y (car(y) exp(y,x) ( ) 2.xy ((van(x) bike(y)) faster(x,y)) 3. z (car(z) xy((van(x) bike(y)) (faster(z,x)faster(z,y)exp(z,x)exp(z,y)))) November 2005 CSA3180: Semantics I 25 English to FOL Translation • Inverse translation is much more challenging. Three main steps: • Identify predicates in the sentence. • Rearrange the sentence into a logical formulation. Capture the essential meaning of the sentence using predicates, quantifiers and connectives. • Cater for expressions involving time such as ‘always’, ‘afterwards’, etc. November 2005 CSA3180: Semantics I 26 Some more simple exercises… • Translate the following natural language statements into predicate logic: 1. Every school boy thinks that Robin Hood is a hero. 2. Some people will never learn to keep their mouth shut or to respect other people. 3. A person’s mother is always older than that same person. November 2005 CSA3180: Semantics I 27 eXtensible Markup Language (XML) November 2005 CSA3180: Semantics I 28 eXtensible Markup Language (XML) • Universal structured data representation language • Framework for web publishing • E-Commerce Applications (B2B/B2C) • “Point of Creation” Bottleneck – people are lazy! • Too time consuming to markup NLP texts manually November 2005 CSA3180: Semantics I 29 eXtensible Markup Language (XML) • NLP applications should help in automatic markup of texts using XML • Gives back much richer text structure and documents • Intelligence to documents • Disambiguation and search functionalities November 2005 CSA3180: Semantics I 30 Semantic Web The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming. "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001 November 2005 CSA3180: Semantics I 31 Semantic Web • Next generation Web? • http://www.w3.org/2001/sw/ • Many small applications, lots of hype, few large spread uses • Most notable: RDF/RSS/Atom for blogs and news syndication (also for podcasting) November 2005 CSA3180: Semantics I 32 NLP for XML (NLPRS 2001) Ontology extraction into XML based structured languages using XML Schema Message Translation for multilingual B2B, B2C ecommerce applications Automatic XML to XML schema mapping by XML vocabulary translators with morphological analyzers Web (XHTML) resource discovery and indexing Automatic hyperlink (XLink) generation Multimodal techniques to take advantage of XML compound documents (e.g. search the key string in XHTML, MathML, SVG and SMIL components at the same time) November 2005 CSA3180: Semantics I 33 XML for NLP (NLPRS 2001) NL Corpora representation languages and the conversions among them, from and to RDB, and from raw text XML based Machine Translation / Interlingua XML based multilingual Web contents management system Tree transducers implemented by XSLT IR powered by both NLP and XML Task-oriented Summarization using XML Schemas VoiceXML applications and the dialogue scenario generation Foreign language e-Education (CALL) material (texts, drills, grading systems etc.) generation by XML November 2005 CSA3180: Semantics I 34 Taxonomies Taxonomy (from Greek ταξινομία (taxinomia) from the words taxis = order and nomos = law) may refer to either the classification of things, or the principles underlying the classification. Almost anything, animate objects, inanimate objects, places, and events, may be classified according to some taxonomic scheme. Wikipedia Definition November 2005 CSA3180: Semantics I 35 Taxonomies/Ontologies Used to markup texts Define XML tags (or SGML) used to markup semantic objects Example: Use <noun> tag to markup “nouns” Frequently hierarchical Confusion with Ontologies – often referring to same thing (ontologies used more in Knowledge Management) Ontologies seen sometimes as being broader in scope than taxonomies November 2005 CSA3180: Semantics I 36 Scientific vs. Folk Scientific taxonomies: Example: Biological Taxonomy (Linnaean/Evolutionary Tree) Folk taxonomies: Objective Universal Subjective Vernacular naming system Social knowledge representation Example: Flickr, del.icio.us, podcast labels More or less the same thing as folksonomies November 2005 CSA3180: Semantics I 37 Taxonomies/Ontologies Formally represent an acyclic graph/tree XML or SGML frequently used as base language Prolog can also be used (80’s AI projects) FOL can also be used (Cyc) Modern standards: OWL, RDF, RDFS, OIL, DAML, DAML+OIL Welcome to acronym world! November 2005 CSA3180: Semantics I 38 Stuff to lookup RDF, DAML+OIL RSS Podcasting – behind the scenes (Non-comprehensive) List of NLP-related projects using ontologies http://www.cs.utexas.edu/users/mfkb/related. html November 2005 CSA3180: Semantics I 39

Descargar
# Parsing Algorithms 1 - University of Malta