Practical Natural Language Processing CPSC 533 Artificial Intelligence Caroline Hrouda & Marilena Rossi April 6, 2000 Contents • Scaling Up the Grammar • Ambiguity • Discourse Understanding Scaling up the Grammar To make sense of a real-life language it requires much more sophistication at every step of the language interpretation process Grammar - the study of the classes of words, their inflections, and their functions and relations in the sentence. Nominal Compounds and Apposition • Nominal Compounds - strings of nouns combining to form a larger unit that still can combine with an article to form an NP. • For the larger noun unit we need a rule Rule Noun -> NounNoun [Noun[Noun[Noun POSTSCRIPT language ] code ] [Noun input file]] • Nominal Compound - “file input” Rule f i Input(i) /\ File(f) /\ NN(i,f) • Achieved by the rule: Rule Noun (y x sem1(x) /\ sem2(y) /\ NN(x,y)) -> Noun(sem1)Noun(sem2) • Apposition - a construction of two noun phrases concatenated together in which both noun phrases refer to the same thing i.e. • Restrictive Apposition - restricts the set of possible references, thus to make sure that one is not confused with something else i.e. “David MacDonald” “the insane professor” • A simplified rule for apposition is: Rule NP([q x sem1 /\ sem2]) -> NP([q x sem1])NP([q x sem2]) Adjective Phrases • Adjective - serves as a modifier of a noun to denote a quality of the thing named, to indicate its quantity or extent, or to specify a thing as distinct from something else. • Intersective Semantics - formed by a conjunction of the semantics contributed by the adjective and by the noun. “the foot is stinky” or “the stinky foot” w Stinky(w) /\ Foot(w) • If all adjectives were with intersective semantics then: Rule Noun (x sem1(x) /\ sem2(x)) -> Adjective(sem1)Noun(sem2) Note: the semantic relation between adjective and noun is often more complicated than just intersection. Determiners • Determiner - a word belonging to a group of limiting noun modifiers characterized by occurrence before descriptive adjectives modifying the same noun. • Articles are just one type of class determiner i.e. “a”, “the” • A simple example: Quasi-logical form: [3x Sasquatch(x)] Gives the following grammar rules: Det(q) -> Article(q) Det(q) -> Number(q) NP([qx noun(x)]) -> Det(q) Noun(noun) Noun Phrases • Change Article to Determinant, include case information and agreement in person and number. Rule NP(case, Person(3), number, [qx sem(x)]) -> Det(number, q) Noun(number, sem) - case variable is unbound - NP can be used in either subjective or objective case - number can be singular or plural, but rule says Det and Noun must have same number ( There are exceptions Det = the or Noun = sheep, can be singular or plural) • To enforce subject / verb agreement Rule S(rel(obj)) -> NP(Subject, person, number, obj) VP(person, number, rel) i.e. I am vs. We am • It is possible to form a NP from a noun with no determiner “Alice ate” is e e Eat(Alice, Past) Since many things can be Alice: e, x e Eat([ ! x Name(x) = Alice], Past) Rule NP(case, Person(3), number, [ ! x Name(x) = name]) -> Name(number, name) Name(Singular, Alice)->Alice Clausal Complements • All verbs have taken only noun phrases and prepositional phrases as complements, but some verbs accept clauses. • Clause - a group of words containing a subject and predicate and functioning as a member of a complex or compound sentence. • The same subcategorization mechanism from before: VP(subcat) -> VP([S|subcat])S VP(subcat) -> VP([VP|subcat])VP Verb([S]) -> believe ( I believe [he has left] ) Verb([VP]) -> want ( I want [to go there] ) • Infinitive - a verb normally identical in English with the first person singular that performs some functions of a noun and at the same time displays some characteristics of a verb. Relative Clauses • Gap - _ symbol where it indicates the place where the head noun phrase (the person) would logically appear to complete the sentence the person [ that I saw _ ] • Filler - the head noun phrase is the filler of the gap the person [ that I looked [ pp at _ ] • Long distance dependency - filler gap relation that reaches down a potentially unbounded number of nodes into the parse tree [the person]i [that [s you said [s you thought [s I gave the book to _ i]]]] i = on parse nodes used to show that there is an identity relationship person = same as the recipient of the book • Relative Clauses - an NP can be modified by following it with a relative clause. A relative clause consists of a relative pronoun followed by a sentence that contains a NP gap. “the person that I saw _ “ Rule NP(Gap) -> NP(Gap) RelClause RelClause -> Pronoun(Relative)S(Gap(NP)) • string comprises an NP with an NP Gap in it. Rule NP(Gap(NP)) -> The Gap has to be passed along in the rest of the grammar i.e. S(Gap(Concat(g1 , g2))) -> NP(Gap(g1))VP(Gap(g2)) If g1 , g2 both are Gaps the S as a whole has no Gap Questions? huh? • In English there are two main types of questions: 1) Yes / No - Did you do that? Subject - aux inversion - like a declarative sentence but it has an auxiliary verb that appears before the subject NP. Sinv is to denote a sentence that has it. Auxiliary - functioning in a subsidiary capacity of a verb, accompanying another verb and typically expressing person, number, mood, or tense. Rule S -> Question Question -> Sinv Sinv -> Aux NP VP 2) Wh (gapped) - What did you see _? Will expect a noun phrase as an answer. It is an interrogative pronoun followed by a gapped Sinv (in the simplest case) Interrogative Pronoun - who, what, where, when, why and how Rule Question -> Pronoun(Interrogative) Sinv(Gap(NP)) • Other question constructions, but less common: 1) Echo - “You did what?” 2) Rising Intonation - “You smell something?” 3) Yes / No with “be” - “Is it dead?” 4) Wh Subject - “When is this class over, Prof. Jacob?” 5) Wh NP - “[What trashy novel] did you read _ ?” 6) Wh PP - “[With what] did you write it _ ?” Ambiguity Handling agrammatical strings: • Syntactic Evidence • Lexical Evidence • Semantic Evidence • Difficult to find the correct interpretation, especially if one can only use lexical, syntactic, semantic rules. • Try to use logical inference through probabilistic models, such as belief networks and hidden Markov models • Belief networks help determine how to combine lexical, syntactic, semantic evidence. • Difficulty lies in selecting the appropriate evidence and how to implement it. • Thus, it’s very important to note the difference between the evidence and source of ambiguity. Syntactic Evidence • Source of ambiguity: adverbs and prepositional phrases (a.k.a. modifiers) can be applied to many different ‘heads’. • Adverb - a word belonging to one of the major form of classes, typically serving as a modifier of a verb, an adjective, another adverb, a preposition, a phrase, a clause , or a sentence and expressing some relation of manner or quality, place or time. • Solution: concluding that the modifier should be applied to the most recent head. i.e. I walked through the sludge near my house. • ‘near my house’ can be applied to both ‘I’ and ‘the sludge’. According to the solution, apply it to ‘the sludge’. Lexical Evidence • Source of ambiguity: syntax of prepositional phrases modify verb and nouns in sentence, changing the entire meaning of the phrase. • I.e. Lee positioned the dress on the rack. Kim wanted the dress on the rack. • ‘On the rack’ determines the interpretation of the sentence: in the first sentence, it affects the verb; in the second sentence, it affects the noun ‘dress’. • Solution: preference of the verb for sub-categorisation. Semantic Evidence • Source of ambiguity: lexical ambiguity where the favoured word sense alters the context of the sentence. • i.e. Ball, diamond, bat, base implies baseball, but the words individually have a more common different meaning. Semantic Evidence • Another common ambiguity: words associated with the word ‘with’, where ‘with’ can have many different meanings and related noun phrases. Sentence I ate macaroni with ketchup I ate macaroni with dessert I ate macaroni with abandon I ate macaroni with a chopstick I ate macaroni with my dog Relation (ingredient of macaroni) (side dish of macaroni) (manner of macaroni eating) (instrument of difficult eating) (accompanier of macaroni eating) • Solution: have the interpretations that refer to the most likely events, although the correct interpretation must still be sought out. Metonymy • Metonymy - a figure of speech where one object is used to represent another. • It’s a frequent occurrence in spoken language and difficult to represent grammatically. i.e. Microsoft announced a loss of 17 billion dollars. • We know that “Microsoft” really is a spokesperson for the company “Microsoft”. Metonymy • Solution: need to represent a new level of ambiguity to represent the new semantics. • Provide two objects for the semantic interpretation: one for the object the phrase literally refers to one for the metonomic reference. • Then state that there is a relation between the two objects. Metonymy • In current grammar: x,e Microsoft(x) Λ e Announce(x,Past). • Needs to be altered to: m,x,e Microsoft(x) Λ Metonymy(m,x) Λ e ε Announce(m,Past). • This is only a representation of the problem. Need to define constraints for the metonymy relation. Metonymy • Case 1: No Metonymy, where x and m are identical: ‘ m, x (m=x) --> Metonymy (m,x) • Case 2: Representational reference for an organisation: m, x Organisation(x) Λ Spokesperson(m,x) -> Metonymy (x,m) Metonymy • Other examples: – referring to an author for his/her works – referring to a producer instead of the product – referring to a group name for the whole (i.e. a team) slang Metaphor • Metaphor - a figure of speech where a phrase with one literal meaning is used to suggest another meaning through analogy. – Has a large part in everyday language, not just poetry. i.e. The system has crashed. Metaphor • Solution 1: define all known/common metaphors into lexicon, adding a new sense to the meaning of words (i.e. fallen, dipped refer to some other scale other than altitude). • Note that this doesn’t necessarily yield the correct interpretation of the sentence • Solution 2: include explicit knowledge of common metaphors and use it to interpret a new sense to the sentence. Discourse Understanding • Discourse or Text - is any string of language, usually one that is more than one sentence long. Requires an understanding of text longer than one sentence. • Easier to break down language into individual sentences, but need to grasp the relations between all the sentences in a given text. • Discourse produced through three steps: – intention – generation – synthesis Discourse Understanding • Discourse understanding is done through: – perception – analysis (semantic, syntactic, lexical) – disambiguation – incorporation • This all depends on the amount of knowledge that an agent has. Two different knowledge bases will lead to two different understandings of a text. Discourse Understanding • General Discourse Equation: KB’ = Discourse Understanding (text, KB) The two versions of KB being: KB = knowledge base of the agent KB’= agent’s knowledge after understanding the text. Discourse Understanding • Six types of knowledge to achieve understanding: 1) 2) 3) 4) 5) 6) General knowledge about a word General knowledge about the structure of coherent discourse General knowledge about syntax and semantics Specific knowledge about situation being discussed Specific knowledge about beliefs of the characters Specific knowledge about the beliefs of the speaker Discourse Understanding • Interpretation tends towards a priori knowledge of meaning. • Lets look at an example discourse: – Pete went to a car race. – He shouted very loudly. – He had to leave at 6pm. • There can be many interpretations to this discourse due to the hearer’s knowledge base. Structure of Coherent Discourse • First, conjunction is NOT commutative in natural languages • For example: Dr. Monroe went golfing. It started to rain. He was struck by lightning. • temporal ordering important • purpose is important Dr.Monroe went golfing. He was struck by lightning. It started to rain. Structure of Coherent Discourse • Segment - clause, complete sentence, group of consecutive sentences. Discourses are composed of segments. • Coherence relation - each segment in a discourse is related to a previous segment and determines the role of each segment in the discourse. • The hearer must discern the relations of segments, not just ascertain the ambiguities. • Coherence relations constrain the possible meanings of each sentence (i.e. single sentences have many meanings, but together only a few). Structure of Coherent Discourse • Hobbs’ Theory - a speaker does four things to make discourse: 1) 2) 3) 4) Convey message Has motivation or goal to do number one. Wants to make it easy to understand the message. Links information to what hearer already knows. • A sentence is a coherent extension to discourse if it does one of the four points above. Structure of Coherent Discourse • 1) A funny thing happened yesterday 2) Wendy went to a fast food restaurant 3) Wendy is a sandwich 4) The clerk said “we don’t serve food here” 5) Wendy was shocked and hurt 6) The clerk said they’d make an exception this time 7) She was very embarrassed by her forgetfulness • Thus, 2 adjacent segments si and sj stand in the evaluation coherence relation if one can infer from si that sj is a step in the speaker’s plan to achieve a discourse goal. Structure of Coherent Discourse • Different types of Coherence Relations: 1) Evaluation 2) Enablement 3) Causal • These come from the speaker’s goal. • Therefore, understanding has 2 levels of plan recognition: – the speaker’s plans and the character’s plans in the discourse. Structure of Coherent Discourse • Other Coherent Relations: – elaboration, used by the speaker to make discourse easier to understand by saying something differently – explanation, where the speaker adds new details to the hearer’s existing knowledge to help understand the discourse at hand. • A more elaborate set of coherence relations was developed by Mann and Thompson: solutionhood, evidence, justification, motivation, reason, sequence, enablement, elaboration, restatement, condition, circumstance, cause, concession, background, thesis-antithesis. Structure of Coherent Discourse • Grosz’ and Sidner’s theory notes the ‘where’ the attention is focussed during a discourse by the speaker and the hearer. • Attention/focus varies by what segment is added on (pushed) and removed (popped) from the stack and when. • This alters the direction of the focus. Structure of Coherent Discourse Discourse (A) Discourse (B) I went to Edmonton I bought you Perogies Then I hitch-hiked home I went to K-Mart I bought some underwear I went to Edmonton Then I hitch-hiked home I went to K-Mart I bought you Perogies I bought some underwear Chit - Chat with Chatterbot Go to Chatterbot! Summary 1) NLP techniques make it practical to develop programs that make queries, extract information from texts, translate and so on. 2) It is possible to parse sentences efficiently using an algorithm. 3) There has been a shift from grammar to the lexicon. 4) Natural languages have a huge variety of syntactic forms. 5) Choosing the right interpretation requires evidence from many sources. 6) Interesting language comes from connected discourse rather than in isolated sentences.