Practical Natural Language
CPSC 533 Artificial Intelligence
Caroline Hrouda & Marilena Rossi
April 6, 2000
• Scaling Up the Grammar
• Ambiguity
• Discourse Understanding
Scaling up the Grammar
To make sense of a real-life language it requires much more
sophistication at every step of the language interpretation
Grammar - the study of the classes of words, their inflections,
and their functions and relations in the sentence.
Nominal Compounds and Apposition
• Nominal Compounds - strings of nouns combining to form a
larger unit that still can combine with an article to form an NP.
• For the larger noun unit we need a rule
Rule  Noun -> NounNoun
[Noun[Noun[Noun POSTSCRIPT language ] code ] [Noun input file]]
• Nominal Compound - “file input”
Rule  f i Input(i) /\ File(f) /\ NN(i,f)
• Achieved by the rule:
Rule 
Noun (y x sem1(x) /\ sem2(y) /\ NN(x,y)) -> Noun(sem1)Noun(sem2)
• Apposition - a construction of two noun phrases
concatenated together in which both noun phrases refer to
the same thing
• Restrictive Apposition - restricts the set of possible
references, thus to make sure that one is not confused with
something else
i.e. “David MacDonald” “the insane professor”
• A simplified rule for apposition is:
Rule  NP([q x sem1 /\ sem2]) -> NP([q x sem1])NP([q x sem2])
Adjective Phrases
• Adjective - serves as a modifier of a noun to denote a quality of
the thing named, to indicate its quantity or extent, or to specify
a thing as distinct from something else.
• Intersective Semantics - formed by a conjunction of the
semantics contributed by the adjective and by the noun.
“the foot is stinky” or “the stinky foot”
w Stinky(w) /\ Foot(w)
• If all adjectives were with intersective semantics then:
Rule  Noun (x sem1(x) /\ sem2(x)) -> Adjective(sem1)Noun(sem2)
Note: the semantic relation between adjective and noun is often more complicated than
just intersection.
• Determiner - a word belonging to a group of limiting noun
modifiers characterized by occurrence before descriptive
adjectives modifying the same noun.
• Articles are just one type of class determiner i.e. “a”, “the”
• A simple example:
Quasi-logical form: [3x Sasquatch(x)]
Gives the following grammar rules:
Det(q) -> Article(q)
Det(q) -> Number(q)
NP([qx noun(x)]) -> Det(q) Noun(noun)
Noun Phrases
• Change Article to Determinant, include case information
and agreement in person and number.
Rule 
NP(case, Person(3), number, [qx sem(x)]) -> Det(number, q) Noun(number, sem)
- case variable is unbound - NP can be used in either subjective or objective case
- number can be singular or plural, but rule says Det and Noun must have same
( There are exceptions Det = the or Noun = sheep, can be singular or plural)
• To enforce subject / verb agreement
Rule  S(rel(obj)) -> NP(Subject, person, number, obj) VP(person, number, rel)
i.e. I am vs. We am
• It is possible to form a NP from a noun with no determiner
“Alice ate” is e e  Eat(Alice, Past)
Since many things can be Alice:
e, x e  Eat([ ! x Name(x) = Alice], Past)
Rule 
NP(case, Person(3), number, [ ! x Name(x) = name])
-> Name(number, name) Name(Singular, Alice)->Alice
Clausal Complements
• All verbs have taken only noun phrases and prepositional
phrases as complements, but some verbs accept clauses.
• Clause - a group of words containing a subject and
predicate and functioning as a member of a complex or
compound sentence.
• The same subcategorization mechanism from before:
VP(subcat) -> VP([S|subcat])S
VP(subcat) -> VP([VP|subcat])VP
Verb([S]) -> believe
( I believe [he has left] )
Verb([VP]) -> want
( I want [to go there] )
• Infinitive - a verb normally identical in English with the first person
singular that performs some functions of a noun and at the same time
displays some characteristics of a verb.
Relative Clauses
• Gap - _ symbol where it indicates the place where the head
noun phrase (the person) would logically appear to complete
the sentence
the person [ that I saw _ ]
• Filler - the head noun phrase is the filler of the gap
the person [ that I looked [ pp at _ ]
• Long distance dependency - filler gap relation that reaches
down a potentially unbounded number of nodes into the
parse tree
[the person]i [that [s you said [s you thought [s I gave the book to _ i]]]]
i = on parse nodes used to show that there is an identity relationship
person = same as the recipient of the book
• Relative Clauses - an NP can be modified by following it
with a relative clause. A relative clause consists of a
relative pronoun followed by a sentence that contains a NP
“the person that I saw _ “
Rule  NP(Gap) -> NP(Gap) RelClause
RelClause -> Pronoun(Relative)S(Gap(NP))
•  string comprises an NP with an NP Gap in it.
Rule  NP(Gap(NP)) -> 
The Gap has to be passed along in the rest of the grammar
i.e. S(Gap(Concat(g1 , g2))) -> NP(Gap(g1))VP(Gap(g2))
If g1 , g2 both are Gaps the S as a whole has no Gap
Questions? huh?
• In English there are two main types of questions:
1) Yes / No - Did you do that?
Subject - aux inversion - like a declarative sentence but it
has an auxiliary verb that appears before the subject NP.
Sinv is to denote a sentence that has it.
Auxiliary - functioning in a subsidiary capacity of a verb,
accompanying another verb and typically expressing
person, number, mood, or tense.
Rule 
S -> Question
Question -> Sinv
Sinv -> Aux NP VP
2) Wh (gapped) - What did you see _?
Will expect a noun phrase as an answer. It is an interrogative
pronoun followed by a gapped Sinv (in the simplest case)
Interrogative Pronoun - who, what, where, when, why and how
Rule  Question -> Pronoun(Interrogative) Sinv(Gap(NP))
• Other question constructions, but less common:
1) Echo - “You did what?”
2) Rising Intonation - “You smell something?”
3) Yes / No with “be” - “Is it dead?”
4) Wh Subject - “When is this class over, Prof. Jacob?”
5) Wh NP - “[What trashy novel] did you read _ ?”
6) Wh PP - “[With what] did you write it _ ?”
Handling agrammatical strings:
• Syntactic Evidence
• Lexical Evidence
• Semantic Evidence
• Difficult to find the correct interpretation, especially if one
can only use lexical, syntactic, semantic rules.
• Try to use logical inference through probabilistic models,
such as belief networks and hidden Markov models
• Belief networks help determine how to combine lexical,
syntactic, semantic evidence.
• Difficulty lies in selecting the appropriate evidence and how
to implement it.
• Thus, it’s very important to note the difference between the
evidence and source of ambiguity.
Syntactic Evidence
• Source of ambiguity: adverbs and prepositional phrases
(a.k.a. modifiers) can be applied to many different ‘heads’.
• Adverb - a word belonging to one of the major form of
classes, typically serving as a modifier of a verb, an
adjective, another adverb, a preposition, a phrase, a
clause , or a sentence and expressing some relation of
manner or quality, place or time.
• Solution: concluding that the modifier should be applied
to the most recent head.
i.e. I walked through the sludge near my house.
• ‘near my house’ can be applied to both ‘I’ and ‘the sludge’.
According to the solution, apply it to ‘the sludge’.
Lexical Evidence
• Source of ambiguity: syntax of prepositional phrases
modify verb and nouns in sentence, changing the entire
meaning of the phrase.
• I.e. Lee positioned the dress on the rack.
Kim wanted the dress on the rack.
• ‘On the rack’ determines the interpretation of the sentence:
in the first sentence, it affects the verb; in the second
sentence, it affects the noun ‘dress’.
• Solution: preference of the verb for sub-categorisation.
Semantic Evidence
• Source of ambiguity: lexical ambiguity where the
favoured word sense alters the context of the sentence.
• i.e. Ball, diamond, bat, base implies baseball, but the
words individually have a more common different
Semantic Evidence
• Another common ambiguity: words associated with the
word ‘with’, where ‘with’ can have many different
meanings and related noun phrases.
I ate macaroni with ketchup
I ate macaroni with dessert
I ate macaroni with abandon
I ate macaroni with a chopstick
I ate macaroni with my dog
(ingredient of macaroni)
(side dish of macaroni)
(manner of macaroni eating)
(instrument of difficult eating)
(accompanier of macaroni eating)
• Solution: have the interpretations that refer to the most
likely events, although the correct interpretation must still
be sought out.
• Metonymy - a figure of speech where one object is used to
represent another.
• It’s a frequent occurrence in spoken language and difficult
to represent grammatically.
i.e. Microsoft announced a loss of 17 billion dollars.
• We know that “Microsoft” really is a spokesperson for the
company “Microsoft”.
• Solution: need to represent a new level of ambiguity to
represent the new semantics.
• Provide two objects for the semantic interpretation:
one for the object the phrase literally refers to
one for the metonomic reference.
• Then state that there is a relation between the two objects.
• In current grammar:
 x,e Microsoft(x) Λ e  Announce(x,Past).
• Needs to be altered to:
 m,x,e Microsoft(x) Λ Metonymy(m,x) Λ e ε Announce(m,Past).
• This is only a representation of the problem. Need to define
constraints for the metonymy relation.
• Case 1: No Metonymy, where x and m are identical: ‘
 m, x (m=x) --> Metonymy (m,x)
Case 2: Representational reference for an organisation:
 m, x Organisation(x) Λ Spokesperson(m,x) -> Metonymy (x,m)
• Other examples:
– referring to an author for his/her works
– referring to a producer instead of the product
– referring to a group name for the whole (i.e. a team)
• Metaphor - a figure of speech where a phrase with one
literal meaning is used to suggest another meaning through
– Has a large part in everyday language, not just poetry.
i.e. The system has crashed.
• Solution 1: define all known/common metaphors into
lexicon, adding a new sense to the meaning of words (i.e.
fallen, dipped refer to some other scale other than altitude).
• Note that this doesn’t necessarily yield the correct
interpretation of the sentence
• Solution 2: include explicit knowledge of common
metaphors and use it to interpret a new sense to the
Discourse Understanding
• Discourse or Text - is any string of language, usually one
that is more than one sentence long. Requires an
understanding of text longer than one sentence.
• Easier to break down language into individual sentences,
but need to grasp the relations between all the sentences in
a given text.
• Discourse produced through three steps:
– intention
– generation
– synthesis
Discourse Understanding
• Discourse understanding is done through:
– perception
– analysis (semantic, syntactic, lexical)
– disambiguation
– incorporation
• This all depends on the amount of knowledge that an agent
has. Two different knowledge bases will lead to two
different understandings of a text.
Discourse Understanding
• General Discourse Equation:
KB’ = Discourse Understanding (text, KB)
The two versions of KB being:
KB = knowledge base of the agent
KB’= agent’s knowledge after understanding the text.
Discourse Understanding
• Six types of knowledge to achieve understanding:
General knowledge about a word
General knowledge about the structure of coherent discourse
General knowledge about syntax and semantics
Specific knowledge about situation being discussed
Specific knowledge about beliefs of the characters
Specific knowledge about the beliefs of the speaker
Discourse Understanding
• Interpretation tends towards a priori knowledge of
• Lets look at an example discourse:
– Pete went to a car race.
– He shouted very loudly.
– He had to leave at 6pm.
• There can be many interpretations to this discourse due to
the hearer’s knowledge base.
Structure of Coherent Discourse
• First, conjunction is NOT commutative in natural
• For example:
Dr. Monroe went golfing.
It started to rain.
He was struck by lightning.
• temporal ordering important
• purpose is important
Dr.Monroe went golfing.
He was struck by lightning.
It started to rain.
Structure of Coherent Discourse
• Segment - clause, complete sentence, group of consecutive
sentences. Discourses are composed of segments.
• Coherence relation - each segment in a discourse is
related to a previous segment and determines the role of
each segment in the discourse.
• The hearer must discern the relations of segments, not just
ascertain the ambiguities.
• Coherence relations constrain the possible meanings of
each sentence (i.e. single sentences have many meanings,
but together only a few).
Structure of Coherent Discourse
• Hobbs’ Theory - a speaker does four things to make
Convey message
Has motivation or goal to do number one.
Wants to make it easy to understand the message.
Links information to what hearer already knows.
• A sentence is a coherent extension to discourse if it does
one of the four points above.
Structure of Coherent Discourse
1) A funny thing happened yesterday
2) Wendy went to a fast food restaurant
3) Wendy is a sandwich
4) The clerk said “we don’t serve food here”
5) Wendy was shocked and hurt
6) The clerk said they’d make an exception this time
7) She was very embarrassed by her forgetfulness
• Thus, 2 adjacent segments si and sj stand in the evaluation
coherence relation if one can infer from si that sj is a step in
the speaker’s plan to achieve a discourse goal.
Structure of Coherent Discourse
• Different types of Coherence Relations:
1) Evaluation
2) Enablement
3) Causal
• These come from the speaker’s goal.
• Therefore, understanding has 2 levels of plan recognition:
– the speaker’s plans and the character’s plans in the
Structure of Coherent Discourse
• Other Coherent Relations:
– elaboration, used by the speaker to make discourse easier to
understand by saying something differently
– explanation, where the speaker adds new details to the hearer’s
existing knowledge to help understand the discourse at hand.
• A more elaborate set of coherence relations was developed
by Mann and Thompson: solutionhood, evidence,
justification, motivation, reason, sequence, enablement,
elaboration, restatement, condition, circumstance, cause,
concession, background, thesis-antithesis.
Structure of Coherent Discourse
• Grosz’ and Sidner’s theory notes the ‘where’ the attention
is focussed during a discourse by the speaker and the
• Attention/focus varies by what segment is added on
(pushed) and removed (popped) from the stack and when.
• This alters the direction of the focus.
Structure of Coherent Discourse
Discourse (A)
Discourse (B)
I went to Edmonton
I bought you Perogies
Then I hitch-hiked home
I went to K-Mart
I bought some underwear
I went to Edmonton
Then I hitch-hiked home
I went to K-Mart
I bought you Perogies
I bought some underwear
Chit - Chat with Chatterbot
Go to Chatterbot!
1) NLP techniques make it practical to develop programs that make
queries, extract information from texts, translate and so on.
2) It is possible to parse sentences efficiently using an algorithm.
3) There has been a shift from grammar to the lexicon.
4) Natural languages have a huge variety of syntactic forms.
5) Choosing the right interpretation requires evidence from many
6) Interesting language comes from connected discourse rather than in
isolated sentences.

Practical Natural Language Processing