Question about the reading
• What are clitics?
• They are not words.
– Evidence: they can’t be stressed
• They are not prefixes or suffixes.
– Evidence: they don’t cause certain changes in the
word that a prefix or suffix would cause.
– Evidence: any given prefix or suffix can attach to one
kind of word (for example, only nouns or only verbs).
Some clitics attach to whatever is nearby.
Example: Spanish clitic pronouns
(Data to be supplied by the class)
• Word stress in Spanish
– Stress the second to last or last syllable
– Examples:
• When you add a suffix like –able or –mente, the stress
goes on the new second to last syllable:
– Examples:
• Clitic pronoun:
– Example: I am reading it.
– When the clitic is added, the stress stays on the old second to
last syllable.
• Clitic pronoun:
– Example: I see him.
– Can it be stressed?
A Distributional Approach to Parts
of Speech
Grammars and Lexicons
September 5, 2007
Categories of Words:
Parts of Speech
Determiner (Article)
Modal ?
Parts of Speech
Det Noun Modal Verb
Adverb Adjective Prep. Det Noun
This boy must seem incredibly stupid to
that girl.
Scientific method in linguistics
• Theories (hypotheses) must be testable
and falsifiable.
• Results must be reproducible.
Reproducible Results: Chomsky, 1957
The search for rigorous formulation in linguistics has a much more
serious motivation than mere concern for logical niceties or the
desire to purify well-established methods of linguistic analysis.
Precisely constructed models for linguistic structure can play an
important role, both negative and positive, in the process of
discovery itself. By pushing a precise but inadequate formulation to
an unacceptable conclusion, we can often expose the exact source
of the inadequacy and, consequently, gain a deeper understanding
of the linguistic data. More positively a formalized theory may
automatically provide solutions for many problems other than those
for which it was explicitly designed. Obscure and intuition-bound
notions can neither lead to absurd conclusions nor provide new and
correct ones, and hence they fail to be useful in two important
In language technologies, imprecise
definitions lead to poor intercoder
reliability, which leads to poor training, etc.
A traditional theory of parts of speech
Verbs denote actions
Nouns denote entities
Adjectives denote states
Adverbs denote manner
Prepositions denote location
Determiners specify
• The same concept can function in several
parts of speech.
– Pinker, page 98
• Her interest in fungi (noun)
• Fungi are starting to interest her more and
more. (verb)
• She seems interested in fungi. (adjective)
• Interestingly, the fungi grew an inch in an
hour. (adverb)
The distributional theory of parts of
• “A part of speech, then, is not a kind of
meaning; it is a kind of token that obeys
certain formal rules, like a chess piece or a
poker chip.”
– Pinker, page 98
• Testable and falsifiable
• Assumes discrete categories
The distributional theory of parts of
• Distribution
– The contexts where the word can appear
• Morphology
– Prefixes, suffixes, and other changes to the
structure of the word.
Identifying parts of speech by their
• Morphology: The form of words
• Affixes: Prefixes, suffixes, infixes
• Stem changes: swim/swam
Morphological properties of English
• Count nouns
– Cup/cups
– Book/books
• Mass nouns
– Attention/?attentions
– Sand/?sands
– Water/?waters
– Coffee/?coffees
Morphological Properties of English
• Monosyllabic (one syllable) adjectives
– Tall/taller/tallest
– Fast/faster/fastest
• Multi-syllabic adjectives
– Intelligent/more intelligent/most intelligent
• Except for adjectives that have nongradable meanings:
– Alphabetical, unique, pregnant
Invariant words: no prefixes or suffixes
in English
• Prepositions (in, on, at, about, across,
beyond, etc.)
• Modals (may, might, can, could, must,
shall, should, etc.)
Morphological Properties of English
Past Participle
Present Participle
Third person singular subject
What are participles?
• Verb forms that act like adjectives or
– Mown grass
• Participle in an adjective position
– Mowing is fun
• Participle in a noun position
Other uses of English Participles
• The grass was mown.
– Passive verb
• I was mowing the grass.
– Present progressive verb
Distributional criteria for parts of
Template 1: adjectives
Great ideas spread quickly.
Interesting ideas spread quickly.
Stupid ideas spread quickly.
Colorless ideas spread quickly.
Words of the same category have the
same distribution. For example, adjectives
can come before nouns.
Template 2: adjectives
They are very adjective.
They are very nice/gentlemanly/ladylike.
*They are very gentlemen/ladies/faxes.
*They are very starve/die.
*They are very to/at/on.
• They are very in.
• They are very off.
Template 3: adjectives and adverbs
Very adverb or adjective
Very slow
Very slowly
Very badly
Very happy
Template 4: adverb
He treats her adverb.
He treats her well.
He treats her arrogantly.
He treats her nicely.
• He treats her nice.
• He treats her good.
Template 5: nouns
noun can be a pain in the neck.
Television can be a pain in the neck.
Linguistics can be a pain in the neck.
This can be a pain in the neck.
*Happy can be a pain in the neck.
*From can be a pain in the neck.
*The can be a pain in the neck.
*Breathe can be a pain in the neck.
Template 6: verbs
They/it can verb.
They/it can stay/leave/die/cry.
*They/it can gorgeous/cute/trendy.
*They/it can from/to/in/off/on.
*They/it can door/bible/gold/camera.
Template 7: Modals
Modal I be frank?
Can I be frank?
Must I be frank?
Should I be frank?
Need I be frank?
Template 8: determiner
• He wrote determiner other works.
• He wrote the/all/these/no/few/many other
• *He wrote despair/be/have other works.
• *He wrote student other works.
• ?He wrote successful other works.
Template 9: prepositions
• Right preposition.
– Right is an intensifier.
Right up/down/in/on/across the street
Right down the stairs
Right in the drawer
Right from school
Right across the street
*He right despaired.
*She chose right this one.
• Problems with Radford’s templates
• Problems for the assumption of discrete
– Words that evade categorization
Template 1 problem
• Templates need to be more exact:
– Great ideas spread quickly.
– The ideas spread quickly.
• Do great and the have the same part of
Template 5: need subcategories
• Cat can be a pain in the neck.
• The template only works for
– Plural nouns (e.g., cats)
– Mass nouns (e.g., water)
– Pronouns (e.g., he)
– Proper nouns (e.g., Sam)
• Cat is a singular count noun.
Count and mass nouns
• Singular count nouns must occur with a determiner:
– The cat was a pain in the neck.
– A cat can be a pain in the neck.
– *Cat was a pain in the neck.
• Plural nouns and mass nouns can occur without a
– Cats can be a pain in the neck.
– Water can be a pain in the neck.
• Singular mass nouns change their meaning when they
occur with “a”
– a water
– a coffee
– ?An information
Other things to take into account
He can be a pain in the neck.
*Him can be a pain in the neck.
This music rocks.
These CDs rock.
Template 6: Need subcategories
• *They can handle.
• *They can accommodate.
• *They can harbor.
• The template only works for intransitive verbs.
• These verbs need another noun after them.
– They can handle boredom.
– They can accommodate changes.
– They can harbor criminals.
Template 9: prepositions
• She looked at him right strangely. (dialect)
• She is right pretty. (dialect)
• You look a right clown. (Oxford English
• The government made a right mess of it.
(Oxford English Dictionary)
Words can have more than one
part of speech
• He needs to see a doctor. (verb)
• Need I be frank? (modal)
• I feel a need to explore my roots. (noun)
Importance to you
• The distributional theory of parts of speech
is problematic, but it is your best bet for
your grammar writing project.
• When you are building a lexicon, you will
decide on parts of speech for words by
using template tests and morphological
In-class exercise
• Goals:
– Interpret the results of distributional tests for
parts of speech.
– Discover that some words are problematic for
the distributional theory of parts of speech.
– Reminder:
• When you know a language, you know a complex
body of unconscious knowledge.
Words that evade classification
• More tests for prepositions and adjectives
– Cambridge Grammar of the English
Language, Chapter 7, Section 2.2
• Attempt to categorize like, worth, near,
opposite, due, close, far
Predicative and non-predicative
• Cambridge Grammar of the English Language, page 604
• Adjectives: predicative modifiers
– Tired of the ship, the captain saw an island on which
to land.
• Tired is predicated of the captain.
– *Tired of the ship, there was a small island.
• Prepositions: non-predicative modifiers
– Ahead of the ship, the captain saw an island on which
to land.
– Ahead of the ship, there was an island on which to
Become, Feel, Seem, Look
• Adjectives
– He became/seemed/felt/looked happy
• Prepositions
– *He became/seemed/felt/looked in the park.
– Exceptions
• He became/seemed/felt/looked under the weather
• He became/seemed/felt/looked out of his mind
Degree modification
• Adjectives
Very smart
Smart enough
*very much smart
• Prepositions
– *very in the room
– ?very much in the room
– *more on the table
• ?This book is more on the table than that one.
?This book is enough on the table not to fall.
?This book is on the table enough not to fall.
This book is very much on the table.
?This book is more about linguistics than that one.
Followed by bare NP or PP
• Adjectives: Cannot be followed by bare NP
– Fond of Sam
– *Fond Sam
– Happy about the promotion
– *Happy the promotion
• Prepositions: Can be followed by bare NP
– In the room
– About linguistics
Right and Straight
• Adjectives:
– *right red
– *right conspicuous
– ?right smart
• Prepositions
– Straight into the room
– Right on the table
Coming with a question word when it moves
(Pied Piping, from a story where kids and rats followed a piper)
Relative clause
– I saw a man
– The man who I saw ___
Embedded question
– I know that you saw someone.
– I don’t know who you saw ___.
She cut the bread with a knife
The knife with which she cut it ___
The knife she cut it with
I know that you are referring to someone.
I don’t know to whom you are referring ___
I don’t know who you are referring to.
She is fond of Sam.
?The boy fond of whom she is ___
The boy of whom she is fond __
The boy who she is fond of ___
*I don’t know fond of whom she is.
*I don’t know of whom she is fond ___.
I don’t know who she is fond of ___.
• Predication:
– Worth over a million dollars, the jewels were
kept under surveillance.
– *Worth over a million dollars, there will be
ample opportunity for a lavish lifestyle.
• Become
– What might have been a $200 first edition
suddenly became worth perhaps ten times
that much.
• Degree modification
– *It was very worth the effort.
– It was very much worth the effort.
– ?It was enough worth the effort.
– ?It was worth the effort enough.
• Followed by a bare NP
– yes
• Right and straight
– *The land is right worth $100K.
• Comes with a question word?
– She thought the land was worth $100K.
– This was far less than the amount which she
thought the land was worth ___.
– *This was far less than the amount worth
which she thought the land was ___.
• Degree modification
Parts of Speech in Language
Part of Speech Tagging
• Input: string of words
• Output: string of words with a part of speech
associated with each word.
• Example:
– This:det boy:N likes:V that:det girl:N
• Use statistical or rule-based knowledge about
• Usually use a long list of parts of speech, e.g.,
around 40.
Part of speech tags used in the
Penn Treebank
Coordinating conjunction
Cardinal number
Foreign word
Preposition/subordinating conjunction
Comparative adjective
Superlative adjective
List item marker
Part of speech tags used in the
Penn Treebank
Singular noun or mass noun
Plural noun
Singular proper noun
Plural proper noun
Possessive ending
Personal pronoun
Possessive pronoun
Comparative adverb
Superlative adverb
Part of speech tags used in the
Penn Treebank
Base form verb
Past tense verb
Gerund or present participle verb
Past participle verb
Verb not 3rd person singular present
Verb 3rd singular present
Possessive wh-pronoun
A different theory of Parts of
Theory of Propositional Acts and
Parts of Speech
(William Croft, Radical Construction Grammar, Chapter 2)
• Refer
• Modify
• Predicate
• Nouns are words that refer without additional
• Adjectives and adverbs modify without additional
• Verbs predicate without additional marking.
Additional Marking
• Predication > reference
– Destroy > destruction
– The destruction of the city
• Predication > modification
– Destroy > that destroyed
– The hurricane that destroyed New Orleans
• Modification > predication
– Red > is red
– The book is red
• Modification > reference
– red > the red one
– The red one is on the shelf
• Reference > predication
– Teacher > is a teacher
– He is a teacher
Problems with propositional acts
and additional marking
• Modification > reference without additional
– Robin Hood stole from the rich and gave to
the poor.
• Reference > modification without marking
– Toy house
Variation across languages
• World Atlas of Language Structures
Things that are marked on verbs in
other languages
• Aspect
– Perfect and imperfect
• Mood
– Subjunctive
• Voice
– Passive

Parts of Speech