What do we need to do, to
understand an utterance?
Cartoon-head figures
from Jackendoff (1994),
Patterns in the Mind
Speech Perception
Segment the auditory stream into
words, made up of particular
– What are the
– Where are the word
Word Recognition
Recognize individual words, resolving ambiguities
• Meaning
Chris walked near the bank.
• Syntactic category (noun, verb, etc.)
She saw her duck.
Buffalo buffalo buffalo buffalo.
• Determine the structure
of the sentence
– Constituents
– Hierarchical structure
• Buffalo buffalo buffalo buffalo.
• Put the ball in the box on the table.
(ball in the box) on (the table)
(ball) in (the box on the table)
Semantic & Pragmatic
• What does the sentence mean?
– Who did what to whom?
– Truth conditions
• What is the speaker trying to convey?
– Can you pass the salt?
The Big Question:
How Do We Accomplish
Linguistic Communication?
Jackendoff (1994), Patterns in the Mind
Map sounds to stored
1. A bird was in the
tree yesterday.
2. Are there any
birds in the tree?
3. A bird might be
in the tree.
4. That tree looks
like a bird.
What do we need?
Linguistic Knowledge as
Categorical Rules for
Jackendoff (1994), Patterns in the Mind
Related Questions
• How do those categorical rules map onto cognitive
• Are the same cognitive processes used to produce
speech as to understand it?
• Does modality (reading/listening) matter?
• Are our brains specialized for language?
 If I form a hypothesis about language
understanding/production, how would I test it?
 What would count as data?
 Let’s give it a shot…
• Philosophical interest in language processing
& language acquisition goes back to ancient
– E.g., Aristotle on relations among thought,
language, & external world
• Modern experimental approach quite new
(most paradigms developed within last 50
• Modern theories reflect contributions from
modern linguistic theory, cognitive
psychology, computer science, & cognitive
– Human language is unique among animal
Early Psychological study of
• Wundt (1832-1920)
– One of psychology’s
founding fathers
– Primarily used
introspection for
studying mental
behavior, but one of
first to use RT
– Published on language
in 1911
Wundt’s (1911) hypotheses
about Language
• Sentence (defined intuitively) is the
primary unit of language
– “Leave!” is one; “Days of the week” is not
– “I filled the water with bottle.”
• Production converts a thought into a
sequential string of sounds
• Comprehension is simply the reverse
What’s wrong with
Wundt’s Approach?
• Is introspection a reliable,
replicable, objective tool for science?
• Do Wundt’s hypotheses lead to any clear
predictions about behavior?
• Constructs (e.g., sentence) not carefully defined
• The stimuli/inputs for production and the
response/outputs for comprehension are neither
well-defined nor directly observable.
– How can one develop a hypothesis and test it
Would a different approach be
more productive?
• Need to develop hypotheses that lead to
clear predictions about behavior
– Link environmental conditions to observable
• Need experimental techniques that can be
clearly described and replicated in
different laboratories.
Dominant paradigm in
psychology 1927-1960.
(Skinner, Pavlov)
– How often does a behavior
occur and with what
– All behavior shaped by the
environment using classical
and operant conditioning.
– No mental representations
– Introspection devalued
Skinner’s (1957)
Verbal Behavior
• Language is a difficult phenomena
for a behaviorist account.
– In 1934, at a dinner party, philosopher A. N.
Whitehead challenged Skinner to “account for my
behavior as I sit here saying ‘No black scorpion is
falling upon this table.’”
– Skinner began the book the next morning, and spent
20+ years working on it.
• Skinner often called this book his most
important work
Skinner’s (1957)
Verbal Behavior
• Emphasis on production, rather than
• A sentence is a chain of associative links, “like
beads on a string”
There – is – no – black – scorpion…
• Speech is learned response to environmental
stimuli (reinforcement, punishment)
Experimental Evidence
• Speech is learned response to
environmental stimuli (reinforcement,
– Use of plural nouns increase if reinforced
with “mmm-hmm” (Greenspoon, 1954,
– Proportion of opinion statements increase
if paraphrased/agreed (Verplank, 1955)
Language research during the
Reign of Behaviorism
• Operant studies (e.g., Verplanck)
• Classical Conditioning experiments (e.g.,
• Practical research, much of which was
funded by the defense department
– George Miller: understanding speech in noisy
radio transmissions
George Miller’s Lab
• Interested in speech and hearing
• Trained as a behaviorist in the 1940s
• In 1950’s investigated radio-based
– How high does the signal-to-noise ratio need
to be, for adequate transmission of the
• Amount of noise
• Characteristics of the message
• Characteristics of the speaker
How do we (the military) insure
adequate transmission of message?
One strategy: Limit the
vocabulary/possible messages
– Digits are easy: 0-9 have 8 different
nuclear vowels (only 5 and 9 share their
– Nonsense syllables are opposite
extreme—need to hear each phoneme
George’s Ground-Breaking
Miller et al. (1951)
Miller & Selfridge (1950)
demonstrated analogous
pattern in free-recall test.
Why are words in sentences
easier to perceive and
easier to remember than
words in lists?
What does “sentence-advantage”
Miller et al. (1951) maintain that sentences
effectively restrict the number of alternative
words, similarly to small vocabularies.
“In 1951, I apparently still hoped to gain
scientific respectability by swearing
allegiance to behaviorism. Five years later,
inspired by such colleagues as Noam
Chomsky and Jerry Bruner [a social
psychologist], I had stopped pretending to
be a behaviorist.” (Miller, 2003)
Miller (1962) re-examines Miller et
al. (1951)
• Words in a sentence are not as distinct as words
in isolation…
– Less carefully pronounced (splice-test)
– Words run together
• Why is there no extra cost for these?
• Speech rate of 2-3 words per second leaves little
time for deducing set of alternatives after each
• “Reduction of alternatives” explanation is
Miller (1962)
Miller (1962)
Evolution of Speech and
August 4, 2009
How did human vocal tract evolve?
All mammals produce vocal sounds in
essentially the same way…
Source – Oscillator (voicing) – Filter
YouTube - vocal tract model synthesis
YouTube - Vocal formants
In human speech, formants are the
most informative parameter. They
make speech intelligible.
 E.g., whispered speech lacks voicing and
pitch, but has normal formants
Human speech requires fine, rapid
motor control during articulation
Role of formants in animal
Primates & birds perceive formants as
accurately as humans
– Individual identification via vocal
– Provide cues to body size of “speaker”
Diff’s btwn ape & human vocal tract
1. Human larynx lowers in throat during 1st yr of life
·Allows more tongue movement, for broad range of
discriminable formant patterns
·Lowers formant freq-- impression of larger size
2. Human oral cavity
shorter, nasal cavity
3. Humans lack
laryngeal air sacs
– Little known about
Vocal Imitation
• Except for humans, primates are poor at
• Apes raised like human kids
• Monkeys raised with other species
• Little evidence for learned vocal behavior
Humans clearly learn language(s)
Human whistling
Human bird calls
Human imitation of animal noises
Vocal Imitation
• Whales, seals and dolphins are somewhat
better than most primates.
– Whales learn their songs
• Passerine Birds are terrific at this, even
– Songbirds learn their songs
– Mockingbirds learn other species songs, as well
as environmental sounds (insects, car alarms,
– Parrots can mimic human speech and even
specific voices
– Irene Pepperberg has trained African Grey Parrots
to use human speech communicatively
Primates are poor candidates for
production of spoken language
• Lack of rapid, fine motor control of
vocal articulation
• Structure of the vocal tract
• Limited ability for vocal imitation
Ape Language studies
(1950’s to present)
• No luck training chimpanzees to produce
spoken language
• Some success with manual/visual
• Chimpanzees, gorillas,
& bonobos approximate
linguistic skill of a 3-yr
old human
• Ceiling on lg potential?
Koko (gorilla.org)
Gorilla trained in sign language
by Penny Patterson
Video Clip
Has Koko acquired a
language? What evidence is
necessary to answer this
Summing Up: How is human
language special?
Vocal tract anatomy
Vocal imitation
Rapid, fine motor control of ariculators
Creative recombination of phonemes,
morphemes, words for expression of nearly any
 But what about Koko?
 Compare Koko to Nicaraguan deaf kids
Spontaneous emergence of
Nicaraguan Sign Language
• Clip from “Birth of a Language”
• How is this signed communication the
same/different from Koko’s?
• What kind of tests would you need to
conduct to compare them?
Special Features of Human
• Specialized vocal tract: Broad range of
formants for producing many distinct
• Vocal imitation/Social Learning
• Rapid, fine articulation
• Hierarchical structure
• Rule Learning
Hierarchical Structure of Lg
Human speech has hierarchical structure,
which is necessary to produce utterances of
arbitrary complexity. Structure is distinct
from content (specific phonemes).
Syllable = (onset) + rhyme
Rhyme = nucleus + (coda)
Onset = one or more consonants
Nucleus = one or more vowels
Coda = one or more consonants
Compositionality and the Rate of
Data Transmission
• Small set of phonemes can be recombined very
productively (but in a constrained way) to form
– Morpheme = one or more syllables (meaning unit)
– Signed language morphemes are also made up of “phonological”
constituents (e.g., hand shape, movement, location)
• Morphemes can be combined productively (but
constrained) to create words.
– (prefix) + stem + (suffix)
– Stem = (prefix) + stem + (suffix)
• Words can be combined productively to create
Hierarchical Structure of Lg
Syntactic Hierarchies & Centerembedding
• The man read Chaucer.
• The man who the woman despised read
• The man who the woman the children
loved despised read Chaucer.
What are the preconditions for
learning hierarchical structure?
1. Fixed sequences (linear order):
– Idioms & stock phrases (once upon a time) are fixed word
– Words are fixed phoneme sequences
2. Statistical Learning is probably important for:
– discovering words in speech stream
– identifying syntactic category and subcategory of words
– resolving lexical and syntactic ambiguity
– Within phrases (e.g. NP), there is a predictable ordering of
categories (e.g., the predicts a noun in the next word or two)
Predictive constraints on sequences may allow us to
learn hierarchical relationships
Coding the
aids word
n prior to
Jenny Saffron & colleagues
Marc Hauser & colleagues
Shared neural underpinnings to
syntax & sequence learning?
• Broca’s aphasics who have severe syntactic deficits
also exhibit deficits in sequence learning
(Christiansen et al., 2001 unpubl)
• Incongruent musical sequences elicit P600’s, just
like syntactic anomalies (Patel et al., 1998)
• MEG shows that Broca’s area is involved in
processing music sequences (Maess et al., 2001)
• All higher organisms must learn about sequential
events. How does human sequential learning
compare with that of other primates?
“The rat the cat the dog
bit chased died.”
What limits the kinds of
rules that Tamarins can
learn about sequences?
Fitch & Hauser (2004) suggest that
Tamarin’s can master Finite State
Grammars, but not Phrase Structure
Types of Grammars
All human languages allow for an infinite
number of different utterances. What kinds
of grammars allow this?
Finite State Grammars: A finite number of
states (e.g., words, calls, syntactic
categories), with rules for getting from one
state to the next.
FSG’s provide rules for concatenation
Types of Grammars
A phrase structure grammar allows for long
distance dependencies.
(Last year, (Demi Moore (took (that cute dumb guy who’s about 20
years old from “That 70’s Show”) out for a while))).
NP took out NP.
NP took NP out.
NP = NP + PP
NP = NP + S
The intervening NP can have an
arbitrary amount of internal
Types of Grammars
Phrase structure grammars allow for center
embedded constructions:
((The rat ((the cat (the dog bit _cat ))chased
_rat)) died.)
The water someone I know carried spilled.
F&H 2004
A simple FSG could have two categories of
“words”, A & B, with the rules that A must follow
B and vice versa. ABn
A: no ba la wu, etc. (hi pitch)
B: li pa mo, etc. (low pitch)
No li ba pa
Ba pa ba pa ba pa…
*No ba la mo
Tamarins & humans easily
learn the simple finite state
grammar. In F&H, were the
learning conditions
comparable across the 2
F&H 2004
A simple phrase structure grammar might require
equal numbers of A and B syllables. AnBn
A: no ba la wu, etc. (hi pitch)
B: li pa mo, etc. (low pitch)
((The rat ((the cat (the dog bit _cat ))chased _rat)) died.)
The water someone I know carried spilled.
What were the results for humans and tamarins in
the PSG condition? What are the implications?
Why does human performance
surpass monkey performance?
1. Humans have UG
2. Humans have general cognitive abilities
superior to monkeys (Look for evidence of
this in non-language context.)
Sequences can be learned via linear associations
(as in FSG) or by learning ordinal positions of
items (e.g., syllable structure, word position in a
• Ordinal sequence learning in rhesus monkeys (Chen et
al., 1997)
• Strategies for nesting cups
• Monkeys learned 4 sets of
4 pictures each on touchscreen.
• Trained to press pics w/in
each set in a particular
order. (Spatial config
• Then the 16 pics were
reorganized into new sets
& monkeys re-trained.
– In “maintained” sets, the
pictures were in the same
ordinal slot [A,B,C,D].
– In “changed” sets, the
pictures were in a new
ordinal slot.
– Maintained sets were much
easier to learn, suggesting
that ordinal position had
been encoded.
Chen et al. (1997)
strategies: do
they involve
Human speech probably
evolved as a result of …
• Rapid, fine motor control of articulators
[frontal lobe, hypoglossal nerve]
• Ability to analyze sounds in terms of
hierarchical structure [Broca’s area? UG?]
• Changes in the vocal tract & enhanced role of
• Increased ability to imitate auditory input
[arcuate fasciculus?]

Psychology of Language