Speech Perception:
Theoretical approaches
Scope of the problem
• Speech perception involves the mapping
of speech acoustic signals onto linguistic
messages (e.g., phonemes, distinctive
features, syllables, words, phrases…)
Why is the problem theoretically
hard to solve?
• Acoustic variability due to context, talker,
dialect, rate, prosodic, and other
differences.
• Segmentation problems
Three theoretical approaches:
• Motor theory
• Direct realism
• General approach
Motor theory of speech perception
(Liberman & Mattingly, 1985)
• Listeners perceive gestures (more
specifically, intended gestures, or
neuromotor commands).
• Speech is perceived in humans by means
of a specialized speech module.
How the speech module works:
…“the candidate signal descriptions are computed
by an analogue of the production process—an
internal, innately specified vocal-tract
synthesizer…—that incorporates complete
information about the anatomical and
physiological characteristics of the vocal tract
and also about the articulatory and acoustic
consequences of linguistically significant
gestures” (Liberman & Mattingly, 1985, p. 26).
Spectrograms of /di/ and /du/
Direct realist theory of speech
perception (C. Fowler)
• Derived from James J. Gibson’s
perceptual theory.
• Objects of speech perception are actual
gestures.
• No special mechanisms are required.
How direct realism works:
• “Perceptual systems have a universal function. They
constitute the sole means by which animals can know
their niches. Moreover, they appear to serve this
function in one way: They use structure in the media
that has been lawfully caused by events in the
environment as information for the events. Even though
it is the structure in media (light for vision, skin for touch,
air for hearing) that sense organs transduce, it is not the
structure in those media that animals perceive. Rather,
essentially for their survival, they perceive the
components of their niche that caused the structure.”
(Fowler, 1996, p. 1732)
General approach to
speech perception
(Diehl, Lotto, & Holt, 2004)
• Objects of speech perception are
(primarily) acoustic/auditory events.
• Speech perception relies on general
mechanisms of audition and perceptual
learning.
Variation in VOT
Spectrograms of English /ba/ and /pa/
VOT frequency histograms of voicing categories
across six languages (Lisker & Abramson, 1964)
English identification functions for VOT stimuli
superimposed on VOT frequency histograms
English discrimination functions for VOT stimuli
Thai identification functions for VOT stimuli
superimposed on Thai VOT frequency histograms
Thai discrimination functions for VOT stimuli
Frequency
(Hz)
Tone onset time (TOT): a nonspeech analog of Voice onset
time (VOT) (Pisoni, 1977; Holt, Lotto, & Diehl, 2004)
- 50
TOT
50
ms
0
TOT
Time
(ms)
+50
TOT
50
ms
Discrimination functions for TOT stimuli
(Holt,Lotto,& Diehl, 2004)
100
Percent Correct Discrimination
90
80
70
60
50
40
30
20
10
0
-50 vs. -40 vs. -30 vs. -20 vs. -10 vs.
-20
-10
0
10
20
0 vs.
30
TOT Stimulus Pair (ms)
10 vs. 20 vs.
40
50
VOT “identification” by chinchillas
(Kuhl & Miller, 1981)
/ga/-/ka/ identification by typically developing children and
dyslexic children (with and without ADHD)
Back to the three approaches to speech
perception
• Motor theory
• Direct realism
• General approach
Descargar

Speech Perception: - HomePage Server for UT Psychology