Next major topic: Consonant articulation.
What’s a vowel? A speech sound produced with a
(relatively) unimpeded air stream.
What’s a consonant? A speech sound produced with air
stream impeded, constricted, diverted, or obstructed.
As before: Vowels are open-ish, consonants are closedish.
Classification system for vowels:
tongue height, advancement, and lip rounding
Classification system for consonants:
place, manner, and voicing
A. Place (also called place of articulation): Where is the breath
stream impeded, constricted, diverted, or obstructed? For
lips, teeth, alveolar ridge, palate, velum, …
(These are the articulatory landmarks that we reviewed earlier. More on place later.)
B. Manner: How is the breath stream impeded, constricted,
diverted, or obstructed? For example:
1. stop or plosive: complete obstruction of air stream
[b], [d], [g], [p], [t], [k] [/] (glottal stop, as in “uh-oh”)
2. fricative: air passed thru a narrow channel, creating
[s], [S] (as in “shoe”), [f], [T] (as in “theory”) [h],
[z], [Z] (as in “Zsa Zsa”), [v], [D] (as in “this”).
3. nasal: air stream redirected through the nasal cavity.
[m], [n], [N] (as in “sing”)
Manner categories (continued)
4. affricate: complete obstruction of air stream followed by
fricative release.
[tʃ] (as in “choke”), [dʒ] (as in “joke”)
5. approximants: consonants that are almost like vowels
[r] [l] [w] [j] (as in “yellow”)
These are the “open-est” of the closed-ish sounds – breath
stream is fairly unimpeded. But, these sounds “pattern” like
consonants; i.e., speakers treat them like consonants not
a rat or an rat?
a lake or an lake ?
a walk or an walk?
a yak or an yak ?
So, these are consonants and that’s that, even if we can’t
supply a neat definition separating vowels from consonants.
Manner categories (continued)
Two Types of Approximants
[r] [l]
Glides (also called semivowels)
[w] [j]
Why are [r], [l] called liquids and [w], [j] called glides?
Easy: They just are. If there’s a good reason for this I don’t
know it. But, you’ll have to learn it same as everyone else.
6. flap: Like a stop, but closure is very brief
[ɾ] (as in “kitty,” “butter,” “Betty,” “later”)
There are other manner classes, but the 6 I listed are the
ones needed for English.
C. Voicing
Are the vocal folds vibrating?
English has many pairs of consonants that are identical
in all other ways except for voicing. Some examples:
[b]-[p], [d]-[t], [g]-[k], [z]-[s], [ʒ]-[ʃ], [v]-[f], [ð]-[θ]
These are called voiced-voiceless cognates.
1. Stops in English
[b] [d] [g]
Unvoiced: [p] [t] [k] [ʔ] ([ʔ] is a bit of an odd bird – later)
Notes on Stop Consonants (Plosives)
a. Dental (rather than alveolar) stops are rather common in
some dialects of Am Eng – working class dialects in NY, NJ,
Philly, etc. Symbols: [d̪] and [t̪]. The diacritic that indicates a
dental stop is a little dealie that looks like a tooth. [d̪] and [t̪]
are allophones of /d/ and /t/ in some dialects of English.
They occur as distinct phonemes in some languages.
b. Stops involve a build up of pressure behind the occlusion
followed by release. The velum has to be in the up position
for the pressure to build; i.e., the V-P (velopharhyngeal) port
needs to be closed.
What problems might speakers with cleft palate have in
producing stops?
Summary of IPA Consonant Symbols
(excluding the obvious ones – b,d,g,p,t,k,l,w,r, etc.)
yellow (not the sound association typical for the letter ‘j’ – in English)
[ʃ] / [s]
shoe (either symbol may be used for this sound; [S] preferred here)
[ʒ] / [z]
measure (ditto – but learn both because you’ll see both)
uh-oh / button
[tʃ] / [c]
church (symbols interchangeable; [tS] preferred here; learn both)
[dʒ] / [j]
judge (ditto – [dZ] preferred here; learn both)
which / whether (for those speakers who distinguish which/witch)
c. Glottal stops occur in a few “exclamatory” words like
“uh-uh” (no) or “uh-oh” (whoops). They’re more
common that you might think, though. Glottal stops
often serve as separators, as in:
no notion vs. known ocean
[no noʃən] vs. [non ʔoʃən]
353-7200: Phone number with “00” spoken as “ohoh.” A glottal stop will almost always be inserted to
separate the two “oh’s; e.g.
Glottal stops also appear as an allophone of /t/:
d. Aspiration
Voiced stops (in English) are never aspirated.
Voiceless stops are sometimes aspirated and sometimes not.
These voiceless stops will be aspirated:
a. Word-initial, regardless of stress:
tap, cat, Topeka (stop precedes an unstressed vowel), command (ditto)
[thæp] [khæt] [thəpʰikə] [khəmænd]
b. Intervocalic (between 2 vowels) but only when
preceding a stressed vowel.
meticulous, repair, recalcitrant, return
These voiceless stops will be unaspirated:
a. Following /s/
stop, skate, stick, stare, spike
b. Intervocalic, preceding an unstressed vowel
napping, camper, sicken, supper, thirsty
(Note: Sometimes these are unaspirated,
sometimes they are lightly aspirated.)
See Table 5-2 (p. 96) of MacKay for a nice summary
with examples.
Voice Onset Time (VOT)
VOT ~85 ms
VOT ~0 ms
voicing onset
voicing onset and
release ~ simultaneous
VOT = Interval between articulatory release
and onset of voicing.
Voice Onset Time (VOT)
VOT ~10 ms
VOT ~85 ms
voicing onset
Very short delay
between release and
voicing onset (~10 ms)
[spɑt] (unaspirated [p])
With [s] edited out
pack [phæk]
(aspirated [p])
/p/ precedes
stressed vowel
capping [khæpɪŋ]
(lightly aspirated [p])
/p/ precedes
unstressed vowel
(unaspirated or
lightly aspirated)
2. Fricatives
Mechanism of sound production is simple: Air is passed
through a narrow channel, creating turbulence. Turbulence
= noise.
When you look at white water on a river or stream you are
looking at turbulence. (You can also hear this turbulence; this is the
noise you hear when white water passes between boulders and whatnot.)
All fricatives involve this turbulence-generating mechanism.
English fricatives:
Voiceless: [f] [θ] (“theory”) [s] [ʃ] (“shoe”) [h]
Voiced: [θ] [ð] (“this”) [z] [ʒ] (“Zsa Zsa”)
All English fricatives except (maybe) [h] form voicedvoiceless cognates:
[v]-[f] [ð]-[θ] [z]-[s] [ʒ]-[ʃ]
For each pair: Same place, same manner, different voicing.
WEAK (not very loud)
(Slit Fricatives)
STRONG (comparatively loud)
[f] [v] [θ] [ð] [h]
(constriction shape for
for weak fricatives)
Long flat constriction =
Inefficient noise generator
(noise is weak)
(Groove Fricatives)
[s] [z] [ʃ] [ʒ]
(constriction shape
for strong fricatives)
More circular constriction =
Efficient noise generator
(noise is strong)
Place = Labiodental (lips-teeth)
Flat constriction (slit fricatives); flat (rather than round or
grooved) constrictions produce a weak noise.
No resonator in front of the constriction; spectrum has a
pretty flat shape (no well-defined resonant peaks)
spectrum during [f] noise (flat)
Narrow band spectrum during [v] noise
(flat, but with harmonics in the lows)
Place = Linguadental (tongue-teeth) or
interdental (linguadental & interdental are synonyms)
Flat constriction (slit fricatives); flat (rather than
round or grooved) constrictions produce a
weak noise
No resonator in front of the constriction; like [f]
and [v], spectrum has a pretty flat shape (no
well-defined resonant peaks)
NOTE: Place is always listed as linguadental/
interdental, but for [ð] in particular the tongue is
often behind the top teeth; i.e., [ð] is more often
dental than linguadental/interdental.
Place = alveolar
Round-ish, grooved constriction; these produce a
strong noise
Short resonator in front of the constriction formed by
the lips; spectrum has a strong high-frequency peak.
Why high freq? Short tubes have high-frequency
spectrum during [s] noise (hi-freq peak)
[z]: Spectrogram for [z] (not shown) is very similar, except
that voicing (a glottal buzz) will be mixed in with the noise,
just like [v] and [ð].
[ʃ]-[ʒ] (also [s] and [z]; small wedge over [s]/[z] = hachek):
Place = Alveopalatal/Palatoalveolar/Prepalatal
Round-ish, grooved constriction; these produce a strong
Relative to [s]-[z]: Place further back and lips are rounded.
Result: Longer resonator in front of the constriction; longer
tubes have lower resonant freq’s. So, [ʃ] has more low freq
energy than [s]; [ʒ] has more low freq energy than [z].
More low
freq energy
for [ʃɑ]- than
[sɑ]. Same
deal for [ʒ]
and [z].
Place = Glottal (whisper)
Tongue, lips & jaw don’t have anything in
particular to do in the production of [h] since it
is a glottal articulation.
Since the vocal tract can do whatever it pleases
during [h], the tongue, lips & jaw will take the
position of the following vowel.
[h], then, is simply a whispered vowel:
he [hi]:
[h] = whispered [i]
who [hu]: [h] = whispered [u]
hoe [ho]: [h] = whispered [o]
Voiced glottal fricative, which may seem impossible.
When /h/ (the slashes here are deliberate) occurs between
two vowels, as in:
The glottal fricative can be breathy (partially voiced)
rather than whispered. In breathy voice, the glottis is
simultaneously producing hiss and buzz. Phonetically,
the resulting sound is called a voiced glottal fricative,
though voicing (periodic) and hiss (aperiodic)
elements from the glottis are mixed. The symbol is [ɦ].
hoy [h•
spectrum during [h] – no harmonics
ahoy [əɦi]
spectrum during [ɦ] – note the harmonics
3. Nasals
Vocal tract is closed (at the lips, alveolar ridge, or
velum); velum is lowered; acoustic energy flows
through the nose rather than mouth.
[m]: bilabial
[n]: alveolar
[ŋ]: velar
•[ŋ]: Symbol called engma or long n
•[ŋ] can end words (sing [sɪŋ]; lung [lʌŋ], bang
[beŋ], etc.) or appear in the middles of words
(singer [sɪŋɚ], sinker [sɪŋkɚ], languid [leŋgwɪd]),
but [ŋ] cannot begin words.
NOTE: Spelling convention: ng = [ŋ], but there is no [g]
and no [n] in sing, singer, song, hanger, stirring, bang, etc.
A [g] may follow the [ŋ], though:
[k] following [ŋ] is also common:
Q: Will the phonetic sequence [iŋ] (e.g.,
[siŋ]) ever be correct?
A: Nah. Never.
Q: What should it be?
A: [ɪŋ] (e.g., [sɪŋ])
Q: Is there an [n] in words like stinking or
A: No. These are engmas (e.g., [stɪŋkɪŋ]
not [stɪnkɪŋ], [blɪŋkɚ] not [blɪnkɚ].
4. Affricates
There are only 2 on these in English:
[tʃ] & [dʒ] (also [] and [])
church [tʃɚtʃ] (or [])
judge [dʒʌdʒ] (or [])
The mechanism of sound production: (1) the vocal
tract is completely occluded (with the velum up); the
occlusion is released into a short fricative: [ʃ] or [ʒ].
Affricates are stops followed by short fricatives.
Place: Alveopalatal/Palatoalveolar/Prepalatal; the
same as [ʃ]-[ʒ], not the same as [t]-[d].
Place is not alveolar, as indicated in the text.
5. Approximants (note the spelling)
Two Types of Approximants
[r] [l]
red [rɛd]
led [lɛd]
Glides (also called semivowels)
[w] [j]
wed [wɛd]
yet [jɛt]
These sounds are vowel-ish consonants, though they
are definitely consonants. For [r w j] (i.e., all but [l]),
there is a vowel with the same sound quality:
[r] : [ɚ] [w] : [u] [j] : [i]
[r] is the consonant version of [ɚ]
[w] is the consonant version of [u]
[j] is the consonant version of [i]
[l] is called a lateral: the tongue is on the alveolar ridge, and
acoustic energy flows along the two sides (lateral margins) of the
tongue. This is how [l] gets the name lateral. It’s all by itself; i.e.,
[l] is the only lateral consonant in English.
[r w j]: these are produced in the same way as [ɚ u i]
[r]: retroflex or bunched, somewhat rounded (like [ɚ])
[w]: high, back, rounded (like [u])
[j]: high, front, retracted lips (like [i])
Notice that these are features of vowel articulation, not features
of consonant articulation. But since these really are consonants,
somehow we have to force these onto a consonant articulation
chart using features such as alveolar, palatal, alveopalatal, etc.
It’s cumbersome and a bit forced, but it’s done.
[r] = alveolar (sometimes palatal); [w] = bilabial and velar; [j] = palatal
Classifications are somewhat arbitrary, but you still have to learn
One Other Way to Classify Approximants
[r w j]: These are central approximants. Sound
energy travels through the center of the
vocal tract.
This is a lateral approximant. Sound energy
travels around the sides of the tongue.
(Why? The tongue is in contact with the alveolar
ridge, forcing sound energy to go around the
That’s all there is to it.
This is an important thing to know because McKay
uses this central vs. lateral approximant
distinction in his place-manner-voicing system.
Last Point on Approximants
The symbol we’ve been using in here for
consonant R is [r].
In the IPA, [r] is used for a trilled R, as in Spanish
(and many other languages).
The official IPA symbol for the rhotic R that occurs
in English is [ɹ] (lower case ‘r’ rotated 1800).
This is a headache to write, and since English does
not have a trilled R, it’s convenient just to borrow
the [r] symbol. But you will sometimes see the [ɹ]
6. Flap
Alveolar place; like a [d], but with very brief contact
with the alveolar ridge. In English flaps occur as an
allophone of /t/ and /d/ between vowels and preceding
unstressed vowels:
6. Flap (cont’d)
How are [d] and [ɾ] different? [d] and [ɾ] are essentially the
same sound, but [ɾ] has a very brief contact with the
alveolar ridge.
The word “identity” (spoken as it typically is in ordinary
conversational speech) has one [d] and one [ɾ]. Try
transcribing this word.
identity: [aidɛnəɾi] (Only the last /t,d/ becomes a [ɾ]. Why?)
Note the very brief contact for the flap – much shorter than for
the [d]. Also note: Flaps precede unstressed syllables. Also:
Wendy [wɛndi] vs. wedding [wɛɾɪŋ]
Andes [ændiz] vs. attic [æɾɪk] (or [æɾək])
One of the most common mistake students (& some
teachers) make is using a [t] rather than a [ɾ] in words like:
phonetics, butter, plotter, heterogeneity, pattern, crater
6. Flap (cont’d)
Last point: Many languages use a flap as their
R sound; e.g., Japanese.
[ɑkiɾɑ] (not [ɑkidɑ])
[hiɾoʃi] (not [hidoʃi])
[aɾigɑto] (not [adigɑto])

consonants. ppt - Homepages