Sound and Unsound
Documentation:
Questions about the roles of audio in language
documentation
David Nathan
Endangered Languages Archive
School of Oriental and African Studies
University of London
www.hrelp.org
A paradigm shift?
 From evidence to
performance…
Documentation output
Wittenburg & Mosel (following Himmelmann):
“… the corpus should consist of a variety of text types
and genres.
Multimedia (sound and video) recordings form the
basis of the documentation work. These recordings
should be associated with an orthographic or
phonemic transcription, a translation in one of the
major languages of the world, and/or glossings in a
local lingua franca and English…”
Documentation output
Johnson & Dwyer:
“Genre
Interaction: conversation, verbal contest, interview, meeting/gathering,
riddling, consultation, greeting/leave-taking, humor, insult/praise, letter
Explanation: procedure, recipe, description, instruction, commentary, essay,
report/news
Performance: narrative, oratory, ceremony, poetry, song, drama, prayer,
lament, joke
Teaching: textbook, primer, workbook, reader, exam, guide, problems
Analysis: dictionary, word-list, grammar, sketch, field notes
Register informal/conversational, formal, honorific, jargon,
baby/caretaker talk, joking, foreigner talk
Style ordinary speech, code-switching, play language, metrical
organization, parallelism, rhyming, nonsense/unintelligible speech”
A paradigm shift?
 Sound as evidence in documentary linguistics …
data not independent of a theory which uses it
what is it? Disk, sound recording, file, file + metadata,
transcription etc
how to represent and store it
how to present it
what to do with it
Recorded/recording events as
performances
 Reifications of pattern or ideal
 Distinguish between event and record of it –
(fundamental for documentary linguistics)
 Repeatable, comparable; implies genre, audience
 Assists with protocol (attributes and participation)
 Allows editing to be methodologically possible
 Links us to existing fields’ knowledge and experience,
e.g. radio, cinematography, performing arts, music,
musicology, ethnography …
Archivism
 However, what we got was archivism
Archivism: capitulation of language documenters to the
agenda and priorities of archives and information
technology
 Why did this happen?
for historical reasons
rapid changes in technology
we left a vacuum
From evidence to archivism
 Positive aspects of archivism - for some, for now,
endangered languages field is luckier than others
 clear imperative to archive data
 benefits of new technologies (media, storage, convergence)
 funding and resources: DoBeS, EMELD, HRELP etc
 However
 may be short-lived
 we are thrust into competing with entities like banks
 not enough contribution to language strengthening etc
 not nurturing documentary linguistics
 a 'productivity paradox' as experienced by the financial sector?
What have we missed?
 Contact with wisdom and
experience of established fields
e.g.
 radio/broadcasting (eg mics, MD)
 cinematography (eg quality and
specialisation)
 journalism (eg equipment handling)
 audio archives (linguists had
input to IASA before 80s
or so)
What have we missed?
 Woodbury: most developments are "what's been
happening around the emergence of a documentary
linguistics", particularly technology, which has raised
expectations more than changed practices
Examples
 (Schüller) audio professionals use the
trained ear as evaluator of quality, while
linguists prefer wave-forms etc
cf value of binaural recording
 media people know that signals
emanate from events but do not
represent them
 recording to edit
Lost opportunities?
 Technical
 stereo, binaural
 monitoring while recording (headphones)
 environment and psychoacoustics
 microphones and handling
 editing
 Content
 everyday expressions, eg Yuwaalaraay ngarigaa
 capturing environment/eliminating environment
 preludes to stories that explain who is talking and why etc.
 Wider question is: in a mature documentary linguistics, is
there a clear, or even valid, boundary between these two?
Did we get what we needed?
What did we get?
advice about formats, parameters, what to avoid
'silver bullet' equipment and formats
fundamentalism and format wars
 What do we need? If we continue to be 'lone wolf'
fieldworkers, how to get good quality signals? Quality is
relative to purpose. But given exhortations to make 'best
record', what influences quality?
What influences audio quality?
 A large number of factors:
 physical environment (inside, outside)
 control/management of environment
 acoustics - room, objects
 microphone selection, placement, handling, compatibility
 mono/stereo/binaural
 sources of noise/interference
 recorder and recording medium handling
 Clearly these span fields: do they tell us anything about
the scope of documentary linguistics?
Disappearing recorders
 Zounds! Where’s my recorder?
 storage (eg iPod etc)
 A-D and storage (eg laptop)
 transducer (microphone)
 Reasons for using a recorder (not laptop)
 workflow
 quality assurance
 consistency
 power
 There are principles involved!
How much sound?
 Under archivism, repositories are seen to determine
amount as well as quality of data
 ELDP experience
 some applicants propose amounts of audio in terms of
technologies, eg flash cards only hold a few hours; or (on other
hand) voice recorders can hold hundreds!
 to get a grant!
 Understandable lurching back and forth between
extremes
 rapid changes in technology, and advice about it
 more information available about documentation agenda and
technologies
 competition for grants as opportunities in linguistics decrease?
How much sound?
 Determined by lists of output types and genres?
Wittenburg & Mosel:
“… the corpus should consist of a variety of text types
and genres.
Multimedia (sound and video) recordings form the
basis of the documentation work. These recordings
should be associated with an orthographic or
phonemic transcription, a translation in one of the
major languages of the world, and/or glossings in a
local lingua franca and English…”
How much sound?
Johnson & Dwyer:
“Genre
Interaction: conversation, verbal contest, interview, meeting/gathering,
riddling, consultation, greeting/leave-taking, humor, insult/praise, letter
Explanation: procedure, recipe, description, instruction, commentary, essay,
report/news
Performance: narrative, oratory, ceremony, poetry, song, drama, prayer,
lament, joke
Teaching: textbook, primer, workbook, reader, exam, guide, problems
Analysis: dictionary, word-list, grammar, sketch, field notes
Register informal/conversational, formal, honorific, jargon,
baby/caretaker talk, joking, foreigner talk
Style ordinary speech, code-switching, play language, metrical
organization, parallelism, rhyming, nonsense/unintelligible speech”
How much sound?
 Possible answers :
distinguish recording from outputs/products (incl
archive deposit as one output)
ELDP/ELAR: demonstrate 10% commitment
let language community members and academic
peers judge, not archives or technologies
Un-sound documentation?
 Johnston & Schembri: Documenting AUSLAN
no writing or widely-used transcription system
no standardization associated with the culture and
history of writing
no written literature; little known about genres etc
no possibility of processing, eg corpus work or 'text
mining‘
Un-sound documentation?
 Johnston sees tools like MPI’s ELAN as the
equivalent of 'writing' for signed languages
 Problems annotating video
for SL also raise issues
being questioned in
mainstream linguistics
eg existence and atomicity
of grammatical categories
Sound interfaces
 Spoken Karaim and ShoeHorn
Run
Is audio the prime representation?
Multi-tiered, multi-scoped annotation cf recent
ELAP workshop where meaning in
documentation seen as:
at different linguistic levels
changing and ongoing over time
messy, irreconcilable, contested
drawing on meanings and texts outside the text in
question
Suggests that audio recording is merely one
(important!) aspect of the documenter’s toolset
Other questions
 Who does the recording?
 Can community members only use cassettes?
 What changes if we shoot video as well?
 Are community members more motivated if they can
shoot video?
 Would we collect data by phone if there was sufficient
bandwidth?
 What audio resources are most effective for language
strengthening?
 Have we conflated fieldwork methodology with
documentation’s outputs?
Conclusions
 In language documentation, a twin shift to
data orientation and
digitisation
has led us into domains where there is a wealth of
existing experience, which we can not easily tap into,
while competing against those who we can't possibly
match
 Treat audio as a way to capture various kinds of
performances, not as the object of description
 We are lacking interfaces and software for working
with and presenting audio
Thank you
Descargar

HRELP Presentation - Endangered Languages SOAS