Internationalising SSML
Perspectives from the Local Language Speech
Technology Initiative
Ksenia Shalonova & Roger Tucker
Outside Echo Ltd
04 October 2015
1
Local Language Speech Technology Initiative
Mission – provide tools, support and training for developing Speech and
Language systems for indigenous languages mainly in developing
countries.
kiSwahili Ukimkirimu Mola wako hukosi fungu lako. (If you are generous
to your God, you will not miss your share.)
isiZulu
IMalaria idalwa amagciwane ahlasela igazi lomuntu. (Malaria
is caused by parasites that infect human red blood cells.)
Hindi
Maleria kithnaa gambheer hai? (How serious is Malaria?)
04 October 2015
2
Local Language TTS (Text to Speech)
Outside Echo, Bristol
CSTR, Edinburgh
University of Bielefeld
UK
Germany
IIIT Hyderabad
india
HP Labs
Nigeria
IISc Bangalore
Kenya
University of Uyo
University of Nairobi
South Africa
Meraka,
Pretoria
04 October 2015
3
Decomposition of a Word into its Constituents (1)
Required for the following TTS modules:
•Proper grapheme-to-sound rules (agglutinating languages)
•Proper tone assignment (agglutinating tonal Bantu languages)
<morph decompose_as="dep+pe">deppe</morph>
(Ibibio reduplication dep – “buying” and deppe – “not buying”)
The following Turkish word is quite easily decomposed into its constituents:
osman+li+las+tir+ama+yabil+ecek+ler+imiz+den+mis+siniz+cesine
(“as if you were of those whom we might consider converting into an Ottoman”)
04 October 2015
4
Decomposition of a Word into its Constituents (2)
(Schwa deletion in Hindi Compound Words)
Each consonant in Hindi is associated with inherent schwa when schwa can be
either deleted or preserved. In order to provide proper schwa deletion rules the
following decompositions are required:
•Decomposing Hindi compound words into single words
•Decomposition of Hindi non-compound words into morphemes
Lok(“public”)+sabhA(“gathering”)=>loksəbhA(“lower house of the
parliament”)
<morph decompose_as=“lok+sabhA">loksabhA</morph>
Another option for specifying Hindi schwa deletion could be <say-as>
04 October 2015
5
Decomposition of a Word into its Constituents (3)
(Moving Lexical Stress in Russian)
Proper lexical stress assignment is based on the stem type and the
morphological class.
Possible solutions in SSML annotation:
1. Adapting a morphological lexicon (a pronunciation lexicon is not helpful as
the number of wordforms in Russian is enormous)
2. Decompose a word into its constituents.
Decomposing words in the inflecting languages into its constituents by naïve speakers is
much more difficult than decomposing words in the agglutinating languages.
3. Inserting an explicit tag <lexical_stress> would be the easiest way of
handling a moving lexical stress. E.g. b<lexical_stress>e</lexical_stress>gal
04 October 2015
6
Prosody
(general remarks)
Prosody is mainly realised on the syllabic level
Tags either for all syllables or only for particular syllables are required
for proper assignment of prosodic features
A Tag for a particular syllable:
good <syllable prosody_rate="+10%" stress=“yes” emphasis_level="strong">mor</syllable>ning
04 October 2015
7
Prosody
(African Tonal Languages)
1. Lexical tones (tones function as phonemes).
Require phonemic tone markup as used for Mandarian.
1.1. Floating tone (a morpheme that contains only tone)
<tone floating_tone= "yes" >ba</tone>
2. Grammatical tones (tones define grammatical categories).
<morph decompose_as="dep+pe" tone= "h+l" >deppe</morph>
3. Terraced tones (realisations of grammatical tones on the basis of a
finite state model).
<morph decompose_as="dep+pe" terrace_pos= "1+2" >deppe</morph>
04 October 2015
8
Dialects and Styles
1. Tags for the Dialects
<lang= "kiSwahili" region= "Comoros Islands" >
<lang= "kiSwahili" dialect="Kingozi" normative= "yes" >
2. Tags for the Styles
Ibibio culture requires a TTS in a gentle voice. Is the available attribute volume=
"soft" enough?
<voice type = "gentle" >
04 October 2015
9
Summary
1. Decomposition of words into either morphemes or syllables is
required for
• tone assignment
• pitch & duration assignment
• grapheme to phoneme rules
2. Moving lexical stress may need to be tagged explicitly
3. Dialects and styles need to be supported
04 October 2015
10
Descargar

LLSTI for ITC-4D