High-quality
Speech Translation
for Language Learning
Chao Wang and Stephanie Seneff
June 24, 2004
Spoken Language Systems Group
MIT Computer Science and Artificial Intelligence Lab
Outline
• Motivation and introduction
• Component technologies
– Language understanding
– Language generation
•
•
•
•
Translation by generation
Translation by example
Evaluation
Summary and future work
Background
• Language teachers have limited time to interact with
students in dialogue exchanges
• Computers can provide non-threatening environment in
which to practice communicating
• Our group has been developing multi-lingual spoken
conversational systems since 1990
– Concentrating on domains related to travel
– Can easily be adapted for language learning applications
– A translation capability from the native language (L1) to the target
language (L2) can greatly improve their usability for language
learning
Introduction
• Goal: provide translation aids for language learning
– Must be high quality
– Must be robust to speech recognition errors
• Strategies for achieving high quality and robustness
– Interlingua-based translation using formal generation rules
– Restricted conversational domains (lesson plans)
* Emphasis on mechanisms to enable rapid porting to
new domains and languages
– Use parsability to assess quality of translation outputs
– Back off to example-based method when parse fails
Language Understanding: TINA
Approach: Context free rules + constraints + probabilities
Rules:
– Define permissible linguistic patterns in the language and
domain
– Encode both syntactic and semantic information
Constraints:
– Eliminate patterns that violate known syntactic/semantic
restrictions (e.g., number agreement)
– Account for movement of constituents in surface realization
Probabilities:
– Support prediction of next word given preceding context
TINA has been used in many systems over the last 10 years:
– Domains: weather, air travel, restaurant guide, hotel
reservations, urban navigation, . . .
– Languages: English, Mandarin, Japanese, Spanish, French. . . .
Process to Automate
Grammar Development
Orion
Mercury
“Scrubbed”
sentences
Domain
dependent
semantics
Pegasus
Jupiter
Merged
“Seed”
Grammar
Generic
Grammar
Voyager
“Are there any <noun>
from <proper_name>
to <proper_name>”
Grammar for
New Domain
• Merge several grammars into shared rules,
predominantly syntax-based
• Once generic grammar is available, creating derivative
domain-dependent grammars is straightforward
Example Parse Tree
sentence
question
will
subject
predicate
intr_verb_phrase
intr_verb
intr_verb_args
locative
in
a_city
city_name
will
it
rain
in
boston
temporal
day_lis
t
this weekend
this
weekend
• Utilizes pre-existing sub-grammars for time and location
• Selected parse categories contribute to a hierarchical
semantic frame (interlingua)
Semantic Frame for Example
Will it rain in boston this weekend?
{c verify
:aux “will”
:subject “it”
:pred {p rain
:pred {p locative
:prep ‘in”
:topic {q city
:name “boston” } }
:pred {p temporal
:topic {q weekday
:quantifier “this”
:name “weekend” } } } }
Semantic frame encodes syntactic structure and features in
addition to semantic information
Language Generation: GENESIS
• Generates a surface string from the semantic frame
• Accomplishes many tasks in dialogue system development
– In the same language (paraphrasing & response generation)
– In a different language (translation)
– Other formal languages (key-value pairs, SQL queries, etc.)
• Utilizes recursive formal rules along with a lexicon encoding
appropriate surface form realizations in context
Challenges in Cross-language
Generation for Translation
• Some expressions have very different syntactic structures
in different languages
What is your name?
I like her.
你(you) 叫(call) 什么(what) 名字(name)?
Ella me gusta.
• Syntactic features are expressed in many different ways
– Determiners (English but not Chinese)
附近(vicinity) 哪儿(where) 有(have) 银行(bank)?
Where is a bank nearby?
– Particles (Chinese but not English)
that hotel
那(that) 家(<particle>) 旅馆(hotel)
I lost my key.
我(I) 丢(lose) 了(<past tense>) 我的(my) 钥匙(key).
– Gender (extensive in Spanish)
Generation Procedures
• Constituent order specified in recursive rules
– “Pull” and “Push” mechanisms support major structural
reorganization
• Lexical selection controlled by feature propagation
– Inflectional forms based on syntactic features
– Lexical realization (word sense) influenced by surrounding
semantic context
• Infers missing features
• Can generate multiple surface strings for the same
semantic frame
A Generation Example
{c verify
:aux “will”
“will” conditioned by “verify”
:subject “it”
:pred {p rain
:pred {p locative
:prep ‘in”
:topic {q city
:name “boston” } }
pulled to the front :pred {p temporal
:topic {q weekday
:quanitifier “this”
:name “weekend” } } } }
bo1 shi4 dun4 zhe4 zhou4 mo4
(
Boston
this weekend
hui4 bu2 hui4
will-not-will
zhe4 zhou4 mo4 bo1 shi4 dun4
( this weekend
Boston
hui4
will
xia4 yu3 ?
rain
? )
xia4 yu3
ma5 ?
rain
<question-particle> ? )
English
Input
Parse
English
Grammar
•
•
•
•
Semantic
Frame
Generate
Chinese
Rules
Chinese
Sentence
rejected
Generation-based Translation
Example-based
Translation
Parse?
Chinese
Output
accepted
Chinese
Grammar
Semantic frame serves as interlingua
Translation achieved by parsing and generation
Use Chinese grammar to detect potential problems
Rejected sentences routed to example-based translation for
a second chance
Example-based Translation
• Requires translation pairs and a retrieval mechanism
– Corpus automatically obtained via the generation-based approach
– Retrieval based on lean semantic information
* Encoded as key-value pairs
* Obtained from semantic frame via simple generation rules
* Generalizes words to classes (e.g., city name, weekday, etc.) to
overcome data sparseness
Example-based Translation Procedure
English
Input
Parser
English
Grammar
Semantic
Frame
Generator
KV
String
Chinese
KV-Chinese Output
Table
Key-value
Rules
Is there any chance of rain in San Francisco?
<CITY>
WEATHER: rain CITY: San
Francisco
San jin1
Francisco
{ <CITY> : jiu4
shan1 }}
jiu4 jin1<CITY>
shan1 hui4 bu2 hui4 xia4 yu3?
• Key-value string serves as interlingua
• Translation achieved by parsing and table lookup
• City name masked during retrieval and recovered in final
surface string
Complete Translation Procedure
Retrieval
Creation
Key-value Index
Database
Chinese
English
Semantic
yes
no
Sentence
yes
Input
Frame
Parse
Generate
Parses?
Key-value
Rules
English
Grammar
will it rain in Boston tomorrow?
Chinese
Rules
translation
Chinese
Grammar
bo1 shi4
dun4 ming2 tian1 hui4 xia4 yu3 ma5?
<CITY>
WEATHER: rain CITY: boston
<CITY>
•
•
•
•
Only parsed sentences go into key-value database
Indexed by semantic information encoded as key-value string
Unnparsed translations replaced by key-value option
Use word classes to overcome data sparseness
Evaluation: English to Mandarin
Weather Domain
• Evaluation data
–
–
–
–
Drawn from the publicly available Jupiter weather system
Telephone recordings; conversational speech
Unparsable utterances (English grammar) were excluded
Total of 695 utterances, with 6.5 words per utterance on average
• System configuration
– Text input or speech input
* Recognizer achieved 6.9% word error rate, and 19.0% sentence
error rate
– Generation-based method preferred over example-based method
– NULL output if both failed
• Evaluation criteria
– Yield of each translation method
– Human judgment of translation quality
Evaluation Results (I)
Yield
By generation
By example
Failed
Total
Text
606
59
30
695
87.2%
8.5%
4.3%
100%
Speech
592
85.2%
48
6.9%
55
7.9%
695
100%
• Majority of the utterances are successfully translated using
formal generation rules, which are likely to achieve high
fidelity and quality
• A greater percentage of the utterances fail in the speech
mode, due to recognition errors
– System will apologize for not understanding the utterance and
invite the user to try again
Evaluation Results (II)
Quality
Perfect
Acceptable
Wrong
Failed
Total
Text
613
43
9
30
695
88.2%
6.2%
1.3%
4.3%
100%
Speech
577
83.0%
50
7.2%
13
1.9%
55
7.9%
695
100%
• Human judgment of translation quality based on
grammaticality and fidelity
• Three categories: perfect, acceptable, or wrong
• Fewer than 2% of the utterances produce incorrect
translation outputs
– A concurrent English paraphrase provides context for the
Chinese translation
Summary and Future Work
• We have demonstrated a capability to produce high-quality
spoken-language translations from English to Mandarin
– Evaluation restricted to weather domain
– Fewer than 2% of the translations were incorrect
Future Plans:
• Integrate into spoken dialogue systems
• Incorporate framework into classroom environment
• Assess effectiveness in second-language acquisition
• Port to other domains and languages
– Develop tools to enable rapid porting
Thank you!
Translation Corpus
Key-value
Rules
English
Input
Parser
Semantic
Frame
KV-Chinese
Table
Generator
Parser
English
Grammar
will it rain in Boston tomorrow?
Chinese
Rules
Chinese
Output
accepted
Chinese
Grammar
bo1 shi4
dun4 ming2 tian1 hui4 xia4 yu3 ma5?
<CITY>
WEATHER: rain CITY: boston
<CITY>
• Guaranteed coverage by the Chinese grammar
• Indexed by semantic information encoded as key-value string
• Use word classes to overcome data sparseness
Translation Corpus
Key-value Index
Database
Chinese
English
Semantic
yes
Sentence
Input
Frame
Parse
Generate
Parses?
Key-value
Rules
English
Grammar
will it rain in Boston tomorrow?
Chinese
Rules
Chinese
Grammar
bo1 shi4
dun4 ming2 tian1 hui4 xia4 yu3 ma5?
<CITY>
WEATHER: rain CITY: boston
<CITY>
• Guaranteed coverage by the Chinese grammar
• Indexed by semantic information encoded as key-value string
• Use word classes to overcome data sparseness
Interlingua-based Speech Translation
Common meaning representation: semantic frame
English
Chinese
SUMMIT
Recognition
TNLU
INA
Interlingua
Parsing
Rules
Speech
Corpora
Models
Generation
Rules
GENESIS
NLG
ESynthesis
NVOICE
English
Chinese
Understanding and Generation:
Procedural Strategy
• Develop end-to-end English system
– Solicit example utterances from SLS members
• Create generation rules for Chinese paraphrase
– Generated sentences become initial Chinese corpus
• Develop understanding component for Chinese input
– Map to identical semantic frame as much as possible
• Adjust English generation for Chinese inputs
– Deal with missing function words, etc.
– Translation loop now possible:
English  Chinese  English
• Evaluation based on English-to-translated-English
• Similar strategy for other languages
Strategies for Translation
• Grammar design strategies
– Preserve as much information as necessary for accurate
translation
* Semantic frames are much more detailed than those in humancomputer interaction applications
– Maintain consistency of semantic frame representation
across different languages whenever possible
* Seed grammar rules for each new language on English
grammar rules
* Mapping from parse tree to semantic frame preserved
• Remaining language dependent aspects in semantic
frame are addressed by generation rules
An Example: English/Chinese
How long does it take to take a taxi
there
How long does it take
need
take to take a taxi go there
there
( take
taxi go there
坐 出租车 去 那里
•
•
•
•
need
how long )
要
多久
Function words disappear in Chinese
Two instances of “take” have different translations
Verb “go” omitted in English
Sentence structure is very different
Descargar

LCS-Marine: Project Overview