Translation Divergence
LING 580MT
Fei Xia
1/10/06
Papers
• Bonnie Dorr (1994): Machine Translation
Divergences: a Formal Description and
Proposed Solution
Outline
•
•
•
•
Formal definition of translation divergence
Seven types of divergence
Discussion
Remaining questions
Formal definition of
translation divergence
Distinction between the source and
target languages
Two categories (Bernett et. al., 1991):
• Translation divergence: same information,
different structures
• Translation mismatches: different
information  important, but outside of the
scope of the paper
How to define translation
divergence formally?
Define the language-to-language divergence
via language-to-interlingua divergence:
Interlingua: lexical conceptual structure
(LCS)
Language-to-interlingua: mapping from
syntactic form to LCS
Lexical conceptual structure (LCS)
[ T ( X ') X '
([ T (W ') W ' ],
'
[ T ( Z ' ) Z 1 ]...[
1
'
'
T (Zn )
Zn]
'
[ T ( Q ' ) Q 1 ]...[
1
'
'
T (Qm )
Q m ])]
X’
T(X’)
W’
T(W’)
Z1’
T(Z1’)
…
Zn’
T(Zn’)
Q 1’
T(Q1’)
…
Q m’
T(Qm’)
X’: logical head
W’: logical subject
Z1’…Zn’: logical argument Q1’…Qm’: logical modifiers
T(Φ’) is the logical type (Event, Path, ….) of the
primitive Φ’ (CAUSE, LET, GO, …)
Root LCS (RLCS)
• A RLCS is an un-instantiated LCS that is
associated with a word definition in the
lexicon (i.e., a LCS with unfilled variable
position)
• LCSs are recursively defined.
RLCS representation for go
GOLoc
Event
X
Thing
TOLoc
Path
ATLoc
Position
X
Thing
Z
Location
 It is different from dependency structure
Composed LCS (CLCS)
• A CLCS is an instantiated LCS that is the
result of combining two or more RLCSs by
means of unification (roughly).
• This is the interlingua form that serves as
the pivot between the source and target
languages.
CLCS representation for
“John went happily to school”
GOLoc
Event
John
Thing
TOLoc
Path
Happily
Manner
ATLoc
Position
John
Thing
School
Location
The operations of combining are not defined in this paper.
Syntactic phrase
X: syntactic head
W: external argument
Z-MAX i: internal arguments Q-MAXi: syntactic adjuncts
 Similar to X-bar theory, GB theory, etc.
An example
Mapping between LCS and
syntactic form
• Generalized linking routine (GLR):
–
–
–
–
X’  X
W’  W
Z’  Z
Q’  Q
(logical head  syntactic head)
(logical subject  external argument)
(logical argument  internal argument)
(logical modifiers  syntactic adjunct)
• Canonical syntactic realization (CSR)
– Relate T(Φ’) to CAT(Φ): (logical type  syntactic
category)
Ex: THING N, EVENT  V
Divergence problem
• Translation divergences occur when there
is an exception either to the GLR or to the
CSR (or to both) in one language, but not
in the other.
Outline
•
•
•
•
Formal definition of translation divergence
Seven types of divergence
Discussion
Remaining questions
T1: Thematic divergence
• The repositioning of arguments w.r.t. a head.
• GLR: W’ Z and Z’W
• Example: I like Mary  Maria me gusta
:INT and :EXT
General Solution
T2: Promotional Divergence
• Promoting a logical modifier into a main verb position (or
vice versa)
• GLR: X’Z and Q’X
• Ex: John usually goes home  Juan suele ir a casa
:PROMOTE
General Solution
T3: Demotional Divergence
• Demoting a logical head into an internal
argument (adjunct?) position (or vice
versa).
• GLR: X’Q and Z’X
• Ex: I like to eat Ich gern esse
:DEMOTE
General Solution
T4: Structural divergence
• It does not alter the positions used in GLR mapping
• But it changes the nature of the relation between
different positions (i.e., the “” correspondence)
• Ex: John entered the house Juan entro en la casa
* marker
Marker forces logical constituents to be realized
compositionally at different levels
General solution
T5: Conflational Divergence
• The suppression of a CLCS constituent (or
the inverse of the process)
• GLR:  correspondence of step (3) or (4)
is changed.
Example
I stabbed John  Yo le di punaladas a Juan
:CONFLATED
General solution
T6: Categorical divergence
• CAT(Φ) is different from CSR(T(Φ’)).
• Ex: I am hungry  Ich hunger habe
:CAT
General solution
T7: Lexical divergence
• As a side effect of other divergences.
• Ex: John broke into the room  Juan forzo la entrada al cuarto
Summary of seven types
• Repositioning (GLR mappings): thematic,
promotional, demotional divergences
• Changing  correspondence: structural,
conflational divergences
• Category: categorical divergence
• ??: Lexical divergence
Discussion
Discussion
• Limits on Repositioning Divergences
• Promotional vs. Demotional Divergences
• Lexical Selection: Full Coverage
Constraint
• Interacting Divergence Types
Limits on Repositioning
divergences
• Three types to cover all repositioning
divergences:
– Thematic:
W’Z, Z’W
– Promotional: X’Z, Q’X
– Demotional: X’Q, Z’X
• (X, W, Z, Q)  (X’, W’, Z’, Q’)
– W has a special status: 44=256  33=27
– a CLCS must contain exactly one head:
33=2712
Limits on Repositioning
Divergences (cont)
• Z can never be associated with Q’, and Q
can never be associated with Z’: 12 5
• Modifying relation cannot be reversed:
54 (Q’X, X’Q, Z’Z)
• Argument relation cannot be reversed: 4
3 (Z’X, X’Z, Q’Q)
• Canonical positions: 3  2
Promotional vs. Demotional
Divergences
• Promotion is triggered by a main verb
(e.g., soler in soler-usually)
• Demotion is triggered by an adverb (e.g.,
gern in like-gern)
Interacting Divergence Types
• Promotional and thematic divergence:
S: Leer libros le suele gustar a Juan
‘reading books (him) tends to please (to) John’
E: John usually likes reading books
Remaining questions
Remaining questions: Interlingua
• How to build RLCS?
– What are logical head, subject, arguments and
modifiers? Ex: like  likingly
– How to represent a verb: stab  CAUSE GOPoss
KNIFE-WOUND
• How are RLCSs combined to form CLCSs?
– Unification = substitution?
• Are CLCSs really sufficient to handle all the
languages?
Remaining issues: divergences
• Are the seven types really sufficient to
cover all the convergences?
– Is the “proof” for limits on repositioning
divergences convincing?
– “Translation divergences occur when there is
an exception to GLR/CSR in one language,
but not the other”: what if there are exceptions
in both languages?
– Can a dependent of X become a dependent
of Y?
Remaining issues: MT
• How to build a real MT system with this
approach?
Descargar

Divergence - UW Faculty Web Server