Tree Bimorphisms and Their
Relevance in the Theory of
Translations
Cătălin-Ionuţ Tîrnăucă
GRLMC, Rovira i Virgili University
[email protected]
CLG 08, Madrid, 28/06/2008
Outline
Motivation
 Tree transformations
 Ways to define tree transformations
 What is a tree bimorphism?
 What can we do with a bimorphism?
 Conclusions
 References

CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Motivation: why using trees in NLP?




Strings and finite-state machines are extremely useful
in speech recognition, text recognition, etc.
NOT suitable for translations between natural
languages:
 don’t capture syntax-sensitive transformations;
 don’t execute certain reorderings of parts of
sentences.
Hence, researchers from computational linguistic
community have shown an increasing interest in more
powerful formalisms, ones that can model trees and
tree transformations.
The new field of syntax-based machine translation
was established: (Knight and Graehl 2005).
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Motivation: formal properties required


Accuracy is an important issue in machine translation,
so the whole theory should rely on a solid
mathematical background (formal languages).
A perfect model: (Knight 2008), should have at least:
 expressiveness (reordering parts of sentences, i.e.,
local rotation),
 inclusiveness (generalizing the finite state
machines),
 modularity (closure under composition, i.e.,
breaking the initial problem into smaller pieces
easier to solve), and
 teachability (efficient training).
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Motivation: defining tree transformations



Many of the formalisms proposed until now fail in at
least one of the above criteria.
We mention the weak and strong parts of some of
them, as well as the relation between them.
We will see (something about and types of):





tree transducers (formal language theory),
tree homomorphisms (formal language theory),
synchronous grammars (linguistics), and
TREE BIMORPHISMS (formal language theory).
We speak more about modularity: think of what
success finite-state machines had because of the
noisy-channel concept!
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Tree transformations: first trees





Ranked alphabet :
={S/2, NP/1, NP/3
symbols of a given rank,
N/0,VP/2,V/1}
i.e., number of branches (if
not unique you can
X=Spanish alphabet
rename); represent parts
Yield = Vamos a la playa
of sentences (NP, VP)
Leaf alphabet X: usual
S
alphabets of “words”
called variables (Spanish,
VP
NP
Russian)
T(X): set of all trees
V
NP
N
Tree language: subset of
T(X)
Yield: concatenated
Vamos
la
a
playa
sequence of leaf variables
read from left to right
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Tree transformations and translations


Tree transformation  : a collection of pairs of parse
(syntax) trees of natural language sentences.
Its translation ( ): a set of pairs of words from the
two natural languages considered: just take the yield
of such a pair of trees.
VP
NP
D
The
N
V
castle burns
S
t  T (Y)
s  T (X)
 transforms s into t, i.e., (s,t)  
yd(s) is translated into yd(t)
VP
NP
V
N
Arde
castelul
X= English words
Y= Romanian words
={S/2,NP/2,VP/1,NP/1,D/1, N/1}
={S/2,NP/2,VP/1,NP/1, N/1}
T (X)= English parse trees

Output Romanian
Input English
S
T (Y)=Romanian parse trees
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Define tree transformations: tree transducers
Input tree
Output tree
Tree
Transducer


A tree transducer uses a set of states to encode information, and
given an input tree over the input ranked alphabet, computes a (set
of) output tree(s) over the output ranked alphabet in two ways:
 Top-down: starts at root, finishes at leaves;
 Bottom-up: starts at leaves, finishes at root.
Many types introduced (in general, modularity doesn’t hold or
cannot be proved), at least two look very promising:
 Extended top-down tree transducers: (Knight and Graehl 2005)
 Multi bottom-up tree transducers: (Maletti, Engelfriet and Lilin
2008), (Maletti 2007)
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Define tree transformations: tree
homomorphisms (I)

A tree homomorphism :T(X)T(Y) is:


a mapping which, based on certain rules defined for
every input symbol, transforms recursively an input
tree into a (totally or not) different output tree, or
A top-down tree transducer with one state.

Defines a simple tree transformation
={(s,(s))s T(X)}; usually one just
identifies  with  treating  as a relation.
 We need auxiliary variables  to indicate an
occurrence of a subtree in a tree during the
processing.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Define tree transformations: tree
homomorphisms (II)

:T(X)T(Y) is defined by the rules:
 (x)= X(x): each x in the input replaced by a tree X(x) from output;
 (c)= 0(c): each constant c in the input replaced by a tree 0(x);
 (f(t1,…,tm))= m(f)(1 (t1),…, m (tm)): each root f in the input
tree replaced by a tree m(f) from the output, where auxiliary variables

appear as leaf symbols;
X={in, blue}
={NP/3,ADJ/1,D/0}
T (X)= English parse trees
NP
in
1
D
ADJ
2
3
blue
Input tree s in T(X)
Y= {the, red, pen, out, of}
3(NP)=VP( 2, 3, 1, of)`
1(ADJ)=N(big,  1, pen)
0(D)=D(the)
X(in)=out
X(blue)=red
D VP
3(NP)=
X(in)=out
3 1 of
X2the
(bleu)=red
N
1(ADJ)=
big 1
pen
={VP/4,N/3,D/1}
T (Y)=English parse trees
VP
D
N
out
of
the big red pen
Output tree (t) in T(Y)
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Define tree transformations: tree
homomorphisms (III)

Several types of tree homomorphisms exist depending
on the restrictions imposed on each tree m(f) and
implicitly on auxiliary variables:






Linear: in each m(f) no copy of a subtree is allowed (no i
appears twice or more);
Non-deleting: in each m(f) no subtree information is lost
(each i appears at least once);
Strict: no m(f) can be reduced to an auxiliary variable i;
Quasi-alphabetic: linear, non-deleting, strict, each constant is
mapped to a constant or a tree of height 1 (all leaves are
symbols from the output leaf alphabet), and each m(f) is a
tree of height 1 in which the subtrees are reordered and
symbols from the output leaf alphabet may appear as leaves.
H: the class of all tree homomorphisms.
lH, nH, sH, qH: the corresponding restricted class.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Define tree transformations:
synchronous grammars (I)



Use the idea of synchronous rewriting: two formal grammars work
in parallel, productions being linked by some relation and applied
synchronously.
Pairs of recursively related words are generated simultaneously.
Syntax-directed translators, first used in compilers, use the power
of context-free rewriting and can be viewed as a three-step
process one of which is a tree transformation:
1. For a given input sentence u, construct a parse tree for u.
2. Transform the parse tree into a tree in the output grammar.
3. Take the yield of the output tree as a translation for u.
S VP1 P2 you VP3 ; P2 VP3 VP1
P  before ; antes de
VP
VP  speak ; hablar
VP  think ; pensar
think
S
S
P
before
you
VP
speak
,
P
VP
antes de hablar
VP
pensar
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Define tree transformations:
synchronous grammars (III)

Many types of syntax-directed translations:
 Syntax Directed Translation Schemata: (Aho and Ullman
1972),
 Synchronous CFGs: (Satta and Peserico 2005),
 Inversion Transduction Grammars: (Wu 1997),
 Tree-to-string models: (Yamada and Knight 2001),
 Hierarchical Phrase-Based Models: (Chiang 2007), etc

Beyond the generative capacity of
CF rewriting (multi-level rules):
 Synchronous Tree Substitution
Grammars: (Shieber 2004).
Involve discontinuous constituents:
 Synchronous TAGs: (Shieber and
Schabes 1990),
 Generalized Multitext Grammars:
(Melamed et all 2004).

CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Define tree transformations:
synchronous grammars (III)

Synchronous grammars derivations can be viewed as
pairs of trees, just as usual derivations in formal
grammars can be viewed as trees.
 From the way of defining them, we can see that they:
 easily capture syntax-sensitive transformations,
 execute local rotations, and
 are in general trainable.
 Unfortunately no modularity results were known or
could be proved until recently: (Shieber 2004).
 What helped us improve the modularity? Answer:
TREE BIMORPHISMS.
 Why? Don’t be impatient and don’ fell asleep. It follows
right now. This is the best part and not too much
mathematics.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
What is a tree bimorphism? (I)

A mathematical, nice alternative way to define
tree transformations.
 Excellent tool for proving mathematical
properties of classes of tree transformations.
 Consists of two tree homomorphisms defined
on the same (regular) tree language, i.e.,
domains of tree transducers that computes a
partial identity function.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
What is a tree bimorphism? (II)
t
T(X) 
RT(Z)


Input language
(t)
yd((t))
Common abstract
language: Interlingua
transforms into
is translated into
(t) T(Y)
yd((t))
Output language



B(H1,Rec,H2): a class of tree bimorphisms.
B(H1,Rec,H2): the class of tree transformations defined by B.
B=(,R,): a tree bimorphism from the class B, with H1, RRec, H2.
 B(lnH,Rec,qH): the class of all tree bimorphisms in which the first tree
homomorphism component is linear and non-deleting, the second is a
quasi-alphabetic tree homomorphism.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
What can we do with a bimorphism?



Concept very similar with synchronous grammars:
from derivation trees to derived trees.
This way, we can encode the derivations into the
common tree language, and by using the tree
homomorphisms to get the derivations in input/output
grammars.
In 2004, Shieber was the first one who linked
synchronous grammars and tree bimorphisms; he was
trying to improve modularity and other mathematical
properties of classes of tree transformations defined
by synchronous grammars.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Results obtained

The tree bimorphisms B(lnH,Rec,lnH):



The tree bimorphisms B(lnsH,Rec,lnsH):



not closed under composition: (Maletti 2007);
define the same class of tree transformations as STSGs:
(Shieber 2004).
not closed under composition: (Arnold and Dauchet 1982);
define the same class of tree transformations as a special
STAG: (Shieber 2006).
The (quasi-alphabetic) tree bimorphisms
B(qH,Rec,qH):



closed under composition: (Steinby and Tîrnăucă 2007);
define the same translations as STDSs: (Steinby and
Tîrnăucă 2007), SCFGs and ITGs: (Tîrnăucă 2007);
the tree transformations are encoded in their bimorphism
counterpart.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Conclusions; future work




Properties of/ modeling tree transformations: a
new challenge in machine translation.
Modularity, or closure under composition, a big
issue.
There are several ways to define tree
transformations. One connects all of them and
helps us to prove modularity: tree bimorphisms.
We saw some results obtained recently. What
about other useful properties of synchronous
grammars, other synchronous grammars and
other tree bimorphism? Can we obtain more
improvements?
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
References (I)








Aho, Alfred and Jeffrey Ullman (1972). The theory of parsing, translation, and compiling,
Volume I: Parsing. New Jersey: Prentice Hall Professional Technical Reference.
Arnold, André and Max Dauchet (1982). “Morphismes et bimorphismes d’arbres”, Theoretical
Computer Science 20: 33–93.
Chiang, David (2007). “Hierarchical phrase-based translation”, Computational Linguistics
33(2):201–228.
Engelfriet, Joost, Eric Lilin and Andreas Maletti (2008). “Extended multi bottom-up tree
transducers”. Manuscript. URL: http://wwwtcs.inf.tu-dresden.de/~maletti/pub/engmal08.pdf.
Knight, Kevin (2008). “Capturing practical natural language transformations”. Manuscript. URL:
http://www.isi.edu/natural-language/mt/capturing.pdf.
Knight, Kevin and Jonathan Graehl (2005). “An overview of probabilistic tree transducers for
natural language processing”. In: Alexander F. Gelbukh, ed., Computational Linguistics and
Intelligent Text Processing 6th International Conference, CICLing 2005, Proceedings, vol.
3406 of LNCS. Berlin: Springer-Verlag, pp. 1–24.
Maletti, Andreas (2007). “Compositions of extended top-down tree transducers”. In: Remco
Loos, Szilárd Z. Fazekas and Carlos Martín Vide, eds., Proceedings of the 1st International
Conference on Language and Automata Theory and Applications, LATA 2007, vol. 35/07 of
GRLMC Reports. Tarragona: Universitat Rovira i Virgili, pp. 379–390.
Melamed, I. Dan (2003). “Multitext grammars and synchronous parsers”. In: HLT-NAACL 2003,
Proceedings of the 2003 Conference of the North American Chapter of the Association for
Computational Linguistics on Human Language Technology - Volume 1. Morristown:
Association for Computational Linguistics, pp. 79–86.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
References (II)







Satta, Girogio and Enoch Peserico (2005). “Some computational complexity results for
synchronous context-free grammars”. In: HLT/EMNLP 2005, Human Language Technology
Conference and Conference on Empirical Methods in Natural Language Processing,
Proceedings of the Conference. Morristown: Association for Computational Linguistics, pp.
803–810.
Shieber, Stuart M. (2004). “Synchronous grammars as tree transducers”. In: Proceedings of
the TAG+7: Seventh International Workshop on Tree Adjoining Grammar and Related
Formalisms, 2004, pp. 88–95.
Shieber, Stuart M. (2006). “Unifying synchronous tree-adjoining grammars and tree
transducers via bimorphisms”. In: EACL 2006, 11st Conference of the European Chapter of
the Association for Computational Linguistics, Proceedings of the Conference. Association for
Computer Linguistics, pp. 377-384.
Steinby, Magnus and Cătălin I. Tîrnăucă (2007). “Syntax-directed translations and quasialphabetic tree bimorphisms”. In Jan Holub and Jan Zdárek, eds,, Implementation and
Application of Automata, 12th International Conference, CIAA 2007. Berlin: Springer-Verlag,
pp. 265–276.
Tîrnăucă, Cătălin I. (2007). “Synchronous context-free grammars by means of tree
bimorphisms”. In: Gemma Bel Enguix and Maria Dolores Jiménez Lopez, eds., Proceedings of
the 1st International Workshop on Non-Classical Formal Languages in Linguistics (ForLing
2007), vol. 36/07 of GRLMC Reports. Tarragona: Universitat Rovira i Virgili, pp. 97–107.
Yamada, Kenji and Kevin Knight (2001). “A syntax-based statistical translation model”. In:
Proceedings of the 39th Annual Meeting on Association for Computational Linguistics.
Morristown: Asociation of Computational Linguistics, pp. 523-530.
Wu, Dekai (1997). “Stochastic inversion transduction grammars and bilingual parsing of
parallel corpora”, Computational Linguistics 23(3): 377–403.
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
That’s all folks!
Graçias!
Mila esker!
Moltes gràcies!
謝謝你。
Grazie!
Kiitos
CLG 08, Madrid, 28/06/2008: C. Tîrnăucă, Tree bimorphisms and their relevance in translations
Descargar

Document