OT12 online seminar: Translation Memory Tools
Paul Filkin, Director of Client Communities,
SDL Language Technologies
1
The Agenda… or things we’ll cover
• how Trados was developed and established itself as industry leader
• how translation memory tools work
• what their benefits for open (and professional) translators are
• what the particular distinguishing features of SDL Trados Studio are
• what the future is for translation memory software
2
SDL Trados… a brief history
3
Translation Production
Content is either …
• Translated by professional
translator
• Or, the “occasional” translator
– Non-linguist, Subject matter specialist
(reviewer), Crowd sourced, …
• Or, left un-translated
– Not relevant, too costly, too much overhead involved, …
This presentation focuses on content produced by professional
translators
4
Productivity Environments
• Today, content workers utilize specialized productivity environment(s)
Content Worker
Application Class
Prominent Example
Graphic Designers
Graphic tools
Adobe Photoshop
Audio Producers
Musicians
DAW
Steinberg Cubase
Architects
3D modeling program
Google Sketch up
Engineers
CAD
Autodesk AutoCAD
(Digital Audio Workstation)
(Computer Aided Design)
Game Developer
Game Engine
Epic Games Unreal
Engine
Translators
CAT
SDL TRADOS TWB /
SDL Studio
(Computer Aided Translation)
All mentioned trademarks are property of their respective owners.
5
Translation Editor is at the core of any CAT
Professional Translation can be done …
• In principle, in any authoring editor (desktop/browser)
– However, with limited productivity (in the range 800-1500 words per day) and
high efforts maintaining consistency and accuracy.
• Using Microsoft Word + Plug-ins
– Plug-in to translation productivity tool
– Hard dealing with structured content
• Using a Dedicated Translation Editor (CAT or TEnT)
– Depending on various factors: productivity boost in the range 2000 to 5000
words per day
– Well established market for professionals
6
What is CAT Technology?
• CAT: Computer-Aided Translation
– A generic term used to describe software which assists users during the
localization/translation process
– Sometimes referred to as TEnT : Translation Environment Tool
• Our CAT technology is an integrated toolset, offering:
– Translation Memory (TM)
– Termbase
– Editing environments
– Project Management functionality
– Software Localization
– OpenExchange
Public ProZ Poll August 24 reply
from 1670 translators
http://www.proz.com/polls/5474
7
What is CAT Technology?
• CAT technology incorporates the concept of translation memory and
termbase
• Translation memory: a database consisting of translation units
– Translation unit: source and translated sentence or paragraph
– During translation, the technology searches for exact or similar matches to the current source
segment for translation
– Matches found can be reused or edited
• Termbase: multilingual database consisting of term entries
– Term entries: terms, synonyms, acronyms, etc.
– Contextual data: definition, part of speech, gender, etc.
• Translators work with a translation memory and termbase to reuse previous
translations and ensure consistency of terminology during translation
8
Translation Memory Overview
• A translation memory is a searchable
database containing source and
translated sentences or paragraphs
– The translation of a segment or phrase
occurs only once, as each occurrence is
stored in the database
– During a translation project, when the source
segment re-occurs, the translation memory
remembers the translation (by searching the
database) and inserts it into the new
document
– The translator may accept the previous
translation or edit the translation, if necessary
9
Terminology Management Overview
• A termbase is a searchable database
which contains a list of multilingual terms
and contextual term data
– Term data gives details about the origin and use of
the term, such as definition, gender, context, etc.
– The termbase can be used in monolingual form
during source content creation
• Ensure consistency of terminology in source
documentation
• Facilitate translation for the global marketplace
– The termbase can be used in bilingual form in
conjunction with translation memory technology to
increase translation accuracy
• Ensure consistency of terminology in translated
documentation
10
Key Productivity Accelerators
Topic Level
Segment Level
Subsegment Level
document, page, fragment, chunk, …
sentence, header, footnote, table cell,
…
phrase, word, …
Exclusion from
translation through
markup
Translation
Memory
Auto-suggest
“Perfect Matching”
utilizing bi-lingual
representations
Automated
Translation
Placeables, Terms
Auto-propagation
Concordance
Impact on effective handling of update translations
Impact on effective handling of new translations
Impact on effective handling of document internal redundancies
Impact on consistency & quality
11
(dictionary based
auto-completions)
Topic (Document, …) Level
“Don’t translate if
it hasn’t changed”
(but show it to provide context for
the text that has actually changed/
added)
Markup exclusions
 Use ITS / other convention to lock
text
 Custom arrangements between
CMS + Translation System
Significant productivity gains
dependent on update frequency
12
Perfect Matching
 Compare text with predecessor
translation project and lock what
hasn’t changed
 But, high overhead in managing
corresponding projects
Segment Level : TM
“Don’t re-translate
if you can reuse
an (approved)
existing translation”
(but adapt as you need)
• Increasingly sophisticated match type differentiation
– 100%, Fuzzies, Context Matches (CM), (ICE)
• Cascaded TMs, Ranking of TMs
• Significant productivity gains dependent on
– Availability of relevant TMs
– Similar content produced again and again
13
Segment Level : Automated Translations
“Adapt an automated
translation proposal”
(instead of translating
from scratch)
• Increasingly accepted by professional translators
– Especially using Statistical Machine Translation (SMT)
• Significant Productivity gains depending on
– SMT engine trained with sufficient, relevant (in-domain), high quality
(professional translator output) data
– Translators are able to dynamically select “in-domain” trained engine [e.g.
“Touchpoints”]
– Trust scores
14
Segment Level : Auto-propagation
“Auto-propagate
translations
for identical source
segments”
(and ripple through any changes
when you change your translation)
• Productivity gain if text has internal repetitions
– Simplifies updating identical segments throughout the content
• Requires parameters to control behavior
15
Subsegment Level : Auto-suggest
“While I type, provide a list of relevant candidates so that I can
quickly auto-complete this part of my translation’”
• Productivity gain highly dependent on available data-sources and proposal
strategy
– Optimal configurations reduce keystrokes by 30 up to 50%
– Avoidance of typos, impact on consistency
16
Subsegment Level : Placeables, terms
“While I type, make it easy for me to place tags, recognised
terms and other placeables so I can focus on the translatable
text.’”
• Productivity gain highly dependent on available data-sources for terminology or
translator diligence, and the complexity of the tags
– Avoidance of typos, impact on consistency, robust target documents
17
Subsegment Level : Concordance
“Make it easy for me to search through Translation Memories,
in both source or target text and from wherever I am in the
document I’m translating’”
• Biggest impact is in being able to find things you’ve translated before that are
similar, or the same, as the current text and make it easy to reuse
– Impacts the quality of the work you deliver
– Impacts the time it takes to find the right words for complicated texts
18
Key technology advances…
• Whereas the key technology advances
are in the area of subsegment reuse and
statistical machine translation (SMT), the
actual productivity gains for a Professional
Translator relate to the ergonomics of how
systems allow users to interact, control and
automate the various data sources:
– Access, creation, chaining, weighting and sharing of TMs
– Access to SMT pointing to specific engines
– Compilation of phrase dictionaries on the fly
19
What Happens When Teams Grow?
When teams of three or more work together, new factors must be
considered to work effectively and properly collaborate
Project Managers
20
Reviewers
Translators
Typical Package-based Workflows
Project Manager
Translator
Reviewer
or
Translator
Project Manager
Reviewer
21
...x 5 languages...
Project Manager
22
Typical Project Workflow
with SDL Studio GroupShare
1. Project Manager creates a project
– Performs analysis, pre-translation using SDL
Trados Studio connected to a TM on TM
Server
2. Project Manager publishes project
 Uses Publish command in Studio,
select server and location, and Studio
takes care of the rest
 Contact team via email, phone
Project Manager
23
Typical Project Workflow
with SDL Studio GroupShare
3. Team Accesses Project
– Use Studio 2011 to open project
– Check out files as required for translation, review, or
signoff
• Studio only gets files as needed
• Project Server tracks file versions
– Studio and Project Server synchronize metadata
Reviewer
Translator
Project Manager
24
Looking forward…
• Current theme for CAT tools – reviewer productivity
– Inclusion of track changes and commenting mechanisms in translation editor
• Automation in the broader production chain
25
… and the Studio “Platform” which includes the OpenExchange
26
The SDL OpenExchange… current state of affairs
57 Apps on the OpenExchange
42 are completely free
29,804 downloads (August 2012)
7,141 app users (August 2012)
396 developers (August 2012)
27
Copyright © 2008-2012 SDL plc. All rights reserved.. All company names, brand names, trademarks,
service marks, images and logos are the property of their respective owners.
This presentation and its content are SDL confidential unless otherwise specified, and may not be
copied, used or distributed except as authorised by SDL.
Descargar

Document