Introduction to MetaTexis MetaTexis Designed and developed by: Hermann Bruns www.metatexis.com http://groups.yahoo.com/group/MetaTexis/ [email protected] Goals: I would like to show MetaTexis as a CAT tool characterized by the following three principal features: ● Simplicity ● Versatility ● Configurability They are of particular importance for free-lance translators, who do not specialize exclusively in IT-related texts. What is a CAT tool? A Computer Aided Translation program creates a database (Translation Memory) containing translation units, that combine a source segment (usually a sentence) with the corresponding target segment. In other words, a TM is a list of bilingual pairs of sentences. During translation, a CAT tool searches the database for similar segments (sentences or their parts). If there are any, they are displayed, so that the translator can select the best match and edit it, if need be. A CAT tool also provides the functionality of a dictionary or terminology database, that is searched during translation for possible matches with formerly defined terms. A CAT tool can function as a bilingual text database, a kind of linguistic corpus, a contextual dictionary. What can a CAT tool help you do? ● ● ● Avoid re-translating repeating fragments of text. Maintain consistent terminology in a large body of texts. Search for the context of a term or phrase. Note: A CAT tool does not translate texts. It is not an automatic translator. Possible applications and advantages ● ● ● ● ● Translation of long or multiple documents that require absolutely consistent terminology. Translation of highly repetitive documents, such as technical documentation or specifications. Translation of highly-specialized texts with difficult terminology that is provided in the form of a bilingual text file. A CAT tool can be used as a “contextual dictionary”. Translation of texts provided in the form of tagged documents. Requirements & versions • • Operating system: Windows 98®, Windows ME®, Windows NT®, Windows 2000®, Windows XP® Microsoft Word 2000® (Service Release 1a), Microsoft Word XP®, Microsoft Word 2003® MetaTexis versions: MetaTexis Lite (basic functions) MetaTexis Pro (advanced functions) MetaTexis NET/Office (MS Office import, network) SIMPLICITY MetaTexis in MS Word All the most commonly used commands are available in the tool bar, visible in the editor. They can also be triggered with shortcuts. All the commands are available in the menu. One needs to learn three shortcuts (Alt+Down to get to the next segment, Alt+Shift+Enter to select a segment from the database and Ctrl+Alt+T to display the main translation memory) to speed up considerably the process of translation. As the operations related to formating are already known to MS Word users, a user new to MetaTexis starts his/her work in a well-known environment. The strong/weak points of MS Word are also the strong/weak points of MetaTexis. For instance, if there is a need for a functionality, a userdefined macro can be launched when a segment is opened/closed. MetaTexis in MS Word ● ● ● All MetaTexis commands are available in the menu, that is added to regular editor menus. Commands are logically grouped to facilitate their use. All important shortcuts are provided in the menu as well. BASICS ● ● Preparation of documents for translation The process of translation – – – ● ● Document navigation Searching databases Handling search results Final version Statistics and cost calculation Preparation of documents for translation Step 1: Defining the type of the document to be translated. [Mostly automatic] Preparation of documents Step 2: Defining the source and target languages Obligatory, yet... Preparation of documents Step 3: Creation of a database (Translation Memory) Important for the levarage effect Preparation of documents Step 4: Creation of a dictionary (Terminology DataBase) Useful, yet not obligatory Preparation of documents Step 5: Defining the information about the translator Not obliagory, but it may be useful to provide the data. Preparation of documents Step 6: What should be done, once the document is prepared The original document is left intact. MT works on a new document, named DocumentName [MetaTexis].doc The process of translation ● ● ● How to move through a translated document? How to handle the results of automatic database searches? How to manually search translation memories and terminology databases? The process of translation All commands used during the process of translation can be executed by means of: tool bar buttons, ●menu commands, ●key combinations (shortcuts). ● Tip: It is best to gradually learn the shortcuts of the most common commands and use them during translation, as they considerably speed the entire process up. Trados/Wordfast users can easily adjust the shortcuts according to their personal preferences and habits. Document navigation ● ● ● ● Opening/closing translation units Searching for units to be revised for formal reasons – segment length – numbers – watch list items Displaying hidden text Activating dialog mode Apart from the navigation commands, the menu contains an advanced Search for text command — very useful in case it should be necessary to verify whether terminology is consistently used. Document navigation ● ● ● ● Go to the next segment (Alt+Down). TM search will be carried out automatically. If only one TM search result is available, edit the translation and press Alt+Shift+Enter If there are more TM search results, place the cursor in the translation box with the best match, edit the translation, and press Alt+Shift+Enter. The new translation unit is ready to be saved in the TM, which is done by pressing Alt+Down again. Note: Search results show the matching subsegments and their order. Document navigation Menu: MetaTexis | Navigation | Search for text (Alt+Shift+I) Copying and deleting ● ● Copying source text Management of special text elements: – – – – – – ● footnotes comments fields hyperlinks images tags Deleting translation Copying and deleting ● ● If the entire source text should be copied into the translation box, press Alt+Shift+C. If there are special objects, e.g. inline images, place the cursor in the place, where the object should appear and press Alt+Shift+Y (for inline images) to insert the object. – – ● Note: Each type of special objects is copied with a separate command. Objects are copied in their original sequence. If the selected TM search results, the entire translation or copied text should be deleted, press Alt+Shift+Delete. Segment manipulation ● ● Segment manipulation – Expanding/shortening text – Combining/dividing translation units (Re-)segmentation of the entire document These options are particularly useful, when the results of automatic segmentation do not correspond to the logical segmentation of a text portion. Searching in translation memories (TMs) ● ● ● ● ● If the default search options are checked in the Start Assistant, translation units are saved and the TM is searched automatically. However, it is occasionally necessary to search a TM for a phrase. To do so: – select the searched phrase – press Ctrl+Alt+T The Database Center displays the list of translation units containing the phrase. The selected unit is displayed in the right section (where it can be edited) The translation of the selected unit can be taken over by pressing the Take over button (at the bottom). Searching in terminology databases – dictionaries (TDBs) ● ● ● ● ● If the default search options are checked in the Start Assistant, TDB is searched automatically. However, it is occasionally necessary to search a dictionary for a term. To do so: – select the searched phrase – press Ctrl+Alt+G The Database Center displays the list of found terms. The selected term pair is displayed in the right section (where it can be edited) The term can be taken over by pressing the Take over button (at the bottom). Preparing the final version Update TM: Useful, when the document was edited after translation, e.g. during proofreading. The TM shall contain the approved translation. Segment whole document: Recommended to make sure segmentation is correct and no text has been skipped. Create Trados document: This option automatically exports translation into a Trados-style document. Preparing the final version Post-production allows you to: – remove unnecessary characters [e.g. double spaces] from translation – streamline quotations – ensure consistent formal characteristics of the translated text The option allows one to avoid carrying out manually some burdensome operations aimed at ensuring proper formal characteristics of the text. Statistics and cost calculation ● Word count ● Character count ● Time statistics ● Segment statistics ● Cost calculation SIMPLICITY ● ● ● ● ● Document preparation is automatic and quick. Basic database configuration is automatic. Document navigation and the selection of database search results require one to learn two shortcuts: Alt+Down (to go to the next segment) and Alt+Shift+Enter (to select translation). The less-frequently used operations can be triggered by means of tool bar buttons or menu commands. The final version is prepared with a single command. Documents edited during proofreading can be used to update TMs during the preparation of the final version, so that the TM can always contain the best translation, which is important if it is to be used extensively in the future. CONFIGURABILITY ADVANCED OPTIONS ● Configuration of MetaTexis – General Options ● Configuration of TMs and TDBs – Document Options ● Export and import of documents and databases ● Alignment ● Multi-document projects Configuration of MetaTexis – General Options ● Options affecting the entire program: – Interface language – – Dialog/document mode – – personal habits can be retained External programs launched from MetaTexis – – especially important in the translation of interfaces [in rc files] Shortcuts – – important for those accustomed to work in Trados or Wordfast Quality control – – English, German, French, Spanish, Portuguese, Russian, Czech, Chinese, Polish Cooperation with external dictionaries and translation machines Color coding of search results – easy recognition of the match percentage General Options General Options - Shortcuts General Options – External programs General Options – the “looks” Colors and frames ● ● ● Colors defined in this tab “code” the percentage of similarity of the translated text with database search results. It is also possible to define colors for identical subsegments and the indexes indicating their order. The color “coding” facilitates orientation in database search results. Important data start to be transmitted in a “subliminal” manner. Document Options ● Miscellaneous tab: – – – – – Definition of the source and target languages; Showing segmentation marks — important, if a document is to be proofread by someone, who does not have MetaTexis; Access to documents included in the same multidocument project; Translator's data; Watch list — a very important and useful function: – – ● helps preserve the coherence of translation helps ensure consistent terminology usage in a verified text that contains erroneous or misleading terminology Watch lists can be stored in *.txt files and imported, when needed Document Options Document Options ● Segmentation tab allows to: – define (a set of) segmentation marks; – – indicate styles that should not be segmented/translated; – – Important when translated text contains numerous elements that should be left as they are, e.g. formulas, commands of a programming language, etc. define a list of abbreviations that should not be segmented; – – Important to reduce “expand segment/combine segments” operations Important to reduce “expand segment/combine segments” operations decide how numbers should be treated. – Important to reduce “expand segment/combine segments” operations Document Options Configuration of TMs and TDBs – Document Options ● Databases tab allows to: – fine-tune the operation of the search engine on translation memories and dictionaries: – – – – indicate additional, secondary translation memories and dictionaries (theoretically, up to 254): – – useful when an “inherited” TM is not to be changed or there is a need for separate TMs for different, yet similar projects define the manner of displaying TM/TDB search results: – – – threshold percentage of similarity ignoring internal tags saving/inserting formatted text the number of alternatives to be displayed the manner of displaying/inserting terminology cross-link TMs and TDBs. TM - Configuration Saving tab ➢ ➢ ➢ ➢ Saving a translation unit automatically ensures the leverage effect is achieved. Saving RTF text allows one to avoid most tasks related to formatting. Saving alternatives provides flexibility. Inverse saving of translation units allows the TM to be searched in both directions TM - Configuration Search 1 tab ➢ ➢ Automatic search — the translator can concentrate on selecting the right or most helpful translation. Language classes allow the search engine to treat, for instance, UK and US English as one language (no matter how strange it may sound for the speakers of these languages). TM - Configuration Search 2 tab ➢ ➢ ➢ ➢ The number of results found in the TM depends on the minimum similarity threshold for entire segments and subsegments. Search results are better, for some purposes, if index fields/internal tags are ignored. TM can be used as a dictionary. TM can be searched both ways. TM - Configuration Results 1 tab ➢ ➢ ➢ 100% matches can be inserted automatically. The number of found segments can be adjusted according to current needs. Marking of identical subsegments and their order is very helpful, as both the text to be translated and its place in a segment are clearly visible. TM - Configuration Results 2 tab ➢ If and identical segment with different numbers are found, MetaTexis uses the found segment and tries to intelligently change the numbers. TDB - Configuration Search 1 tab ➢ ➢ Automatic search for terminology that can be inserted or just displayed. Language classes allow the search engine to treat, for instance, UK and US English as one language. TDB - Configuration Search 2 tab ➢ ➢ If there is a 100% match, there may be no need for a terminology search. However, it may also be important to search the dictionary if the TM can contain incorrect terminology. The dictionary can also be used as a TM. TDB - Configuration Results 1 tab ➢ ➢ The results can be displayed in the source segment in the form of pairs of words. Such segment can be copied and overwritten. The results can be displayed in a separate terminology section. TDB - Configuration Results 2 tab ➢ ➢ ➢ The terminology can be marked, for convenience. The marks can later be useful during proofreading. They can also be deleted (in most cases, it is better to check this option). CONFIGURABILITY ● ● ● ➔ Comprehensive options for configuring: – Translation Memories: – similarity thresholds, search direction – Terminology DataBases: – flexibility of displaying terminology – Segmentation rules – Logical and semantic segmentation takes precedence over automatic segmentation prepared by the program. A wide range of options is available for defining the manner of displaying TM search results. Personal habits can be respected, as the interface is flexible: – Program shortcuts can be defined by the user – The tool bar can be customized The user (and his habits) takes precedence over the program. VERSATILITY & Compatibility Export and import ● ● ● Export/import of Trados/Wordfast documents Import of tagged documents – preparation for translation MS Office documents: – – ● Export/import of Translation Memories – ● Excel PowerPoint Formats: tmx, txt, mdb Export/import of Terminology Databases – Formats: tmx, txt, mdb Export/import: Trados/Wordfast documents ● ● Trados documents are recognized in the Start Assistant and automatically prepared for translation in MetaTexis. A document can easily be exported to Trados/Wordfast format. Export/import: tagged documents ● ● Tagged documents are recognized in the Start Assistant and automatically prepared for translation in MT. Supported formats include: – HTML – PageMaker – Frame_Maker – Interleaf – Ventura – Quark Express – XML – OpenTag – XLIFF Export/import: tagged documents The import of tagged documents, e.g. HTML, can be finetuned automatically or customized to particular needs by anyone, who knows the relevant language structure (HTML, XML, etc.). If done well, it ensures better segmentation. Export/import: tagged documents It is also possible to define ➢ ➢ the meta-character set of the source document the meta-character set of the target document MetaTexis works in Unicode, but it may be important to ensure the right encoding (other than Unicode) in the documents to be published on the Internet. Export/import: tagged documents ● ● ● Tagged documents are automatically exported to the native format, when the final version is prepared. It is possible to automatically encode special characters in the final version. If some segments have been overlooked, there appears a warning. Export/import: MS Office documents ● ● MS Excel and MS PowerPoint files can be translated directly in Metatexis. A data sheet or presentation is imported into a MetaTexis document and, once translated, it is exported back to the original file. Export/import: Translation Memories & Dictionaries Menu: MetaTexis | Import/export | Import/export TMs The dialog box allows the user to select an existing TM, create a new TM, view an existing TM and export (or import) it to a selected format. An identical dialog box is used to perform all the actions for dictionaries (TDBs). Translation Memories — Export TMX ● ● If TMX is the target format of the database, the basic structure is created automatically. The user can create here a consistent database for the translation in the opposite direction as well. Translation Memories — Export TXT ● ● If TXT is the target format of the database, it is necessary to define: – the field separator – the content delimiter – the fields to be exported It is possible to change the order of languages in the exported segments as well. Translation Memories — Export Access ● ● If MDB is the target format of the database, it is possible to define certain options, that facilitate further processing of the TM in MS Access. The fields to be exported can also be selected. Translation Memories — Import Highly configurable import procedure Translation Memories — Import TXT ● If TXT is the source format of the database (e.g. Wordfast TMs), the user can select the appropriate field separator and content delimiter, to ensure the database structure is copied correctly. Translation Memories — Import TXT ● The user can select and assign the fields to be imported to the fields of the target MetaTexis database. Translation Memories — Import TXT ● The user can also define how the imported fields are to be treated. TMs — Export/Import ● ● ● Export/Import formats: – txt [the format used by Wordfast] – tmx [universal exchange format] – Mdb [MS Access database format] Flexibility of field assignment – Possibility of selecting the data, that are needed. – The user can be sure the database structure is correctly mapped from the source file / onto the target file. TXT and MDB formats allow the databases to be easily processed in MS Excel or MS Access, which facilitates greatly the tasks related to database maintenance. Terminology Databases: Export/Import Options and dialog boxes analogous to those, that control export/import of TMs: ●The same flexibility of field assignment: ➔ Possibility of selecting fields to be imported/exported ●Possibility of exporting to Access databases, that can further be processed. ➔ Easy database maintenance ●Possibility of exporting to/importing from *.txt files (e.g. Wordfast databases or *.csv files) ●It is possible to export a dictionary into a txt or xls file, process it and re-import into MT. ➔ Easy maintenance if Access is not available. VERSATILITY and compatibility ● Files in multiple formats can be prepared for translation in MetaTexis: – – ● ● HTML, PageMaker, Interleaf, Ventura, Quark Express, XML, OpenTag, XLIFF, Trados document, Trados TagEditor, Windows resource files MS Excel, MS PowerPoint It is possible to export documents to Trados RTF format and Wordfast documents. Import/export of TMs and TDBs from/to *.tmx, *.txt and *.mdb formats. ➔ ➔ Possibility of flexible selection and assignment of fields to be exported/imported. High compatibility with Trados and Wordfast. Other advanced options ● ● ● Alignment – Preparation of Tms from separate texts (source and translation). Projects – Sets of documents, that are treated as an entity. Searching, indexing and statistic calculations are carried out for all the documents. Batch processing – The option allows the user to pre-translate a set of documents (e.g. to prepare documents for other translators) or save all translation units in a set of documents in a TM (e.g. after proofreading, to obtain an updated TM).