Open Translation Tools

Translation Memory

When you translate using a CAT-Tool (Computer Assisted Translation Tool) a database of bilingual segments is being stored. This database is called Translation Memory, abbreviated TM. Working with translation memories has two major advantages: if you are working with highly repetitive texts or on updates of translations, all the material has already been previously translated will be found in that memory. For less repetitive texts, the TM can be used to look up terminology. The advantage here is that you do not only see the term, but also how it was used in context.

When a text is broken down into so-called segments and these are translated, you have a translation memory of that text. The segmentation follows certain rules and these rules differ from language to language, because not each fullstop is actually the end of a sentence. This means segmenting rules are essential to the creation of a good and reusable translation memory.

When accepting a new job, translators are often provided with an existing translation memory of texts of the same company or of texts dealing with the same domain. In this way, the translator uses the TM to do terminology research, and when the TM is from the same company it helps to maintain the style of translation. For maintaining the style of a text, in addition to the translation memory normally a styleguide is provided.

There are both proprietary and open formats for translation memories. One of the best known standards among translators is TMX (Translation Memory eXchange), an XML based format.

Local Versus Global Translation Memories

Translation memories can be used in a range of contexts, from a personal productivity tool, to a global memory that is shared across many projects or companies. Translation memory started out as part of a desktop productivity tool, and was primarily used by individual translators to archive and re-use their own work. As Internet connectivity has become ubiquitous, translation memories are now often networked, so that many translators within a team can share their work, and more recently, with global translation memories that act as SaaS (software as a service) tools.

Local translation memories and small networks work best for teams of translators who are working for a specific client, work in a specific domain (e.g. automotive parts documentation), etc. You decide which translation memory you want to use or join based on the project you are working on, and the types of translations you're likely to need or re-use.

Global translation memories, such as the Worldwide Lexicon, collect translations from a wide range of projects and publications, spanning many language pairs and domains. This type of translation memory is not suitable for domain specific translation, but it does work well for more general content, such as newspaper articles, because the vocabulary and writing level targets a general audience.

It is also possible to combine both types, by searching first for translations from a domain specific translation memory, and then fallback to a general purpose translation memory.

Exact Versus Fuzzy Memory

Translation memory tools offer two types of searches: exact and fuzzy matches. In an exact match, the translation memory only returns translations that precisely match the source text. In a fuzzy match, the translation memory returns approximate matches. Results from a fuzzy match cannot be used as-is, but must be reviewed and edited by a translator, as even a single word can change the meaning of a whole sentence. Fuzzy matches are very useful, however, because there is a lot of repetition, especially in domain-specific material such as manuals and documentation.