Open Translation Tools

Global Voices Lingua


Global Voices (http://www.globalvoicesonline.org) is an international network of bloggers who translate, report on and defend blogs and citizen media from around the world. Since 2005 the project's contributors have posted summaries and reports of what bloggers and other producers of citizen media from around the world are discussing. As of June 2009 Global Voices has published over 50,000 post and its community comprises over 300. Global Voices' content is freely available for re-distribution and use in derivative works under a Creative Commons Attribution license. The project actively promotes the reposting of its content by both citizen media organisations and mainstream commercial media.

The seeds of Global Voices' social translation project "Lingua" (http://globalvoicesonline.org/lingua/) were sown when a group of Taiwanese fans of the site began independently translating articles into Chinese and posting them online. As their body of translated work grew, they decided to organize it into Chinese-language mirror of the site. Interactions between members of the Global Voices community and the Chinese translators made it clear that there was impetus and excitement around the idea of creating multiple similar translation sites for other languages. Global Voices set out to build a scalabale infrastructure that would enable the organization to create sites for new translation communities as these came into being.

Choosing a distributed network of sites

The decision to use separate sites for each language group rather than integrated translation in the content management system (CMS) was based on several factors. For one, WordPress, the CMS used for the site, lacks content translation support in the core code and the available plugins either lacked the relevant features or did not permit the level of scalability required (at the time of writing in June 2009, Global Voices has over 20 language sites). In addition, the existing translation plugins were all pet projects of individuals, which meant there was no guarantee of long term stability (WordPress core upgrades invariably break complex plugins, so the programmers responsible need to remain vigilant over time).

It was also important that the translation project be kept as simple as possible so that it could be replicated and updated quickly. Integrating the translations into the main site with its thousands of posts, hundreds of users and years' worth of hacks and customizations would have been a slow process involving lots of testing before it could be launched. By contrast, the use of external sites to house the translations allowed the translation system to be effectively 'beta' released with no serious risk of exposure to embarassment if issues arose on the main (English-language) site. A simplified version of the site's front-end design template was used for the translation sites, speeding up launch time and avoiding the design conversions that would have been needed in order to internationalize the theme. The translation template was also much less dependent on the content skew that was taken for granted in the design of the English site.

Despite the decentralized nature of Global Voices' translation section, the sites still needed a means of recording and displaying the source-to-translation relationships on posts and in the database. To this end, a pinging model based on blog trackbacks was used. This allowed translations to be associated with the source post, and the relationship to be recorded in the translation database in the process. This involved entering the URL of the source post in the editing interface of the translation site, which pinged the source when the translation post was saved. Metadata fields for original author name and original author profile URL were added to the translation interface so that both the translator and the original author were credited separately on the translation post. The system was improved over time with enhancements that simplified the process, such as the addition of a hidden section in the post-viewing interface of English-language posts where all post metadata was easily accessible to translators.

Pros and cons of the distributed sites model

Even though the use of external sites was primarily a practical decision made for technical reasons, many within the Global Voices community consider the separation to be one of the project's greatest strengths. While integrated translations are appealing in many ways, and while some people complain that the translations are "ghettoized" in relation to the main site, separating the content and translation communities by language has in many ways helped motivate translators to remain committed to the success of their translation communities. Having administrative rights to the site appears to give editors a sense of ownership over the content. In the case of new or less active translation sites, the separation has allowed the community to grow organically and at its own pace, rather than being shoved into the noisy and crowded space of the English-language site.

Despite the Lingua project's apparent success at facilitating both participation by translators and the act of translation itself, the system was lacking in a number of ways on both the conceptual and technical fronts. For one, the system was entirely centralized on the English-language site and its database. Translations could flow only away from English and not the other way around. This was partly an editorial decision, but some translators had in fact found ways around it. The English-language site was also the only one with access to the translations database, meaning that the translation sites were unable to identify versions of a post other than the English source. These technical limitations were often interpreted as a form of Anglocentrism on the part of Global Voices and considered to be misaligned with the organization's values.

Revising the model: Decentralizing power and simplifying workflow

After considerable discussion, and once it became possible to dedicate the programming resources required for the task, Global Voices' Technical Director began re-designing the system architecture to solve the problems outlined above. Though still under development at the time of writing in June 2009, the new system is designed to manage translations using the same basic model as before, with separate sites pinging each other to record translation relationships, but in far less centralized way. The database, rather than being part of the English site, was moved to a separate space and can be accessed by all of the sites in the network, which can now save and fetch translation data directly from it. The English-language site is no longer the hub for all content translation, but merely a member of the translation network alongside the various language sites. Under the new horizontal structure, any Global Voices site can be the source or destination of a translation. The new structure also facilitates translations of translations (e.g. en->fr->zh) by translators who may be interested in a story that was not written in a language they know.



Alongside these changes the Global Voices Lingua project has worked on simplifying and streamlining the translation process, notably to make better use of the pinging system that automates the migration of content and metadata from the source site to the translation. The necessity of copying and pasting content and re-selecting categories from the original post has been the most common complaint from translators, as well as the cause of inaccuracies in translated posts. Under the updated system, the source site receives a ping from the translation to which it replies with serialized data about the post that the translation site can use to auto-populate the relevant fields in the translation interface. This makes it easy for translators to create a perfect copy of the source post on their site before starting their translations.

The decentralized pinging model for translations is perhaps the one that should have been used from the beginning, but waiting to see what problems arose with the original system made it possible to identify the biggest issues between the system and our community before making large-scale modifications. It is also worth noting that in many ways the older system, as simple and insufficient as it was, may have been the right tool for getting the Global Voices Lingua translation network off the ground. Now that the project has achieved critical mass it has become clear that all sites should be treated equally. However, the centralized nature of the original system allowed the managers of the main/English language site to maintain control over the translation sites and communities during the period where the community was learning how they might work and what kind of management they might require.