History: WS08Paper:Introduction

Preview of version: 31

Massively collaborative sites like Wikipedia are revolutionizing the way we think and deal with content creation. This is bound to have profound impacts on the way that we think and deal with content translation as well (Desilets 2007a... this would be my Aslib keynote).

In particular, it raises the question of how to go about collaboratively authoring and translating content in several languages at the same time, and to do it in an organic, continuous fashion. This is a situation that many community-built sites find themselves in. For example, SUMO, the Mozilla Support community had been using traditional techniques for documentation until recently. When they decided to move to wiki documentation, they faced an important problem with content localization.

In order to provide quality documentation to a very large user base, the SUMO community requires the content to be translated in at least 8 major languages. To the Mozilla foundation, widespread adoption of the web browser is a critical business objective. Without quality documentation available to a large public, those objectives are impossible to reach.

In a collaborative context, allowing authors to create original content is crucial. However, when it comes to translation, it causes serious synchronization problems. Moreover, through the crowd sourcing effects offered by wikis, the foundation may be able to propose the documentation in languages that would not have been thinkable in the past. To do so, reliable translation tracking tools are required.

TODO: LPH wrote some excellent introductory French material in his project report. Use some of it here

Translating content in these collaborative environment presents a number of unique challenges, compared to more traditional environments (Desilets et al., 2005). The primary difference is that in a collaborative environment, the process is much less controlled and more "chaotic". Traditional translation processes and tools operate under a number of assumptions which simply do not hold in a collaborative environment. Below is a list of those.

NOTE: We should choose concise yet highly descriptive names for these assumptions so we can refer to them later in the paper_

  • Assumption 1 - Master language: In a traditional context, original content is typically created in a master language, usually English. This is not realistic in a collaborative context, because authors may not all be fluent enough in English to write high quality content in that language.
  • Assumption 2 - Limit original changes during translation: In a traditional context, once translation of content in the master language has started, there is a strong tendancy to limit changes to the master language version until it has been translated to all other languages. This is not realistic for a collaborative context since content often never reaches a final stage.
  • Assumption 3 - Can ensure timely translation: In a traditional context, one can assume that timely translation of content can be ensured through contractual obligations with the translator. In a collaborative context, this is not possible since translators are often unpaid volunteers working on their own free time.
  • Assumption 4 - Focus on small list of languages: In a traditional context, there is a tendancy to focus on a small list of "core" languages, in order to minimize the number of language pairs for translation. In a collaborative contetxt, members of the community are usually allowed to create content in whatever language, including minority languages.
  • Assumption 5 - Strong coordination of authors and translators: In a traditional context, the community of authors and translators is a "closed" world, where everyone knows each other, and there is some central authority that coordinates everything. In a collaborative context, authors and translators contributing to the documents are not coordinated and often do not know each other.
  • Assumption 6 - Trained translators: In a traditional context, translators are usually professionallly trained, and can be "enculturated" into the organisation's tools and processes. In a collaborative context, translators are often amateur, and the amount of tool and process training that one can impose on them is limited.
  • Assumption 7 - Separation of Authoring and Translation: In a traditional environment, authoring and translation are clearly segragated, and there are very little chances for the two to interfere with each other. Authors do not have to know about translation processes and translators do not have to know about the authoring processes. In a collaborative environment, it is difficult to separate those two processes, and the people doing the two are often the same one. As a consequence, there is a risk that introducing a translation process into a wiki community will complexify the basic authoring operations that have made the success of many wiki communities.


TODO: Eliminate or merge sentences which are redundant in the argumentation below.

Thus the main technological challenge of collaborative translation is to come up with tools and processes that do not depend on those assumptions. All these assumptions can be summarized in a few words: change must be embraced rather than constrained. At the same time, the tools need to offer sufficient structure to allow volunteer translators to be effective in their work without obstructing the content authors in their creative processes. With appropriate tool support, they could together reduce the effort required to create and make content widely available.

Such tools would be useful for many communities and organizations.

In this paper, we describe an approach that addresses the limitations of the tradition model, by relaxing all of these assumptions, sometimes partially, sometimes completely. This enables in effect true collaborative translation. The solution described is implemented in a fully-featured content management system and is ready to be deployed. It allows efficient translation workflows in a collaborative translation environment and can help organizations reduce costs through crowdsourcing.

To our knowledge, our system is the first one to go this far in supporting collaborative authoring and translation of content, and to be usable in actual production settings.

TODO : Section summary

History

Advanced
Information Version
Sat 12 of Apr, 2008 16:54 GMT alain_desilets 53
Sat 12 of Apr, 2008 16:08 GMT lphuberdeau 52
Sat 12 of Apr, 2008 15:21 GMT alain_desilets 51
Fri 11 of Apr, 2008 01:40 GMT alain_desilets 50
Fri 11 of Apr, 2008 01:39 GMT alain_desilets 49
Fri 11 of Apr, 2008 01:37 GMT alain_desilets 48
Fri 11 of Apr, 2008 01:36 GMT alain_desilets 47
Fri 11 of Apr, 2008 01:30 GMT alain_desilets 46
Fri 11 of Apr, 2008 01:18 GMT alain_desilets 45
Fri 11 of Apr, 2008 01:17 GMT alain_desilets 44
Fri 11 of Apr, 2008 01:17 GMT alain_desilets 43
Fri 11 of Apr, 2008 01:16 GMT alain_desilets 42
Fri 11 of Apr, 2008 01:15 GMT alain_desilets 41
Fri 11 of Apr, 2008 01:14 GMT alain_desilets 40
Fri 11 of Apr, 2008 01:13 GMT alain_desilets 39
Fri 11 of Apr, 2008 01:12 GMT alain_desilets 38
Fri 11 of Apr, 2008 01:11 GMT alain_desilets 37
Fri 11 of Apr, 2008 01:11 GMT alain_desilets 36
Fri 11 of Apr, 2008 01:02 GMT alain_desilets 35
Fri 11 of Apr, 2008 00:50 GMT alain_desilets 34
Fri 11 of Apr, 2008 00:42 GMT alain_desilets 33
Fri 11 of Apr, 2008 00:18 GMT alain_desilets 32
Fri 11 of Apr, 2008 00:10 GMT alain_desilets 31
Thu 10 of Apr, 2008 15:45 GMT lphuberdeau 30
Wed 09 of Apr, 2008 11:33 GMT alain_desilets 29
Wed 09 of Apr, 2008 00:14 GMT alain_desilets 27
Wed 09 of Apr, 2008 00:13 GMT alain_desilets 26
Wed 09 of Apr, 2008 00:11 GMT alain_desilets 25
Tue 08 of Apr, 2008 23:40 GMT alain_desilets 24
Tue 08 of Apr, 2008 23:39 GMT alain_desilets 23
Tue 08 of Apr, 2008 23:35 GMT alain_desilets 22
Tue 08 of Apr, 2008 22:42 GMT alain_desilets 21
Tue 08 of Apr, 2008 22:42 GMT alain_desilets 20
Tue 08 of Apr, 2008 22:41 GMT alain_desilets 19
Tue 08 of Apr, 2008 22:38 GMT alain_desilets 18
Tue 08 of Apr, 2008 22:35 GMT alain_desilets 17
Tue 08 of Apr, 2008 22:10 GMT alain_desilets 16
Tue 08 of Apr, 2008 19:35 GMT alain_desilets 15
Tue 08 of Apr, 2008 19:33 GMT alain_desilets 14
Tue 08 of Apr, 2008 18:18 GMT alain_desilets 13
  • «
  • 1 (current)
  • 2

Upcoming Events

No records to display