History: WS08Paper:Supporting concurrent editing and translation

Preview of version: 18

NOTE from AD: I moved a whole chunk about what the problem is to the Introduction, because I think serves the introduction well. We can use this section to discuss the solution.


A key technical challenge that must addressed to support such an unconstrained translation and editing workflow, is how to track the original contributions and translations made in the different pages, in such a manner that they can easily be reproduced in other linguistic versions of those same pages. We now describe how we addressed this challenge in the CLWE project.

Tracking and Translating Edits


TODO: I find this part very hard to follow. LPH, maybe you need to verbally explain what you mean to either Seb or I. I often find this helps me formulate my thoughts in a way that is easier to understand by other people

In adding the required features to support change tracking, a few guidelines had to be followed:
  • Any contribution is worth translating until a translator says otherwise.
  • Only the final result matters. Intermediate steps can be ignored.
  • Content contributors should not have to worry about translation.

These three guidelines are all tightly connected towards a solution. As mentioned before, an attempt to translate any change individually to every single linguistic version is impractical due to the limited translation resources. By focussing on the final results, it is possible for a translator to catch up on the content of a source page in a single step and abstract away the different steps that lead to the final content. In most cases, the translator can simply observe the changes that were made on the source page since the last time a synchronization occurred.

In a wiki content creation process, multiple edits and not all of them would be worth translating. Rather than having a single author, wiki pages are a collaboration work between many people. Someone would initially write the base content. Others would add to it. In the process, many people will make minor contributions that contribute to the quality of the content, but may not be relevant for translators. Such changes include grammatical corrections and syntactic improvements. Some changes may only affect the formatting of the page. Determining which change has to be translated is a complex task on which a line is very hard to draw.

A simple change to the syntax of a phrase may seem trivial, but if the previous formulation was ambiguous, a translator may already have made a wrong interpretation of it. In which case, the translation would need to be updated. A potential solution to this would be to ask the content contributor to say wether or not a change should be translated. However, the content contributor may not have sufficient knowledge about the translation process to make the right decision. With no information requested from the content contributors, it is possible to make the translation process as invisible as possible.

Because translators will translate an aggregate of changes rather than changes individually, it does not matter if a few trivial changes slip in. Very active translation communities propagating the changes often may get to translate very small changes that do not impact their version of the content. However, it is unlikely that a translation gets updated multiple times a day to match each change on a page.

The result of these guidelines is a very simplified model for change tracking. In the model, each change represents an idea by its author. As far as tracking in concerned, all changes are equal as they all have to be propagated to the other linguistic version. When a change is incorporated in a given linguistic version, all pages updating from the given page will also inherit the given change.

In a simplified manner, change propagation can be represented as a directed graph in which nodes are page versions and arcs are page evolutions. Arcs are used for both evolution between versions of a same page and to represent the translation of content from source to target language. In the graph, original content creation is attached to page versions. By following all arcs from the original page version node, all pages containing the change can be found.

Consider this sample case using three languages. An unlimited amount of languages are supported. However, the resulting scenarios and representations would be too large and impractical for the demonstration purposes.

  1. A page gets created in English {en_v1}.
  2. In a second edit, some content is added {en_v2}. A third edit is then made.
  3. After this point, the page gets an original translation to both French {fr_v1} and Spanish {es_v1}.
  4. Afterwards, a French contributor decides to add the list of required sections {fr_v2}. After this modification, both English and Spanish versions indicate that they are not fully up to date {en_v3_2}. They provide links to view the other versions and update the content.
  5. An English translator responds to the request and includes the participant list in the English version {fr_v2_to_en_v4}. The English page correctly indicates that the page is now up to date, but the Spanish version is still behind {en_v4}.
  6. A Spanish contributor adds the exact dates of the event along with the location {es_v2}. The page indicates that the page does contain additional content. However, more content can be obtained from the French and English versions.
  7. A Spanish translator decides to update the Spanish version from the English source. In doing so, the Spanish version becomes fully up to date and includes the changes first made to the French version {es_v3}. Both the French and English versions now indicate that content can be obtained from the Spanish version.

Image

Image

Image

Image

Image

Image

Image

Image

Image


As it can be seen in the scenario, to a content contributor, the translation process is invisible. As any visitor of the website, the contributor will see the "Page Translation" box presenting the different alternatives and status information. However, he is free to ignore it. When a change is made by a content contributor, a new original content contribution is recorded and other linguistic versions of the page get updated with the information.

The "Page Translation" box provides links for translators to view the relevant changes made to the page. When using those links, the translator is brought to a slightly different version of the edit page. The page displays the changes to be translated along with the text area. When the translator indicates that the translation of the changes is completed, the translation target gets marked as containing the changes provided by the translation source. Again, other linguistic versions of the page get updated with the new information.

The directed graph representation of the described scenario can be illustrated as in figure {architecture_graph.dot.png}. In the graph, white nodes are original content contributions and gray nodes are versions resulting from a translation efforts. Solid arcs are page evolutions from version to version and dashed arcs are translations from source to target. On each node, the original content contributions included in the version are listed on the second row.

The graph representation is in fact very close to the internal representation used. Beyond providing useful information for the site visitors and support translators, the entire translation history is preserved. Figure {en_history} presents the page history of the English page in the scenario. The information from the translation history will allow to analyze the translation patterns and evolution of the communities around the different linguistic versions of a page.

en_history
en_history


Maybe not really relevant


Maybe all we need is to reformat LPH's scenario a bit to use the format below, which communicates better than the impersonal approach. Also, add a bit about how prior art (LizzyWiki) succeeded in removing some of the constraints, but not all.

John creates an English page Welcome to this wiki. Pierre then translates it to French page Bienvenue à ce wiki. Later on, John adds three sentences to English page Welcome to this wiki. Now, Josée who does not speak English wants also to add two words to the French page Bienvenue à ce wiki.

With a standard wiki, the two pages would be distinct and no one would be aware of the content evolution in the other linguistic versions. With the LizzyWiki approach, Josée is not allowed to add her two words to Bienvenue à ce wiki before she has translated the ten sentences added by John to the English version Welcome to this wiki. But Josée cannot do this because she does not read English. Even if she could, she might not be in the mood to translate ten English sentences just to be allowed to add two words to the French version.

In (Désilets et al., 2006) the authors also postulated that in order to support collaborative authoring and translation in more than two languages at a time, it might be necessary to impose the use of pivot languages as intermediaries between other languages, in order to provide stable points of references in an otherwise chaotic environment.

With the CLWE project, we blindly ignored these constraints and allow authors to create original content on any linguistic version of any page, and at any time.


History

Information Version
Tue 15 of Apr, 2008 19:09 GMT alain_desilets 55
Tue 15 of Apr, 2008 19:03 GMT alain_desilets 54
Thu 10 of Apr, 2008 16:20 GMT lphuberdeau 53
Thu 10 of Apr, 2008 12:13 GMT alain_desilets 52
Thu 10 of Apr, 2008 12:12 GMT alain_desilets 51
Thu 10 of Apr, 2008 12:11 GMT alain_desilets 50
Thu 10 of Apr, 2008 12:09 GMT alain_desilets 49
Thu 10 of Apr, 2008 12:06 GMT alain_desilets 48
Thu 10 of Apr, 2008 12:03 GMT alain_desilets 47
Thu 10 of Apr, 2008 03:03 GMT alain_desilets 46
Thu 10 of Apr, 2008 03:02 GMT alain_desilets 45
Thu 10 of Apr, 2008 02:58 GMT alain_desilets 44
Thu 10 of Apr, 2008 02:54 GMT alain_desilets 43
Thu 10 of Apr, 2008 02:36 GMT alain_desilets 42
Thu 10 of Apr, 2008 02:30 GMT alain_desilets 41
Thu 10 of Apr, 2008 02:19 GMT alain_desilets 40
Thu 10 of Apr, 2008 01:52 GMT alain_desilets 39
Thu 10 of Apr, 2008 01:42 GMT alain_desilets 38
Thu 10 of Apr, 2008 01:34 GMT alain_desilets 37
Thu 10 of Apr, 2008 01:06 GMT alain_desilets 35
Thu 10 of Apr, 2008 01:05 GMT alain_desilets 34
Thu 10 of Apr, 2008 00:39 GMT alain_desilets 33
Wed 09 of Apr, 2008 23:08 GMT alain_desilets 32
Wed 09 of Apr, 2008 11:28 GMT alain_desilets 31
Wed 09 of Apr, 2008 11:12 GMT alain_desilets 30
Wed 09 of Apr, 2008 10:57 GMT alain_desilets 29
Wed 09 of Apr, 2008 10:54 GMT alain_desilets 28
Wed 09 of Apr, 2008 10:53 GMT alain_desilets 27
Wed 09 of Apr, 2008 10:43 GMT alain_desilets 26
Wed 09 of Apr, 2008 10:25 GMT alain_desilets 25
Wed 09 of Apr, 2008 10:18 GMT alain_desilets 24
Tue 08 of Apr, 2008 23:58 GMT alain_desilets 23
Tue 08 of Apr, 2008 23:56 GMT alain_desilets 22
Tue 08 of Apr, 2008 23:54 GMT alain_desilets 21
Tue 08 of Apr, 2008 23:53 GMT alain_desilets 20
Tue 08 of Apr, 2008 23:41 GMT alain_desilets 19
Tue 08 of Apr, 2008 22:11 GMT alain_desilets 18
Tue 08 of Apr, 2008 22:10 GMT alain_desilets 17
Tue 08 of Apr, 2008 21:06 GMT alain_desilets 16
Tue 08 of Apr, 2008 21:01 GMT alain_desilets 15
  • «
  • 1 (current)
  • 2

Upcoming Events

No records to display