History: WS08Paper:Assessing Translation Progress

Preview of version: 4 (current)

The technique explained in the previous section can indicate if a page requires updates or not. However, this information alone does not provide enough insight on the content to allow readers or potential translators to decide if an action should be taken at this point. A reader may not want to switch to a second or third language for a minor change that may have occurred. A translator may want to wait for more significant changes to be made before updating the content.

To provide this insight, it was necessary to find a measure to qualify the size of the changes. Finding an accurate measure is not an easy task. In natural languages, a single character change can completely change the meaning and complete paragraphs can be meaningless. For this reason, the measure found is bound to be imprecise and can only be used as an estimate.

The screenshots presented in the previous section contains percentages of "Up-to-date-ness". This measure indicates how many segments have been modified in the other linguistic versions. A modified segment is a segment that is either added or removed. The segments are counted as sentences. To take the difference of effort required to add a segment and to remove one, additions and removals are weighted differently in the calculation.

Alternatively, segments could have been counted as words or paragraphs. However, a sentence is closer to the meaning of an idea and initial tests demonstrated that other techniques would make variations too large or too small. Using sentences is not perfect either. Especially on small articles, the replacement of a word by its synonym may cause significant variations in the measure displayed.

To resolve this issue, multiple techniques could be used. However, until the CLWE has been deployed on a large amount of sites and data is available, it has been preferred to leave the measure as-is and wait until more comparative data and user feedback is available. The potential solutions are:

  • Change segment size
  • Adapt change weight
  • Adaptive segment size or weight depending on the length of the article
  • Deeper content analysis to determine if the change modified the meaning
  • Use symbolic representation of the value rather than numeric to better represent the imprecise nature of the value

History

Information Version
Tue 08 of Apr, 2008 12:58 GMT lphuberdeau 4
Mon 31 of Mar, 2008 17:32 GMT lphuberdeau 3
Sun 23 of Mar, 2008 15:47 GMT lphuberdeau 2
Wed 19 of Mar, 2008 16:45 GMT lphuberdeau 1

Upcoming Events

No records to display