History: Business case

Preview of version: 8

This breakout session looked at the business issues related to collaborative translation and translation crowdsourcing.

A first question raised was whether crowdsourcing mostly works for non-profit organizations? In other words, are there that many folks who will be willing to provide free translation services that will then be used by companies to make a profit?

There are examples of free crowdsourcing being used to translation commercial content produced by for profit organizations (ex: Adobe, Facebook), but it's still quite marginal. Not clear that this work at a larger scale.

On the other hand, it was pointed out that crowdsourcing does not necessarily mean that translators work for free, nor that they work for peanuts. The more controversial flavours of it (controversial for professional translators that is) do work that way, but it is not the only possible flavour of crowdsourcing. One could use the same processes and techniques to parallelise the translation of very large documents, to a large group of professional, well remunerated translators. This could be a way to decrease lead time. As an example of decreased lead time, Facebook found that for some language pairs, they could have all 350 000 words of their user interface translated in a matter of a day by leveraging the crowd of users (who, in this case were unpaid volunteers).

Next, we talked about the fact that translation crowdsourcing is just one of many approaches that businesses and organisations can be used for translation. We wondered what was the place of that new approach in the toolbox.

If you want translation done today, your options are:
  • Professional human translation
    • Large expensive shops
    • Mom and pop shops
  • Crowd-sourcing
  • Machine Translation

It was noted that Crowd-sourcing is sort of in between Professional human translation and MT in terms of:
  • Volume
  • Cost
  • Quality

So, crowdsourcing is appropriate in situations where you need a moderate level of those attributes. If you need to translate VERY, VERY LARGE amounts of text in a matter of hours, but can deal with moderate levels of quality, then probably raw MT is the only way to go. If you need VERY TOP QUALITY translations, but that the volume is small and/or you can afford to wait several days, then you should go with professional human translation. If your needs are between those two in terms of those three criteria, then maybe translation crowdsourcing is a good option.

We also talked about the extent to which crowdsourcing might or might not be able to make a dent in the growing gap between demand for translation services, and the offer in terms of human translation. We did not have a clear sense or consensus about this. On the one hand, crowdsourcing does increase the supply side of human translators, but the demand is growing so fast (mostly due to the internet and user-generated content) that it's not clear that this increase in supply is sufficient.

Another topic that was discussed is relationship between crowdsourcing and what has been happening for years on Wikipedia. It was noted that on Wikipedia, contributions follow a power law with a relatively small number doing most of the contributions. But long tail still very relevant, cause these small changes and corrections amount to much. Also, in order to find this small core of very motivated wikipedians, the Wikipedia folks had to open up their site to the whole wide world, and solicit contributions from the whole population of internet users. Will similar trends apply to translation crowdsourcing?

It was noted that an important difference between Wikipedia and translation crowdsourcing, is that the later can be parallelised at much finer grain. Indeed, in translation crowdsourcing, it's possible (but not necessarily desirable, see below) to split a document into sentences, and have different members of the crowd translate different sentences. In contrast on Wikipedia, one cannot ask the crowd to write an article about a topic by writing one sentence each. It's true that once a first draft of an article has been written, people can contribute to it in a more parallel way, by adding or modifying specific parts of it, fixing typos, style, etc... But one suspects that the original first version of an article is probably written by one, or a few people (but that probably needs to be verified empirically).

Different communities and organisations involved in translation crowdsourcing use different granularization approaches. Facebbook parallelises at the sentence level, while Kiva parallelises at the page level. Both approaches are probably appropriate for different situations and will present advantages and disadvantages of each are. But it's not 100% clear what the tradeoff space is like. For example, one would expect that coarser parallelisation would lead to higher quality, as it is recognized that translation out of context leads to poorer quality. But it could be that, on the contrary, parallelising at a sentence leads to higher quality, because you can have more than one person translating a particular phrase, and use voting to choose the one that seems best. This question of granularity is probably a good theme for empirical research.

We had a brief discussion about "indirect" crowdsourcing. For example, one can think of Google's Page Rank as a very indirect form of crowdsourcing, where people in the crowd "vote" on the importance of a page by creating links to it. Google then harvests these opinions from the crowd and uses them to prioritize list of hits. Similarly, many organisations (including Google), crawl the web looking for bilingual web pages, and use that data to train their MT systems. Is this a form of "indirect" crowdsourcing for translation? Can we think of other forms of "indirect" translation crowdsourcing?

We then went into a discussion of what translation crowdsourcing actually buys you, as an organization. It was noted that organizations often do not see cost reduction as the biggest benefit. In the talk she gave at ATA the week before, Naomi Baer listed the following benefits which have been mentioned by several organizations:

  • Community involvement
  • Reduced time to publication
  • Translating content that traditionally doesn’t get translated
  • Expanding market reach to additional languages

We then got talking about whether reduced time to publication could be achieved, even in a more traditional context of professional translation shops, by

The second point, time to publication, got us thinking about whether crowdsourcing style processes could be used to achieve

in 100% professional translation shops, in order
Could C-S used as a way to decrease lead-time in a professional translators situation
  • Ex: If ProZ deployed a platform for parallelizing large translation tasks to professional members of their network.

Will C-S dramatically affect livelyhood of professional translators?
  • Will put downward pressures on cost for sure.
  • But may create new oppportunities for new kinds of work (like Open Source did in Sofware Dev).
    • QQ
    • Coaching the crowd
    • Terminology management, creating resources to help the crowd
    • Coordination

What knds of tasks are not being done by profs now:
  • Short lived, dynamic content
  • Languages with small markets, but with motivated native crowd
  • Companies with small international outreach
  • Content produced by non-profit orgs

What kinds of content ARE being translated by C-S now:
  • Fan translation: Content that you are personally attached to
  • TedTalks: high profile content
  • YOur favourit app (ex: Facebook)
  • Customer support articles, manuals (Adobe) - Q: what motivates people to translate stuff there?
    • A: A crowd of third value developers, value-added developers, with special priviledged relationship with the commercial organization. In those scenarios, good MT is crucial, cause crowd probably less motivated to contribute.
  • Organisations whose mission are inspiring (ex: Kiva)

What content will ALWAYS be translated by professionals:
  • Material where cost of mistake is high
    • Safety issue (legal, medical)
  • Creative material
    • Marketing
    • Literary

Maybe translators will become 100% revisors
  • Revise crowdsourced translations
  • Revise MT outputs
  • There will be higher volumes of stuff to revise, probably enough to occupy all professional translators that exist today.

History

Advanced
Information Version
Tue 30 of Nov, 2010 16:23 GMT alain_desilets 14
Tue 30 of Nov, 2010 16:20 GMT alain_desilets 13
Tue 30 of Nov, 2010 15:34 GMT alain_desilets 12
Thu 25 of Nov, 2010 19:41 GMT alain_desilets 11
Mon 22 of Nov, 2010 15:47 GMT alain_desilets 10
Mon 22 of Nov, 2010 15:33 GMT alain_desilets 9
Mon 22 of Nov, 2010 15:14 GMT alain_desilets 8
Mon 22 of Nov, 2010 14:37 GMT alain_desilets 7
Mon 22 of Nov, 2010 14:21 GMT alain_desilets 6
Thu 18 of Nov, 2010 16:00 GMT alain_desilets 5
Sun 31 of Oct, 2010 18:33 GMT alain_desilets 4
Sun 31 of Oct, 2010 18:30 GMT alain_desilets 3
Sun 31 of Oct, 2010 18:16 GMT alain_desilets 2
Sun 31 of Oct, 2010 16:50 GMT alain_desilets 1

Upcoming Events

No records to display