The work described in this paper is part of an Open Source project called the Cross Lingual Wiki Engine (CLWE) project:
http://www.wiki-translation.com/tiki-index.php?page=Cross+Lingual+Wiki+Engine+Project&bl=y
This project aims at designing, developing and testing lightweight wiki tools that can be used to translate content in the new frontier of massively collaborative environments. Our aim is to develop and evaluate processes and tools that can be used with any wiki engine. However, as a first step, we elected to use TikiWiki as our initial development platform. The hope is that once we have figured out how to enable translation the wiki-way inside of TikiWiki, other wiki engines will be able to follow suit by emulating what we have done.
NOTE FROM AD: I copied and pasted stuff above from the CLWE wiki site. Some of it is redundant with stuff that LPH wrote below. We should try to merge, co-articulate them together. Also, it seems that a lot of what is said below has already been said in the intro. We should see if some of it should be eliminated. Also, would parts of this fit better in the intro? The way I see this Background section, it's not trying to describe the problem. It's talking more about the specific platforms, people, lead users who are involved in the project. Kind of putting an actual face on the project kind of thing.
In designing the Cross Lingual Wiki Engine project, the objectives were to improve the capacity to translate collaboratively and bring a true wiki experience to it. It had been identified that translation is a key issue in content democratization and that the solutions currently offered by wiki engines were simply insufficient. For content creators, it was duplication of efforts and for the site visitors, it was inaccuracy and incertitude.
In our vision, it was possible for authors part of different linguistic communities to collaborate on content beyond the language barrier. To allow them to contribute in the language they are most comfortable in and not to fear to never be read. For the site visitors, it would allow them to get a clear picture of where the information is and allow them to reach it easily, without having to scan excessive amounts of text.
To reach a wide public, it was necessary to integrate the solution to an existing product. Developing a reliable, fully featured, wiki engine with multilingual effort that is ready for production use is a colossal effort. With so many open source implementations available and ready to receive contributions, to embark on a new effort was a complete waste of time.
Among the many engines available, one had to be selected. To reach global acceptance, WikiMedia, the engine behind Wikipedia, would have been the best choice by far. However, the large popularity of the engine forces its maintainers to be very conservative about the changes that are incorporated. It's very likely that the changes like those required by CLWE would have been left out, leaving availability to a small group of users using our patched version.
TikiWiki CMS/Groupware was finally selected as the first engine to be modified. The openness of the community and its commitment towards multilingual support made it an excellent candidate. Moreover, around the same period of time, the Support Mozilla community (SUMO) selected it among many other content management system to run the new support site. The knowledge base contained in the support site had to be available in multiple languages to reach a user base as large as possible. The SUMO group's ambition is to initially support 8 languages for the most important pages of the knowledge base. Without specific tools to help this task, it would be nearly impossible to ensure content quality.
In the context of the Cross Lingual Wiki Engine Project, the SUMO knowledge base appeared to be an excellent primary test case. Because of the amount of languages to be supported and the large amount of potential content and translation contributors, the knowledge base is a perfect scenario to test our tools and study the collaborative translation behavior.
However, the SUMO group has specific requirements that had to be met above the initial goals of content synchronization. The SUMO group requires very high quality content. For this reason, they use an approval workflow to make sure all content visible on the website is approved. Moreover, they needed a way to mark translations as potentially out of date to warn readers that significant changes were made on an other translation.
To categorize the different pages in the knowledge base, extensive use of tags is being made. In order to improve the experience of the readers, these tags also need to support multilingual correctly. However this aspect will not be covered in the current paper.