History: Possible Research Questions

Preview of version: 23

Starting September 2008, Marta Stojanovic and Alain Désilets of the National Research Council of Canada will start a 12 month R&D effort around the Cross Lingual Wiki Engine Project.

As many of you know, choosing a good research question is very difficult task, so please help us by reading the possible ideas below, providing comments, and rating them. A good research question is one for which:

  • The answer is not known already, and cannot be found easily.
  • The answer matters and has large practical consequences for a particular community.

Thx for your help. We are aiming to choose one of them by mid-september.

BTW: When you rate ideas, make sure you make your own mind and write your answer down before looking at ratings from other folks.


Contexts of use


While collaborative translation has applications in a wide range of situations, we are particularly interested in research that will have impact in the following contexts:

  • Government organizations that have some sort of legal obligation to provide content in multiple languages (ex: Canadian Goverment, UN departments, European Commission departments).
  • Companies that need to produce user documentation for their products in multiple languages, and who want to outsource this work to the community of users.



Q1: What is the current state of collaborative translation practices and technologies?


Description


There are lots of sites that are doing collaborative translation, and many technologies that are used to support them. A partial list can be found here:


At this point in time, nobody seems to have a good handle on everything that is happening. It would be good to write a good synthesis of what is happening.

For example, we could write a survey that analyzes the different communities and tools in terms of the extent to which they operate without relying on the Assumptions of conventional translation processes.

Why is this question important?


This is important so we know what has been done already, so we can figure out what the important unresolved problems are, and can focus on solving those instead of re-inventing the wheel.

What makes this a research question?


This is not hardcore quantitative research, but it it falls in the category of qualitative research. It will involve gathering information, writing and analysing surveys, and synthesizing the information into a big picture.

Proposal assessment (please prefix your scores with your initials)


Please help us by providing your own assessment of this research question, on three levels.

Importance: To what degree, do you feel that the answer to this question have important concrete consequences for the community of people doing collaborative translation.
1 = Not important, 5 = Critical importance

Workload: How many person month do you think it will take to answer that question?

Research level: To what degree, do you feel that this qualifies as research?
1 = This is not research at all, 5 = This is definitely research.

Make sure you make up your own mind before looking at other people's assessments (you can view them by clicking in the minus sign below).

Your assesment

[+]

Q2: How to best integrate computer-assisted tools into a collaborative translation platform?


Description


Professional translators have all sorts of Computer Assisted Translation (CAT) tools at their disposal (ex: terminology databases, translation memories), which amateur translators working in collaborative fashion often do not have.

Which of these tools should be integrated into collaborative translation platforms, and if so, how?

In this project, we would integrate open source CAT tools into TikiWiki, have them used in an actual environment involving amateur translators, and gather feedback about their usefulness, limitations, and suggested improvements.

Why is this question important?


This is important because CAT tools have great potential for increasing the productivity of volunteer translators in a collaborative environment.

What makes this a research question?


CAT tools are pretty mature, and we know how to build them for professional translators. We also know that they have a good impact on productivity.

But it's not clear to what extent tools need to be different to help amateur translators, and the extent to which it will actually improve their productivity.

Moreover, the more open and unpredictable technical environment in which amateur translators work poses a number of design questions that will be interesting research from a Human Computer Interaction perspective.

Proposal assessment (please prefix your scores with your initials)


Please help us by providing your own assessment of this research question, on three levels.

Importance: To what degree, do you feel that the answer to this question have important concrete consequences for the community of people doing collaborative translation.
1 = Not important, 5 = Critical importance

Workload: How many person month do you think it will take to answer that question?

Research level: To what degree, do you feel that this qualifies as research?
1 = This is not research at all, 5 = This is definitely research.

Make sure you make up your own mind before looking at other people's assessments (you can view them by clicking in the minus sign below).

Your assesment

[+]

Q3: How can Machine Translation help collaborative translation communities?


Description


Collaborative translation communities often do not have sufficient human resources to cover all language pairs, and to provide translation of all content in a timely fashion.

Machine Translation might help in several ways:

  • Automatically provide a "gist quality" translation of new content. This would be only a temporary measure until a human translator finds the time fix it.
  • Allow volunteer translators to translate content from a source language that they can't read. For example, the MT system would provide a "bad" English translation of a page written in Japanese, and the user could fix that bad English without having to actually read the original Japanese.

Why is this question important?


This is important because communities don't want to spend most of their human resources and energy in translation as opposed to creation of original content. MT has the potential of providing "good enough" translation at a fraction of the cost in human resources that fully manual translation can offer.

What makes this a research question?


MT is still bleeding edge technology, so application that uses it is definitely research.

While there have been studies of the use of MT outputs for the purpose of gisting, and as first drafts to be post-edited by human translators, those have focused on translation of whole documents.

In the context of a collaborative community, we are more likely to want to apply MT to updates to pages. There are some interesting new issues with that context.

For example, consider a French page that is perfectly translated by a human. Someone adds two sentences to the English page. Wouldn't it be nice to be able to insert an MT translation of just those two sentences into the French page, maybe highlightin them in yellow with a warning saying that they were MT translated? Could it be that two, potentially poorly MT translated setences are more easily understandable when presented in the context of a perfectly translated document? Also, how do we go about reliably inserting those two sentences at the right place in the French (ex: using alingment technology).

Also, suppose I have an English page that is initially all translated by MT to French. Then, I manually correct the bad MT translation to make it perfect. In particular, I modify the structure of sentence 2 to make it sound more like a French sentence (the MT translation used an English-like sentence structure). Then, someone changes the English sentence number 2. What should the MT system do? Should it replace French sentence 2 by an MT translation of the newly modified English sentence 2? If so, chances are that I will have to redo the structure modification in the French sentence 2. Is there a way that the MT system could learn from my correction made to the original French sentence 2, and use the same sentence structure to retranslate the updated English sentence 2?

There may also be some "softer" Human Computer Interaction types of issues. For example, how best to entice readers of a bad MT translation (either of a whole page, or just of a page), to become an active participant in the community by fixing the translation?

Proposal assessment (please prefix your scores with your initials)


Please help us by providing your own assessment of this research question, on three levels.

Importance: To what degree, do you feel that the answer to this question have important concrete consequences for the community of people doing collaborative translation.
1 = Not important, 5 = Critical importance

Workload: How many person month do you think it will take to answer that question?

Research level: To what degree, do you feel that this qualifies as research?
1 = This is not research at all, 5 = This is definitely research.

Make sure you make up your own mind before looking at other people's assessments (you can view them by clicking in the minus sign below).

Your assesment

[+]

Q4: How useful is the current implementation of CLWE?


Description


We have made real progress in the CLWE project, thanks to the excellent work by Louis-Philippe Huberdeau. For a demo, see:


How useful is this to end users as it is now? What are the remaining problems to be addressed?

Why is this question important?


CLWE is still at beta stage, and it is crucial to evaluate it in real-use situations, in order to improve it.

What makes this a research question?


This is not hardcore, quantitative style of research, but it falls within the realm of more qualitative Human Computer Interaction research.

Proposal assessment (please prefix your scores with your initials)


Please help us by providing your own assessment of this research question, on three levels.

Importance: To what degree, do you feel that the answer to this question have important concrete consequences for the community of people doing collaborative translation.
1 = Not important, 5 = Critical importance

Workload: How many person month do you think it will take to answer that question?

Research level: To what degree, do you feel that this qualifies as research?
1 = This is not research at all, 5 = This is definitely research.

Make sure you make up your own mind before looking at other people's assessments (you can view them by clicking in the minus sign below).

Your assesment

[+]

Q5: How to better isolate textual elements in a page that need translation?


Description


The CLWE system does a pretty good job at knowing when say, the French page is missing some edits that have been made in the English and Spanish pages.

But it does not do a great job at identifying the actual textual elements in the English and Spanish pages that need to be reproduced in French.

The actual issues are complex and a bit hard to explain, but are described in the paper entitled "The Cross-Lingual Wiki Engine: Enabling Collaboration Across Language Barriers" (soon to be available on the web... Google for the title). See the Limitations section of that paper for a description of the problem, and the Future research section for a description of potential solution.

Why is this question important?


The current implementation of displaying what needs to be translated is based on diff technology, which can cause a lot of confusion if there are new page changes interleaved with translations from another language. For example, when translating a change from English to French, the system might in cases where there are interleaved modifications to the English page, indicate that certain portions of the English page need translation into French, when in fact, these English passages were actually created in French originally, and translated to English.
This can cause the users to completely lose faith in the system.

What makes this a research question?


While diff technology is pretty straightforward, patching technology isn't, and often requires that the human be kept in the loop. The main challenge of this project is to find a way to:

  • Take a diff between say, versions v5 and v6 of the English page
  • Show those diffs in the context of the current version of the English page, say v9.

As far as we know, this is not a trivial problem. More advanced isolation of textual elements in a page that need translation significantly complicates the range of possible translation workflows.

Proposal assessment (please prefix your scores with your initials)


Please help us by providing your own assessment of this research question, on three levels.

Importance: To what degree, do you feel that the answer to this question have important concrete consequences for the community of people doing collaborative translation.
1 = Not important, 5 = Critical importance

Workload: How many person month do you think it will take to answer that question?

Research level: To what degree, do you feel that this qualifies as research?
1 = This is not research at all, 5 = This is definitely research.

Make sure you make up your own mind before looking at other people's assessments (you can view them by clicking in the minus sign below).

Your assesment

[+]

Q6: What is the value of supporting cross-lingual searching, and how best to implement it?


Description


In a site that is collaboratively translated, some of the information may be available only in particular languages and not in others.

When searching for information, users probably want to find the information no matter in which language it is present. But obviously they don't want to write the same query in different languages.

There are experimental technologies for doing cross-lingual search. For example, writing a query in English, and having the system search for that in all languages (usually by automatically translating the query to different languages). Combined with Machine Translation system for translating the hits found in different languages, this might be good enough for people to find and understand information in pages written in languages that they can't read.

Does such a feature have value for collaborative translation communities? If so, where does it lie? How can we best implement such features?

Why is this question important?


This is another way to deal with the fact that in collaboratively translated sites, is not always possible to translate all relevant information to all languages in a timely fashion.

What makes this a research question?


Cross Lingual Search technology is still bleeding edge, so it's not clear that it will work to a sufficient level to provide value to end users. We plan to find out by building it and trying it out with real end users.

Proposal assessment (please prefix your scores with your initials)


Please help us by providing your own assessment of this research question, on three levels.

Importance: To what degree, do you feel that the answer to this question have important concrete consequences for the community of people doing collaborative translation.
1 = Not important, 5 = Critical importance

Workload: How many person month do you think it will take to answer that question?

Research level: To what degree, do you feel that this qualifies as research?
1 = This is not research at all, 5 = This is definitely research.

Make sure you make up your own mind before looking at other people's assessments (you can view them by clicking in the minus sign below).

Your assesment

[+]

Q: How could bilingual alignment technology decrease reliance on users for assessing translation completion?


Description


The CLW system currently relies heavily on the user to tell it when a particular translation task is complete (Complete Translation versus Partial Translation buttons for saving). If the user mistakenly pushes the wrong
button, this may result in changes not being propagated to
other languages, or in substantial confusion for subsequent
translators of the same page.

One way to alleviate this problem would be to use automatic
bilingual sentence alignment technologies16 to perform
a basic sanity check on the alignment of the saved
target page with the source page. The system could then
notify the user when the alignment does not seem to correspond
to his choice of Complete Translation versus Partial
Translation button

Why is this question important?




What makes this a research question?




Proposal assessment (please prefix your scores with your initials)


Please help us by providing your own assessment of this research question, on three levels.

Importance: To what degree, do you feel that the answer to this question have important concrete consequences for the community of people doing collaborative translation.
1 = Not important, 5 = Critical importance

Workload: How many person month do you think it will take to answer that question?

Research level: To what degree, do you feel that this qualifies as research?
1 = This is not research at all, 5 = This is definitely research.

Make sure you make up your own mind before looking at other people's assessments (you can view them by clicking in the minus sign below).

Your assesment

[+]

Q: ???


Description




Why is this question important?




What makes this a research question?




Proposal assessment (please prefix your scores with your initials)


Please help us by providing your own assessment of this research question, on three levels.

Importance: To what degree, do you feel that the answer to this question have important concrete consequences for the community of people doing collaborative translation.
1 = Not important, 5 = Critical importance

Workload: How many person month do you think it will take to answer that question?

Research level: To what degree, do you feel that this qualifies as research?
1 = This is not research at all, 5 = This is definitely research.

Make sure you make up your own mind before looking at other people's assessments (you can view them by clicking in the minus sign below).

Your assesment

[+]

History

Advanced
Information Version
Wed 10 of Dec, 2008 16:27 GMT alain_desilets 45
Wed 10 of Dec, 2008 16:23 GMT alain_desilets 44
Wed 10 of Dec, 2008 16:22 GMT alain_desilets 43
Wed 10 of Dec, 2008 16:14 GMT alain_desilets 42
Tue 18 of Nov, 2008 16:09 GMT marta.stojanovic 41
Tue 18 of Nov, 2008 16:05 GMT marta.stojanovic 40
Tue 18 of Nov, 2008 16:04 GMT marta.stojanovic 39
Tue 18 of Nov, 2008 16:03 GMT marta.stojanovic 38
Tue 18 of Nov, 2008 16:02 GMT marta.stojanovic 37
Tue 18 of Nov, 2008 16:02 GMT marta.stojanovic 36
Tue 18 of Nov, 2008 16:01 GMT marta.stojanovic 35
Tue 18 of Nov, 2008 15:55 GMT marta.stojanovic 34
Tue 18 of Nov, 2008 15:54 GMT marta.stojanovic 33
Tue 18 of Nov, 2008 15:54 GMT marta.stojanovic 32
Tue 18 of Nov, 2008 15:53 GMT marta.stojanovic 31
Sat 06 of Sep, 2008 09:34 GMT alain_desilets 30
Fri 05 of Sep, 2008 12:34 GMT alain_desilets 29
Fri 05 of Sep, 2008 12:30 GMT alain_desilets 28
Fri 05 of Sep, 2008 12:26 GMT alain_desilets 27
Fri 05 of Sep, 2008 12:22 GMT alain_desilets 26
Fri 05 of Sep, 2008 12:18 GMT alain_desilets 25
Fri 05 of Sep, 2008 12:12 GMT alain_desilets 24
Fri 05 of Sep, 2008 12:07 GMT alain_desilets 23
Fri 05 of Sep, 2008 12:01 GMT alain_desilets 22
Sat 23 of Aug, 2008 14:44 GMT nkoth 21
Wed 20 of Aug, 2008 18:53 GMT alain_desilets 20
Wed 20 of Aug, 2008 18:51 GMT alain_desilets 19
Wed 20 of Aug, 2008 18:50 GMT alain_desilets 18
Wed 20 of Aug, 2008 18:48 GMT alain_desilets 17
Wed 20 of Aug, 2008 18:47 GMT alain_desilets 16
Wed 20 of Aug, 2008 18:33 GMT alain_desilets 15
Wed 20 of Aug, 2008 18:25 GMT alain_desilets 14
Wed 20 of Aug, 2008 18:14 GMT alain_desilets 13
Wed 20 of Aug, 2008 18:07 GMT alain_desilets 12
Wed 20 of Aug, 2008 17:31 GMT alain_desilets 11
Wed 20 of Aug, 2008 17:08 GMT alain_desilets 10
Wed 20 of Aug, 2008 17:07 GMT alain_desilets 9
Wed 20 of Aug, 2008 16:58 GMT alain_desilets 8
Wed 20 of Aug, 2008 16:57 GMT alain_desilets 7
Wed 20 of Aug, 2008 16:51 GMT alain_desilets 6
  • «
  • 1 (current)
  • 2

Upcoming Events

No records to display