Alain's comments on translation tracking architecture, 2008-02-05

These are Alain's comments on an architecture document written by LPH. The document describes how the system will track the relationships between changes made and translated in different language versions of a same page.

The architecture is attached at the bottom of this page.

Very flexible framework


I like the fact that the framework is highly flexible. In particular, it is completely agnostic about whether or not we have a master language or pivot language or none. This will give us flexibility to experiment with different types of workflows.

I also like the way you address the need for preserving the complete translation relationship record, while allowing fast display (using translation bit flags).

Response: Everything is kept, no worries. In the aerospace world, DELETE is not valid.


User still needs to manually label synchronisation points


On page 2, the document says:

Trying to identify the point in time when two pages are synchronized would usually require human validation. Instead, this model keeps track of which unique modications the page has been exposed to. Pages that have been exposed to the same content are considered equivalent. Similarly, pages that have been exposed to more content can be considered superior.


The question I have is what does "exposed" mean in this context? It seems to me that you are assuming that as soon as the user enters a transaction to transate a change from say, En to Fr, and he clicks on save, then the Fr page is deemd to have been exposed to the En change.

But that assumes that translators will always translate a whole change in one go and will not want to do an interim save. This may not be the case when translating long changes.

I wrote a bit about this on this page Alain's test and feedback, 2008-02-05... look at Issue #9.

Response: There will be a need to support partial translation. The limitation is currently driven by the user interface which propagates as soon as a translation is saved. The solution would probably be to allow the translator to mark the current translation as incomplete. As far as the architecture goes, nothing would change. No bits get propagated for an incomplete change, leaving the need for translation as active.

There is no way to select which of the bits got translated. While in could in theory be possible, the only way to do it is to see the diff made by the original edit, which really could be in any language. Moreover, if no translations are made for a long period, nothing of what was originally added could remain. A translation bit really only represents an edit on the content. It could be adding, removing or reorganizing. For propagation, it's all-or-nothing.

This was actually planned from the start. Only that I couldn't figure out a way to handle it properly right away. I will try to expand on this.


Can translates decide to not incorporate a particular change from a language


Often there are some things that are too cultural and are not appropriate in other languages. In cases like this, translators must have the option so simply not integrate a particular change in their language version.

I don't think there are limitations in the architecture w.r.t. that, but just wanted to raise the issue.

Response: In fact, you just pointed out one of the largest limitation of the architecture: it does not know about the social dynamics. It does not know if the change was applied properly or not. It simply propagates the bits and update itself.

The solution to this is probably to allow the translator to indicate that an interpretation of the content was made. While the translation was made and the translated page was exposed to the other bits in a way that is suitable, the change made cannot be used as a reference for other pages to translate from. In fact, an additional translation bit should probably be created as part of the operation, just in case other languages want to benefit from the interpretation.

In a community where this type of interpretation is welcome, this is probably a good solution. In other situations, the translation might just have to be reverted. Again, I don't currently have a way to revert propagations. Some sort of review process might also be required.

I tried to mention this at quite a couple of places, but really couldn't make a strong point out of it. I think I will just dedicate a complete section to it.


Formulaes are not clear


Eventhough I am a mathematician by training, I find I have no more stamina for forumalaes anymore ;-).

BTW: What you show at the bottom of page 2 are formulaes, not equations.

Response: Good point, my mistake. Easy fix.


I had trouble understanding them. In particular, you say that alpha is the source page, and omega is the set of translatios for a given page. Given that, how can you say that alpha is an element of omega when the elements of omega are not pages? Also, you didn't define what Phi was.

Response: Elements in a set of translations really are pages. I should make this clear.

Phi is defined by formula 3 and 4. I just didn't think it was necessary to give more details. Basically, it's the set of all translation bits available for a given translation set. Theta is simply any page in the set. Source and target did not matter in that case.


Associating translation bits with parts of pages


I can see how the translation bit flag hack will allow you to quickly determine if a page in one language needs to be exposed to changes from another language.

It's not clear to me though how you will determine WHICH changes, i.e. which part of the source page need to be brought into the target page.

Will there be some sort of version number associated with a particular bit in the bit mask?

Response: This is part of the graph section mostly. When a bit is propagated, it is bound to a version. Digging in the history and looking out for junctions, you can find which revision matches.

The actual implementation does not use bits. It's only a way to explain it.


Will consecutive translation bits be merged together?


It seems to me that every time the user saves a page (while authoring original content), he generates a new translation bit. Does this mean that when the time comes to propagate the changes to other languages, the translator will need to do a translation transaction for each of those consecutive bits? Or will "consecutive" translation bits (exact meaning of "consecutive" to be determined) be somehow aggregated into a single translation bit?

I wrote a bit about this on this page Alain's test and feedback, 2008-02-05, under Comment #7.

Response: This was caused by a problem that existed until this morning. The history lookup would only search for updates from source to target while target to source also mattered. It caused quite a lot of confusion.

Any translation operation propagates all bits, so they are really merged together from a user standpoint.

Upcoming Events

No records to display