Generating a translation memory and running a quality test

Generating a translation memory and running a quality test

by José Miguel Andonegi Martínez -
Number of replies: 6

Hello:

Translation memories are very useful in translation process. Unfortunately, AMOS doesn't work with tm. A .tmx file has all the translations and it makes easier consulting similar texts-

This tutorial explains how to create a .tmx file out of Moodle language packs and how to make a quality test over that file to detect translation errors. I have used it to improve some few texts in the Basque language pack.

I have done the same process with the Spanish language and this is the translation memory generated.

Is anybody using tms or Translation software like OmegaT? I wonder if it could be worth using that kind of tools and to import the translated file to AMOS.

In reply to José Miguel Andonegi Martínez

Re: Generating a translation memory and running a quality test

by koen roggemans -
Picture of Language pack maintainers

Hi José,

Thanks a lot for this. My translation memory sits on top of my shoulders and that is probably not the best place for it. The automatic quality checks you managed to make it run are very interesting - I didn't know that was possible.

I can imagine it being possible to generate a .tmx file using Amos from what is in the language pack already. I don't think it is technically possible to import files from a .tmx file back to AMOS, because there is some important information missing - unless I didn't find it. 

Let me explain: if you look at the beginning of the file:

<?xml version="1.0" encoding="UTF-8"?>
<tmx version="1.4"><header creationtool="net.sf.okapi.steps.idaligner.IdBasedAlignerStep" creationtoolversion="unknown" segtype="paragraph" o-tmf="unknown" adminlang="en" srclang="en" datatype="unknown"></header><body>
<tu tuid="access">
<tuv xml:lang="en"><seg>Accessibility</seg></tuv>
<tuv xml:lang="es"><seg>Accesibilidad</seg></tuv>
</tu>
<tu tuid="accesshelp">
<tuv xml:lang="en"><seg>Accessibility help</seg></tuv>
<tuv xml:lang="es"><seg>Ayuda sobre accesibilidad</seg></tuv>
</tu>

You can see there is some important information missing: the file where the string is coming from. That filename is an important piece of information to know from which component a string originates. If I do a count for the tuid "pluginname" with cat moodle28_en_es.tmx |grep '<tu tuid="pluginname">'|wc -l then the result is 448. Which plugin name belongs to which plugin and in which file should that be? You can never figure that out, because the information is missing (unless I missed something)



In reply to koen roggemans

Re: Generating a translation memory and running a quality test

by koen roggemans -
Picture of Language pack maintainers

Step 1 of your manual for Linux/Mac users (see attachment):


In reply to koen roggemans

Er: Re: Generating a translation memory and running a quality test

by José Miguel Andonegi Martínez -

Hi koen:

Thanks for your script. I will include it in the tutorial and I will also tray to write a similar .bat file.

You are right about that limitation in the described process. That's because of my limited knowledge using filters in Rainbow smile. It would be better to generate an id composed of the filename+id.

I'm not an expert using this kind of technology but if I am not wrong, the translation memory is something like a repository where you put your different translations. It is useful as an input to suggest similar texts or to perform quality check actions. Just having an export option from AMOS would be very useful.

I'm new using AMOS and I find it a great tool to manage the translation process (proposal, review, approval) but it would be nice to have translation proposals based upon similar texts.

Other possible way:

1) Use an alignment tool like Rainbow to convert each couple of .php files into an .xliff file (using an improved alignment process that includes file name in the id).

2) Translate the .xlf files with OmegaT using the .tmx as an input (and other tools like glosaries, integrated quality checks, ...)

3) Revert the alignment process with Rainbow to generate a completely translated .php file

4) Import the translated file

Regards!

In reply to José Miguel Andonegi Martínez

Er: Re: Generating a translation memory and running a quality test

by José Miguel Andonegi Martínez -

Hi Koen:

I have found an easier way to generate the .tmx: just export it from the language customization tables. This is the procedure:

  1. Load the up to date language packs in a moodle instance where you have access to the datatbase (it can be in your computer)
  2. Run the language customization option, so that the texts are loaded in the database.
  3. The attached files have the SQL request to be run and a template of the translation memory.
  4. Change the language from eu to your language.
  5. Copy and paste the result of the query in the template

The resulting file can be analysed with Checkmate.

In reply to José Miguel Andonegi Martínez

Re: Generating a translation memory and running a quality test

by Ralf Hilgenstock -
Picture of Language pack maintainers

I'm working with Moodle translation process for a long time and also in the translation process of several other tools.  I've worked also with different tools.

Here are the screens from two tools I used over elast months:

http://translations.launchpad.net












Launchpad serves different software projects. You can see the translations from the tool your are working with and olther tools that use exactly the same strings and copy them into your translation window.


https://www.transifex.com/

This is a commercial service. You see not only the same term and the translation in your tool, but also other strings which may be similar.











What are the main problems in translation:

  • Consistent use of terminology
    • Long time  ago since last translation of the string for the same activity/element
    • Different translators working independent
    • Context the translator thinks  about when translating
  • Context
    • Finding the best wording in context. Context can be about the tool, but also the target group. In Germany we use different words for teacher and students in schools, Higher Education and adult training programs.
  • General, language related
    • English uses different terms when German uses only one: i.e. answer/response:Antwort
    • English uses one term and German different words in different situations.
In reply to Ralf Hilgenstock

Re: Generating a translation memory and running a quality test

by Daniel Neis Araujo -
Picture of Language pack maintainers

Hello,


nice work being done!

I am working with translating Moodle and other software and content - first was ubuntu at launchpad, back on 2009 or early, than moodle of course, and more recently i've tried Transifex for the GNU MediaGoblin project (http://mediagoblin.org/) that have moved to Pootle recently (https://chapters.gnu.org/projects/). They use gnu po files and this Pootle tool let you download the text file and translate offline and also translate and review online. I've used it just for an afternoon and it seems to be a very good tool. A point to note is that they count words to translate and not strings. It woul be nice to have this on AMOS too =)


Last year i have also participated on the Mozilla Translation quality project (https://blog.mozilla.org/l10n/2014/06/16/translation-quality-at-mozilla/) that is a very good approach to quality review and they generated a framework for this http://www.qt21.eu/mqm-definition/definition-2014-06-06.html and have used a tool to conduct the job (http://scorecard2.gevterm.net/).

Thay have very good instructional videos on how to use the tools and it is really quick to review things to work on and have a better metric than "100% translated". With this revision tool you can see how many good and poor translated strings you have.


I've also played with video sub-titling on Amara.org that is an awesome jaw-dropping tool, really amazed with how easy is the process to translate and sync subtitles.


Last month I've also tried the new Wikipedia Content Translation tool that easies the process of translating wikipedia pages from one language to another if the page does not exist on the destination language. Interesting links are:

https://blog.wikimedia.org/2015/01/20/try-content-translation/

https://pt.wikipedia.org/wiki/Especial:ContentTranslation

https://www.mediawiki.org/wiki/Content_translation/Translation_tools


It would be very nice if Moodle let users translate strings on the common interface, it woul eliminate the major problem with translations that is context. Much times on AMOS you don't know if a phrase is on "infinitive" or "declarative" or things like that.

We already have an option that let you pass "&strings=1" on the url and get the indentifier for the strings. What about implement a webserver on amos that let people click on the text, it get transformed on a text input with a confirm and cancel buttons attached that when you click confirm send it as a contribution to AMOS? A side benefit is to end with 300 strings contributions that are hard (time consuming) to review.


What do you say?


Kind regards,

Daniel