Using Amazon Mechanical Turk for Translating Computer Software

Summary

It’s possible to translate software using Amazon Mechanical Turk, but it’s not ideal.

What is Amazon Mechanical Turk?

It’s a web site run by Amazon (yes, the shopping company). Anyone can create an account and post some questions or a set of tasks. You agree to pay a certain amount for each completed task. Then someone else finds your task in the list, decides it’s worth their time, and does whatever is requested.

The tasks could be anything. Here’s an example that’s well-suited to Mechanical Turk: “Draw bounding boxes around objects in images — Draw a box around counter: table consisting of a horizontal surface over which business is transacted.” And then there is a picture of a kitchen and you’re supposed to draw an outline around the counter.

It’s a pretty cool concept.

What I Did

Late last year I decided that I wanted to try translating software (often described as “localizing”) using Mechanical Turk. I chose to translate Meebo’s Android IM application into Spanish. The application is written in a way that allows it to be localized: Individual chunks of text (usually called “strings”) exist in a single xml file. The source code of the application references this xml file whenever it needs to show a string to the user.

I posted each string to Mechanical Turk as a separate task. I offered $0.10 for short strings and $0.20 for longer strings. To attempt to increase the quality of the results I specified that shorter strings should be translated by 3 different people and longer strings should be translated by 5 different people. That way I could verify that everyone gave the same response, and pick and choose between different phrasing and word choice.

It worked, but I don’t have a high degree of confidence in the translation. I ended up having to use an online English-Spanish dictionary, Google Translate, and my own rusty knowledge of Spanish to decide between different versions of strings. I’m sure I didn’t do justice to my high school Spanish teachers, and I’m sure the result isn’t perfect.

Lessons Learned

  • It’s easy to translate text poorly. It’s hard to translate text well. It’s an art. There are many subtle nuances in each word choice, and it’s very difficult to preserve those when translating to another language. And translating computer text is usually different than translating a book.
  • Translators do a better job with context. For example, should the word “login” be translated as a noun or a verb? Should “sign in” mean that you’re connecting to another computer, or that you’re writing your name on a sign-in sheet at a bridal show?
  • Having the same person translate all strings will provide better consistency between the strings.

Better Options

  • If you can afford it, pay a professional software translation team. Professionals know what verb tenses to use for buttons versus menu titles versus dialog titles, etc. You’ll have better consistency if you have just one or two people translate an entire language.
  • Have your users translate for you. If you’re a popular open source project this will work well. The translators may or may not be as talented as a professional who translates software for a living, but if they’re users of the software then they’ll be self-motivated to translate it, and will hopefully maintain the translation into the future.
  • Have your translators create a glossary of words commonly used in your application (“username,” “e-mail,” “buddy”) and translate this glossary first. That way current and future translators can reference this glossary and maintain consistency across their translations.
This entry was posted in All, Computers. Bookmark the permalink.

4 Responses to Using Amazon Mechanical Turk for Translating Computer Software

  1. Gabe says:

    I like your new masthead image, you look like an evil genius now.

  2. Marcus says:

    Since when was Mark not an evil genius?

  3. adam says:

    Thanks for your review, helpful.

  4. Lee says:

    Thanks for this. We have had issues with context and have given our translators a ‘context document’ to show them how the software works and what it looks like. We are also toying with the idea of adding them as testers.

Leave a Reply

Your email address will not be published.