Monday, March 10, 2008

Translate the net for the whole world

I know, it's simply too easy to ridicule existing machine translation services. But Microsoft Research is trumpeting a new advance, so I thought I'd look again.

Here is the result of translating the first few of their paragraphs into German:
Leute benutzen auf der ganzen Erde das tägliche Internet, um Waren oder Dienstleistungen zu kaufen, um zu Information zu suchen, Ablenkungen zu finden.

Aber ist das World Wide Web wirklich weltweit?

Es ist schwierig, den Fall zu bilden. Schätzungen behaupten, dass ungefähr 70 Prozent Webseiten heute in der englischen Sprache verursacht werden, während der Prozentsatz der nicht-Englischen Lautsprecher schneller als der der englischen Lautsprecher wächst. So was, wenn Sie nicht Englisch sprechen? Oder was, wenn Sie tun und Sie finden eine interessante Seite geschrieben auf Deutsch? Oder Russe? Oder Chinese?

Microsoft-Forschungsziele zu bitte.
If you find that hard to read, here's their translation back to English:
People use the daily Internet on whole earth, in order to buy goods or services to find in order to search to information, diversons.

But the World Wide Web is really world-wide?

It is difficult to form the case. Estimations maintain that approximately 70 per cent of web pages is caused today in the English language, while the percentage of the non-English loudspeakers grows faster than that of the English loudspeakers. So which, if you do not speak English? Or which, if you do and you find an interesting side written on German? Or Russian? Or Chinese?

Microsoft aims of research too ask.
If German is too hard, let's try Portuguese:
Os povos usam pelo mundo inteiro o Internet diário, para comprar bens ou serviços, para procurarar pela informação, para encontrar diversões.

Mas é o World Wide Web verdadeiramente mundial?

É difícil fazer o caso. As estimativas reivindicam que aproximadamente 70 por cento dos Web pages estão criados hoje na língua inglesa, quando a porcentagem de altofalantes não-Ingleses crescer mais rapidamente do que aquele de altofalantes ingleses. Assim que se você não fala o inglês? Ou que se você faz e você encontra uma página interessante escrita no alemão? Ou russo? Ou chinês?

Alvos de pesquisa de Microsoft a por favor.
And back:
The peoples use for the entire world the daily Internet, to buy good or services, to procurarar for the information, to find diversions.

But it is the truily world-wide World Wide Web?

It is difficult to make the case. The 70 estimates demand that approximately percent of the Web pages are created today in the English language, when the percentage of not-English speakers to grow more quickly of what that one of English speakers. As soon as if you do not say the English? Or that if you make and you find an interesting page writing in the German? Or Russian? Or Chinese?

Targets of research of Microsoft please.
I have two concerns about such machine translation services:
  1. Under copyright law, a translation is a "derivative work." Wholesale translation and publication without the permission of the copyright holder appears to be a violation of copyright law.
  2. If someone reads one of these translations and acts on it, who is legally responsible for any errors? E.g., accidental libels, inaccurate description of products, risible renderings of serious literary works or policy whitepapers, ...

