Thursday, May 08, 2008

The translation problem solved again

If it’s Thursday, there must be another tech developer promising reliable machine translation for the masses. FujiSankei Business i has a piece (the top story in the paper edition this morning) on a new handheld translation gadget and software that promises great things for a Japan that’s home to non-Japanese-speaking foreigners (“Everyone will be able to chat with them with ease!” blares the headline) and a China that’s host to the Olympics this summer.

????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

The National Institute of Information and Communications Technology, or NICT, is now wasting a pile of taxpayers’ money developing the system described above. In a nutshell, it takes data from NHK news broadcasts, manufacturers’ technical manuals, and various other sources to create a vocabulary database. This can be updated to give the system’s users access to the latest terminology, and the sound-recognition function can even handle the various regional flavors of Japanese, according to the article.

There’s been no shortage of software that promises to do all this for users, or of handheld gadgets with such software onboard. US soldiers in the Middle East have small computers with phrase dictionaries ready to go, and you can even get travel phrasebooks for your Nintendo DS or Sony PSP that let you chat with the natives, if you believe the breathless literature on display at your neighborhood Bic Camera. The article says that the differences between previous efforts in the field and this new NICT project are: (1) database creation is now automated, so you don’t need humans manually adding phrases to the system bit by bit, and (2) the government-affiliated group has avoided copyright problems (previously an obstacle to getting lengthy texts included in databases) by tying up ahead of time with the broadcasters and companies providing their documents for the system.

It remains to be seen whether this will result in a system that’s actually useful for more than simple phrases, only slightly mangled. My money’s on “no,” since not even the technical might of Google, which takes a similar approach to automated building of massive multilingual databases for its translation tools, can keep boneheaded glosses out of the system. (I wrote a bit about this on my own site last month.) Not to mention the fact that the article completely ignores the question of voice-recognition accuracy. The fancy toy you can see in the photo accompanying the FujiSankei article has a microphone, but who knows how well it will pick up all those Japanese dialects rural shopkeepers will throw at lost tourists.

Oh well. Even if this one doesn’t solve all our communication problems, there’s always next Thursday . . .

Posted by Peter Durfee on 05/08 at 06:14 PM
(2) CommentsPermalink