Computers bring translation to the telephone

February 01, 1993|By Andrew Pollack | Andrew Pollack,New York Times News Service

KYOTO, Japan -- Toshiyuki Takezawa sat down at a bank of powerful computers here recently and spoke into a microphone. "Moshimoshi," he said.

Instantly, the computers whirred to life, furiously digesting and analyzing this morsel of Japanese speech. Twelve seconds later and half a world away, a computer in Pittsburgh spoke, conveying Mr. Takezawa's message in English. "Hello," it said, in its electronic voice.

That simple greeting began what researchers here describe as the first public demonstration of an overseas telephone conversation being automatically translated from one language to another.

Such a capability, which engineers call automatic interpreting telephony, has been a long-sought goal of researchers, who see it as a way of overcoming language barriers.

The day of being able to dial up anyone in the world and chat freely is probably two decades or more away, researchers say. But they think that by the end of this decade, interpreter telephones will be used for limited applications, such as making travel reservations.

"Today's experiment is just one tiny, tiny step," said Kohei Habara, executive vice president of the Advanced Telecommunications Research Institute International, the Japanese research center in Tokyo that conducted the demonstration. "But we think historically it's a very important one."

As the test showed, the systems are very limited. Any system that requires 12 seconds to translate "hello" from one language to another (more complex utterances could take 20 seconds or longer) obviously needs improvement. But that will come with time and faster computers.

A much more fundamental problem with the test system is that conversation must be limited to a very narrow topic, in this case, registering for a conference. And the vocabulary is limited to between 500 and 700 words, to make it easier for the computers to recognize what was said.

While the test computer understood the sentence "It costs $200 per person," for instance, it would not have been able to handle a variation such as "That'll be 200 bucks a head." Moreover, the speakers had to use grammatical sentences.

The computers, to their credit, understood sentences spoken at a normal or near normal pace and flawlessly translated such sentences as "The proceedings and the reception are included in the application fee."

Thursday's test was the result of an international collaboration involving Japan's ATR institute, Carnegie-Mellon University in Pittsburgh, and a team from the company Siemens AG and Karlsruhe University in Germany. The German team participated from Munich.

ATR, which is 70 percent financed by Japan's government and 30 percent by corporations, has spent 16 billion yen, or about $130 million, on a seven-year interpreting telephony project.

That project is now coming to an end, although ATR hopes to win financing for a new one to continue the work.

Speech translation involves three technologies, each of which is difficult in its own right -- speech recognition, translation and speech synthesis.

When Mr. Takezawa, an ATR researcher, spoke a sentence in Japanese, a computer workstation analyzed the sound patterns and converted the speech into Japanese text. The text was displayed on the computer screen, allowing him to verify that the computer had understood him.

Next, another computer translated the Japanese text into English. That English text was transmitted over the telephone line to a computer at Carnegie-Mellon, which converted the text into English speech, using a speech-synthesis device.

By using modems to transmit text, rather than voices, over the phone line, the system tested Thursday avoided the problem of having a computer try to discern speech over a telephone line; phone transmission degrades voice quality and makes recognition harder for a machine.

Several other companies have already demonstrated speech translation systems that can operate within one room but would not work well over international phone lines.

NEC Corp. has a system that works faster and more smoothly than ATR's, though with an even more limited vocabulary. Matsushita of Japan and American Telephone & Telegraph Co. in the United States have also demonstrated experimental systems.

Thursday's demonstration took place at the auditorium of ATR. Satellite video-conference hookups allowed viewers to see and hear what was happening at Carnegie-Mellon and in Munich. A huge computer screen showed the speech recognition and translation in progress.

Because each group developed the processing system for its own language, the systems differed in their capabilities. Both the English and Japanese speech output sounded almost like human voices. But the German speech output sounded like a robot.

Baltimore Sun Articles
Please note the green-lined linked article text has been applied commercially without any involvement from our newsroom editors, reporters or any other editorial staff.