User-friendly voice dictation? Don't speak too soon

Personal Computers

January 29, 1996|By Stephen Manes

COMPUTER understanding of human speech always seems to be just around the corner. Until recently, only a fool or an optimist would try to dictate an entire column to a little computer, and I am not an optimist.

But I wrote that last sentence, and some of the ones that follow, on the smallest of IBM Thinkpad subnotebooks with the help of an IBM system called Voicetype Dictation. It costs about $1,000 to $1,100, depending on configuration.

Voicetype Dictation began life as a package of hardware and software that ran only on IBM's OS/2 operating system. That version is still available, but now the software has been reworked for Windows. Although the hardware still comes as a circuit board that plugs inside a typical desktop computer, it has also been crammed for portability into a credit-card-size PC (formerly PCMCIA) Card.

The idea of a laptop dictation taker remains almost as exciting as it was in the era of "How to Succeed in Business Without Really Trying" but comes with at least as many complications. Although Voicetype is supposed to work with Windows 95, getting the two to coexist is painful, largely because proper software drivers are unavailable.

With no help from the manual and little from help files, I wasted a couple of frustrating hours before calling (800) TALK-2-ME. An hour later, a technician diagnosed the problem as bad hardware, but a new card displayed identical problems.

I eventually discovered on my own that the trick was to use the Windows 95 control panel's PC Card program to disable the card slots, leaving the job to DOS-based programs invoked by editing the AUTOEXEC.BAT file. Unless you are extremely masochistic (or, as Voicetype's first pass would have it, "Massachusetts"), make sure Windows 95 drivers are available. The manual suggests using Windows 3.1 will be even less amusing.

Once safely past the basic installation, you don the headset and "enroll" your voice by reading text that appears on the screen. As you train the machine to understand you, it trains you in the dictation technique IBM calls "isolated speech." Each word you dictate must be separated from the next by a brief pause.

It. Is. A. Little. Tricky. To. Get. Used. To. Still, the human's enrollment tasks take only about an hour and a quarter. The computer does the rest.

The manual warns: "The process will exceed the battery power capacity of any known laptop system." I plugged my machine in overnight. In the morning it was ready to accept dictation.

The dictation window displays words as it figures them out. If you talk fast, it falls behind, but you can keep talking while it keeps working.

The dictionary included with the program contains more than 20,000 words; optional dictionaries for journalism, medicine and the law, none of which I tried, cost about $500 each.

Reviewing the transcription is simple but time-consuming. Click on a word you find questionable, and the program displays a list of alternatives while replaying your pronunciation.

Often the first or second word in the list is the one you meant, but if all the choices are wrong, you can type the right one in. The program learns from these corrections.

In part because of poor software design, this editing method is far slower than the standard way of correcting errors by deleting and retyping. You may do that, but only when a mistake is yours and not the program's, since Voicetype will not learn from these directly made corrections. Although the system transcribed most of this column in less than 15 minutes, correcting it took three times as long.

This dictation program comes closer to usefulness than any other I have seen, but I am still no optimist. For most people, Voicetype will be not quite good enough.

Stephen Manes is a columnist for the New York Times.

