Product Review: DragonTM NaturallySpeaking®9
I didn't write this blog post. My computer wrote it for me. I dictated it.
I wrote this blog posts with the assistance of DragonTM NaturallySpeaking®9, a very versatile speech-to-text application that I paid $100 for at Circuit City a few weeks back. Around that time, I was reading over The Age of Spiritual Machines by Ray Kurzweil the second or third time around. The book was written in 1999, and he predicted that by the year 2009 that rather than use keyboards, we would start to dictate words to our computers. A lot of Ray's 2009 predictions are definitely coming to pass, such as wires disappearing from hardware, and that most financial transactions would happen between a human and computer. Other than some cheesy attempts at voice commands on some automated phone systems I use when calling customer service lines—staffed with robots that tries embarrassingly hard to act human—I haven't seen much use speech recognition. But Kurzweil has a good track record as far as predicting the future, so I did some web searching to see what was out there to see if there wasn't some huge tech revolution that I was missing. I found jott.com, a website that's being beta-tested, where you use your cell phone to e-mail to yourself or others audio reminders, and also get a text translation of your audio message. A trial of it proved underwhelming. You could only leave yourself short messages and the text of your message would end up being very garbled. And then I found NaturallySpeaking 9, a software application that needs to be installed on your computer, and needs a really large amount of memory to work, but has much better quality.
I've heard anecdotally that researchers have been working on speech-to-text for years. A professor I had at law school told me that he went to a seminar back in the early 80s where they thought that these programs would become available any year now. But computer speech recognition proved to be a tough nut to crack. How do you teach a computer to translate "How bare is a bare bear?" (Actually, that one was hard to dictate using this software.) It seems like high-quality speech recognition was capable only by sentient beings, and even we have trouble with it sometimes. The big question is whether identifications, not just of words but of Being itself, is something that only God-breathed creatures can perform.
That question still remains, but it does appear that speech-to-text software is getting better. Much better. As in "actually functional" better. I can use this program at the office and not have it be a complete drain on my time to get it to work right. And since buying it, I've done just that.
NaturallySpeaking is software that learns. The application has training modes where you read text, and the program learns the sound of your voice. Also, when you correct bad dictation, the program supposedly learns from its mistakes. Like any employee, you need to train it. And like any boss, you need to train yourself to manage it. So it can take a while to get good at, just like it takes a while to get good at typing. The company that markets the software says that while the average person can type 30 to 40 words per minute, you may be able to speak over 120 words per minute. So it stands to reason that once you teach the program to dictate efficiently for you, NaturallySpeaking can be a boon to productivity, or so they claim.
It's not quite that easy. The 99% accuracy rating that Dragon brags about may be closer to 70 to 80% in real usage. Dictation can still be very clunky, and you may still have to use your keyboard or correct certain errors that, if the computer doesn't make, you make by not being a naturally born Paul Harvey or Don Pardo. Going over this blog entry, I've had to make just as many revisions as if I drafted something by typing it. The technology hasn't gotten to the point yet where it's as good as dictating to a secretary, and you just can't go stream of consciousness and expect the letter or essay you wrote to turn out perfect. In other words, this program will not think for you.
But it can change the way you write. I hadn't noticed it before, but writing while slumped over the keyboard with your eyes vacillating between the monitor and your fingers can really wear away at your concentration and patience for writing. Instead, when you're dictating using the software, you can actually look at the computer screen, or even get up and stretch your legs while dictating. The visual focus leads to a lot less errors in your writing, not to mention that the computer doesn't mistype or misspell. If the computer recognizes that you said "concupiscence" or "amalgamated", it's not going to misspell those words like you might if you were typing sloppy or got less than a 500 verbal score on your SATs. So the trade-off is less typos and a better capability of detecting them since, unlike most typers, you can now actually look at the word processor while processing words. Not to mention it's liberating to be able to lean back, move your hands, look at things, and fiddle with something in your hand while writing. It makes for more creativity.
This software is not perfect, but it's good enough that it could really change people's lifestyle and computer usage and be the tipping point for greater things. I think this will really catch on the next few years, especially once NaturallySpeaking 10 or 11 come along and we get some better versions of the stuff and computers with enough memory to handle it.