Podbean Podcast Site Category :   Technology   Tags :                       
Feed on
Posts
Comments

Archive for March, 2008

Audio player at end of text.

MacSpeech Dictate, Nuance NaturallySpeaking

In past episodes, I bemoaned the lack of good speech to text recognition software for the Apple. At the time I began writing this blog and podcast the only product available for Intel Macintosh computers was iListen produced and marketed by MacSpeech. I had been accustomed to using Dragon NaturallySpeaking by Nuance and the excellent speech-recognition and translation software bundled with all flavors of Vista. In contrast, I saw iListen as a flawed product, which in my mind, should have been discontinued years ago.

In January 2008, MacSpeech announced a licensing agreement with Nuance to market a port of the Nuance speech engine to Intel Macintosh computers.MacSpeech offered many of its iListen customers what they referred to as a cross grade from iListen to Dictate for $79. The list price for MacSpeech Dictate is $200 US. Given the very sad performance of iListen, I felt that MacSpeech should have made the “Cross Grade” available at no charge. I suspect significant royalty obligations to Nuance, may have made it impossible for MacSpeech to do the right thing and absorb the cost of providing free updates to iListen customers.

MacSpeech Dictate is shipped with two CDs, one to install the actual application and the other to import the appropriate vocabulary tailored to your selected language and country. Installation was typically Mac, in that once the CD had been recognized by my computer, I simply moved the MacSpeech icon to the applications folder. After the application is installed a pop-up requests the insertion of the data disk.

I realized of course that when I tried to insert the data desk that the program disk had not been automatically ejected. I felt a bit of a blush from my embarrassment but realized that my instinctive move to insert the second required disk was born out of habit in working with Windows machines that typically eject disks before requesting the insertion of the additional CDs or DVDs.

After the software and data have been installed I was prompted to create a user profile which requires me to calibrate my USB interfaced headset and read approximately 5 minutes of text. Although the Windows Dragon NaturallySpeaking application can actually be used without any training at all the five minutes of training required by Dictate is innocuous and a huge improvement over the two week training period foisted upon iListen users.

Once installed and “trained”, I immediately noted accuracy had been dramatically improved over that seen in iListen and the equal of what I am accustomed to when using the Windows version of Dragon NaturallySpeaking.

The Dictate Quick-start flyer repeatedly refers to the “amazing accuracy” you should expect when using the product. Those who have struggled with trying to make iListen work will be blown away. However, those using other products such as Dragon and Vista Speech Recognition will be less impressed but breathe a sigh of relief that Nuance finally ported their speech engine to the Macintosh.

The only documentation shipped with Dictate is a three page flyer detailing installation and listing some common voice commands. A separate pop-up window displaying additional commands is launched when Dictate is activated. Given the complexity and novelty of speech to text software the lack of comprehensive documentation is a glaring omission that I hope is addressed quickly.

There is essentially no usable correction scheme which in my opinion severely limits its usefulness and certainly makes it impractical for many who are physically disabled. It seems to have some of the same quirks I noted in iListen. The Do Select, Do Delete, Insert before, etc., commands become useless if the text is manually edited. Manual edits seem to confuse Dictate and it loses track of dictated text. Several times complete paragraphs were deleted by Dictate when I was seeking to modify one word.

Additionally, correcting text manually seems to cause previously deleted text to randomly reappear when dictation is resumed. Once, after manually modifying text, the application stopped responding. The Dictate application showed that my speech was being monitored but no text was displayed. The only way I was able to resume dictation was by shutting down Dictate and restarting.

Interestingly, Chuck Rogers, the Chief Evangelist at MacSpeech blames the Mac operating system for their difficulties in creating a correction scheme that works at least as well as that available in Windows products. He goes on to suggest that Nuance has had similar problems in developing a workable correction scheme for Windows but have been able to resolve them because of the many additional years to iron out the kinks. Is he suggesting Windows is indeed a more mature OS offering solutions not currently available to Apple users? This would make for an interesting Windows versus Apple commercial if Microsoft were so inclined.

David Pogue, in his last review of Dictate, suggested several bugs would need to be squashed before shipping, however, it appears that more work needs to be done. Although accuracy it is greatly improved a workable correction scheme needs to be implemented before it can truly compete with the Windows version.

All of this said, Dictate is light years ahead of iListen and with improved correction options and the extermination of a few bugs, Dictate may finally offer Mac users a speech to text technology competitive with that we have grown accustomed to in the Windows world.

Link to youtube demo of MacSpeech Demo illustrating limited edit commands and extraneous text bug.

Dictated using MacSpeech Dictate, recorded using M-Audio and MixCraft 3 and tags edited with JetAudio.

NOTE:

Chuck Rogers disagrees with my characterization of his comments concerning issues in implementing speech technology on the Apple platform. He disclaims any allusion to Mac OS X as being “less robust or functional than Windows”. Please see his comment to get his insight into the technical issues facing MacSpeech.

Listen Now:


icon for podbean  Standard Podcasts [8:39m]: Play Now | Play in Popup | Download | Embeddable Player | Hits (592)

Read Full Post »