I really enjoy talking and writing about technology and tech related things but one of my big hang-ups after using a computer for close to 15 years is I’m just not as fast at typing as I probably should be. When I started thinking about ways to make the writing and blogging process more efficient one of the first things I came to is talking and typing by using one of these speech-to-text engines. When I started researching speech-to-text engines the primary one that almost everybody has heard about is called Dragon Naturally Speaking. Dragon Naturally Speaking is probably an industry-standard but also not exactly cheap. I didn’t want to put a whole lot of money into this experiment just to be testing out what works. With the primary goal of being able to create blog posts more efficiently Dragon Naturally Speaking is out because it’s not free. Some things that kept coming up in the reviews and comments about Dragon Naturally Speaking that could be true for any voice interface is that you need a high quality external microphone that produces great audio. The idea being better audio improves the processing accuracy of what you say.
With all the hype currently around voice interfaces like Okay Google, Apple’s Siri, Amazon’s Alexa, and Microsoft’s Cortana implementation it’s easy to expect a lot more than what’s currently possible with a voice interface. In movies, these machines tend to understand humans in pretty much any language perfectly but that is not the case in real life. For a variety of factors audible sounds just don’t mean a whole lot to computers yet. Even with all the computing power and resources going into it right now like natural language processing and machine learning we’re just not quite there.
I did not want to spend a long time training the text engine on my voice. That can get real time-consuming and tedious very quickly. Then I stumbled onto the ability for text to voice typing in Google Docs and it seems to be working quite well because that’s what I used to make this post. By working well, I mean it’s capturing most what I say and getting the correct words and spelling. Punctuation and grammar could be improved but that is one of those things that is still hard for machines to understand. The privacy caveat with this being a Google product is the audio snippets are probably being saved in perpetuity some place for further ad based analysis. Even with those weaknesses I would say the potential is here for it to cut my writing process at least in half which is a big win for me.
To Learn More: