Write with your mouth and then use fingers. Lately, I have been trying a new technique while writing articles — begin the process with voice memos, load it on a transcription software/app and finally, give the skeletal draft mass using the keyboard. Suddenly, most articles appear conversational, all thanks to transcription apps and programmes that are available in plenty, some of which are expensive, many goof up badly but a few get it spot on, well, almost. Not only articles, these digital transcribers work well with interviews, especially the long ones.
Voice to text is not a new phenomenon. Even a decade ago, it was possible to knock out a few paragraphs using voice but there was always the issue of correction which went beyond adding comma or the authoritative period. Short dictation for a post on Facebook is not what I had been aiming for. The technology that’s currently available doesn’t come free but it doesn’t cost the sky.
Usually the process begins with recording voice memos using an app like RecUp, which costs around Rs 120 but it syncs well with Dropbox. I also use Voice Memos on the iPhone, which can record conversations with enough depth, cutting out most unwanted environmental noises. These Voice Memos can be emailed or shared easily. Then comes the part in which users are spoilt for choice.
The good thing with voice memos is that these can be recorded anywhere, while driving, idling in the park or a cafe, or while washing dishes. To get the best quality voice memos, I usually use AirPods, which can pick up voice better than most wireless earbuds in the market and even the sound of running water is kept in check while recording.
Next, the options. Closest to my heart are Word as part of Office 365, Descript, Otter.ai and Temi.
Transcribe via Office 365
As a subscriber of Office 365, it is a no-brainer to use the Transcribe feature. For a few hundred bucks per month, you can use the entire Office suite, which has some excellent Cloud-related features while Transcribe is the high point for me. All you need to do is upload an audio file, which takes a few minutes to transcribe. For example, a few days ago there was an interview of Magic Johnson, which got transcribed using this feature. In terms of accuracy, I would say it gets 70-80 per cent correct if the diction is American or British. Names of cities, personalities or popular food items usually turn out correct but when it comes to pauses or all the ‘ums’, ‘ahs’ and ‘oohs’, it does a shoddy job. What would have taken me 45 minutes to transcribe earlier, was done in 10-12 minutes.
Otter.ai
The popular transcription product is from a team with members who have led development of technologies in mobile, search, speech, and data analytics at companies like Google, Facebook, Uber, Cisco, and Nuance. It has a slight edge over Office 365’s Transcribe option in terms of accuracy although at $8.33 a month, it doesn’t come cheap. It understands dictions far better than most rival options while the speed of transcription is high. Until the RBI ruling over recurring payments using credit cards came into play, Otter.ai had me in its spell.
Descript
The versatile app is billed as a “word processor for audio”. The voice memos I record, often get worked on by Descript’s tech muscles, truncating silences easily while keeping the text searchable. The output is not something I would be able to publish without changes but the matter can easily be hammered into something meaningful. Descript also has a powerful podcast angle which supports simultaneous and collaborative multitrack editing, with changes synced in real time to the cloud. The feature to look out for is Overdub, and it will allow you to create what Descript is calling an AI voice that can be used to overdub flubbed words or phrases. Is the transcription up to scratch? Well, it captured my wife’s thoughts clearly: “My husband likes camembert cheese. If you ask me, I think camembert should be fed to the dogs and maybe, (just) maybe I might allow my cat to eat it. (No) Sniff it.” The commas had to be added. The same goes for the words within brackets. Though accurate, transcription took time.
Temi
Be it grammar or word accuracy, complete with jargon, Temi does well. Available on iOS, Android and web, a 12-minute audio file took roughly five minutes to transcribe, turning in around 75-80 per cent accuracy. We recommend it because it’s quick and relatively inexpensive (at 25¢ per minute of audio), though you have to do some clean-up work. There is a trial available to see if Temi suits your needs.
Art of using voice memos
The good thing with voice memos is that these can be recorded anywhere — while driving, idling in the park or a cafe, or while washing dishes.
You can take a voice memo anywhere, anytime — from driving to washing dishes.
It allows you to spend time without screens.
Ideas and thoughts can come anytime. Most phones do a good job in turning in crystal clear recordings.
If voice memos are cleverly put together, it can make articles appear more conversational.
The cost of transcription is reducing every year, which means the time you save transcribing interviews can be used elsewhere.
Using something like Office 365 allows the transcription feature as part of a set of important features. So you are not stuck with an app that does only one thing.