voice memos speech recognition / transcription
< Next Topic | Back to topic list | Previous Topic >
Posted by Paul Korm
Jan 4, 2015 at 07:12 PM
And there’s also Dragon Dictate for iOS
Posted by Dr Andus
Jan 5, 2015 at 01:37 AM
jimspoon wrote:
I would like to be able to grab a voice recorder, and without even
>looking at it, push a button to start recording, start talking, push
>another button to end the recording. Then I’d like the recording to be
>automatically uploaded to Google’s or Dragon’s servers, and have the
>recognized text appended to a text file with an appropriate time stamp.
I agree that this would be great, but I also doubt whether this is technically possible yet, or rather, whether the results are sufficiently reliable.
The main issue I’ve run into (as a long-term Dragon user) is that even if recognition is 99% accurate, there will be mistakes, and if you don’t spot and correct them right away, days, weeks or years down the line the text may become unintelligible, as you won’t remember what you meant to say (although one could check the recording again, but it’s extra work).
I use Dragon on the PC to dictate book passages, and even after I check, months later I find sentences that don’t make sense, and I have to go back to the original to figure out what’s going on.
As for personal notes with a digital recorder, I find that they often contain background noise (if you’re recording them in a public space) or are not sufficiently succinct or coherent, so what I tend to do these days is 1) record the note, 2) listen to it back at the office, and 3) transcribe them in a more coherent and succinct form (which could be dictated with Dragon), and 4) delete the audio file. Anything else just ended up being too much hassle for it to be worth it (such as trying to upload the files to the PC and have them automatically transcribed by Dragon).
Posted by jimspoon
Jan 6, 2015 at 01:29 PM
Thanks all for your input - I will investigate your suggestions.
What surprises me is that the Google and Nuance recognizers produce such good results when entering text using voice in real time on my Android phone. So it seems like it should be possible to get equally good results by submitting previously recorded sound files to the same servers/software. I mean quick voice recordings that would be made without even looking at the device, simply by holding down a button or something like that.
There ought to be a dedicated voice recorder device out there that works like that - grab it, hold down button, wait for beep, talk, release button. Device automatically uploads recorded audio to servers via 3G/4G or wifi if available. Text together with recorded sound is available via smartphone app or web app - instantly searchable. For correcting recognition errors, the interface would need to have something like the desktop Dragon’s “play this back” function - highlight text, click “Play this Back”, and hear the portion of the audio recording which the recognizer transcribed into the selected text. The data plan for such a device shouldn’t be that expensive, after all - it wouldn’t take that many megabytes to transmit voice recordings to the servers. I think such a device would be a killer product. I’ve looked at the websites for various voice recorder manufacturers (Olympus, Sony), haven’t found anything like it. Some high end recorders have some wjfi capability.
I did find an interesting android app called Speech to Text Notepad - for me the interesting feature was the ability to delete words by saying, for example, “delete 4” to delete the four words preceding the cursor. This makes it easy to delete mis-recognized words. Just tap at the appropriate location before issuing the command.
https://play.google.com/store/apps/details?id=com.heterioun.HandsFreeNotes
The speech recognition built into the Google Keyboard and Swype keyboard doesn’t recognize many commands - while they properly interpret “new line”, “comma”, “period”, “exclamation point” “question mark” - “backspace” and “delete” do not work as they should. They should also recognize and respond appropriately to “insert date” and “insert time” (in configurable formats).
Seen anything like this, for android or iOS?
Posted by MadaboutDana
Jan 6, 2015 at 05:48 PM
Evernote haters among us will be amused by the fact that, yes, Evernote for Android does indeed convert spoken input to text, and preserves both the audio and text files, using Google’s speech recognition engine.
Posted by MadaboutDana
Jan 6, 2015 at 05:54 PM
There’s also a voice recognition function built into iOS 8. There’s a nice article on it here:
http://www.makeuseof.com/tag/type-superfast-real-time-voice-dictation-ios-8/
I have a couple of iOS 8 devices, but haven’t experimented with this. I could have a go, I suppose, and see what happens to the audio files…