Tuesday, November 5, 2019

intro

Title says it all (sigh).  So here we go...

(Spoiler alert: pocketsphinx, at least out-of-the-box, is a joke.  Totally useless.)

The "free" server/cloud based systems such as google, are encumbered by licensing BS, and/or generally they want to do *anything but* provide the simple service that everyone wants, i.e., convert an arbitrary sound file into a plain text file: speech-to-text transcription.  Google and their ilk want to force you to use their browser (Chrome), or the input can only come from your mobile-device microphone (if any!), not from a sound file, or the text output is embedded in some kind of graphical web interface that lets you do all sorts of colourful manipulations, yet somehow makes it remarkably hard to actually download a plain-text file of your results.  I can't stand that kind of... stuff.

The little virtual keyboard that pops up on my Galaxy tablet when I want to type something, has a little "microphone" button which I had never tried before.  I assume it links up to google cloud.  I tried it, and it gave by far the best results I've seen so far!  But again, I just can't bring myself to do the project at hand by playing my audio files back on one little gizmo's speaker, while holding my tablet up to the speaker so that I can type an email to myself...  Which century is this again?  Reminds me of an acoustic modem...

So, I will try to use one of the Linux standalone systems, and see if I can make it work acceptably.  Given how little useful documentation seems to be out there pertaining to this simple question:
How can I convert speech to text on Linux for absolutely free without restrictions, limitations, trial periods, or subscriptions of any kind?
I hope that the record of my own bumbling efforts towards a solution may help others in this world.  Good luck to us all!

No comments:

Post a Comment