Elliott C. Back: Technology FTW!

Google is Deaf

Posted in How to Blog, Search by Elliott Back on November 26th, 2004.

Matt points out this article, “Google is deaf.” Its point, while more subtle, can be reduced to:

Podcasting, that is radio web shows, is the new craze. And we’re never going to transcribe them. However, Google is, for all intents, a deaf user. A billionaire deaf user with tens of millions of friends, all of whom hang on his every word.

Wordpress already lets you generate an RSS enclosure for your podcast with no additional effort–just include the mp3 in your post. This has led to increased podcasting popularity, because it’s so easy to do one. The next step for podcasting tools will be to generate a “best match” transcript by using voice recognition technology. Pipe the podcast into your voice recognition software, and output a transcript in a “read more” cut directly in the post. It doesn’t matter if you only match 70% of the words, because you’re playing for Google. Getting a few of your keywords out there in the right order is all that is important.

It shouldn’t be hard to technically do, but I don’t have any voice recognition software myself–does anyone know of any open source packages?

Update: cmusphinx.sourceforge.net/html/cmusphinx.php from Carnegie Mellon University has potential. It’s written in java, so it could be easily added as a binary to a webserver, and has modes for reading .wav files. Get an application that can decode mp3->wav->Sphinx, and you’d be all set.

This entry was posted on Friday, November 26th, 2004 at 4:51 pm and is tagged with carnegie mellon university, voice recognition technology, voice recognition software, open source packages, new craze, radio web, google, intents, wav files, billionaire, podcasting, sphinx, podcast, wordpress, tens, popularity, match, modes, mp3. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback.

 

Trackbacks

(Trackback URL)

close Reblog this comment
blog comments powered by Disqus