Resources for Transcribing (voice to text)

Mark Boas

unread,

Nov 18, 2011, 4:22:26 AM11/18/11

to hyper...@googlegroups.com

Free Services

Universal Subtitles : http://www.universalsubtitles.org/ --> SRT Converter : http://happyworm.com/clientarea/hyperaudio/hap/v22/parserSRT.htm *

* Estimates world level alignment from subtitles

Free Resources

Open Source Toolkit For Speech Recognition : http://cmusphinx.sourceforge.net/

Large Vocabulary Continuous Speech Recognition : http://shout-toolkit.sourceforge.net/

Paid Services

3playmedia : http://3playmedia.com

DotSub : http://dotsub.com

Dragon Speech Engine : http://www.nuance.com/

PlyMedia : http://www.plymedia.com/

Ramp : http://www.ramp.com/

SpeakerText : http://speakertext.com/

VoiceBase (freemium) : http://www.voicebase.com/

Eric Pugh

unread,

Jan 26, 2012, 11:19:06 AM1/26/12

to hyper...@googlegroups.com

Talked to Mark a bit about this last week, I've posted a rough SRT -> Hypertranscript of my own, rewritten in PHP. Not trying to reinvent the wheel here, and I understand wanting to keep this on the "front end" but there does seem to be some advantages, and well... I'm just a bit more familiar with PHP. :-)

Anyway, I've posted the function here:
http://www.graphicsilence.com/web-development/convert-subtitles-srt-files-into-other-formats.html

A form similar to Mark's:
http://www.graphicsilence.com/demos/srt-converter/

And, at the bottom of the above post, a very simple example, using an SRT file from Universal Subtitles API and converting it... which someone might find useful for one of their projects.
thx

Mark Boas

unread,

Jan 27, 2012, 4:44:11 AM1/27/12

to hyper...@googlegroups.com

Nice work Eric, thanks very much for sharing that.

One thought is that Universal Subtitles could return data as JSONP or enable CORS on their servers so that we could avoid same domain origin policy restrictions and do it all client-side! I posted this on the Universal Subtitles group here : https://groups.google.com/d/topic/universal-subtitles-development/EWTrhMV8s34/discussion

In the meantime I'm thinking of creating a CORS enabled proxy on our own jplayer.org servers.

Why client-side? 1. Portablity of code - easy for people to get up and running, everything is in one place. 2. Scalability, processing is distributed amongst the cleints, less for the server to do.

Looks like they are going to expose the users list of subtitles which is cool. In general the Universal Subtitles project is very very cool, folk that haven't tried it yet should definitely give it a whirl - it's a lot of fun!

Please keep any developments coming. Great to see!

Cheers

Mark

Mark Boas

unread,

Jan 27, 2012, 4:52:13 AM1/27/12

to hyper...@googlegroups.com

I just wanted to add another resource that Norman mentioned on another thread : https://groups.google.com/d/topic/hyperaudio/J-qu68YFCvE/discussion

A service called Tropo : http://blog.tropo.com/tag/transcription/ Doesn't seem to give word aligned timings, and the accuracy could be better, but looks useful nonetheless as a possible starting point. Their code is also available on github https://github.com/tropo/pamfaxr

Thanks Norman!

J. Reuben Wetherbee

unread,

Mar 29, 2012, 8:32:00 AM3/29/12

to hyper...@googlegroups.com

I have been working on a text/audio player for the University of Pennsylvania PennSounds project and found a free aligner provided the Linguistics Data Consortium at Penn. You can either submit the audio file online and get the result emailed to you or download and install the aligner and run it yourself. For longer files (more than 15 minutes of audio) I found it can take quite a few hours, but it outputs the alignment by word. You' do have to do a bit of work to parse the output to get it to line back up with your text though. The link is:

http://www.ling.upenn.edu/phonetics/p2fa/

-Reuben Wetherbee

Reply all

Reply to author

Forward