--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
I'm interested in this sort of thing. I've just done some googling, if you want the video and both sets of subtitles then the command you need is:
$ youtube-dl --write-sub --write-auto-sub 'https://www.youtube.com/watch?v=-l03NVQf-D8'
(where you may need to do apt-get install youtube-dl or get it from https://github.com/rg3/youtube-dl)
Regarding real time and latency: The reason everything appears with no significant latency is that either the ASR (in the case of automatic subtitles) or the alignment (in the case of manual transcription) is done in advance. If you have a look at the downloads you can see the formats used. These are then stitched together after the ASR/alignemt to give no significant latency.
Kaldi can do real time recognition, but the latency is in the order of 10s or so.
On 29/10/16 17:59, Danijel Korzinek wrote:
You can actually turn both manual and auto-generated transcription in that example. The auto-generated seems real, especially when you see it stop transcribing when more than one person talks (yells) at once. However, we have no way of knowing if the auto-generated subtitles aren't using the manual ones for tuning. That would give the system a huge advantage and make it completely useless as a benchmark.--
I suggest you upload a file of somtehing that doesn't exist anywhere on youtube and check that out.
On Friday, October 28, 2016 at 9:49:31 PM UTC+2, Dan Povey wrote:Those are not automatically generated subtitles-- notice that it says who is speaking, e.g. 'The Prime Minister: .... '. They were generated by a human.
You'd probably only get reasonably accurate subtitles (e.g. >90% accurate) if the acoustic model was trained on British English speech and the language model contained suitable data (e.g. parliamentary debates). This is true of any ASR systems, not just Kaldi.
Dan
On Fri, Oct 28, 2016 at 7:05 AM, Sabr Tasbolatov <sabrtas...@gmail.com> wrote:
With YouTube's auto-generated English Language model, the subtitles look perfect for me, like 9/10 words were right, and with the fluent speech speed of UK Pr.Minister, there was no significant latency.--
Just interesting, how far can Kaldi go with ASR online decoding?
Thanks
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Speechmatics is a trading name of Cantab Research Limited
We are hiring: www.speechmatics.com/careers
Dr A J Robinson, Founder, Cantab Research Ltd
Phone direct: 01223 794096, office: 01223 794497
Company reg no GB 05697423, VAT reg no 925606030
51 Canterbury Street, Cambridge, CB4 3QG, UK
--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
--
Speechmatics is a trading name of Cantab Research Limited
We are hiring: www.speechmatics.com/careers
Dr A J Robinson, Founder, Cantab Research Ltd
Phone direct: 01223 794096, office: 01223 794497
Company reg no GB 05697423, VAT reg no 925606030
51 Canterbury Street, Cambridge, CB4 3QG, UK
--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.