Closed Captions (CC) & Subtitles - How you handle this?

269 views
Skip to first unread message

Sven Stauber

unread,
Jul 21, 2017, 7:30:06 AM7/21/17
to Opencast Users
Dear Opencast Adopters,

We are thinking about ways to deal with closed captions and subtitles. Since there seems to be knowledge out there in our community, I would like to ask some questions to get a better understand of options available.

1. From what sources you use CC & Subtitles?
- Are you doing OCR on the presentation slides and to show the slide texts as CC?
- Are you doing Speech-To-Text Recognition and show the results as CC?
- How do you handle with subtitle tracks in video containers (e.g. MP4) of existing videos?
- Are you using external tools to generate CC? Which tools?
- Are you using others sources?

2. How do you ingest those CC/Subtitles into Opencast?
- Are you uploading CC/Subtitles as separate files or are they contained in a video container file (e.g. MP4) or both?
- How do you add CC/Subtitles to an existing recordings already ingested in Opencast?
- What formats are you ingesting in case separate files are used (e.g. WebVTT)?

3. How do you edit of CC/Subtitles managed by Opencast?
- Once you have CC/Subtitles in Opencast, how would you change them (e.g. fix a typo)?

4. How do you process subtitles?
- Are you performing extracting of subtitles/CC from video containers?
- Are you performing addition of subtitles/CC to video containers?
- Are you transforming various subtitles/CC formats to others (e.g. Any Format to WebVTT)?

5. Visualization of CC and subtitles?
- Which video player are you using to visualize CC and/or subtitles?
- Which CC/subtitles formats does your player support?
- Does the player get the CC/subtitles directly from the video container file (e.g. MP4), or is there an additional file contained the CC/subtitles?
- Does the solution work with adaptive streaming (HLS, MPEG-DASH)?

6. What are to top-most headaches you have encountered when dealing with CC & Subtitles?

Looking forwards to get some insights into CC & Subtitles in the Opencast world.

Best regards,
Sven

Dietmar Zenker

unread,
Jul 21, 2017, 8:01:11 AM7/21/17
to Opencast Users
Hi Sven,

I can answer at least one of your questions:


Are you using external tools to generate CC? Which tools?

IMO one of the most sophisticated (and free!) tools is Amara. I've used this web service from time to time.

Greets,
Dietmar

Rute Santos

unread,
Jul 21, 2017, 10:22:16 AM7/21/17
to us...@opencast.org

Hi Sven,


On 7/21/17 7:30 AM, Sven Stauber wrote:
Dear Opencast Adopters,

We are thinking about ways to deal with closed captions and subtitles. Since there seems to be knowledge out there in our community, I would like to ask some questions to get a better understand of options available.

1. From what sources you use CC & Subtitles?
- Are you doing OCR on the presentation slides and to show the slide texts as CC?
No

- Are you doing Speech-To-Text Recognition and show the results as CC?
Yes

- How do you handle with subtitle tracks in video containers (e.g. MP4) of existing videos?
We don't have that.

- Are you using external tools to generate CC? Which tools?
Yes, IBM Watson speech-to-text

- Are you using others sources?
Human-generated text for special courses that require 100% accuracy, provided by 3Play.


2. How do you ingest those CC/Subtitles into Opencast?
- Are you uploading CC/Subtitles as separate files or are they contained in a video container file (e.g. MP4) or both?
We are ingesting as a separate file (dfxp at this point).

- How do you add CC/Subtitles to an existing recordings already ingested in Opencast?
WOH to attach the captions to the media package and re-publish it.

- What formats are you ingesting in case separate files are used (e.g. WebVTT)?
Dfxp, but will probably use vtt in the future.


3. How do you edit of CC/Subtitles managed by Opencast?
- Once you have CC/Subtitles in Opencast, how would you change them (e.g. fix a typo)?
Currently manually: download the captions file, edit, run a workflow to replace the captions file and re-publish. We are planning to implement a captions editor, accessible from the player.


4. How do you process subtitles?
- Are you performing extracting of subtitles/CC from video containers?
- Are you performing addition of subtitles/CC to video containers?
- Are you transforming various subtitles/CC formats to others (e.g. Any Format to WebVTT)?

5. Visualization of CC and subtitles?
- Which video player are you using to visualize CC and/or subtitles?
Paella

- Which CC/subtitles formats does your player support?
- Does the player get the CC/subtitles directly from the video container file (e.g. MP4), or is there an additional file contained the CC/subtitles?
Additional file.

- Does the solution work with adaptive streaming (HLS, MPEG-DASH)?
We are not using adaptive yet so maybe Paella folks can answer this?


6. What are to top-most headaches you have encountered when dealing with CC & Subtitles?
Accuracy for automated captions :)

    Thanks,

    Rute



Looking forwards to get some insights into CC & Subtitles in the Opencast world.

Best regards,
Sven
--
You received this message because you are subscribed to the Google Groups "Opencast Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to users+un...@opencast.org.

Carlos Turro Ribalta

unread,
Jul 26, 2017, 6:46:22 AM7/26/17
to us...@opencast.org

Dear Sven

 

Sorry for the delay in answering

 

1. From what sources you use CC & Subtitles?

- Are you doing OCR on the presentation slides and to show the slide texts as CC?

No

- Are you doing Speech-To-Text Recognition and show the results as CC?

Yes

- How do you handle with subtitle tracks in video containers (e.g. MP4) of existing videos?

We don’t use it

- Are you using external tools to generate CC? Which tools?

We use poliTrans (http://politrans.upv.es through a REST API)

- Are you using others sources?

No

 

2. How do you ingest those CC/Subtitles into Opencast?

- Are you uploading CC/Subtitles as separate files or are they contained in a video container file (e.g. MP4) or both?

We have an ugly hack since 1.4 which ingests the audio to poliTrans and then we serve the files to paella directly from politrans. Our idea is to move to uploading as separate files in 4.0

 

- How do you add CC/Subtitles to an existing recordings already ingested in Opencast?

The same

 

- What formats are you ingesting in case separate files are used (e.g. WebVTT)?

DXFP

 

3. How do you edit of CC/Subtitles managed by Opencast?

- Once you have CC/Subtitles in Opencast, how would you change them (e.g. fix a typo)?

PoliTrans has an embedded web editor (external to opencast)

 

4. How do you process subtitles?

- Are you performing extracting of subtitles/CC from video containers?

No

- Are you performing addition of subtitles/CC to video containers?

No

- Are you transforming various subtitles/CC formats to others (e.g. Any Format to WebVTT)?

Also .srt

 

5. Visualization of CC and subtitles?

- Which video player are you using to visualize CC and/or subtitles?

Paella

 

- Which CC/subtitles formats does your player support?

dxfp and webtt

 

- Does the player get the CC/subtitles directly from the video container file (e.g. MP4), or is there an additional file contained the CC/subtitles?

No, it gets the subtitles from an additional file

 

- Does the solution work with adaptive streaming (HLS, MPEG-DASH)?

Yes

 

6. What are to top-most headaches you have encountered when dealing with CC & Subtitles?

1- Audio quality: If there are issues with audio levels there is nothing to do

2- Permissions to edit. Usually it is not the author who has to review the subtitles

3- Audio quality ;-)

 

Hope this helps

 

Carlos

--

Reply all
Reply to author
Forward
0 new messages