CCExtractor version: 0.94
ccextractorwinfull.exe -autoprogram -out=srt -bom -utf8 file.ts
After running countless of DVB-recordings through ccextractor to the subtitles from the teletext, this file was the first to not getting processed at all, instead I got this console-output: ```
ccextractorwinfull.exe -out=srt -bom -utf8 file.ts CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
Input: K:\file.ts [Extract: 1] [Stream mode: Autodetect] [Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto] [CEA-708: 63 decoders active] [CEA-708: using charset "none" for all services] [Timing mode: Auto] [Debug: No] [Buffer input: Yes] [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] [Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No] [Add font color data: Yes] [Add font typesetting: Yes] [Convert case: No][Filter profanity: No] [Video-edit join: No] [Extraction start time: not set (from start)] [Extraction end time: not set (to end)] [Live stream: No] [Clock frequency: 90000] [Teletext page: Autodetect] [Start credits text: None] [Quantisation-mode: CCExtractor's internal function]
Opening file: K:\file.ts Detected MP4 box with name: moov File seems to be a MP4 Analyzing data with GPAC (MP4 library) Opening 'K:\file.ts': ←[31m[iso file] Incomplete box 0000B00D - start 0 size 479044969 ←[0m←[31m[iso file] Incomplete file while reading for dump - aborting parsing ←[0mFailed to open input file (gf_isom_open() returned error)
Total frames time: 00:00:00:000 (0 frames at 29,97fps) Done, processing time = 0 seconds ```
Forcing the input-file-format with -in=ts
worked and the subtitle was created successfully, but I wanted to get down to the cause of the problem.
After going through the source and checking how the format-detection works, I saw that CCE is checking the video for certain strings to determine the format, at least that's how I understood it.
I opened the TS-file in a hex-editor and searched for moov
:
Position 727131 0xB185B
Luckily that was a payload-only TS-packet of the video-PID, so I was free to just change the text to something else. Then I ran this modified file through ccextractor, which worked: ```
ccextractorwinfull.exe -autoprogram -out=srt -bom -utf8 file_edit.ts CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
Input: K:\file_edit.ts [Extract: 1] [Stream mode: Autodetect] [Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto] [CEA-708: 63 decoders active] [CEA-708: using charset "none" for all services] [Timing mode: Auto] [Debug: No] [Buffer input: Yes] [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] [Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No] [Add font color data: Yes] [Add font typesetting: Yes] [Convert case: No][Filter profanity: No] [Video-edit join: No] [Extraction start time: not set (from start)] [Extraction end time: not set (to end)] [Live stream: No] [Clock frequency: 90000] [Teletext page: Autodetect] [Start credits text: None] [Quantisation-mode: CCExtractor's internal function]
Opening file: K:\file_edit.ts File seems to be a transport stream, enabling TS mode Analyzing data in general mode VBI/teletext stream ID 2701 (0xa8d) for SID 2004 (0x7d4) - Programme Identification Data = ProSieben.at - Universal Time Co-ordinated = Tue Mar 8 15:33:44 2022 Notice: Teletext page with possible subtitles detected: 149 - No teletext page specified, first received suitable page is 149, not guaranteed 100% | 34:00 Teletext decoder: 51004 packets processed
Number of NAL_type_7: 0 Number of VCL_HRD: 0 Number of NAL HRD: 0 Number of jump-in-frames: 0 Number of num_unexpected_sei_length: 0
Min PTS: 25:29:18:443 Max PTS: 26:03:18:563 Length: 00:34:00:120 Done, processing time = 4 seconds ```