Sorting transcripts

63 views
Skip to first unread message

ajames

unread,
Mar 11, 2021, 4:43:55 PM3/11/21
to BBEdit Talk
Complete novice here exploring an option for a workflow, thank you for the help. Totally out of my area here.

Trying to combine on set television logging scripts into a master script. 2 seperate loggers concentrating on different talent, producing seperate scripts e.g

LOGGER 1

[15:57:13.22]
PRODUCER:
WHERE ARE YOU FROM? 

[18:57:15:00]
CONTESTANT 1:
I'm from Austin Texas. Wait should I say........

[18:57:16.13]
PRODUCER:
I THINK YOU SHOULD SAY AUSTIN AS ONE. YOU ARE EARING A COBOYS HAT. 

[15:57:27:03]
CONTESTANT 2:
Or should I say both? I'm from Austin but I live in Vegas. 

LOGGER 2

[15:57:14.22] 
PRODUCER 2:
TELL ME, TELL ME.

[15:57:15.22] 
CONTESTANT 2: 
I run the junior program at my country club, I help with event set ups and I do all the retail buying for the country club.

[15:57:20.22] 
PRODUCER 2:
CAN WE NOW PUT YOUR INTRO ALL TOGETHER? SO YOU'RE BASICALLY IN ONE THOUGHT 'MY NAME'S CONTESTANT 2, I'M FROM CALIFORNIA AND THIS IS WHAT I DO.'

[15:57:31.22] 
CONTESTANT 2: 
Okay, um. I'm contestant 2, I'm from California and I'm a golf shop manager in New York.

I need to sort by combine these scripts into a single document for our edit software to read, but I need to keep the spacing and format of each small block of text, just need to sort by time e.g.

[15:57:13.22]
PRODUCER:
WHERE ARE YOU FROM? 

[15:57:14.22] 
PRODUCER 2:
TELL ME, TELL ME.

[18:57:15:00]
CONTESTANT 1:
I'm from Austin Texas. Wait should I say........

[15:57:15.22] 
CONTESTANT 2: 
I run the junior program at my country club, I help with event set ups and I do all the retail buying for the country club.

[18:57:16.13]
PRODUCER:
I THINK YOU SHOULD SAY AUSTIN AS ONE. YOU ARE EARING A COBOYS HAT. 

[15:57:20.22] 
PRODUCER 2:
CAN WE NOW PUT YOUR INTRO ALL TOGETHER? SO YOU'RE BASICALLY IN ONE THOUGHT 'MY NAME'S CONTESTANT 2, I'M FROM CALIFORNIA AND THIS IS WHAT I DO.'

[15:57:27:03]
CONTESTANT 2:
Or should I say both? I'm from Austin but I live in Vegas. 

[15:57:31.22] 
CONTESTANT 2: 
Okay, um. I'm contestant 2, I'm from California and I'm a golf shop manager in New York.

Thank you.

James Reynolds

unread,
Mar 11, 2021, 4:47:05 PM3/11/21
to bbe...@googlegroups.com
When I do something like this I usually get rid of the returns by searching for "\n" or "\r" and replacing it with garbage text like asdf. Then I would search for "[" and replace it with "\n[". Then sort. Then replace "asdf" with "\r". You'll be back to where you started but everything will be sorted.

James Reynolds
Sr Systems Administrator
School of Biological Sciences
The University of Utah
801-585-3086
> --
> This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
> ---
> You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/3568a4be-95d5-4d3c-800a-4774e5daf1a0n%40googlegroups.com.

ajames

unread,
Mar 11, 2021, 5:39:37 PM3/11/21
to BBEdit Talk
Perfect! Thank you very much.

jj

unread,
Mar 12, 2021, 4:22:20 AM3/12/21
to BBEdit Talk
@ajames,

Maybe this can help you manually execute the workflow described by James.

1. Create an empty bbedit text document:

    File > New > Text Document
    
2. Insert your files into the new document:

    Edit > Insert > File Contents...

3. Check that the imports are separated by a blank line.

4. Copy this command:

    perl -pe 's/(.)\n/\1\t/g' | sort | perl -pe 's/\t/\n/g'
    
5. Paste the command in:

    Text > Run Unix Command...
    
6. Choose your preferred output options and click OK.

If you need to further automate the process you could create a shell script:

    ```
    #!/bin/sh
    
    cd "/PATH/TO/SOURCE/DIRECTORY" ; # Replace with the directory path.
    cat * | perl -pe 's/(.)\n/\1\t/g' | sort | perl -pe 's/\t/\n/g' > "/PATH/TO/OUTPUT_FILE.txt" ; # Replace with the output file path.
    ```
    
HTH

Best regards,

Jean Jourdain

MediaMouth

unread,
Mar 12, 2021, 9:27:36 AM3/12/21
to bbe...@googlegroups.com

Did you get a solution?

Your question suggested you were looking to combine the transcripts, then sort by the given (source?) time codes, but in your example it shows the combined transcript ordered strangely -- the hour volleying between 15 and 18, yet the mm:ss:ff ordered chronologically.

Is that just a typo or is your challenge a little more technically involved?

Christopher Stone

unread,
Apr 19, 2021, 11:45:06 PM4/19/21
to BBEdit-Talk
Hey James,

I'm late to this party, but I wanted to see how fun it would be to do the job in Perl.

It turned out to be really simple:



#!/usr/bin/env perl -0777 -nsw
# --------------------------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2012/11/27 08:12
# dMod: 2021/04/19 21:36 
# Task: Sort CC Script Segments for a Video.
# Tags: @ccstone, @Shell, @Script, @CC, @Sort, @Video
# --------------------------------------------------------------------------------
use v5.010;

s!^LOGGER\h*\d+\h*!!gim;        # Remove “Logger ##” lines if necessary.
s!\A\s+|\s+\Z!!;                # Remove leading and trailing vertical whitespace.
s!^\h+|\h+$!!gm;                # Remove trailing whitespace on every line.
my @array = split(/\n\s+/, $_); # Split script segments into an array.
chomp(@array);                  # Remove trailing linefeeds from records.
@array = sort(@array);          # Sort array.
$, = "\n\n";                    # Set Output Field Separator.
print @array;                   # Print array.



Save the script in:

~/Library/Application Support/BBEdit/Text Filters/<YourScriptName>.pl

Give it a keyboard shortcut in BBEdit's Menus & Shortcuts preferences – or activate it from the Text > Apply Text Filter menu.

Note:

script segments must have at least 1 blank line between them.

--
Best Regards,
Chris



@lbutlr

unread,
Apr 25, 2021, 1:35:04 AM4/25/21
to BBEdit Talk
On 19 Apr 2021, at 21:46, Christopher Stone <listmei...@gmail.com> wrote:
> use v5.010;

Excellent script, but is this needed? And what are the consequences if you have 5.32 installed?


--
"I know she's in there," said Verence, holding his crown in his hands
in the famous Ai-Se-or-Mexican-Bandits-Have-Raided-Our-Village
position

Christopher Stone

unread,
Apr 26, 2021, 2:23:42 AM4/26/21
to BBEdit-Talk
On 04/25/2021, at 00:34, @lbutlr <kre...@kreme.com> wrote:
On 19 Apr 2021, at 21:46, Christopher Stone <listmei...@gmail.com> wrote:
use v5.010;

Excellent script, but is this needed? And what are the consequences if you have 5.32 installed?


Hey Lewis,

Thank you.

No, use v5.010; is not required in this case.

It does not designate the Perl version used per se – it does designate the lowest possible version of Perl that can be used.

I frequently set the lowest version to 5.010, because it allows `say` to be used in addition to `print`.  (Without the use statement `say` will throw an error.)

You can run this to demonstrate to yourself that your current Perl version is the one being used.



#!/usr/bin/env perl -sw
use v5.010;

print "Perl $^V\n";



Appended is a slight mod of my script – I hadn't made the strip vertical whitespace code global, and under certain circumstances that could lead to an extra linefeed between a couple of dialog blocks.

I also added a regex to normalize the spacing between dialog blocks just for good measure.

--
Best Regards,
Chris



#!/usr/bin/env perl -0777 -nsw
# --------------------------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2012/11/27 08:12
# dMod: 2021/04/26 00:07
# Task: Sort CC Script Segments for a Video.
# Tags: @ccstone, @Shell, @Script, @CC, @Sort, @Video
# --------------------------------------------------------------------------------
# use v5.010;

s!^LOGGER\h*\d+\h*!!gim;        # Remove “Logger ##” lines if necessary.
s!^\h+|\h+$!!gm;                # Remove trailing whitespace on every line.
s!\n{2,}!\n\n!gm;               # Normalize spacing between script blocks.
s!\A\s+|\s+\Z!!g;               # Remove leading and trailing vertical whitespace.
Reply all
Reply to author
Forward
0 new messages