Regular expressions aren't well suited to handle things like checking line lengths and moving line contents based upon differences in those lengths.
A better method is to use something like a text filter using a scripting language that can check for things like text lengths and make text string changes based upon runtime evaluations.
Below is a perl script text filter which will take as input a selection or whole file of SRT formatted text. It will find any and all SRT sequence entries with two lines of dialog text and reformat/reword wrap the lines of text to a more equal line length leaving the second line longer if necessary for proper word wrapping.
I've named it
reformat_subtitle_text.pl and saved it in BBEdit's Text Filters folder so it will be listed in BBEdit's Text Filters pallet. If desired you can also set a keyboard shortcut for it.
You'll probably want to enhance the reformatting logic in the fixup_dialog subroutine to handle cases where simple two line word wrap reformatting produces awkward results. For example, what appears to be two person dialog text like:
- Shall I get you something, Micke?
- No, I don't have time.
or
- Whose turn is it today?
- Malin's, isn't it?
with your simple word wrapping rule gets reformatted as:
- Shall I get you something,
Micke? - No, I don't have time.
- Whose turn is it
today? - Malin's, isn't it?
In the SRT formatting rules I found, "-" has no defined markup rule so perhaps it is just an informal convention so people are using to indicate multiple people speaking.
SRT formatting rules also allow simple markup annotations (e,g., bold - <b> </b>) which will change the lengths of displayed text from the lengths of a subtitle entry's raw dialog text. This script doesn't try to deal with that complicating issue.
reformat_subtitle_text.pl:
#!/usr/bin/env perl
use strict;
use Text::Wrap;
use POSIX qw/ceil/;
my $subtitles = '';
# regex to dissect one subtitle entry 1) sequence number and time range, 2) first dialog text line,
# and 3) second dialog text line
my $seq_item_re = qr/(\d+\n\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}\n)(.+\n)(.+\n)/;
# read in all the input subtitle text
$subtitles = do { local $/; <STDIN> };
# extract each and all subtitle entries with two lines of dialog text
# and replace them with reformatted version
$subtitles =~ s/$seq_item_re/$1 . fixup_dialog($2, $3)/mge;
#output the reformatted subtitles
print $subtitles;
# reformat two lines of dialog text to have more equal line lengths with line two the longer if
# necessary for proper word wrapping
sub fixup_dialog {
my ($line1, $line2) = @_;
# trim trailing white space
$line1 =~ s/\s+$//;
$line2 =~ s/\s+$//;
# ideal column width for two lines of characters without word wrapping
# and with word wrapping will leave second line the longer of the two lines
my $ideal_col_width = ceil((length($line1) + length($line2))/2) + 1;
my $total_text = $line1 . " " . $line2 . "\n";
# locally set wrapping parameters to not expand tabs and column width constraint
local($Text::Wrap::unexpand) = 0;
local($Text::Wrap::columns) = $ideal_col_width;
my $wrapped_text = wrap('', '', $total_text);
# if word wrapping creates third line move it to end of second line
if ( $wrapped_text =~ m/(.+\n.+)\n(.+\n)/){
$wrapped_text = $1 . $2;
}
return $wrapped_text;
}