Count characters with regex?

114 views
Skip to first unread message

Doug Pinkerton

unread,
Apr 22, 2024, 1:31:18 PMApr 22
to BBEdit Talk
I would like to create a text factory to change each instance of

# This is a level 1 heading
to
This is a level 1 heading
=========================

And I would like to change each instance of

## This is a level 2 heading
to
This is a level 2 heading
-------------------------

My grep skills are rudimentary. I’m using this.

Find: ^## ([A-Za-z: ,]+)\r
Replace: \1\r----------

It works as desired, almost. But instead of ten hardcoded dashes, I would like for the number of dashes to equal the number of characters in the previous line (once the initial hashmarks are removed). I’m in over my head. Can someone help?

Thanks.

Rich Siegel

unread,
Apr 22, 2024, 3:02:37 PMApr 22
to BBEdit Talk
On 22 Apr 2024, at 13:31, Doug Pinkerton wrote:

> I would like to create a text factory to change each instance of
>
> # This is a level 1 heading
> to
> This is a level 1 heading
> =========================
>
> And I would like to change each instance of
>
> ## This is a level 2 heading
> to
> This is a level 2 heading
> -------------------------

Grep is not really the right tool for this sort of transformation, but it could certainly be scripted.

I asked ChatGPT, which said:

===

Unfortunately, PCRE (Perl Compatible Regular Expressions) does not support
variable-length replacement strings based on the length of the matched string.
This is because regular expressions are designed for pattern matching, not for
complex string manipulation.

However, you can achieve this in a programming language that supports regular
expressions and string manipulation. Here's an example in Python:

```python
import re

def replace_headings(text):
def repl(match):
heading = match.group(1)
level = len(match.group(2))
return heading + '\n' + ('=' if level == 1 else '-') * len(heading)

return re.sub(r'^(#{1,2}) (.*)$', repl, text, flags=re.MULTILINE)

text = """
# This is a level 1 heading

## This is a level 2 heading
"""

print(replace_headings(text))
```

This script defines a function `replace_headings` that takes a string of text
and returns a new string with the headings replaced. The `re.sub` function is
used to search for headings and replace them. The replacement function `repl` is
called for each match, and it generates a string of equals or hyphens of the
same length as the heading. The `flags=re.MULTILINE` argument makes the `^` and
`$` anchors match the start and end of each line, not just the start and end of
the whole string.

===

So you could try something like that.

R.

--
Rich Siegel Bare Bones Software, Inc.
<sie...@barebones.com> <https://www.barebones.com/>

Someday I'll look back on all this and laugh... until they sedate me.

GP

unread,
Apr 22, 2024, 4:06:58 PMApr 22
to BBEdit Talk
For a human generated solution... 

Since simple grep search and replace can't easily handle generating variable length strings whose length is determined by the length of a found text group, something like a simple perl text filter is an easier way to handle it.

A simple perl text filter like:

#!/usr/bin/perl -w
use strict;

# set up the underline character for heading levels
my $ulCharL1 = "=";
my $ulCharL2 = "-";

#for each line check for and reformat level 1 and level 2 headings
while (<>) {
chomp;
if ($_ =~ /^## (.+)$/){
print $1, "\n", $ulCharL2 x length($1), "\n";
} elsif ($_ =~ /^# (.+)$/){
print $1, "\n", $ulCharL1 x length($1), "\n";
} else {
print $_, "\n";
}
}

will quickly do the job.

I saved it as "underline_headings.pl in in BBEdit's Text Filters support folder. (If desired, you can assign a keyboard short cut for in the Text Filters pallet.)

Then you can apply the text filter to either the frontmost whole file/text window or just to selected text in the frontmost file/text window.

In the match patterns for heading lines, I'm assuming heading lines will always start at the beginning of a line with # or ## followed by just one space character followed by the rest of the heading line text to the end of that line. If those assumptions don't hold with the level headings forms found in your real world text, you'll need to modify the match grep expressions.

Also, if you have additional levels of heading besides the two, to handle those just declare the underline character for that heading level and add another elsif clause to find/match that heading level and print the reformatted heading text with the underline text line.

Doug Pinkerton

unread,
Apr 22, 2024, 7:19:46 PMApr 22
to BBEdit Talk
I think I can make the perl work. Thanks very much to both of you.

Doug Pinkerton

unread,
Apr 23, 2024, 1:23:20 PMApr 23
to BBEdit Talk
I just implemented the perl filter. It works perfectly. 
Thanks, GP.

Reply all
Reply to author
Forward
0 new messages