Regex to find blank lines and lines with only spaces

18,517 views
Skip to first unread message

Mike Pullen

unread,
Jan 9, 2017, 10:27:20 AM1/9/17
to BBEdit Talk

1. This regular expression works in BBEdit to find all blank lines and all lines containing only whitespace:   ^\n|^\s+\n


2. However, this regular expression does not find all of those same lines:  ^\s+$


I've attached the file (test.txt) that I am using to test both regex's.


I would like to understand why the second regex doesn't work.


Help, please.


Thanks.





test.txt

Marc Simpson

unread,
Jan 9, 2017, 12:48:29 PM1/9/17
to bbe...@googlegroups.com
\s+ matches one or more whitespace characters; as such, blank lines
proper (i.e. without trailing whitespace) won't be matched by ^\s+$.

Try ^\s*$ (lines containing 0 or more whitespace characters).
> --
> This is the BBEdit Talk public discussion group. If you have a
> feature request or would like to report a problem, please email
> "sup...@barebones.com" rather than posting to the group.
> Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>
> ---
> You received this message because you are subscribed to the Google Groups
> "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bbedit+un...@googlegroups.com.
> To post to this group, send email to bbe...@googlegroups.com.
> Visit this group at https://groups.google.com/group/bbedit.

David Wagner

unread,
Jan 9, 2017, 12:48:29 PM1/9/17
to BBEdit Talk
Change the plus to * -> ^\s*$ and will pickup just carriage returns or spaces and carriage returns... ;)

Wags ;)
WagsWorld
Hebrews 4:15
Ph(primary) : 408-914-1341
Ph(secondary): 408-761-7391

Bruce Linde

unread,
Jan 9, 2017, 12:48:30 PM1/9/17
to BBEdit Talk
your first expression says find all blank lines or those containing only white space

your second expression says find all lines containing at least one or more (specified by the plus sign) white space characters. you've specifically told it to ignore empty lines that do not contain at least one white space character.

Mike Pullen

unread,
Jan 12, 2017, 7:19:44 AM1/12/17
to BBEdit Talk
^\s*$ does not work for me in BBEdit.  BBEdit does not find the empty lines that proceeds the line containing "four" or the line containing "six" in my test file.

This perl one-liner works: perl -pe 's/^\s*$//' test.txt.  I think the same regex should work in BBEdit but doesn't on my system.  I'm using BBEdit 11.6.4 and had the same results with 11.6.3.

Could someone try the regex in BBEdit and let me know if it works for them?


test.txt

Sam Hathaway

unread,
Jan 12, 2017, 7:38:42 AM1/12/17
to bbe...@googlegroups.com, Mike Pullen
Doesn't with for me either, with latest bbedit. I'm not in front of my computer now, but iirc bbedit would select an empty line, including that line's linefeed, and the following line, consisting of 4 spaces, and that lines linefeed.

That is, in the document represented by this C string: "one\n\n \n\ntwo\n"

It would match chars 5 through 10 (the first empty line AND the line with 4 spaces) in one fell swoop, but would not match char 11 (the 2nd empty line).

My uninformed guess is that bbedit is trying to be smart about whether it's matching in single line or multiline mode, and doesn't always get it right. (Personally, I'd like for \n to NEVER be a member of \s. You can always write [\s\n] if that's what you want.)
-sam

Rod Buchanan

unread,
Jan 12, 2017, 9:14:43 AM1/12/17
to bbe...@googlegroups.com

Try this:

^\s*?$


--
This is the BBEdit Talk public discussion group. If you have a
feature request or would like to report a problem, please email
"sup...@barebones.com" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
To post to this group, send email to bbe...@googlegroups.com.
Visit this group at https://groups.google.com/group/bbedit.
<test.txt>

-- 
Rod Buchanan
Kelly Supply Company
1004 W Oklahoma Ave
Grand Island, NE 68802-1328
308 382-8764 x1120

Dmitry Markman

unread,
Jan 12, 2017, 9:15:41 AM1/12/17
to bbe...@googlegroups.com, Mike Pullen
try

^\s*?$



--
This is the BBEdit Talk public discussion group. If you have a
feature request or would like to report a problem, please email
"sup...@barebones.com" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
To post to this group, send email to bbe...@googlegroups.com.
Visit this group at https://groups.google.com/group/bbedit.

Dmitry Markman


Patrick Woolsey

unread,
Jan 12, 2017, 10:46:23 AM1/12/17
to bbe...@googlegroups.com
On 1/11/17 at 8:59 PM, mike....@gmail.com (Mike Pullen) wrote:

>^\s*$ does not work for me in BBEdit. BBEdit does not find the
>empty lines that proceeds the line containing "four" or the
>line containing "six" in my test file.

It isn't expected to, since $ explicitly matches the position
_preceding_ the nearest line end:

[Chapter 8: Searching with Grep / page 166]

It is important to note that ^ and $ do not actually match return
characters. They match zero-width positions after and
before returns,
respectively. So, if you are looking for “foo” at the
end of a line,
the pattern "foo$" will match the three characters "f",
"o", and "o".
If you search for "foo\r", you will match the same text,
but the
match will contain four characters: "f", "o", "o", and [the
linebreak which follows].



Regards,

Patrick Woolsey
==
Bare Bones Software, Inc. <http://www.barebones.com/>

David Wagner

unread,
Jan 12, 2017, 2:03:30 PM1/12/17
to bbe...@googlegroups.com
Well, what can I say, but being an old guy who used perl for over 20 years, it worked for me in Perl, but as stated, does not in bbedit. I tried ^\s*\n and ^\s*\r and both replaced all lines.

I do apologize for not trying in bbedit, but used extensively in my Perl scripting that i just made the assumption and for my environment worked as I expected to… ;)

Wags ;)
WagsWorld
Hebrews 4:15

Sam Hathaway

unread,
Jan 12, 2017, 2:22:29 PM1/12/17
to bbe...@googlegroups.com
Patrick,

Say I have this file:
----8<-cut-here----
one



two
----8<-cut-here----

(Line 2 and 4 are empty, line 3 consists of four spaces.)

And this pattern: ^\s*$

Shouldn’t it match these three ranges?
- zero chars on line 2
- four spaces on line three
- zero chars on line 3

Instead, it seems to match only one range:
- 6 chars: the linefeed at the end of line 2, the four spaces on line
three, and the linefeed at the end of line 3.

I’m still not seeing why ^\s*$ would match chars 5 through 10 (the
first empty line AND the line with 4 spaces) in one fell swoop, but
would not match char 11 (the 2nd empty line).

Shouldn’t it match these three ranges?
- zero chars after char 4
- four chars (“ ”) starting with char 6 and ending with char 9
- zero chars after char 10

Confused!
-sam

Fletcher Sandbeck

unread,
Jan 12, 2017, 2:42:40 PM1/12/17
to bbe...@googlegroups.com
I think it does what you're expecting if you change the pattern so the whitespace match isn't greedy: ^\s*?$

\s matches returns and newlines so your pattern ^\s*$ should match any block of lines which contain only whitespace. With the non-greedy modifier it instead stops at the first end-of-line it finds so cycles through each line which contains only whitespace in turns.

I am seeing some strange behavior, which you mention, where if the cursor is at the start of the file it only matches lines 2 through 3, but upon wrapping around it matches lines 2 through 4 as I'd expect.

[fletcher]

Sam Hathaway

unread,
Jan 12, 2017, 3:02:45 PM1/12/17
to bbe...@googlegroups.com
Ah, I think I see now. Thanks for the explanation. Looks like I usually
want \h instead of \s (so as to exclude \n \r and a few other more
esoteric vertical whitespace).
-sam
Reply all
Reply to author
Forward
0 new messages