--------------------------------------------------------------------
4.32: How do I strip blank space from the beginning/end of a string?
(contributed by brian d foy)
A substitution can do this for you. For a single line, you want to
replace all the leading or trailing whitespace with nothing. You can do
that with a pair of substitutions.
s/^\s+//;
s/\s+$//;
You can also write that as a single substitution, although it turns out
the combined statement is slower than the separate ones. That might not
matter to you, though.
s/^\s+|\s+$//g;
In this regular expression, the alternation matches either at the
beginning or the end of the string since the anchors have a lower
precedence than the alternation. With the "/g" flag, the substitution
makes all possible matches, so it gets both. Remember, the trailing
newline matches the "\s+", and the "$" anchor can match to the physical
end of the string, so the newline disappears too. Just add the newline
to the output, which has the added benefit of preserving "blank"
(consisting entirely of whitespace) lines which the "^\s+" would remove
all by itself.
while( <> )
{
s/^\s+|\s+$//g;
print "$_\n";
}
For a multi-line string, you can apply the regular expression to each
logical line in the string by adding the "/m" flag (for "multi-line").
With the "/m" flag, the "$" matches *before* an embedded newline, so it
doesn't remove it. It still removes the newline at the end of the
string.
$string =~ s/^\s+|\s+$//gm;
Remember that lines consisting entirely of whitespace will disappear,
since the first part of the alternation can match the entire string and
replace it with nothing. If need to keep embedded blank lines, you have
to do a little more work. Instead of matching any whitespace (since that
includes a newline), just match the other whitespace.
$string =~ s/^[\t\f ]+|[\t\f ]+$//mg;
--------------------------------------------------------------------
The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.
If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.
> You can also write that as a single substitution, although it
> turns out the combined statement is slower than the separate ones.
> That might not matter to you, though.
> s/^\s+|\s+$//g;
One might think that it would be realatively trivial to optimize a
situation like this: if all alternations are anchored, just go to the
next anchor if the previous match fails (or something to that effect.)
Does this make any sense?
--
szr
> > s/^\s+|\s+$//g;
>
> One might think that it would be realatively trivial to optimize a
> situation like this: if all alternations are anchored, just go to the
> next anchor if the previous match fails (or something to that effect.)
>
> Does this make any sense?
Sure that makes sense. Now just make the patch and send it to p5p so I
can get back to my single regex instead of two statements :)
Yes, the special case in the engine. Check for all anchor's when parsing.
Will we start a new thread for "special cases" in the engine.
Sounds interresting, why not?
sln
I will look into that then. I already have all the 5.10.0 (and past)
source codes so I'll start making some time to take a crack at that and
see if I can dish up something useful :-)
--
szr