Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Removing spaces in a string

10 views
Skip to first unread message

Gopala Tumuluri

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

Hello Folks.....

I am reading a line of input that is as follows " 20".
When I read it using :


$num = <some_file>;
and the
print $num;

It prints it as " 20" as expected.

I want to remove those spaces and then print $num which should print "20".

How would one do this.
Read a book but well it did not help much.


Mike Mitchell

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

$num=~s/^\s{1,}//;
will do it.
Mike
--
These opinions are those of the author only and do
not represent
the official or unoffical corporate policy of
BellSouth. Whew!

Paul de Werk

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to Gopala Tumuluri

Have you read the FAQ?

How about:

$num =~ s/^\s+//;

--
#include <std/disclaimer.h>
Paul de Werk, BSCS
MCI SGUS
pde...@campus.mci.net

ev...@storm.stud.unit.no

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

In article <4qmlpa$b...@shell.fore.com>, Gopala Tumuluri wrote:
>I am reading a line of input that is as follows " 20".
>When I read it using :
>
>$num = <some_file>;
>and the
>print $num;
>
>It prints it as " 20" as expected.
>
>I want to remove those spaces and then print $num which should print "20".
>

If you like to you might just go ahead using $num as number. That is

print $num + 5;

gives the output

25

In fact the following also works:

$num = <>;
$num += 0;
print ">$num<";

>How would one do this.
>Read a book but well it did not help much.
>

The solution about stripping spaces are alternatives, but in some cases
it's just to go ahead using $num as a number... :-)

--
: Even Holen, Berg Prestegård, Jonsvannsvn 45, N-7016 Trondheim
\|/ : mailto:ev...@stud.ntnu.no, Even....@unimed.sintef.no
/¯¯¯\ : www: http://www.stud.ntnu.no/~evenh/
m |. .| m : 'Christians are not perfect they are forgiven'


Kirk Haines

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to Gopala Tumuluri

> I am reading a line of input that is as follows " 20".
> When I read it using :
>
>
> $num = <some_file>;
> and the
> print $num;
>
> It prints it as " 20" as expected.
>
> I want to remove those spaces and then print $num which should print "20".
>
> How would one do this.
> Read a book but well it did not help much.

Read about regular expressions. That should help a lot.

To do this, a simple way would be:

$num = <some_file>;
$num =~ s/\s//;
print "$num\n";

Kirk Haines
OSH Consulting


Clay Shirky

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

>> Read a book but well it did not help much.

Try R. Schwartz's "Learning Perl". It has a very good chapter on Perl
regular expressions.

>$num =~ s/\s//;

This will only remove the first space. To remove all the leading
spaces you could do

$num =~ s/^\s+//; # ^\s+ says "one or more spaces at the beginning of $num"

To remove all spaces anywhere

$num =~ s/\s+//g; # the 'g' says "do this globally"

--
Clay Shirky


Mark-Jason Dominus

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

In article <31CEE9...@bsc.bls.com>,
Mike Mitchell <mitche...@bsc.bls.com> wrote:
>$num=~s/^\s{1,}//;

{1,} is weird. Most people would write `+' instead. If you write
`{1,}', people will wonder what you are doing that they don't
understand.

--

Mark-Jason Dominus m...@plover.com

Frank Varnavas

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

Hey does anyone have a way to trim leading and trailing blanks
from a string in 1 line? I keep doing:

$str =~ s/^\s*//;
$str =~ s/\s*$//;

And I can't find a better way to do it.

thanks,
frank (varn...@ny.ubs.com)

Mike Mitchell

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

> >$num=~s/^\s{1,}//;
>
> {1,} is weird. Most people would write `+' instead. If you write
> `{1,}', people will wonder what you are doing that they don't
> understand.sorry, this is a habit from a split command I had
written often which split on two characters or
{2,2}. Actually, I had orginally written it \s+ but
changed it later. They both work.

Quentin Fennessy

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

In article <4qp0bu$k...@ns2.ny.ubs.com>, Frank Varnavas <varnavas@news> wrote:
>Hey does anyone have a way to trim leading and trailing blanks
>from a string in 1 line? I keep doing:
>
>$str =~ s/^\s*//;
>$str =~ s/\s*$//;

How about

$str =~ s/^\s+(\S*)\s+$/$1/;


--
Quentin Fennessy AMD, Austin Texas

Clay Shirky

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

>Hey does anyone have a way to trim leading and trailing blanks
>from a string in 1 line? I keep doing:

>$str =~ s/^\s*//;
>$str =~ s/\s*$//;

>And I can't find a better way to do it.

$str =~ s/(^\s|\s$)*//g;

Note however, that not only is this not a better way to do it, it is
in many ways a worse one. While I do not know perl's guts well enough
to know whether you come out fractionally ahead or behind on compile
or run time by doing it in one pass, I know for sure that you lose out
on readability and maintainability by not doing it exactly as you
wrote it.

Good code is understandable, maintainable, concise, and elegant, in
that order.

--
Clay Shirky


Clay Shirky

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

In <DtKHy...@txnews.amd.com> que...@benson.amd.com (Quentin Fennessy) writes:

> $str =~ s/^\s+(\S*)\s+$/$1/;

This will break on phrases which have internal whitespace. It will
also break on phrases which don't have one or more leading _and_
trailing whitespaces.

$str =~ s/^\s*(.*?)\s*$/$1/;

--
Clay Shirky

Quentin Fennessy

unread,
Jun 26, 1996, 3:00:00 AM6/26/96
to

Whoops. I stand corrected. (Also by Alan Su <al...@postgres.Berkeley.EDU>).

Clay's solution is correct. Thanks.

Tad McClellan

unread,
Jun 28, 1996, 3:00:00 AM6/28/96
to
Mike Mitchell (mitche...@bsc.bls.com) wrote:
: > >$num=~s/^\s{1,}//;

: >
: > {1,} is weird. Most people would write `+' instead. If you write
: > `{1,}', people will wonder what you are doing that they don't
: > understand.

: sorry, this is a habit from a split command I had

: written often which split on two characters or
: {2,2}. Actually, I had orginally written it \s+ but

^^^^^
: changed it later. They both work.


{2,2} is weird too (IMHO). Why not just {2} ?


--
Tad McClellan, Logistics Specialist (IETMs and SGML guy)
email: mccle...@lfwc.lockheed.com

The "Battle of the Sexes" is perpetuated by fraternizing with the enemy.

Mike Mitchell

unread,
Jun 28, 1996, 3:00:00 AM6/28/96
to
Tad,

This is because :: was a delimeter that was used in
a file. often the field would be blank, so you
might have:

field1::field2::::::field5::::field7\n

If I split like this
(@fields)=split(/:{2,2}/,$record);

I get the right break out regardless of the blank
fields.

Mike

Jeffrey Friedl

unread,
Jul 1, 1996, 3:00:00 AM7/1/96
to

Clay Shirky wrote:
> >Hey does anyone have a way to trim leading and trailing blanks
> >from a string in 1 line? I keep doing:
> >
> >$str =~ s/^\s*//;
> >$str =~ s/\s*$//;
> >
> >And I can't find a better way to do it.
>
> $str =~ s/(^\s|\s$)*//g;

You probably mean s/(^\s+|\s+$)+//g;
-^---^--^-

> Note however, that not only is this not a better way to do it, it is
> in many ways a worse one. While I do not know perl's guts well enough
> to know whether you come out fractionally ahead or behind on compile

If you think what the alternation must require Perl to do, it becomes
rather obvious that the second way can likely be much slower than the first,
"two-pass" way. The one-pass way is much less efficient because it has
to try each alternative at each position in the string. The two-pass way
will "just do it" and be done.

Jeffrey
----------------------------------------------------------------------------
Jeffrey Friedl <jfr...@omron.co.jp> Omron Corp, Nagaokakyo, Kyoto 617 Japan
See my Jap<->Eng dictionary at http://www.wg.omron.co.jp/cgi-bin/j-e
or at mirrors at [enterprise.ic.gc.ca] and [www.itc.omron.com]

Eric D. Friedman

unread,
Jul 4, 1996, 3:00:00 AM7/4/96
to

[mailed, posted]

In article <31D74C8F...@omron.co.jp>,
Jeffrey Friedl <jfr...@omron.co.jp> wrote:
>Clay Shirky wrote:

>You probably mean s/(^\s+|\s+$)+//g;
> -^---^--^-
>
>> Note however, that not only is this not a better way to do it, it is
>> in many ways a worse one. While I do not know perl's guts well enough
>> to know whether you come out fractionally ahead or behind on compile
>
> If you think what the alternation must require Perl to do, it becomes
>rather obvious that the second way can likely be much slower than the first,
>"two-pass" way. The one-pass way is much less efficient because it has
>to try each alternative at each position in the string. The two-pass way
>will "just do it" and be done.

Doesn't the presence of the anchors (^ and $) prevent this? Just
curious....

--
Eric D. Friedman
frie...@uci.edu

Ian Alderman

unread,
Jul 10, 1996, 3:00:00 AM7/10/96
to

>> >Hey does anyone have a way to trim leading and trailing blanks
>> >from a string in 1 line? I keep doing:

>> >$str =~ s/^\s*//;
>> >$str =~ s/\s*$//;

>> $str =~ s/(^\s|\s$)*//g;

>You probably mean s/(^\s+|\s+$)+//g;

Actually this doesn't work, because it doesn't remove both leading and
trailing blanks.

To get both, try the following (perl 5 required for the ?).

Pythagoras C. Watson

unread,
Jul 10, 1996, 3:00:00 AM7/10/96
to

In article <4s0hnm$i...@blather.cs.cornell.edu>,
Ian Alderman <i...@cs.cornell.edu> wrote:
:>> >Hey does anyone have a way to trim leading and trailing blanks

:>> >from a string in 1 line? I keep doing:
:
:>> >$str =~ s/^\s*//;
:>> >$str =~ s/\s*$//;
:
:>> $str =~ s/(^\s|\s$)*//g;
:
:>You probably mean s/(^\s+|\s+$)+//g;
:
:Actually this doesn't work, because it doesn't remove both leading and
:trailing blanks.

Well, actually it does:
perl -e '$_=" hello "; s/(^\s+|\s+$)+//g; print "[$_]\n"'
prints:
[hello]

Note that the last plus accomplishes nothing, except to slow the regex
down. A better form is:
s/^\s+|\s+$//g;

:To get both, try the following (perl 5 required for the ?).


:
:$str =~ s/^\s*(.*?)\s*$/$1/;

However, this doesn't always give the same results when $str contains
newlines. For example:
perl -e '$_=" hello \n world "; s/^\s*(.*?)\s*$/$1/;; print "[$_]\n"'
prints:
[ hello
world ]
(i.e. it fails), while:
perl -e '$_=" hello \n world "; s/^\s+|\s+$//g; print "[$_]\n"'
prints:
[hello
world]

And just for the fun of it, I benchmarked these three regexes plus the
plain old two-line version, and got:

Benchmark: timing 1000 iterations of ORd_1, ORd_2, separate, subMiddle...
ORd_1: 8 secs ( 8.32 usr 0.03 sys = 8.35 cpu)
ORd_2: 16 secs (15.20 usr 0.00 sys = 15.20 cpu)
separate: 4 secs ( 3.98 usr 0.00 sys = 3.98 cpu)
subMiddle: 11 secs (10.59 usr 0.00 sys = 10.59 cpu)

Benchmark: timing 10 iterations of ORd_1, ORd_2, separate, subMiddle...
ORd_1: 11 secs (10.83 usr 0.00 sys = 10.83 cpu)
ORd_2: 20 secs (20.18 usr 0.00 sys = 20.18 cpu)
separate: 6 secs ( 5.36 usr 0.00 sys = 5.36 cpu)
subMiddle: 15 secs (14.73 usr 0.01 sys = 14.74 cpu)

Given these results, I recommend the two-line version, unless speed and
readibility aren't options.

For those that care, here is the benchmark source:

#!/usr/local/bin/perl -w

use Benchmark;

@X = ( 'none', ' leading', 'trailing ',
' leading/trailing ', " embeded newline\n<-- here ");

print "Data\n----\n ";
for (@X) { print " [$_]"; }
print "\nORd_1\n-----\n ";
@Y = @X; for (@Y) { s/^\s+|\s+$//g; print " [$_]"; }
print "\nORd_2\n-----\n ";
@Y = @X; for (@Y) { s/(^\s+|\s+$)+//g; print " [$_]"; }
print "\nseparate\n--------\n ";
@Y = @X; for (@Y) { s/^\s+//; s/\s+$//; print " [$_]"; }
print "\nsubMiddle\n---------\n ";
@Y = @X; for (@Y) { s/^\s*(.*?)\s*$/$1/; print " [$_]"; }
print "\n\n";

for (1..4) {
push @X, @X;
}
timethese( 1000, {
ORd_1 => q{ @Y = @X; for (@Y) { s/^\s+|\s+$//g; } },
ORd_2 => q{ @Y = @X; for (@Y) { s/(^\s+|\s+$)+//g; } },
separate => q{ @Y = @X; for (@Y) { s/^\s+//; s/\s+$//; } },
subMiddle => q{ @Y = @X; for (@Y) { s/^\s*(.*?)\s*$/$1/; } }
});

print "\n";
for (1..7) {
push @X, @X;
}
timethese( 10, {
ORd_1 => q{ @Y = @X; for (@Y) { s/^\s+|\s+$//g; } },
ORd_2 => q{ @Y = @X; for (@Y) { s/(^\s+|\s+$)+//g; } },
separate => q{ @Y = @X; for (@Y) { s/^\s+//; s/\s+$//; } },
subMiddle => q{ @Y = @X; for (@Y) { s/^\s*(.*?)\s*$/$1/; } }
});

--
Py -- 3.141592653589793238462643383279502884197169399375105...
Pythagoras Watson -- "Live long and may all your kernels pop."
INET: p...@ecst.csuchico.edu ============ COMPUSERVE: 72162,2676

Clay Shirky

unread,
Jul 14, 1996, 3:00:00 AM7/14/96
to

>>>> $str =~ s/^\s*//;
>>>> $str =~ s/\s*$//;

>>> $str =~ s/(^\s|\s$)*//g;

>>You probably mean s/(^\s+|\s+$)+//g;

>Actually this doesn't work, because it doesn't remove both leading and
>trailing blanks.

Yes it does. The 'g' modifier says match multiple times. The first
match is against leading spaces, the second is agianst trailing
spaces.

-clay shirky


0 new messages