Need IP Address Sort Subroutine

241 views
Skip to first unread message

Joe Williams

unread,
Sep 29, 1998, 3:00:00 AM9/29/98
to
Does anyone have, or can anyone me toward a routine that will sort IP
addresses? Many thanks.

John Porter

unread,
Sep 29, 1998, 3:00:00 AM9/29/98
to
Joe Williams wrote:
>
> Does anyone have, or can anyone me toward a routine that will sort IP
> addresses? Many thanks.

I assume you want to sort them numerically.

You will therefore want to be able to convert dotted-quad
notation to an integer. The following sub does it:

sub dotted_to_int {
return undef if $_[0] =~ /[^0-9.]/;
my @a = split /\./, $_[0];
my $n = pop @a;
return undef unless $n < (256 ** ( 4 - @a ));
for ( 0..$#a ) {
return undef unless $a[$_] < 256;
$n += ( $a[$_] << (8*(3-$_)) );
}
$n;
}

(It actually accepts all valid dotted-number notations,
not just dotted-quad.)

Then you can use a Schwartzian Transform to do the sort:

@ips_sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { $_, dotted_to_int( $_ ) }
@ips;

Hope this helps.

--
John "Many Jars" Porter

Michal Rutka

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
John Porter <jdpo...@min.net> writes:

> Joe Williams wrote:
> >
> > Does anyone have, or can anyone me toward a routine that will sort IP
> > addresses? Many thanks.
>
> I assume you want to sort them numerically.

Why? IP address is a string of four characters. It is better to
sort them as strings.

> You will therefore want to be able to convert dotted-quad
> notation to an integer. The following sub does it:

Why? It cost only time all this calculations. BTW your code do not
work for IP 200.255.255.255. On my machine it returns -1. Anyway, every
address above 127.255.255.255 is 'unsortable' by you. It just 50% of
all addresses so who cares?

>
> sub dotted_to_int {
> return undef if $_[0] =~ /[^0-9.]/;
> my @a = split /\./, $_[0];
> my $n = pop @a;
> return undef unless $n < (256 ** ( 4 - @a ));
> for ( 0..$#a ) {
> return undef unless $a[$_] < 256;
> $n += ( $a[$_] << (8*(3-$_)) );
> }
> $n;
> }
>
> (It actually accepts all valid dotted-number notations,
> not just dotted-quad.)
>
> Then you can use a Schwartzian Transform to do the sort:
>
> @ips_sorted =
> map { $_->[0] }
> sort { $a->[1] <=> $b->[1] }
> map { $_, dotted_to_int( $_ ) }
> @ips;
>
> Hope this helps.

Horror.

>
> --
> John "Many Jars" Porter

Hymm, how about this:

@ips_sorted = sort {pack("C4",split(/\./,$a)) cmp pack("C4",split(/\./,$b))}
@ips;

Should be faster and without this Schwartzian thing (what is it, anyway)?
BTW. Your 'Schwartzian' code does not work for me. I.e. lot of uninitialized
values and result is empty.

Regards,

Michal

sjag...@my-dejanews.com

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to

> Why? IP address is a string of four characters. It is better to
> sort them as strings.

Wont work. If u sorted addreses as strings, u'd have 130.4.aaa.bbb listed
AFTER 130.111.xxx.yyy while arranging in ascending order because 1 comes
before 4

one method wud be to do a numerical comparison on each element of the ip addr,
ie each block.

jagadish

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp Create Your Own Free Member Forum

Michal Rutka

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to sjag...@my-dejanews.com
sjag...@my-dejanews.com writes:

> > Why? IP address is a string of four characters. It is better to
> > sort them as strings.
>
> Wont work.

Are you sure what are you talking about?

> If u sorted addreses as strings, u'd have 130.4.aaa.bbb listed
> AFTER 130.111.xxx.yyy while arranging in ascending order because 1 comes
> before 4

OK. I've run the following script (the sort routine to compare strings
is from my previous post):

@ips = ('130.111.0.0','130.4.0.0');


@ips_sorted = sort {pack("C4",split(/\./,$a)) cmp pack("C4",split(/\./,$b))}
@ips;

foreach(@ips_sorted){
print "$_\n";
}


And guss which result I'vr got? Here it is:
130.4.0.0
130.111.0.0

Again, IP address is a string of four 8 bit characters!

> one method wud be to do a numerical comparison on each element of the ip addr,
> ie each block.

Not true. I can compare a complete adress like in my code. Don't need to
do four comparision.

Lack Mr G M

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
In article <6urq6t$pe2$1...@callisto.clark.net>, "Joe Williams" <will...@clark.net> writes:
|>
|> Does anyone have, or can anyone me toward a routine that will sort IP
|> addresses? Many thanks.

No, but this simpel script should give you an idea.

Basically, you use inet_aton to convert it to a 4-char item (so it
can sort out that 10.12 is really 10.0.0.12 and hence get it between
10.0.0.1 and 10.0.1.1 in the example) then sort that (*as a character
string*).

The sort uses the Orcish Maneuver (see Effective Perl at
http://www.effectiveperl.com/) so you only call inet_aton once
per-address, in case you have a lot of them...


==================================================================
use Socket;

@addr = ( '100.1.2.3',
'10.12',
'10.0.0.1',
'10.0.1.1',
'200.1.2.3',
'100.2.3.4',
'100.11.12.13',
'100.19.18.17');

{my %hc; # Localize cache hash
@sorted = sort {
($hc{$a} ||= inet_aton($a)) cmp ($hc{$b} ||= inet_aton($b))
} @addr;
}

{local $"=', ';
print "Original: @addr\n";

print "Sorts to: @sorted\n";
}

==================================================================

--
----------- Gordon Lack ----------------- gml...@ggr.co.uk ------------
The contents of this message *may* reflect my personal opinion. They are
*not* intended to reflect those of my employer, or anyone else.

John Porter

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
Michal Rutka wrote:
>
> BTW your code do not
> work for IP 200.255.255.255. On my machine it returns -1.

Then something is wrong with your machine.
I do not get that result. I get 3372220415.


> Anyway, every address above 127.255.255.255 is 'unsortable'
> by you. It just 50% of all addresses so who cares?

Not on my machine. What's your platform?


> Horror.

4q.


> @ips_sorted = sort {
> pack("C4",split(/\./,$a)) cmp pack("C4",split(/\./,$b))
> } @ips;

Good idea.
But your code causes the redundant splitting and packing
of some (possibly very many) of the strings.
You could wrap that little operation with a sub
which you can then memoize:

use Memoize;
sub pack_split { pack("C4",split(/\./,shift)) }
memoize 'pack_split';
@ips_sorted = sort { pack_split($a) cmp pack_split($b) } @ips;

Or you could use a Schwartzian Transform:

@ips_sorted =
map { $_->[0] }

sort { $a->[1] cmp $b->[1] }
map { $_, pack("C4",split(/\./,$_)) }
@ips;


> Should be faster and without this Schwartzian thing (what is it, anyway)?

Get a life. (Which is to say, read this newsgroup for a while.)

> BTW. Your 'Schwartzian' code does not work for me. I.e. lot of uninitialized
> values and result is empty.

Like I said...

John Porter

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
sjag...@my-dejanews.com wrote:
>
> > Why? IP address is a string of four characters. It is better to
> > sort them as strings.
>
> Wont work. If u sorted addreses as strings, u'd have 130.4.aaa.bbb listed

> AFTER 130.111.xxx.yyy while arranging in ascending order because 1 comes
> before 4

No, that's not what's going on.
He's converting each number (0..255) into the corresponding character,
and packing the four resulting characters into a string.
Remember, Perl doesn't care what bits are set in a byte, it can
still treat it as a "character".

Michal Rutka

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to jdpo...@min.net
John Porter <jdpo...@min.net> writes:
> Michal Rutka wrote:
> >
> > BTW your code do not
> > work for IP 200.255.255.255. On my machine it returns -1.
>
> Then something is wrong with your machine.
> I do not get that result. I get 3372220415.

Sorry, typo, it should be 'for IP 255.255.255.255'.

[...]


> Not on my machine. What's your platform?

This is what uname says:
SunOS erhs86 5.5.1 Generic_103640-08 sun4u sparc SUNW,Ultra-Enterprise

[...]


> use Memoize;
> sub pack_split { pack("C4",split(/\./,shift)) }
> memoize 'pack_split';
> @ips_sorted = sort { pack_split($a) cmp pack_split($b) } @ips;
>

Hym. Never heard about Memorize. It is not present in my distribution.

> Or you could use a Schwartzian Transform:
>
> @ips_sorted =
> map { $_->[0] }
> sort { $a->[1] cmp $b->[1] }
> map { $_, pack("C4",split(/\./,$_)) }
> @ips;
>
>
> > Should be faster and without this Schwartzian thing (what is it, anyway)?
>
> Get a life. (Which is to say, read this newsgroup for a while.)

I don't have so much time to follow this group. I have some real work
to do too..

> > BTW. Your 'Schwartzian' code does not work for me. I.e. lot of uninitialized
> > values and result is empty.
>
> Like I said...

The point is that this thing is not portable, at least it does not
work on my Sparc. Therefore I will not bother to learn about things
which can cause compability problems. I put it rather on the banned
syntax list. The same counts for integer conversion.

>
> --
> John "Many Jars" Porter

Regards

Michal

Uri Guttman

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to John Porter
>>>>> "JP" == John Porter <jdpo...@min.net> writes:

JP> Michal Rutka wrote:
>>
>> BTW your code do not
>> work for IP 200.255.255.255. On my machine it returns -1.

JP> Then something is wrong with your machine.
JP> I do not get that result. I get 3372220415.

>> Anyway, every address above 127.255.255.255 is 'unsortable'
>> by you. It just 50% of all addresses so who cares?

JP> Not on my machine. What's your platform?


>> Horror.

JP> 4q.

you tell him, john! :-)

>> @ips_sorted = sort {
>> pack("C4",split(/\./,$a)) cmp pack("C4",split(/\./,$b))
>> } @ips;

JP> Good idea.

his split and pack is a good idea but you are right, of course, about
the redundancy that schwartzian removes.

JP> Or you could use a Schwartzian Transform:

JP> @ips_sorted =
JP> map { $_->[0] }
JP> sort { $a->[1] cmp $b->[1] }
JP> map { $_, pack("C4",split(/\./,$_)) }
JP> @ips;

i think that should be
JP> map { [$_, pack("C4",split(/\./,$_))] }

you never created the anon array.

>> Should be faster and without this Schwartzian thing (what is it, anyway)?

JP> Get a life. (Which is to say, read this newsgroup for a while.)

>> BTW. Your 'Schwartzian' code does not work for me. I.e. lot of
>> uninitialized values and result is empty.

he did find your bug, but he didn't fix it.


but why use the temp array when you can do it i
linearly? you save the allocation of the temp arrays and dereferences in
the sort vs. the unpack at the end. and i use the socket routines to do
the work.

try this (tested):

perl -MSocket -le 'print join( "\n", map{ inet_ntoa($_) } sort map
{inet_aton $_ } <>)'

uri

--
Uri Guttman Fast Engines -- The Leader in Fast CGI Technology
u...@fastengines.com http://www.fastengines.com

Uri Guttman

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
>>>>> "MR" == Michal Rutka <erh...@erh.ericsson.se> writes:

MR> [...]


>> use Memoize; sub pack_split { pack("C4",split(/\./,shift)) }
>> memoize 'pack_split'; @ips_sorted = sort { pack_split($a) cmp
>> pack_split($b) } @ips;
>>

MR> Hym. Never heard about Memorize. It is not present in my
MR> distribution.

it's Memoize. no R. and it is on CPAN where all of the rest of perl
resides besides your distribution.

MR> I don't have so much time to follow this group. I have some real
MR> work to do too..

then why are you posting to this group? and john was trying to help you
improve your code and speed. he just had a typo in his schwartzian (se
my earlier followup).

>> > BTW. Your 'Schwartzian' code does not work for me. I.e. lot of
>> uninitialized > values and result is empty.
>>

>> Like I said...

he had a minor bug. the transform is incredibly useful if you would take
the 5 minutes to understand it. it is general purpose and has nothing to
do with IP numbers.

MR> The point is that this thing is not portable, at least it does not
MR> work on my Sparc. Therefore I will not bother to learn about
MR> things which can cause compability problems. I put it rather on
MR> the banned syntax list. The same counts for integer conversion.

it works on all perls. study it. it will save your job some day.

and what do you mean by integer conversion being banned syntax?

John Porter

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
For anyone who may be interested...
I have performed some benchmarks, and based on the results,
I make the following observations.

1. Michal gave the following formula for converting a dotted-quad
string into a string of four chars:
pack( "C4", split(/\./,$_) )
This is so fast that memoization has no effect.

2. The canonical way to do this conversion is:
inet_aton($_)
And it's about as fast.

3. Uri suggested using inet_ntoa to convert the sorted strings
back into dotted-quad notation:
map { inet_ntoa($_) } sort map { inet_aton($_) }
But my benchmarks indicate that this is about 18 times slower
that either of the above.

4. Using he Schwartzian Transform adds a TON of overhead no matter
what else you do, including memoize.


BTW, I had an error (bug) in my ST, which Michal ran into,
but didn't know how to diagnose.
This is how it should have looked:

@ips_sorted =
map { $_->[0] }

sort { $a->[1] <=> $b->[1] }

map { [ $_, dotted_to_int( $_ ) ] } # list REF!
@ips;

The difference is that the first map (that's the LOWER one!)
should have square brackets in it, so that it returns a
list ref, rather than a list of two things.

Uri Guttman

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
>>>>> "JP" == John Porter <jdpo...@min.net> writes:

JP> For anyone who may be interested...
JP> I have performed some benchmarks, and based on the results,
JP> I make the following observations.


JP> 1. Michal gave the following formula for converting a dotted-quad
JP> string into a string of four chars:
JP> pack( "C4", split(/\./,$_) )
JP> This is so fast that memoization has no effect.

JP> 2. The canonical way to do this conversion is:
JP> inet_aton($_)
JP> And it's about as fast.


JP> 3. Uri suggested using inet_ntoa to convert the sorted strings
JP> back into dotted-quad notation:
JP> map { inet_ntoa($_) } sort map { inet_aton($_) }
JP> But my benchmarks indicate that this is about 18 times slower
JP> that either of the above.

i wasn't trying to speed things up. but i wonder what is slow about
it. if inet_aton is about the same speed as pack/split, then inet_ntoa
must be a pig. maybe an unpack/join would be much faster. john could
you benchmark that and post all the results?

JP> 4. Using he Schwartzian Transform adds a TON of overhead no matter
JP> what else you do, including memoize.

i thought the anon array, derefs, etc were slow. in most cases the
savings on conversions in the sort compare wins with ST. in this case no
conversion and anon array is needed since you can convert on the fly and
still sort.

John Porter

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
Lack Mr G M wrote:
>
> The sort uses the Orcish Maneuver (see Effective Perl at
> http://www.effectiveperl.com/) so you only call inet_aton once
> per-address, in case you have a lot of them...

The Orcish Maneuver is a nice idea.
It's basically what Memoize encapsulates... although Memoize
has lots of other nifty features, like persistent caching
to disk.

By the way, did you benchmark your solution?
I did, and my benchmarks show that the using the or-cache
actually slows down the code a tiny bit -- IN THIS CASE.

Moral: don't assume that techniques intended to speed up your
code necessarily do. use Benchmark;

John Porter

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
Michal Rutka wrote:
>
> > > BTW your code do not
> > > work for IP 200.255.255.255. On my machine it returns -1.
> Sorry, typo, it should be 'for IP 255.255.255.255'.

Even so, I get 4294967295 on my machine.


> This is what uname says:
> SunOS erhs86 5.5.1 Generic_103640-08 sun4u sparc SUNW,Ultra-Enterprise

Mine:
SunOS pollux 5.4 generic sun4m sparc


> I don't have so much time to follow this group. I have some real work
> to do too..

I understand.
But the Schwartzian Transform is well worth knowing.
You can read about it in section 4.15 of The Perl Cookbook.


> The point is that this thing is not portable, at least it does not

> work on my Sparc. Therefore I will not bother to learn about things


> which can cause compability problems.

You are wrong to assume that the error you got was a result of
incompatibility. There was simply a bug in my code, which
would have generated an error on *any* machine.

Kevin Reid

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
Joe Williams <will...@clark.net> wrote:

> Does anyone have, or can anyone me toward a routine that will sort IP
> addresses? Many thanks.

#!perl -w

@ips = qw(
213.135.67.36
192.45.120.4
192.46.240.45
14.30.16.0
108.162.155.80
213.53.67.242
108.162.33.65
14.150.47.46
);

@sips =
map {join '.', unpack 'C4', $_}
sort
map {pack 'C4', split /\./, $_}
@ips;

print join "\n", @sips, '';

__END__

Doesn't handle all cases, but pretty fast.

BTW, I didn't use inet_ntoa() because when I tried it, it started my PPP
connection (probably because it wanted to talk to a DNS).

--
Kevin Reid. | Macintosh.
"I'm me." | Think different.

John Porter

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
Uri Guttman wrote:
>
> i wasn't trying to speed things up. but i wonder what is slow about
> it. if inet_aton is about the same speed as pack/split, then inet_ntoa
> must be a pig. maybe an unpack/join would be much faster. john could
> you benchmark that and post all the results?

Here are the functions; benchmarks follow.

sub ntoa_aton {
my @args = @_;


map{ inet_ntoa($_) }
sort
map{ inet_aton($_) }

@args;
}

sub ntoa_pack {
my @args = @_;
map{ inet_ntoa($_) }
sort
map{ pack("C4",split(/\./,$_)) }
@args;
}

sub unpk_aton {
my @args = @_;
map{ join '.',unpack("C4",$_) }
sort
map{ inet_aton($_) }
@args;
}

sub unpk_pack {
my @args = @_;
map{ join '.',unpack("C4",$_) }
sort
map{ pack("C4",split(/\./,$_)) }
@args;
}

sub sort_aton {
my @args = @_;
sort {
inet_aton($a)
cmp
inet_aton($b)
} @args
}

sub sort_pack {
my @args = @_;


sort {
pack("C4",split(/\./,$a))
cmp
pack("C4",split(/\./,$b))

} @args
}

(4 iterations, sorting a list of 10,000 addresses:)

aton pack
unpk_ 22 32
ntoa_ 16 25
sort_ 1 1

From this we can see that inet_aton() is faster than pack(split()),
in these cases, and that inet_ntoa() is faster than join(unpack()).

But we also see that doing a map/sort/map is MUCH slower than doing
a sort with an appropriate comparison inside... probably because of
the temporary lists that have to be created -- among other reasons.

(400 iterations:)
sort_aton: 103
sort_pack: 96

Interestingly, pack(split()) appears to be faster than inet_aton()
in this situation.

Mark-Jason Dominus

unread,
Sep 30, 1998, 3:00:00 AM9/30/98
to
In article <36124B48...@min.net>, John Porter <jdpo...@min.net> wrote:
>You could wrap that little operation with a sub
>which you can then memoize:
>
> use Memoize;
> sub pack_split { pack("C4",split(/\./,shift)) }
> memoize 'pack_split';
> @ips_sorted = sort { pack_split($a) cmp pack_split($b) } @ips;

The `Orcish Maneuver' is usually better than memoization for sort
comparators. here's why: Suppose your list @ips has 1,000 elements.
`sort' will want to do about 8,700 comparisons. It will make about
15,000 calls to `pack_split'.

With no memoization, you get 15,000 calls to pack_split, 15,000
splits, and 15,000 packs.

With `Memoize', you only have 1,000 splits and 1,000 packs, but you
still have 15,000 calls to `pack_split', although 14,000 of them
return immediately.

If you use the Orcish Maneuver, like this:

@ips_sorted = sort compare_ips @ips;
{ my %cache;
sub compare_ips {
($cache{$a} ||= pack_split($a))
cmp
($cache{$b} ||= pack_split($b))
}
}

Then you have only 1,000 calls to pack_split.

The big benefit of `Memoize' is that you can have it store the results
of `pack_split' on disk, and then the next time you run your program
it will run even faster:


Michal Rutka

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to u...@camel.fastserv.com
Uri Guttman <u...@camel.fastserv.com> writes:
> >>>>> "MR" == Michal Rutka <erh...@erh.ericsson.se> writes:
> MR> Hym. Never heard about Memorize. It is not present in my
> MR> distribution.
>
> it's Memoize. no R. and it is on CPAN where all of the rest of perl
> resides besides your distribution.

Sorry, another typo. I know what CPAN is, and I know what standard distribution is and I
know that when you have more than 100 computers on which your script must
run than much better is to use only a standard distribution.

> then why are you posting to this group?

Just to have a little fun for now. Soon I will stop posting.

> and john was trying to help you
> improve your code and speed.

No. It was another way around. He posted code with two bugs:
1. Integer overflow
2. Schwartzian thing.

I was trying to help him to see that there is another, safier (read portable) way
to do this.

> he had a minor bug. the transform is incredibly useful if you would take
> the 5 minutes to understand it.

If you need as much time as 5 minutes then...
It took me about 10 seconds to understand a syntax. I didn't found it pretty.
And one more thing, in a job which I do I learned to use a word 'transform' to
more complicated things than a simple data manipulation. That's why I call it
a Schwartzian thing.

> it is general purpose and has nothing to
> do with IP numbers.

I would never guss.

[..]


> and what do you mean by integer conversion being banned syntax?

Well. Do you know that most bugs in embeded software are caused by an integer
overflow? Do you know that those bugs are very hard to find when the software
is in testing phase? Say, how long would you take to find an ineteger overflow
in John's code in this scenario:

1. He writes code, test it on his machine where it works OK.
2. You take it to the machine which has 32 bit signed integers... and bingo
code does not work in a some wired way.

Think about it.

Regards,


Michal

Lack Mr G M

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to
In article <lapvccy...@erh.ericsson.se>, Michal Rutka <erh...@erh.ericsson.se> writes:
|> >
|> > it's Memoize. no R. and it is on CPAN where all of the rest of perl
|> > resides besides your distribution.
|>
|> Sorry, another typo. I know what CPAN is, and I know what standard distribution is and I
|> know that when you have more than 100 computers on which your script must
|> run than much better is to use only a standard distribution.

No it isn't. The much MUCH better thing to do is to get all of these
systems to use a *single* distribution. That's what network file
systems are for. Then you just have to install additional modules once
and they appear on all systems.

There is far more to administering systems than writing Perl - you
also need to know how to make the admin simple by using the available
options.

|> It took me about 10 seconds to understand a syntax. I didn't found it pretty.
|> And one more thing, in a job which I do I learned to use a word 'transform' to
|> more complicated things than a simple data manipulation. That's why I call it
|> a Schwartzian thing.

I suggest you look up transform in a dictionary.

Lack Mr G M

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to
In article <361269E7...@min.net>, John Porter <jdpo...@min.net> writes:
|> For anyone who may be interested...
|> I have performed some benchmarks, and based on the results,
|> I make the following observations.
|>
|> 1. Michal gave the following formula for converting a dotted-quad
|> string into a string of four chars:
|> pack( "C4", split(/\./,$_) )
|> This is so fast that memoization has no effect.

However, the address format for IP does not require you to use 4
fields, you can also specify network.host. This is why it is better to
use inet_aton, since it handles this.

Eg: 127.1 is the same as 127.0.0.1
172.16.3 -> 172.16.0.3
172.16.56239 -> 172.16.219.175

You may not use this, but why not make use of the code which is
already written that *can* handle it? Stop re-inventing the wheel (or
rather, in this case, re-inventing it as an arc!).

Steffen Beyer

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to
In article <6urq6t$pe2$1...@callisto.clark.net>, Joe Williams <will...@clark.net> wrote:

> Does anyone have, or can anyone me toward a routine that will sort IP
> addresses? Many thanks.

Easy:

foreach $item (@list)
{
if ($item =~ /^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/))
{
$host = sprintf("%.2X%.2X%.2X%.2X", $1, $2, $3, $4);
$addr = sprintf("%d.%d.%d.%d", $1, $2, $3, $4); # removes leading 0
$addr{$host} = $addr;
# $name{$host} = $name; # wherever this comes from
}
}
foreach $host (sort keys(%hash))
{
printf("%-40.40s %s\n", $name{$host}, $addr{$host});
}

HTH.

Yours,
--
Steffen Beyer <s...@engelschall.com>
Free Perl and C Software for Download: www.engelschall.com/u/sb/download/

Michal Rutka

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to
gml...@ggr.co.uk (Lack Mr G M) writes:
> |> Sorry, another typo. I know what CPAN is, and I know what standard distribution is and I
> |> know that when you have more than 100 computers on which your script must
> |> run than much better is to use only a standard distribution.
>
> No it isn't. The much MUCH better thing to do is to get all of these
> systems to use a *single* distribution. That's what network file
> systems are for. Then you just have to install additional modules once
> and they appear on all systems.

Yeah, if they are on a single network. But I am not an admin (what a broing work
anyway). What I am talking about is 100 customers which are using my software. Do
you have any 'bright' idea to solve this?

> There is far more to administering systems than writing Perl - you
> also need to know how to make the admin simple by using the available
> options.

Yeah, that is good for people which want work as admin and be paid as admin. I am
not.

> |> It took me about 10 seconds to understand a syntax. I didn't found it pretty.
> |> And one more thing, in a job which I do I learned to use a word 'transform' to
> |> more complicated things than a simple data manipulation. That's why I call it
> |> a Schwartzian thing.
>
> I suggest you look up transform in a dictionary.

And start calling grep as 'transformation program'. No thanks.

Regards,

Michal

Michal Rutka

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to
gml...@ggr.co.uk (Lack Mr G M) writes:
[...]

> |> 1. Michal gave the following formula for converting a dotted-quad
> |> string into a string of four chars:
> |> pack( "C4", split(/\./,$_) )
> |> This is so fast that memoization has no effect.
>
> However, the address format for IP does not require you to use 4
> fields, you can also specify network.host. This is why it is better to
> use inet_aton, since it handles this.
>
> Eg: 127.1 is the same as 127.0.0.1
> 172.16.3 -> 172.16.0.3
> 172.16.56239 -> 172.16.219.175
>
> You may not use this, but why not make use of the code which is
> already written that *can* handle it? Stop re-inventing the wheel (or
> rather, in this case, re-inventing it as an arc!).

And therefore, when you run this code at home then start a dialup even when you
know that your addresses are in a cannonical form? No thanks. BTW invention the
wheel, or even an arc takes more effort than writing a few lines of Perl code.

Michal

John Porter

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to
Michal Rutka wrote:
>
> And therefore, when you run this code at home then start a dialup
> even when you
> know that your addresses are in a cannonical form? No thanks.
> BTW invention the
> wheel, or even an arc takes more effort than writing a few lines of
> Perl code.

Not sure what your point is.
inet_aton() is not only at least as portable, but MORE correct
than your solution. And takes less typing besides.

--
John "Many Jars" Porter

baby mother hospital scissors creature judgment butcher engineer

John Porter

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to
Steffen Beyer wrote:
>
> In article <6urq6t$pe2$1...@callisto.clark.net>, Joe Williams <will...@clark.net> wrote:
>
> > Does anyone have, or can anyone me toward a routine that will sort IP
> > addresses? Many thanks.
>
> Easy:
>
> foreach $item (@list)
> {
> if ($item =~ /^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/))
> {
> $host = sprintf("%.2X%.2X%.2X%.2X", $1, $2, $3, $4);
> $addr = sprintf("%d.%d.%d.%d", $1, $2, $3, $4); # removes leading 0
> $addr{$host} = $addr;
> # $name{$host} = $name; # wherever this comes from
> }
> }
> foreach $host (sort keys(%hash))
> {
> printf("%-40.40s %s\n", $name{$host}, $addr{$host});
> }

I do not see a hash named %hash being created anywhere.
I assume (i.e. I hope) you meant to say %addr.

Anyway, this sorts the addresses as strings in their
original form, not numerically, so that 127.0 is put
before 2.0. I don't think that's what he was looking for.

dr...@copyright.com

unread,
Oct 1, 1998, 3:00:00 AM10/1/98
to
In article <lahfxox...@erh.ericsson.se>,

Michal Rutka <erh...@erh.ericsson.se> wrote:
> > |> It took me about 10 seconds to understand a syntax. I didn't found it
pretty.
> > |> And one more thing, in a job which I do I learned to use a word
'transform' to
> > |> more complicated things than a simple data manipulation. That's why I
call it
> > |> a Schwartzian thing.
> >
> > I suggest you look up transform in a dictionary.
>
> And start calling grep as 'transformation program'. No thanks.
>

If Laplace and Fourier can have transforms named after them, why can't Randal?

In the traditional use of the word in math, a pair of inverse transforms
convert the problem from a domain where it's difficult to a domain where it
can be handled more easily and then convert the result back. This is exactly
what the algorithm is doing, so it's an appropriate use of the word, at least
to a mathematician.

grep, on the other hand, is not a transform, and I don't think I heard anyone
refer to it as such. It's plenty useful, but not a transform.

Now, what are these extraordinarily complex things that you call transforms?

--
Don Roby

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Michal Rutka

unread,
Oct 2, 1998, 3:00:00 AM10/2/98
to
dr...@copyright.com writes:
> If Laplace and Fourier can have transforms named after them, why can't Randal?

OK. He can. Are you happy now?

Regards,

Michal

Michal Rutka

unread,
Oct 2, 1998, 3:00:00 AM10/2/98
to
John Porter <jdpo...@min.net> writes:
> Steffen Beyer wrote:
[...]

> > if ($item =~ /^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/))
> > {
> > $host = sprintf("%.2X%.2X%.2X%.2X", $1, $2, $3, $4);
> > $addr = sprintf("%d.%d.%d.%d", $1, $2, $3, $4); # removes leading 0
> > $addr{$host} = $addr;
> > # $name{$host} = $name; # wherever this comes from
> > }
> > }
> > foreach $host (sort keys(%hash))
> > {
> > printf("%-40.40s %s\n", $name{$host}, $addr{$host});
> > }
>
> I do not see a hash named %hash being created anywhere.
> I assume (i.e. I hope) you meant to say %addr.
>
> Anyway, this sorts the addresses as strings in their
> original form, not numerically, so that 127.0 is put
> before 2.0. I don't think that's what he was looking for.

John, please no bad feelings, you are wrong. Stings are not in their
original form. Those addresses will look like this: 7F.00 and 02.00 and
2.0 will be put before 127.0

Michal

John Porter

unread,
Oct 2, 1998, 3:00:00 AM10/2/98
to
Michal Rutka wrote:

> John Porter <jdpo...@min.net> writes:
> > Anyway, this sorts the addresses as strings in their
> > original form, not numerically, so that 127.0 is put
> > before 2.0. I don't think that's what he was looking for.
>
> John, please no bad feelings, you are wrong. Stings are not in their
> original form. Those addresses will look like this: 7F.00 and 02.00 and
> 2.0 will be put before 127.0

Oh, yeah! wups... :-)

dr...@copyright.com

unread,
Oct 2, 1998, 3:00:00 AM10/2/98
to
In article <la3e97q...@erh.ericsson.se>,

Michal Rutka <erh...@erh.ericsson.se> wrote:
> dr...@copyright.com writes:
> > If Laplace and Fourier can have transforms named after them, why can't
Randal?
>
> OK. He can. Are you happy now?
>

Yes. Thank you ever so much.

John Klassa

unread,
Oct 2, 1998, 3:00:00 AM10/2/98
to
Sorry to jump into the middle of this...

On 01 Oct 1998 14:56:40 +0200, Michal Rutka <erh...@erh.ericsson.se> wrote:
> > |> It took me about 10 seconds to understand a syntax. I didn't found
> > |> it pretty. And one more thing, in a job which I do I learned to
> > |> use a word 'transform' to more complicated things than a simple
> > |> data manipulation. That's why I call it a Schwartzian thing.
> >
> > I suggest you look up transform in a dictionary.
>
> And start calling grep as 'transformation program'. No thanks.

The syntax of a Schwartzian Transform may be a bit daunting at first, but
it makes perfect sense if you read it from the bottom up. There are a
number of useful explanations of this powerful technique lying about (one
is at http://www.5sigma.com/perl/schwtr.html). The "black transform" (as
Tom Christiansen calls it) is a keeper...

As for whether "transform" is the right word, I defer to the dictionary:

@trans.form \trans-'fo_.rm\ vb
1: to change in structure, appearance, or character
2: to change (an electric current) in potential or type
SYN: transmute, transfigure
-- trans.for.ma.tion \.trans-f*r-'ma_--sh*n\ n
-- trans.form.er \trans-'fo_.r-m*r\ n

It would seem, then, that the Schwartzian Transform is aptly named...
(Furthermore, "grep" is not a transform.)

--
John Klassa / Alcatel Telecom / Raleigh, NC, USA <><

Larry Rosler

unread,
Oct 2, 1998, 3:00:00 AM10/2/98
to
[Posted to comp.lang.perl.misc and a copy mailed.]

In article <6v3cjn$7f4$1...@aurwww.aur.alcatel.com> on 2 Oct 1998 20:21:11
GMT, John Klassa <kla...@aur.alcatel.com> says...


...
> (Furthermore, "grep" is not a transform.)

Sheesh, folks. Principles of Unix. 'grep' is a *filter*.

fil·ter n. ... 2. Any of various electric, electronic, acoustic, or
optical devices used to reject signals, vibrations, or radiations of
certain frequencies while passing others.

--
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
l...@hpl.hp.com

Jonathan Stowe

unread,
Oct 2, 1998, 3:00:00 AM10/2/98
to
On 30 Sep 1998 12:03:45 -0400 Uri Guttman <u...@camel.fastserv.com> wrote:

> try this (tested):

> perl -MSocket -le 'print join( "\n", map{ inet_ntoa($_) } sort map
> {inet_aton $_ } <>)'

Now that was the one I was waiting for.

/J\
--
Jonathan Stowe <j...@btinternet.com>
Some of your questions answered:
<URL:http://www.btinternet.com/~gellyfish/resources/wwwfaq.htm>

Joe Williams

unread,
Oct 3, 1998, 3:00:00 AM10/3/98
to
This is in answer to my original post. First, many thanks for all the
comments. I will be traveling over the next couple of days, but look forward
to checking them out. Meantime, I'm posting a solution that is certainly not
the most elegant or efficient, but it does work, and is fairly transparent.
I used John Porter's solution to the IP string conversion with some changes
as noted. It is probably over-commented, but what the hell!

Joe Williams
will...@clark.net

#sortest_v1.pl

@list =
("\n0179 Tue 01Sep98 06:17:55 - Address of Site: 129.1.144.88#0179 Tue
(more)",
"\n0560 Tue 01Sep98 08:05:57 - Address of Site: 99.1.13.12#0186 Tu (more)",
"\n0800 Tue 01Sep98 06:38:42 - Address of Site: 205.128.134.105#0180 Tu
(more)",
"\n0460 Tue 01Sep98 07:41:32 - Address of Site: 99.1.13.12#0184 Tue 01S
(more)",
"\n0187 Tue 01Sep98 08:37:16 - Address of Site: 199.77.210.211#0187 Tue
(more)");

print "\n";
print @list;
print "\n";

# Extract the IP address and convert it into an integer and
# assign as a key to a hash with the line from which it was taken
# as the value. This code requires that the lines have a known,
# consistent structure.
#

foreach $list_item (@list)
{
$x = index($list_item, "#");
$w = (index ($list_item, "ite:") + 5);
$ip_dotted_quad = substr( $list_item, $w, ($x - $w) );
$ip_int = dotted_to_int($ip_dotted_quad);
# Must hash with $list_item as key, because $ip_int's are not unique
$hashed_by_ip{$list_item} = $ip_int; # $list_item is the key; $ip_int, the
value
}

print @sorted_lines = sort by_ip_int keys(%hashed_by_ip);

# Subroutines
# -------------------------------------------------------------------------


# Subroutine to convert a dotted quad IP addresses into a sortable number

sub dotted_to_int {
return undef if $_[0] =~ /[^0-9.]/; # undef unless IP address composition
right
my @a = split /\./, $_[0]; # Each octet is in an array element of @a
my $n = pop @a; # $n equals last octet
return undef unless $n < (256 ** ( 4 - @a )); # undef unless last octet <
256. Redundant?

# Cycles from 0 to 3--the four octets. The .. operator expands $#a
for ( 0..$#a ) { # $#a = number of array elements.
return undef unless $a[$_] < 256; # Checks again that each quad < 256

# $n starts equal to the last quad. It is unshifted, and the right
# argument of * below evaluates to zero for this octet.
# Orignal John Porter version used << vice * which returns the value of its
left
# argument shifted left by the number of bits specified by the right
argument.
# The problem is that it also produces negative integers notation
# that could cause trouble.

$n += ( ( $a[$_]) * (8**(3-$_)) ) ; # converts each octet to decimal &
adds them
}
return $n;
}

# -------------------------------------------------------------------------

sub by_ip_int
{
# Sorts by values instead of keys gecko book, p. 163.
# It could easily be place inline with the sort.

return $hashed_by_ip{$a} <=> $hashed_by_ip{$b};
}

# End program code and subroutines section -------------------------------

Michal Rutka wrote in message ...
>John Porter <jdpo...@min.net> writes:


>
>> Joe Williams wrote:
>> >
>> > Does anyone have, or can anyone me toward a routine that will sort IP
>> > addresses? Many thanks.
>>

>> I assume you want to sort them numerically.
>
>Why? IP address is a string of four characters. It is better to
>sort them as strings.
>
>> You will therefore want to be able to convert dotted-quad
>> notation to an integer. The following sub does it:
>
>Why? It cost only time all this calculations. BTW your code do not
>work for IP 200.255.255.255. On my machine it returns -1. Anyway, every


>address above 127.255.255.255 is 'unsortable' by you. It just 50% of
>all addresses so who cares?
>
>>

>> sub dotted_to_int {
>> return undef if $_[0] =~ /[^0-9.]/;
>> my @a = split /\./, $_[0];
>> my $n = pop @a;
>> return undef unless $n < (256 ** ( 4 - @a ));
>> for ( 0..$#a ) {
>> return undef unless $a[$_] < 256;
>> $n += ( $a[$_] << (8*(3-$_)) );
>> }
>> $n;
>> }
>>
>> (It actually accepts all valid dotted-number notations,
>> not just dotted-quad.)
>>
>> Then you can use a Schwartzian Transform to do the sort:


>>
>> @ips_sorted =
>> map { $_->[0] }
>> sort { $a->[1] <=> $b->[1] }
>> map { $_, dotted_to_int( $_ ) }

>> @ips;
>>
>> Hope this helps.
>
>Horror.


>
>>
>> --
>> John "Many Jars" Porter
>

>Hymm, how about this:
>
>@ips_sorted = sort {pack("C4",split(/\./,$a)) cmp
pack("C4",split(/\./,$b))}
> @ips;


>
>Should be faster and without this Schwartzian thing (what is it, anyway)?

>BTW. Your 'Schwartzian' code does not work for me. I.e. lot of
uninitialized
>values and result is empty.
>

>Regards,
>
>Michal

John Klassa

unread,
Oct 4, 1998, 3:00:00 AM10/4/98
to
On Fri, 2 Oct 1998 13:44:26 -0700, Larry Rosler <l...@hpl.hp.com> wrote:
> Sheesh, folks. Principles of Unix. 'grep' is a *filter*.

Right you are... Wrong I am. :-)

Anyway, I'd still call the S.T. a "transform", though (even if I
was mistaken about grep). :-)

Michal Rutka

unread,
Oct 5, 1998, 3:00:00 AM10/5/98
to
"Joe Williams" <will...@clark.net> writes:
> This is in answer to my original post. First, many thanks for all the
> comments. I will be traveling over the next couple of days, but look forward
> to checking them out. Meantime, I'm posting a solution that is certainly not
> the most elegant or efficient, but it does work, and is fairly transparent.
> I used John Porter's solution to the IP string conversion with some changes
> as noted. It is probably over-commented, but what the hell!

You are welcome. Your code is quite impressive. However, you can consider
some improvements. The most importent of these is that the code is much
shorter and clear (at lest to me), therefore easier to maintain.

Improvement 1:

@sorted_lines = sort {get_ipa($a) cmp get_ipa($b)} @list;
# Where get_ipa() can be:
sub get_ipa{
return udef unless $_[0] =~ /.*: ([\d.]+)#/;
return pack("C4",split(/\./,$1));
}
# Or more correct:
use Socket;
sub get_ipa{
return udef unless $_[0] =~ /.*: ([\d.]+)#/;
return inet_aton($1);
}

This solution might suffer when you have a lot lines to compare, because of
the routine passed to sort. As John Porter banchmarked, this gives an extra
overhead. However, in your original solution you also use routine, therefore
you should notice speed improvements. (Note that ST solution will also
suffer from this fact, as it uses sub call too to dereference indexes).

Improvement 2:
@sorted_lines = map {substr($_,4)} sort map {get_ipa($_).$_} @list;

This solution is quite similar to ST but faster. It does not use
dereference during sort and finall map should be faster too.

Nevertheless, when I try to benchmark all solutions it seems that the first
improvement is only slightly faster than original solution and the
second one is not improvement at all (i.e. in speed)! Some more realistic
benchmarking should be done for which I do not have enough time. In my
opinion, as map creates a new array it will always suffer from speed
penalty, as on most systems memory allocation is the most time consuming
operation. Above counts for ST too.

Regards,

Michal

John Porter

unread,
Oct 5, 1998, 3:00:00 AM10/5/98
to
Michal Rutka wrote:
>
> Improvement 1:
>
> @sorted_lines = sort {get_ipa($a) cmp get_ipa($b)} @list;
> # Where get_ipa() can be:
> sub get_ipa{
> return udef unless $_[0] =~ /.*: ([\d.]+)#/;
> return pack("C4",split(/\./,$1));
> }
>
> # Or more correct:
> use Socket;
> sub get_ipa{
> return udef unless $_[0] =~ /.*: ([\d.]+)#/;
> return inet_aton($1);
> }

Thank you for noting the difference.

Hey, what's 'udef'? I suppose you didn't actually run this code?


> Improvement 2:
> @sorted_lines = map {substr($_,4)} sort map {get_ipa($_).$_} @list;
>
> This solution is quite similar to ST but faster. It does not use
> dereference during sort and finall map should be faster too.

Indeed, very similar to ST.

> as map creates a new array it will always suffer from speed
> penalty, as on most systems memory allocation is the most time consuming
> operation.

This does indeed seem to be the case.

--
John "Many Jars" Porter

John Porter

unread,
Oct 5, 1998, 3:00:00 AM10/5/98
to
Joe Williams wrote:
>
> This is in answer to my original post. First, many thanks for all the
> comments. I will be traveling over the next couple of days, but look forward
> to checking them out. Meantime, I'm posting a solution that is certainly not
> the most elegant or efficient, but it does work, and is fairly transparent.
> I used John Porter's solution to the IP string conversion with some changes
> as noted.

Joe, thanks for the nod, but I must formally disclaim the
appropriateness of that routine. It is a dog, besides which the
functionality it attempts to implement is already handled faster and
more correctly by Socket::inet_aton.


> ("\n0179 Tue 01Sep98 06:17:55 - Address of Site: 129.1.144.88#0179 Tue

> foreach $list_item (@list)
> {
> $x = index($list_item, "#");
> $w = (index ($list_item, "ite:") + 5);
> $ip_dotted_quad = substr( $list_item, $w, ($x - $w) );
> $ip_int = dotted_to_int($ip_dotted_quad);
> # Must hash with $list_item as key, because $ip_int's are not unique
> $hashed_by_ip{$list_item} = $ip_int;
> }

You should learn to use regular expressions.
For your two index()s, one substr(), and two temporary variables,
you could have:

( $ip_dotted_quad ) = ( $list_item =~ /ite:(.*)#/ );


One other note: please edit down the quoted message as much
as you can. The net will appreciate it.

Michal Rutka

unread,
Oct 6, 1998, 3:00:00 AM10/6/98
to
Hi John,

John Porter <jdpo...@min.net> writes:
[...]


> > return udef unless $_[0] =~ /.*: ([\d.]+)#/;
> > return inet_aton($1);
> > }
>
> Thank you for noting the difference.
>
> Hey, what's 'udef'? I suppose you didn't actually run this code?

Hym... I did run the code but without -w, so I didn't notice this bug, but
after all the behaviur is almost the same because 'udef' is in anyway
undefined ;-).

Michal

Michal Rutka

unread,
Oct 6, 1998, 3:00:00 AM10/6/98
to
John Porter <jdpo...@min.net> writes:
> Michal Rutka wrote:
> >
> > And therefore, when you run this code at home then start a dialup
> > even when you
> > know that your addresses are in a cannonical form? No thanks.
> > BTW invention the
> > wheel, or even an arc takes more effort than writing a few lines of
> > Perl code.
>
> Not sure what your point is.
> inet_aton() is not only at least as portable, but MORE correct
> than your solution.

Eh, nothing. There was a bug in Socket implementation for Win32 platform
(probably you dont use this platform, and dont do this unless you must).
It causes that when you called inet_* function, it was trying to connect
to the network (via dialing a modem). It was quite annoing, but the
problem seems to be solved now so there is no excuse to not use inet_*
functions anymore... unless... somebody has a client with an old version.

Some guy pointed out how it is marvelous to keep the software up-to-date,
with all newest library. In practice it is impossible! There are big
companies which sell own software with bundled Perl (e.g. Mentor Graphic).
The bad practice is that you cannot update the shipped Perl version
because the software which uses it will not run anymore. It is not
always possible to update original software. Moreover I am talking about
a software which costs > 100k$ per package, pobablly more that that bright
guy makes in a year...

Maybe this explain a little my point of view.

> And takes less typing besides.

This is a good point ;-).

>
> --
> John "Many Jars" Porter
> baby mother hospital scissors creature judgment butcher engineer

Michal

Kevin Reid

unread,
Oct 7, 1998, 3:00:00 AM10/7/98
to
Michal Rutka <erh...@erh.ericsson.se> wrote:

Actually, it's a bareword, therefore it's the string 'udef'.

--
Kevin Reid. | Macintosh.
"I'm me." | Think different.

Joe Williams

unread,
Oct 11, 1998, 3:00:00 AM10/11/98
to
The info below in a partial summary of the ways suggested to sort an IP
number. It may be useful to some one new to the problem.

If the comments are bothersome use @code_lines = grep !/^#/, @all_lines to
eliminate them.

# I originally posed the question of how to sort and array of strings by the
# IP address contained in each string.
# There are several issues: how to pull the IP address
# out of the string, how to convert it to sortable form, and then how to
# sort the array. There were a lot of answers, some of which I probably
missed
# while traveling. The examples below concern only the conversion problem.

# It helps to understand that an IP address in the dotted quad form is a
# 32 bit number which has been broken into four 8 bit parts and translated
# into a digital number: 10.45.126.10. To sort on it, it has to be
# reconstructed as a number or structure which has all the information
# contained in the original address, including the position of the quads.

# If you run the first example below,
# it will print out the IP number converted to 199077210067.
# A numerical sort on this as a number works, although the number is not
# really the IP address. It works because the info in the postion of the
quads
# is retained.

print "\n";
print "Results of sprintf approach: "; print sprintf("%.3d%.3d%.3d%.3d",
"199.77.210.67" =~ /(\d+)/g);
print "\n";

# This one takes an approach that I don't understand. It uses a subroutine
# to convert the IP address to 199 77 210 67 (using the split function)
# which is assigned to an anonymous array used as the argument to the pack
# function. Pack's "C4" takes the anonymous four element string and using
# the template "C4", converts each number (the 4) to an unsigned binary
# character (the C) and packs the four numbers into a "binary structure"
# (the exact form of which is what I don't understand). This binary
# structure is then sorted as a string. This works.
# I can't print the structure
# without unpacking it first, which the example does.

$temp = get_ipa("199.77.210.67");
print "\n";
print "Unpacked packed approach: "; print unpack("C4", $temp);print
"\n";print "\n";

sub get_ipa{
return udef unless $_[0] =~ /([\d.]+)/;


return pack("C4",split(/\./,$1));
}

# Still another approach uses the inet_aton function which is part of the
standard
# distribution libary. However, when I tried it, the function logs into the
# network, probably because it can take "somehost.net" as well as the IP
address,
# and it is looking for a DNS. So, that approach doesn't work too well for
my
# purposes. A really good conversion routine would exclude bogus addresses,
# such as 0.0.0.0, and be error trapped. inet_aton presumably does all this,
# and is efficient as well.
# Does anyone know where the code for this routine actually resides?

# No elegance in the final example, but it converts the actual IP address to
a decimal.

print "Actual IP Address as a decimal: ";print
dotted_to_int("199.77.210.67");
print "\n";

sub dotted_to_int
{
return undef if $_[0] =~ /[^0-9.]/; # undef unless IP address composition
right
my @a = split /\./, $_[0]; # Each octet is in an array element of @a

# Cycles from 0 to 3--the four octets. The .. operator expands $#a
$n = 0;
for ( 0..($#a) ) # $#a = number of array elements.
{
return undef unless (-1 <= $a[$_] && $a[$_] < 256); # Checks quads > -1
< 256
$n = $n + ( $a[$_] * (256**(3-$_)) ) ; # converts each octet to decimal &
adds them
}
return $n;
}


Michal Rutka

unread,
Oct 12, 1998, 3:00:00 AM10/12/98
to will...@clark.net
Good work Joe.

"Joe Williams" <will...@clark.net> writes:
> The info below in a partial summary of the ways suggested to sort an IP
> number. It may be useful to some one new to the problem.

[...]


> # This one takes an approach that I don't understand. It uses a subroutine

Which part do you not understand?

> # to convert the IP address to 199 77 210 67 (using the split function)
> # which is assigned to an anonymous array used as the argument to the pack
> # function. Pack's "C4" takes the anonymous four element string and using
> # the template "C4", converts each number (the 4) to an unsigned binary
> # character (the C) and packs the four numbers into a "binary structure"
> # (the exact form of which is what I don't understand). This binary

This binary structure is a simple string. If you run this code:

perl -e 'print pack("C4",48,49,50,51),"\n"'

it will print a string "0123". The "C4" means, take 4 arguments and treat
ASCII char code. One can rewrite this pack as follow:

sub my_c_pack{
my $tmp = "";
foreach(@_){
$tmp .= chr($_);
}
$tmp;
}

but it will, of course, be less efficient.

[...]


> # Still another approach uses the inet_aton function which is part of the
> standard
> # distribution libary. However, when I tried it, the function logs into the
> # network, probably because it can take "somehost.net" as well as the IP
> address,
> # and it is looking for a DNS. So, that approach doesn't work too well for
> my

Well, this function should not log on the network when not needed. I've
noticed this behaviour with old version of ActiveState perl (build 310
I belive). However, with the latest version a problem is gone. Maybe
you have the same problem?

Regards,

Michal

Joe Williams

unread,
Oct 12, 1998, 3:00:00 AM10/12/98
to
Michael - This is very helpful. Many TXs.

I see better what the code is doing now. It is taking the four quads of an
IP address and converting
them to binary (I assume this is in 8-bit chunks since it is using the ASCII
format.) Then this is
stuffed into a string. Elementary, my dear Watson--but not until you gave me
a few hints!

This approach actually recreates the 32-bit IP address in binary form, then
when it is sorted
as characters, that works fine, because it is sorting on the four octets
giving the leftmost octets
the most importance. Actually, you could sort on the binary number as well,
but I assume Perl's
built in sort routine sorts on decimal numbers written in ASCII, and that
won't work well on this
kind of structure. Right?

Here is why I was confused:

Your example prints out fine:
print pack("C4",48,49,50,51),"\n";
print "\n";

However, the following is what I used as an example, and this doesn't print
out so fine--
for reasons now obvious.
print pack("C4",199,77,210,67),"\n";

This is OK of course:
print unpack ("C4", get_ipa ("199.77.210.67") );

sub get_ipa{
$_[0] =~ /([\d.]+)/;print "\n";


return pack("C4",split(/\./,$1));
}


I can't print it


Michal Rutka wrote in message ...

>Good work Joe.
>
>"Joe Williams" <will...@clark.net> writes:
>> The info below in a partial summary of the ways suggested to sort an IP
>> number. It may be useful to some one new to the problem.
>[...]
>> # This one takes an approach that I don't understand. It uses a
subroutine
>
>Which part do you not understand?
>

Garry T. Williams

unread,
Oct 13, 1998, 3:00:00 AM10/13/98