Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss
Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Hash Domain Sort (Schwartzian Transform)

52 views
Skip to first unread message

JohnShep

unread,
Jul 22, 2002, 12:59:02 PM7/22/02
to
Trying to speed up my mailing list by using envelope and sorting email
addresses by domain.
How do I get the key of the hash to be the address in order to use this
Schwartzian Transform method of sorting.

TIA John


while (my $row = $sth->fetchrow_hashref) {

push @hash, $row->{email_add}{$row}; # I know this is wrong

@temp{$row->{email_add}} = $row ;
push @hash, \%temp; # Am I getting any closer ?


}


my @sorted_keys =
map { $_->[0] }
sort {
my $id = 1;
my $cmp;
{
return $a->[0] cmp $b->[0] if $id > $#$a and $id > $#$b;
return $cmp if $cmp = $a->[$id] cmp $b->[$id];
$id++;
redo;
}
}
map { [$_, reverse map lc split /\.\@/] }
keys %hash;


Benjamin Goldberg

unread,
Jul 22, 2002, 2:16:14 PM7/22/02
to
JohnShep wrote:
>
> Trying to speed up my mailing list by using envelope and sorting email
> addresses by domain.
> How do I get the key of the hash to be the address in order to use
> this Schwartzian Transform method of sorting.
>
> TIA John
>
> while (my $row = $sth->fetchrow_hashref) {
>
> push @hash, $row->{email_add}{$row}; # I know this is wrong
>
> @temp{$row->{email_add}} = $row ;
> push @hash, \%temp; # Am I getting any closer ?
>
> }

Try one of:

while( my $row = $sth->fetchrow_hashref ) {

push @array_of_hashrefs, { %$row };
}

Or:

while( my $row = $sth->fetchrow_hashref ) {

push @{$hash_of_array_of_hashrefs{ $row->{email_add} }}, {%$row};
}

Or:

while( my $row = $sth->fetchrow_hashref ) {

$hash_of_hashrefs{ $row->{email_add} } = {%$row};
}

Note that in all cases, you must use { %$row }, to make a copy of the
row returned by fetch, and may not store $row directly.

> my @sorted_keys =
> map { $_->[0] }
> sort {
> my $id = 1;
> my $cmp;
> {
> return $a->[0] cmp $b->[0] if $id > $#$a and $id > $#$b;
> return $cmp if $cmp = $a->[$id] cmp $b->[$id];
> $id++;
> redo;
> }
> }
> map { [$_, reverse map lc split /\.\@/] }
> keys %hash;

my @sorted_keys = map $$_[0], sort {
my ($i, $cmp) = 1;
{
return $cmp if $cmp = $$a[$i] cmp $$b[$i];
++$i;
redo if $i < @$a and $i < @$b;
};
return $$a[0] cmp $$b[0];
} map [ $_, reverse split /\.\@/, lc ], <whatever>;

Where <whatever> is one of:
map $_->{email_addr}, @array_of_hashrefs;
or:
keys %hash_of_array_of_hashrefs;
or:
keys %hash_of_hashrefs;


--
tr/`4/ /d, print "@{[map --$| ? ucfirst lc : lc, split]},\n" for
pack 'u', pack 'H*', 'ab5cf4021bafd28972030972b00a218eb9720000';

John

unread,
Jul 22, 2002, 8:14:15 PM7/22/02
to
From: "Benjamin Goldberg" <gol...@earthlink.net>

> my @sorted_keys = map $$_[0], sort {
> my ($i, $cmp) = 1;
> {
> return $cmp if $cmp = $$a[$i] cmp $$b[$i];
> ++$i;
> redo if $i < @$a and $i < @$b;
> };
> return $$a[0] cmp $$b[0];
> } map [ $_, reverse split /\.\@/, lc ], <whatever>;
>
> Where <whatever> is one of:
> map $_->{email_addr}, @array_of_hashrefs;
> or:
> keys %hash_of_array_of_hashrefs;
> or:
> keys %hash_of_hashrefs;

Thanks for hash Benjamin, now I've tried it out the sort keys on the prefix
of the email
address and not on the domain, any help welcome.

John

Rich

unread,
Jul 24, 2002, 12:50:22 PM7/24/02
to

"JohnShep" <jo...@princenaseem.com> wrote in message
news:ahhdkm$t0v$1...@paris.btinternet.com...

> Trying to speed up my mailing list by using envelope and sorting email
> addresses by domain.
> How do I get the key of the hash to be the address in order to use this
> Schwartzian Transform method of sorting.
>
> TIA John
<snip>

You can accomplish it while you are fetching the addresses from the db. No need
to create hash for the addresses you fetch from the db then an array for the
domains.

Just create a hash with the available domains as the keys and the addresses for
each domain as a list stored in the value of that key.

#!/usr/bin/perl -w

use strict;
use DBI;

my $dbh = DBI->connect("DBI:mysql:my_database", "user","password",
{RaiseError => 1});

my $sth = $dbh->prepare("SELECT email_add FROM list");

$sth->execute;

my ($email,%sorted);

$sth->bind_col(1,\$email);

## Create a hash with each unique domain as a key with the value
## of each key being a list of the addresses for that domain:

while ($sth->fetch) {
# lc() not necessary if you lc()'d the addresses before storing
push @{$sorted{substr(lc($email), rindex($email, '@') + 1)}}, lc($email);
}

## the list of addresses for each domain would
## be in @{$sorted{'domain_name'}}

foreach my $domain (keys %sorted) {

my $domain_list = join ', ', @{$sorted{$domain}};
my $address_count = scalar(@{$sorted{$domain}});

print "$domain has $address_count addresses.\n";
print "Addresses for $domain are: $domain_list\n\n";

}

## %sorted would look something like:
## %sorted = (
## 'yahoo.com' => ['addr...@yahoo.com', 'addr...@yahoo.com'],
## 'hotmail.com' => ['addr...@hotmail.com', 'addr...@hotmail.com'],
## );


$dbh->disconnect;
__END__

A better solution may be to have a separate table for the unique domains (one
column, say 'domains, as a primary key) and update it when you enter a new
address into the list table.


John

unread,
Jul 24, 2002, 9:13:21 PM7/24/02
to
> You can accomplish it while you are fetching the addresses from the db. No
need
> to create hash for the addresses you fetch from the db then an array for
the
> domains.
>
> Just create a hash with the available domains as the keys and the
addresses for
> each domain as a list stored in the value of that key.
>
> #!/usr/bin/perl -w
>
> use strict;
> use DBI;
>
> my $dbh = DBI->connect("DBI:mysql:my_database", "user","password",
> {RaiseError => 1});
>
> my $sth = $dbh->prepare("SELECT email_add FROM list");
>
> $sth->execute;
>

Thanks Rich,
I actually managed to come up with a similar solution myself,

$sth = $dbh->prepare(" SELECT email FROM db ORDER BY REVERSE(email) ");

I extract the email addresses sorted in reverse character order which
effectively sorts them by domain. ie the sort is on
moc.loa.aaa
moc.loa.bbb
etc.
I'm really quite chuffed with it, it works, it's fast and I can understand
it !

John
www.boxrec.com


0 new messages