I wrote a perl script but I'm not sure if there's a better way of
doing it.
The script runs very slow because the input file (txt file) I'm
reading from is very big. The script reads the input text file and if
it finds any location that equals to the list in array, it will
replace it with location code.
Here's a part of the script:
...
....
%LOCATION = (
'Gaborone,,BC', "GBE",
'Cairo,,EG', "CAI",
'Nairobi,,KN', "NBO",
'Dar es Salaam,,TN', "DAR",
'Kampala,,UG', "KLA",
'Cape Town,,ZA', "CPT",
'Johannesburg,,ZA', "JNB",
'Pretoria,,ZA', "PRY",
'Harare,,ZW', "HRE",
'Windhoek,,NM', "ERS",
'London,,UK', "LON",
'Vienna,,OS', "VIE",
'Brussels,,BX', "BRU",
'Copenhagen,,DN', "CPH",
'Berlin,,DL', "BER",
'Frankfurt,,DL', "FRA",
'Athens,,GR', "ATH",
'Tallinn,,EO', "TLL",
'Dublin,,IE', "DUB",
'Madrid,,SP', "MAD",
);
while (<INPUT_FILE>) {
chomp;
( $city, $state, $country, @data ) = split(/,/);
$location = "$city,$state,$country";
if ( defined @data[0] && ! defined $StartDD ) {
$StartDD = @data[0];
}
$day3 = $data[8]; $max3 = $data[9]; $min3 = $data[10];
$wx3 = $CONDITION{$data[11]};
foreach $location_name (%LOCATION ) {
$location_code = $LOCATION{$location_name};
if ($location_name eq $location ) {
printf FILE "$location_code\,";
printf FILE "%4s", "$min3\,";
printf FILE "%4s", "$max3\,";
printf FILE "%1s", "$wx3\n";
}
}
close(INPUT_FILE);
close(FILE);
}
Thanks in advance.
Regards,
Venus
You don't need that loop - all you have to do is to see wether you
have an element with key $location in %LOCATION:
if( exists $LOCATION{$location} ){
[...]
HTH,
--
use strict;my($i,$t,@r)=(0,'5 -.@BHJPT4acd6e2hk2lmn2o4r2s3tuz',map{ord}
split//,unpack('u*','L#`T&)QD5#0`#!!`#%1D)#08`#P05!!(3``$$"``#"0L&``('.
'"`P<!`````0$`'));$t=~s/(\d)(.)/$2x$1/eg;map{$t.=substr$t,$i,1,''while
$_--;$i++}@r;print"$t\n";# Tho...@Baetzler.de - http://baetzler.de/perl
venus wrote:
>
[...]
> reading from is very big. The script reads the input text file and if
> it finds any location that equals to the list in array, it will
> replace it with location code.
[...]
> $location = "$city,$state,$country";
[...]
> foreach $location_name (%LOCATION ) {
> $location_code = $LOCATION{$location_name};
[...]
> }
Shouldn't this do it instead of the foreach loop:
$location_code = $LOCATION{$location};
if(defined($location_code)) {
printf FILE "$location_code\,";
printf FILE "%4s", "$min3\,";
printf FILE "%4s", "$max3\,";
printf FILE "%1s", "$wx3\n";
}
Regards, Rainer
> foreach $location_name (%LOCATION ) {
> $location_code = $LOCATION{$location_name};
>
> if ($location_name eq $location ) {
>
This is like going up to a group of people and trying to find John by
asking each person in turn to tell you their name rather than simply
asking John to step forward.
if (my $location_code = $LOCATION{$location}) {
Also your use of printf is also pathological:
> printf FILE "$location_code\,";
> printf FILE "%4s", "$min3\,";
> printf FILE "%4s", "$max3\,";
> printf FILE "%1s", "$wx3\n";
Is location_code really a printf() template?
Why are you putting fixed punctuation in the data rather than the
template?
Why are you using 4 printf()s?
What't the point of specifying a minimum field width of one for the
last field when the data you are putting therein is certain to be
non-null anyhow? ("$wx3\n" must at least contain "\n"!).
printf FILE "%s,%3s,%3s,%s\n", $location_code, $min3, $max3, $wx3;
--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
>$location_code = $LOCATION{$location};
>if(defined($location_code)) {
> printf FILE "$location_code\,";
> printf FILE "%4s", "$min3\,";
> printf FILE "%4s", "$max3\,";
> printf FILE "%1s", "$wx3\n";
>}
Don't use printf where you don't have to.
print FILE "$location_code,";
And why not put it all in one statement?
if(defined(my $code = $LOCATION{$location})) {
printf FILE "%s,%3s,%3s,%1s\n", $code, $min3, $max3, $wx3;
}
--
Bart.
Just copied the inner loop of the original author.
Did not think longer about this.
> And why not put it all in one statement?
> if(defined(my $code = $LOCATION{$location})) {
> printf FILE "%s,%3s,%3s,%1s\n", $code, $min3, $max3, $wx3;
> }
Agreed for the printfs. For my point of view
on readability, $a = $b; if(defined($a)) { ..
is ok too.
Regards, Rainer
What if I have an array that has different keys pointing to the same
value?
%LOCATION = (
"GBE", 'Gaborone,,BC',
"GBC", 'Gaborone,,BC',
"ADL", 'Amman,,JD', );
instead of :
%LOCATION = (
'Gaborone,,BC', "GBE",
'Amman,,JD', "ADL", );
I tried this:
%LOCATIONS = reverse %LOCATION;
while (<INPUT_FILE>) {
($city, $statecode, $countrycode, $date1, $max1, $min1,
$cond1, $date2, $max2, $min2, $cond2,
$date3, $max3, $min3, $cond3, $date4, $max4, $min4, $cond4,
$date5, $max5, $min5, $cond5, $date6, $max6, $min6, $cond6 )
= split (/,/);
$location = "$city,$statecode,$countrycode";
if( exists $LOCATIONS{$location} ){
print OUTPUT_FILE "$LOCATIONS{$location},$max1,$min1,$cond1\n";
but this will eliminate those repeated values and only prints one of
the associated key. Any other way to do it?
Appreciate if you could help.
regards,
venus
> What if I have an array that has different keys pointing to the same
>value?
>
> %LOCATION = (
> "GBE", 'Gaborone,,BC',
> "GBC", 'Gaborone,,BC',
> "ADL", 'Amman,,JD', );
> %LOCATIONS = reverse %LOCATION;
...
>but this will eliminate those repeated values and only prints one of
>the associated key. Any other way to do it?
>
>Appreciate if you could help.
Indeed. That's the effect you get if you do:
%hash = ( a => 1, a => 2, a => 3 );
all that is left is $hash{a} = 3 (the rightmost entry for "a" on the
list).
Instead, try something like:
while(my($k, $v) = each %LOCATION) {
push @{$LOCATIONS{$v}}, $k;
}
This will make an anonymous array entry for each old value, with all
associated keys pushed onto each.
You'll have to loop through them:
foreach (@{$LOCATIONS{$location}} {
print OUTPUT_FILE "$_,$max1,$min1,$cond1\n";
}
--
Bart.
my %LOCATION = ....;
my $StartDD;
while( <INPUT_FILE> ) {
chomp;
my @data = split /,/;
my $loc_code = $LOCATION{join ",", splice(@data, 0, 3)};
my ($min3, $max3, $wx3) = (@data[10,9], $CONDITION{$data[11]});
$StartDD = $data[0] if !defined $StartDD;
printf FILE "%s,%4s,%4s,%1s\n", $loc_code, $min2, $max3, $wx3;
}
close INPUT_FILE;
close FILE;
--
"I think not," said Descartes, and promptly disappeared.
The script runs halfway and stopped with the message:
Segmentation fault (core dumped)
while (<INPUT_FILE>) {
s/^\s+//;
s/\r?\n$//;
($city, $statecode, $countrycode, $date1, $max1, $min1, $cond1,
$date2, $max2, $min2, $cond2,
$date3, $max3, $min3, $cond3, $date4, $max4, $min4, $cond4,
$date5, $max5, $min5, $cond5, $date6, $max6, $min6, $cond6 )
= split (/,/);
$location = "$city,$statecode,$countrycode";
while(my($k, $v) = each %LOCATION) {
push @{$LOCATIONS{$v}}, $k;
}
foreach (@{$LOCATIONS{$location}}) {
if(exists $LOCATIONS{$location}){
print OUTPUT_FILE"$_,$date1,$max1,$min1,$cond1\n";
}
}
Am I missing something here?
Thank you and regards...
On 25 Sep 2001 20:16:34 -0700,
venus <train...@hotmail.com> wrote:
> Hi...
>
> The script runs halfway and stopped with the message:
>
> Segmentation fault (core dumped)
This means that your Perl is broken. Perl should never bomb out like
that, and if it does, it is either a bug in Perl, a bug in your OS
libraries, a bug in a module that uses external libraries, or internal
compiled machine code [1]. Since I can't see any use of external modules
in your code, I can only assume that your Perl is broken (or some
system library it relies upon).
Very hard to say anything sensible about it. I would advise you to
recompile Perl.
> while (<INPUT_FILE>) {
>
>
> s/^\s+//;
> s/\r?\n$//;
> ($city, $statecode, $countrycode, $date1, $max1, $min1, $cond1,
> $date2, $max2, $min2, $cond2,
> $date3, $max3, $min3, $cond3, $date4, $max4, $min4, $cond4,
> $date5, $max5, $min5, $cond5, $date6, $max6, $min6, $cond6 )
>= split (/,/);
yuck. Why don't you use an array and indexes?
> $location = "$city,$statecode,$countrycode";
>
> while(my($k, $v) = each %LOCATION) {
> push @{$LOCATIONS{$v}}, $k;
> }
What is this supposed to do? It more or less inverts what is in
%LOCATION, and puts it in %LOCATIONS, but why is that done for each
line of input? The least that you should do is move this out of the
loop.
> foreach (@{$LOCATIONS{$location}}) {
> if(exists $LOCATIONS{$location}){
> print OUTPUT_FILE"$_,$date1,$max1,$min1,$cond1\n";
> }
> }
>
> Am I missing something here?
Nothing that has anything to do with the core dump. As I said, a core
dump in Perl most likely signifies that there is something wrong with
your perl binary.
Martien
[1] I suspect you can probably get this stuff as well if you're not
careful in signal handlers, and there may be one or more other, more
obscure, possibilities.
--
Martien Verbruggen |
Interactive Media Division | Hi, Dave here, what's the root
Commercial Dynamics Pty. Ltd. | password?
NSW, Australia |
Not too surprising, given that %LOCATIONS must be pretty huge by
then. You're stuffing new values into it on every iteration of the
outer loop. Two changes:
1. The inner while loop should only be done once, before you enter
the "while (<INPUT_FILE>)" loop.
2. You should invert the if and the foreach. There's no need to do
the test on every iteration if the thing you're testing doesn't
change:
while(my($k, $v) = each %LOCATION) {
push @{$LOCATIONS{$v}}, $k;
}
while (<INPUT_FILE>) {
s/^\s+//;
s/\r?\n$//;
($city, $statecode, $countrycode, $date1, $max1, $min1, $cond1,
$date2, $max2, $min2, $cond2,
$date3, $max3, $min3, $cond3, $date4, $max4, $min4, $cond4,
$date5, $max5, $min5, $cond5, $date6, $max6, $min6, $cond6 )
= split (/,/);
$location = "$city,$statecode,$countrycode";
if (exists $LOCATIONS{$location}) {
foreach (@{$LOCATIONS{$location}}) {
print OUTPUT_FILE "$_,$date1,$max1,$min1,$cond1\n";
}
}
}
-- Dave Tweed
Still surprising. Perl should not dump core simply because it can't get
memory anymore. In fact, it should exit with the message
Out of memory!
or one of the related messages documented in perldiag. If it dumps core
because malloc returned NULL, it is a bug.
Martien
--
Martien Verbruggen |
Interactive Media Division | Little girls, like butterflies, need
Commercial Dynamics Pty. Ltd. | no excuse - Lazarus Long
NSW, Australia |
Yes, you are right. I've tried your suggestion and it works.
Thanks to you and others as well for giving me many options...