$more input
3308191
The following program reads in the number from the file named 'input'
and builds a url form this number. Then it builds a url from this
number. I have lynx then dump the data into a file called 'out' and
then just grep the entire thing for the Product Number, Product ID,
SKU, UPC, and weight.
m-net% more parse.pl
#!/usr/bin/perl -w
my (@shit, $read, $build, @product, @id, @sku, @upc, @weight);
my $temp;
open(IN, '<', 'input') || die "cant open: $!";
$read = <IN>;
chomp($read);
$build = "http://www.doba.com/members/catalog/".$read.".html";
$temp = `lynx -accept_all_cookies -dump $build`;
open(OUTFILE, '>out');
print OUTFILE $temp;
close OUTFILE;
open(OUT, '<', 'out') || die "cant open: $!";
@shit = <OUT>;
@product = grep(/Product ID/, @shit);
@id = grep(/Item ID/, @shit);
@sku = grep(/SKU/, @shit);
@upc = grep(/UPC/, @shit); #this part doesn't grep UPC correctly. I
get some extra data after UPC.
@weight = grep(/Weight/, @shit);
print @product;
print @id;
print @sku;
print @upc;
print @weight;
% ./parse.pl
Product ID: 3308191
Item ID: 3653992
SKU: 8930
UPC: 896207999816 Condition: refurbished
Weight: 4.7 lbs.
i have to know if you could write this mess any slower? you are doing
everything possible to slow you down.
c> open(IN, '<', 'input') || die "cant open: $!";
c> $read = <IN>;
c> chomp($read);
c> $build = "http://www.doba.com/members/catalog/".$read.".html";
c> $temp = `lynx -accept_all_cookies -dump $build`;
why are you calling out to a program when perl can load web pages just
fine with LWP? did you even look for web stuff on cpan?
c> open(OUTFILE, '>out');
c> print OUTFILE $temp;
c> close OUTFILE;
c> open(OUT, '<', 'out') || die "cant open: $!";
c> @shit = <OUT>;
why are you writing out the output of lynx JUST TO READ IT BACK IN
AGAIN? this is the most absurd part of this program.
you have the text in $temp. you know how to use backticks but why do you
do the file write and reading back in? if you assigned the backticks to
an array you would get the same thing as in @shit without the wasted
effort.
also calling it @shit is not a good thing.
c> @product = grep(/Product ID/, @shit);
c> @id = grep(/Item ID/, @shit);
c> @sku = grep(/SKU/, @shit);
c> @upc = grep(/UPC/, @shit); #this part doesn't grep UPC correctly. I
c> get some extra data after UPC.
that is a problem with the format of the html page. html isn't line
oriented and you are grepping over lines. the proper way to deal with
html is with a parser. or in special very well defined cases with
regexes to actually grab what you want from the text. whole html lines
are almost never what you want.
uri
--
Uri Guttman ------ u...@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
I know I shouldn't critize free help, but you seem to have some anger
management issues.
>
> c> open(IN, '<', 'input') || die "cant open: $!";
> c> $read = <IN>;
> c> chomp($read);
> c> $build = "http://www.doba.com/members/catalog/".$read.".html";
> c> $temp = `lynx -accept_all_cookies -dump $build`;
>
> why are you calling out to a program when perl can load web pages just
> fine with LWP? did you even look for web stuff on cpan?
>
Would using LWP speed up the code? By the way, this code is meant to
run on a server with restricted access. Ie, I can't install stuff from
cpan on that server.
> c> open(OUTFILE, '>out');
> c> print OUTFILE $temp;
> c> close OUTFILE;
>
> c> open(OUT, '<', 'out') || die "cant open: $!";
> c> @shit = <OUT>;
>
> why are you writing out the output of lynx JUST TO READ IT BACK IN
> AGAIN? this is the most absurd part of this program.
>
> you have the text in $temp. you know how to use backticks but why do you
> do the file write and reading back in? if you assigned the backticks to
> an array you would get the same thing as in @shit without the wasted
> effort.
>
> also calling it @shit is not a good thing.
>
Huh? Are you saying I don't need the 'out' file?
Maybe something like this?
% more parse.pl
#!/usr/bin/perl -w
my (@shit, $read, $build, @product, @id, @sku, @upc, @weight);
my @temp;
open(IN, '<', 'input') || die "cant open: $!";
$read = <IN>;
chomp($read);
$build = "http://www.doba.com/members/catalog/".$read.".html";
@temp = `lynx -accept_all_cookies -dump $build`;
@product = grep(/Product ID/, @temp);
@id = grep(/Item ID/, @temp);
@sku = grep(/SKU/, @temp);
@upc = grep(/UPC/, @temp);
@weight = grep(/Weight/, @temp);
print @product;
print @id;
print @sku;
print @upc;
print @weight;
However, I don't know how to use LWP. Again, would the code run faster
if I used LWP?
c> On May 15, 1:37 pm, Uri Guttman <u...@stemsystems.com> wrote:
>> >>>>> "c" == chadda <cha...@lonemerchant.com> writes:
>>
>> i have to know if you could write this mess any slower? you are doing
>> everything possible to slow you down.
c> I know I shouldn't critize free help, but you seem to have some anger
c> management issues.
nope. i have bad code anger issues. i deal with this in code reviews all
the time. i just don't get how people come up with wacky and slow ways
to do things. i have seen worse code that read in files, parsed them,
wrote them out (untouched) and read them in again.
>>
c> open(IN, '<', 'input') || die "cant open: $!";
c> $read = <IN>;
c> chomp($read);
c> $build = "http://www.doba.com/members/catalog/".$read.".html";
c> $temp = `lynx -accept_all_cookies -dump $build`;
>>
>> why are you calling out to a program when perl can load web pages just
>> fine with LWP? did you even look for web stuff on cpan?
>>
c> Would using LWP speed up the code? By the way, this code is meant to
c> run on a server with restricted access. Ie, I can't install stuff from
c> cpan on that server.
if you have access to load scripts you can load pure perl modules
too. this is an FAQ.
c> open(OUTFILE, '>out');
c> print OUTFILE $temp;
c> close OUTFILE;
>>
c> open(OUT, '<', 'out') || die "cant open: $!";
c> @shit = <OUT>;
>>
>> why are you writing out the output of lynx JUST TO READ IT BACK IN
>> AGAIN? this is the most absurd part of this program.
>>
>> you have the text in $temp. you know how to use backticks but why do you
>> do the file write and reading back in? if you assigned the backticks to
>> an array you would get the same thing as in @shit without the wasted
>> effort.
>>
>> also calling it @shit is not a good thing.
>>
c> Huh? Are you saying I don't need the 'out' file?
yes. why do you think you need that file? you call backticks and get the
html page in $temp. why do you think you need a file to process that
data? you already have it inside perl.
>> Huh? Are you saying I don't need the 'out' file?
yes.
c> Maybe something like this?
c> % more parse.pl
c> #!/usr/bin/perl -w
c> my (@shit, $read, $build, @product, @id, @sku, @upc, @weight);
c> my @temp;
c> open(IN, '<', 'input') || die "cant open: $!";
c> $read = <IN>;
c> chomp($read);
c> $build = "http://www.doba.com/members/catalog/".$read.".html";
c> @temp = `lynx -accept_all_cookies -dump $build`;
c> @product = grep(/Product ID/, @temp);
c> @id = grep(/Item ID/, @temp);
c> @sku = grep(/SKU/, @temp);
c> @upc = grep(/UPC/, @temp);
c> @weight = grep(/Weight/, @temp);
c> print @product;
c> print @id;
c> print @sku;
c> print @upc;
c> print @weight;
c> However, I don't know how to use LWP. Again, would the code run faster
c> if I used LWP?
better but forking off lynx is still slow. LWP should be much faster. if
you want speed (and with the data size you have, you want it), use LWP.
depending on how fast you need it (cpu usage will spike with the greps
you have) you can also change all that to parse out what you want with
regexes. (again, that assumes a known fixed html page layout which you
seem to have).
> > i have to know if you could write this mess any slower? you are
> > doing
> > everything possible to slow you down.
> I know I shouldn't critize free help, but you seem to have some anger
> management issues.
He seems to constantly come across this way. I really wish he could see
things from other points of view.
...
As a simple answer, take a look at LWP:UserAgent
(http://search.cpan.org/~gaas/libwww-perl-5.812/lib/LWP/UserAgent.pm),
as a good start in the right direction.
--
G.Etly
All the OP needs is LWP::Simple and HTML::TableExtract.
In fact, I wrote a whole script that took only 0.8 seconds to download
and parse a single page (of course, with more id's in a file, the only
real limit on the speed is the network latency and transfer speed) but I
have decided not to post it as I do not know what his intentions are.
As for you, pick a posting id and stick with it.
PLONKETY PLONK!
Sinan
--
A. Sinan Unur <1u...@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
I just tried LWP, and now I can't get the code to work for the life of
me. Here is what I attempted
#!/usr/bin/perl -w
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Cookies;
my ($read, $build, @product, @id, @sku, @upc, @weight);
my @temp;
open(IN, '<', 'input') || die "cant open: $!";
$read = <IN>;
chomp($read);
$build = 'http://www.doba.com/members/catalog/'.$read.'.html';
#@temp = `lynx -accept_all_cookies -dump $build`;
my $ua = LWP::UserAgent->new;
$ua->agent("OMEGA SPARC DESTROYER/69");
my $request = HTTP::Request->new('GET');
$request->url($build);
my $cookie_jar = HTTP::Cookies->new;
$cookie_jar->add_cookie_header($request);
my $response = $ua->request($request);
my $code = $response->code;
print $code;
@temp = $request->content;
@product = grep(/Product ID/, @temp);
@id = grep(/Item ID/, @temp);
@sku = grep(/SKU/, @temp);
@upc = grep(/UPC/, @temp);
@weight = grep(/Weight/, @temp);
print @product;
print @id;
print @sku;
print @upc;
print @weight;
% ./parse.pl
500%
> On May 15, 3:16 pm, "Gordon Etly" <ge...@bentsys-INVALID.com> wrote:
>> cha...@lonemerchant.com wrote:
>> > On May 15, 1:37 pm, Uri Guttman <u...@stemsystems.com> wrote:
>> > chadda <cha...@lonemerchant.com> writes:
>> > > i have to know if you could write this mess any slower? you are
>> > > doing
>> > > everything possible to slow you down.
>> > I know I shouldn't critize free help, but you seem to have some
>> > anger management issues.
...
>> As a simple answer, take a look at LWP:UserAgent
>> (http://search.cpan.org/~gaas/libwww-perl-
5.812/lib/LWP/UserAgent.pm),
>> as a good start in the right direction.
...
> I just tried LWP, and now I can't get the code to work for the life of
> me. Here is what I attempted
As I mentioned elsewhere, all you need is LWP::Simple.
So, here is a fish for you:
C:\Temp> cat p.pl
#!/usr/bin/perl
use strict;
use warnings;
use HTML::TokeParser;
use LWP::Simple;
my ($input_file) = @ARGV;
die "No input file specified\n" unless defined $input_file;
open my $INPUT, '<', $input_file
or die "Cannot open '$input_file': $!";
ID:
while ( my $id = <$INPUT> ) {
chomp $id;
my $url = make_url( $id );
my $html = get $url;
unless ( defined $html ) {
warn "Error downloading from '$url'\n";
next ID;
}
my $parser = HTML::TokeParser->new( \$html );
TABLE:
while ( my $token = $parser->get_tag('table') ) {
if ( lc $token->[1]{id} eq 'product_details' ) {
my $td = $parser->get_tag('td');
last TABLE unless $td;
my $cell = $parser->get_text('/td');
my %data;
while ( $cell =~ /\s*([^:]+?):\s+(\d+)\s+/g ) {
$data{$1} = $2;
}
use Data::Dumper;
print Dumper \%data;
}
}
}
sub make_url {
return
sprintf q{http://www.doba.com/members/catalog/%s.html}, $_[0];
}
__END__
C:\Temp> timethis p list
$VAR1 = {
'Product ID' => '3308191',
'UPC' => '896207999816',
'Item ID' => '3653992',
'SKU' => '8930'
};
TimeThis : Command Line : p list
TimeThis : Start Time : Thu May 15 18:19:28 2008
TimeThis : End Time : Thu May 15 18:19:29 2008
TimeThis : Elapsed Time : 00:00:01.062
Comparing this to the overhead of an empty script:
C:\Temp> cat t.pl
#!/usr/bin/perl
use strict;
use warnings;
C:\Temp> timethis t
TimeThis : Command Line : t
TimeThis : Start Time : Thu May 15 18:20:38 2008
TimeThis : End Time : Thu May 15 18:20:38 2008
TimeThis : Elapsed Time : 00:00:00.218
It took 0.844 seconds to retrieve and parse the required information. Of
course, the time cost would be better amortized if you ran a lot of
these queries.
GE> cha...@lonemerchant.com wrote:
>> On May 15, 1:37 pm, Uri Guttman <u...@stemsystems.com> wrote:
>> chadda <cha...@lonemerchant.com> writes:
>> > i have to know if you could write this mess any slower? you are
>> > doing
>> > everything possible to slow you down.
>> I know I shouldn't critize free help, but you seem to have some anger
>> management issues.
GE> He seems to constantly come across this way. I really wish he could see
GE> things from other points of view.
GE> ...
as usual, no help from you.
GE> As a simple answer, take a look at LWP:UserAgent
GE> (http://search.cpan.org/~gaas/libwww-perl-5.812/lib/LWP/UserAgent.pm),
GE> as a good start in the right direction.
which i already told him and we have already improved his code a good
deal. try to keep up.
When I try to run this code, I keep getting a blank url.
[ Do not quote in full. Do not quote sigs. ]
...
> When I try to run this code, I keep getting a blank url.
Well, did you provide it with a file containing the id numbers? How do
you know the URL is blank? Did you modify the code? If you did, why did
you not post the relevant modifications?
I would have normally put the id number in the __DATA__ section, but
since you implied that you already had an input file with id numbers, I
followed your example.
In any case, unless you take active steps to help others help you, this
will be the sum total of the help I will provide you.
Sinan
--
A. Sinan Unur <1u...@llenroc.ude.invalid>
I suppose you want to turn that line into a while loop once you got more
than one single item to process.
However, considering network latency and response times it may very well
be worthwhile to trigger multiple HTTP requests in parallel, such that
your processing code will never have to wait for network responses.
Other issues like shelling out an expensive external process, that
expensive but useless temporary file, or trying to parse HTML code using
REs others already mentioned.
jue
Are you the same moron you went into my killfile a few days ago as
Gordon Etly <g...@bentsys.com>? I guess everyone had filtered you so you
had to create a new identity, right? Back you go where you came from!
>As a simple answer, take a look at LWP:UserAgent
>(http://search.cpan.org/~gaas/libwww-perl-5.812/lib/LWP/UserAgent.pm),
>as a good start in the right direction.
Yeah, it's easy enough to copy what other people had mentioned already.
jue
This may depend on many parameters, but the overhead of system()ing
may be quite low. The overhead of opening a new HTTP connection for
each line may be larger. LWP will have a chance to use persistent
connections...
Yours,
Ilya
> I know I shouldn't critize free help, but you seem to have some anger
> management issues.
*plonk*
--
Affijn, Ruud
"Gewoon is een tijger."
[please don't left pad quoted text with spaces]
> > cha...@lonemerchant.com wrote:
> > > On May 15, 1:37 pm, Uri Guttman <u...@stemsystems.com> wrote:
> > > chadda <cha...@lonemerchant.com> writes:
> > > > i have to know if you could write this mess any slower? you are
> > > > doing
> > > > everything possible to slow you down.
> > > I know I shouldn't critize free help, but you seem to have some
> > > anger management issues.
> > He seems to constantly come across this way. I really wish he could
> > see things from other points of view.
> > ...
> as usual, no help from you.
I'm just pointing out what is. It's you who keep bringing this upon
yourself. You are constantly rude and arrogant to people, then you
wonder why people sometimes post back, like the OP did. If you can't
handle receiving comments about what you post, then don't post. If you
can't take it, don't dish it out.
> > As a simple answer, take a look at LWP:UserAgent
> > (http://search.cpan.org/~gaas/libwww-perl-5.812/lib/LWP/UserAgent.pm),
> > as a good start in the right direction.
> which i already told him and we have already improved his code a good
> deal. try to keep up.
I would think someone who has been on UseNet as logn as you would know
that posts don't always come down at the same time (or order) from every
server. Case in point, I had not seen such a post mentioning it until
later on.
--
G.Etly
Changing your identity again because everyone filtered you?
jue
> > He seems to constantly come across this way. I really wish he could
> > see things from other points of view.
> I guess everyone had filtered you so you had to create a new identity
I have not changed my identity. My name is Gordon Etly. I have not
changed that part, nor made any attempt to hide it, so your statement is
false.
I happen to be a sys op for the company I work for, including our mail
server, so I am able to add entries to /etc/aliases (which I commonly
use to public variants of my main email address that any unwanted
mailings can be easily stopped.) I've never seen any rule saying "never
change your email field", as that is anyone's right.
> > As a simple answer, take a look at LWP:UserAgent
> > (http://search.cpan.org/~gaas/libwww-perl-5.812/lib/LWP/UserAgent.pm),
> > as a good start in the right direction.
> Yeah, it's easy enough to copy what other people had mentioned
> already.
I had not seen that mentioned at all before I posted. Funny, I see you
and your fellows do exactly this all the time (posting essentially the
same answer that was already given by someone else), but now it's
suddenly a bad thing. Please make up your minds.
In this case, there were no replies mentioning LWP::UserAgent. Uri did
mention LWP very briefly, but LWP has several modules. I was more
specific.
--
G.Etly
IZ> [A complimentary Cc of this posting was sent to
IZ> Uri Guttman
IZ> <u...@stemsystems.com>], who wrote in article <x7prrn9...@mail.sysarch.com>:
>> better but forking off lynx is still slow. LWP should be much faster. if
>> you want speed (and with the data size you have, you want it), use LWP.
IZ> This may depend on many parameters, but the overhead of
IZ> system()ing may be quite low. The overhead of opening a new HTTP
IZ> connection for each line may be larger. LWP will have a chance to
IZ> use persistent connections...
i highly doubt forking lynx and it doing a fetch with passing the page
back via a pipe would be faster than a direct call to lwp and getting
the page in ram. it would have to be a very odd system for the lynx
solution to be faster.
and lynx would have to always open a new connection as forked procs have
no memory.
> Jürgen Exner wrote:
>> "Gordon Etly" <ge...@bentsys-INVALID.com> wrote:
>
>> > He seems to constantly come across this way. I really wish he could
>> > see things from other points of view.
>
>> I guess everyone had filtered you so you had to create a new identity
>
> I have not changed my identity. My name is Gordon Etly. I have not
> changed that part, nor made any attempt to hide it, so your statement
> is false.
>
> I happen to be a sys op for the company I work for, including our mail
> server, so I am able to add entries to /etc/aliases (which I commonly
> use to public variants of my main email address that any unwanted
> mailings can be easily stopped.) I've never seen any rule saying
> "never change your email field", as that is anyone's right.
Noting from the Anti-Troll FAQ:
Subject: 7.6 Morphed Identity
A morphed identity is when a poster has one usenet identity,
which changes in detail, to outwit killfiles. For instance the
name may remain the same and the email address change, or the
name and/or email address may contain accented characters which
are changed for different versions of the same letter.
Here are all of your selves as recorded by my killfile:
gordon etly g...@bentsys.com 0
gordon etly g...@invalidbentsys.com 0
gordon etly ge...@bentsys-invalid.com 0
I have added another one with this post.
Now, I don't know about Individual.NET's policies regarding morphing,
but their terms of use seems to explicitly prohibit using domains you do
not own as your from address:
http://www.individual.net/rules.php
Sender Address
The e-mail addresses given in "From:", "Reply-To:", and "Sender:" should
be valid (= should not bounce because of invalidity). Using addresses
and name space of other people without their permission is prohibited.
It does not look like you own bent-INVALID-sys.com or invalidbentsys.com
or bentsys-invalid.com.
Don't morph. Pick one identity and stick with it.
1) My identity has never changed. It has always been Gordon Etly, which
is my name.
2) Why are you trying to speak for everyone. While certain people may
share your view (and vice versa), it doesn't mean you speak for the
whole of the group.
--
G.Etly
> > Jürgen Exner wrote:
> Noting from the Anti-Troll FAQ:
>
> Subject: 7.6 Morphed Identity
>
> A morphed identity is when a poster has one usenet identity,
Any email address is not an identity. It's an email address. The "Name"
field is your identity), and I have not changed that. I am free to
change my email address field however I wish, as are you and anyone
else.
> Sender Address
> The e-mail addresses given in "From:", "Reply-To:", and "Sender:"
> should
> be valid (= should not bounce because of invalidity). Using addresses
> and name space of other people without their permission is prohibited.
Being in control of your mail server actually allows you to fulfill the
"should not bounce because of invalidity" if you want to get down to
that. How a poster writes their email address is completely up to that
person. A rather large amount of people munge their email addresses, so
this isn't even an issue.
Lastly, attempting to pose that "identity" on a medium like UseNet
actually meaning something is idiotic at best. There is no guarantee
that a name you see is a real name, and in many cases it is not. Many
people use a "nick" name of sorts, and it is quite common to use a false
or munged email address to thwart spammer email harvesting.
--
G.Etly
There is no "Name" field. The From: header often includes both a name
and an email address. Changing one's From: header as often as you have
is a strong indicator of a troll.
> Lastly, attempting to pose that "identity" on a medium like UseNet
> actually meaning something is idiotic at best. There is no guarantee
> that a name you see is a real name, and in many cases it is not. Many
> people use a "nick" name of sorts, and it is quite common to use a false
> or munged email address to thwart spammer email harvesting.
It is not common to alter the From: header as often as you have done, no
matter whether your name is Gordon Etly, Gordon Gekko, or Trolly
McTroll.
--keith
--
kkeller...@wombat.san-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://www.therockgarden.ca/aolsfaq.txt
see X- headers for PGP signature information
I do not think you understood what I wrote.
I'm not claiming that *this* overhead is small. What I say is that
*other* overheads may be not negligible.
Anyway, all overheads I know are in favor on LWP.
Hope this helps,
Ilya
> > Any email address is not an identity. It's an email address. The
> > "Name" field is your identity), and I have not changed that.
> There is no "Name" field. The From: header often includes both a name
> and an email address.
Many readers separate the "name" and "email" fields. I never changed my
name. The email address part of the From: line is not a atatic entity;
one can always change their email address. It's anyone's right to do so,
as it's their info. You're not suggesting an email address is a reliable
way of tracking someone, are you?
> Changing one's From: header as often as you have is a strong
> indicator of a troll.
Or someone who does not wish to satisfy someone's false notion that they
can force the last word using that tired old method. If they going to
reply and then inform you that you're killfiled, as if the public really
needs to know (#1), then it is no less wrong to circumvent their
killfile; it's attack an counter, something that's existed as long as
man.
If one really wants to ignore me, they can either not read my posts or
block my name, as that remains constant.
> > Lastly, attempting to pose that "identity" on a medium like UseNet
> > actually meaning something is idiotic at best. There is no guarantee
> > that a name you see is a real name, and in many cases it is not.
> > Many
> > people use a "nick" name of sorts, and it is quite common to use a
> > false or munged email address to thwart spammer email harvesting.
> It is not common to alter the From: header
This is untrue. I see many people post one day with one name and/or
email and the next time I see a variant of their Name (or a nick name)
and/or a differing email address.
> no matter whether your name is Gordon Etly, Gordon Gekko, or Trolly
> McTroll.
My name has always been Gordon Etly. That is my identity; my name. If
one wishes to killfile me using that, then they are welcome to do so. If
they killfile me by email address then
(#1)
If you true need to ignore someone, you don't need to announce the fact,
or for that matter, one doesn't need a killfile either, though it can be
nice.
--
G.Etly
> I 'll eventually have the input file filled with 350 million items.
Incidentally, if you could do three pages in a second, this corresponds to
about 3.7 years of continues scraping.
If you try to do this in massively parallel way, then it might be
considered a denial of service attack.
Of course, if you could do that, then the performance constraints of the
web server on the other and of the connection kick in.
I am not sure if it is a good idea for you to invest any more time &
resources into improving the performance of your script.
> A. Sinan Unur wrote:
>> "Gordon Etly" wrote in
>
>> > Jürgen Exner wrote:
>
>> Noting from the Anti-Troll FAQ:
>>
>> Subject: 7.6 Morphed Identity
>>
>> A morphed identity is when a poster has one usenet identity,
>
> Any email address is not an identity. It's an email address. The
> "Name" field is your identity), and I have not changed that.
> I am free to change my email address field however I wish,
> as are you and anyone else.
In newsgroups, your identity is your full handle. It does not matter if
that does not correspond to your real life identity. So, so long as you
pick one, and stick with it, no one has a problem with it.
Except,
>> Sender Address
>> The e-mail addresses given in "From:", "Reply-To:", and "Sender:"
>> should be valid (= should not bounce because of invalidity). Using
>> addresses and name space of other people without their permission is
>> prohibited.
You snipped the source of that rule. That is a rule stated by the
service provider you chose.
> Being in control of your mail server actually allows you to fulfill
> the "should not bounce because of invalidity" if you want to get down
> to that.
That's funny because most of the domain names you use are not
registered. I am not sure how you are running a mail servers for non-
existent domains.
Second, some of the domains you use are registered but do not seem to be
owned by someone named Gordon Etly.
> How a poster writes their email address is completely up to
> that person. A rather large amount of people munge their email
> addresses, so this isn't even an issue.
From other users' perspective, what matters is that you pick one and
stick with it. It seems like your service provider has explicit policies
prohibiting you from using non-existent domains or domains owned by
others. So, you should argue this point with them.
> Lastly, attempting to pose that "identity" on a medium like UseNet
> actually meaning something is idiotic at best. There is no guarantee
> that a name you see is a real name, and in many cases it is not. Many
> people use a "nick" name of sorts, and it is quite common to use a
> false or munged email address to thwart spammer email harvesting.
And that is completely irrelevant.
Oh really? So
Author: Gordon Etly <g...@bentsys.com>
Author: Gordon Etly <ge...@bentsys-INVALID.com>
Author: Gordon Etly <g.e...@bent-INVALID-sys.com>
was not you? How come that I don't believe you?
And now using identity number 4:
Author: Gordon Etly <g.e...@bentsys.INVALID.com>?
You must have a _REALLY_ bad reputation that you feel the need to change
your ID every other day.
>2) Why are you trying to speak for everyone. While certain people may
>share your view (and vice versa), it doesn't mean you speak for the
>whole of the group.
I never claimed to speak for anyone but myself.
jue
Nonsense. There is a From header. And maybe a ReplyTo header. And maybe
a FollowupTo header. But there is no such thing as a "Name" or an
"Email" header field in the first place.
>I never changed my
>name. The email address part of the From: line is not a atatic entity;
>one can always change their email address. It's anyone's right to do so,
>as it's their info. You're not suggesting an email address is a reliable
>way of tracking someone, are you?
If someone has to change it frequently then it is a very good indication
that that person has something to hide in their past. Why else would
they change their ID frequently?
Back you go to where you crawled out from
jue
> > > > I'm just pointing out what is. It's you who keep bringing this
> > > > upon
> > > > yourself. You are constantly rude and arrogant to people, then
> > > > you
> > > Changing your identity again because everyone filtered you?
> > 1) My identity has never changed.
> Oh really? So
> Author: Gordon Etly <g...@bentsys.com>
> Author: Gordon Etly <ge...@bentsys-INVALID.com>
> Author: Gordon Etly <g.e...@bent-INVALID-sys.com>
> was not you?
My name never changed. Email address is not an identity, it's an email
address. They are a variable field. One can always change it, so stop
trying to use that as an argument here. I said before my name never
changed and you just proved that for me.
> > 2) Why are you trying to speak for everyone. While certain people
> > may
> > share your view (and vice versa), it doesn't mean you speak for the
> > whole of the group.
> I never claimed to speak for anyone but myself.
Not true:
( from above )
> > > Changing your identity again because everyone filtered you?
You clearly implied you knew -everyone- had done it. Stop trying to
misrepresent things in order to formulate your arguments.
--
G.Etly
> > > There is no "Name" field. The From: header often includes both a
> > > name and an email address.
> > Many readers separate the "name" and "email" fields.
> Nonsense. There is a From header. And maybe a ReplyTo header. And
> maybe a FollowupTo header. But there is no such thing as a "Name" or
> an "Email" header field in the first place.
No, most readers that I've used give separate fields for Name and Email.
It writes the From: header behind the scenes. Either way, it doesn't
change the fact that Email part is a variable field that can change at
any time. Whether it's from changing email providers, or any number of
reasons (which one is not required to disclose), it is a person's own
choice what they want to display to the public as an email address.
Hell, some providers don't even require an email address (I once had one
when I was in Europe for a few months that allowed "Name < >" (a space
for an email), which I realized when I forgot to enter an email. Granted
most don't allow it, but the point is what ever it is, it's up to the
poster.
> If someone has to change it frequently then it is a very good
> indication that that person has something to hide in their past.
Err... I never changed my name, so how could I possible be trying to
hide? Actualyl quite the oppisite, I change the way my email appears in
the From: like so I am -NOT- hidden :)
--
G.Etly
So what? I am not violating it.
> > Being in control of your mail server actually allows you to fulfill
> > the "should not bounce because of invalidity" if you want to get
> > down
> > to that.
> That's funny because most of the domain names you use are not
> registered.
Please stop playing stupid. I am not the first to add "invalid" or
"nospam" or so to my email address. IT's a common practice and it've
never been prohibted by any privider I've come across. Bottom line: the
email address you enter is for public display and that's what many
harvesters look for.
> Second, some of the domains you use are registered
I only use one domain. You know very well about munging practices so
please stop feigning ignorance so suddenly.
> but do not seem to be owned by someone named Gordon Etly.
Come on, really. How many @aol, @yahoo, etc etc etc own those domains?
You know better than to make such an arugement. Most people -don't- own
the domain their email is in.
> > How a poster writes their email address is completely up to
> > that person. A rather large amount of people munge their email
> > addresses, so this isn't even an issue.
> From other users' perspective, what matters is that you pick one and
> stick with it.
One does not have to use the same email address. One is free to change
that to what ever they wish.
--
G.Etly
+-------------------+ .:\:\:/:/:.
| PLEASE DO NOT | :.:\:\:/:/:.:
| FEED THE TROLLS | :=.' - - '.=:
| | '=(\ 9 9 /)='
| Thank you, | ( (_) )
| Management | /`-vvv-'\
+-------------------+ / \
| | @@@ / /|,,,,,|\ \
| | @@@ /_// /^\ \\_\
@x@@x@ | | |/ WW( ( ) )WW
\||||/ | | \| __\,,\ /,,/__
\||/ | | | (______Y______)
/\/\/\/\/\/\/\/\//\/\\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
==================================================================
Ben
--
Joy and Woe are woven fine,
A Clothing for the Soul divine William Blake
Under every grief and pine 'Auguries of Innocence'
Runs a joy with silken twine. b...@morrow.me.uk
> One does not have to use the same email address. One is free to change
> that to what ever they wish.
One is also free to fart in an elevator whenever one wishes.
The point being, not everything that's allowed is polite. Most of us learned
manners in grade school - please try to catch up.
sherm--
--
My blog: http://shermspace.blogspot.com
Cocoa programming in Perl: http://camelbones.sourceforge.net
> > One does not have to use the same email address. One is free to
> > change that to what ever they wish.
> One is also free to fart in an elevator whenever one wishes.
Not that I would recommend it.
> The point being, not everything that's allowed is polite.
Agreed.
But lets go back a second: I have not changed my name. I have not
attempted to hide myself. I only slightly altered the appearance of my
email addresses field, which is that they are a variable field, and can
change at any time. Such as when none moves from one ISP to another. It
has nothing to do with politeness. It's my info that I set and it's my
choice. No one else's.
Please don't be no naive that all of UseNet uses real email addresses in
their From line. It's not uncommon see munged or fake emails, especially
if you've been spammed to high hell in the past when using a valid email
address. If you really want to get in direct contact with me, reply
saying so and I'll provide you with my real address (and how to decode
it.)
--
G.Etly
IZ> [A complimentary Cc of this posting was sent to
IZ> Uri Guttman
IZ> <u...@stemsystems.com>], who wrote in article <x7tzgy1...@mail.sysarch.com>:
>> >> better but forking off lynx is still slow. LWP should be much faster. if
>> >> you want speed (and with the data size you have, you want it), use LWP.
>>
IZ> This may depend on many parameters, but the overhead of
IZ> system()ing may be quite low. The overhead of opening a new HTTP
IZ> connection for each line may be larger. LWP will have a chance to
IZ> use persistent connections...
>>
>> i highly doubt forking lynx and it doing a fetch with passing the page
>> back via a pipe would be faster than a direct call to lwp and getting
>> the page in ram. it would have to be a very odd system for the lynx
>> solution to be faster.
>>
>> and lynx would have to always open a new connection as forked procs have
>> no memory.
IZ> I do not think you understood what I wrote.
so make it clearer the next time you write.
IZ> I'm not claiming that *this* overhead is small. What I say is that
IZ> *other* overheads may be not negligible.
IZ> Anyway, all overheads I know are in favor on LWP.
my point too.