could you provide a suggestion about how to verify the validity of
an email address?
there are two levels of validity one could check:
- could the address provided be valid? by this, i mean - is this a
syntactically valid address?
- is the address provided actually valid? this means connecting to
the mail server on port 25 and using the VRFY command. the problem
here, however, is that this is not always possible. you have to
end up looking at mx records and it all gets very ugly.
anyone out there have a suggested solution?
thanks,
yash
This is an FAQ. What it basically comes down to is that if you
really want to be sure of it's validity, you'll have to send them
an email and wait for a response.
: - could the address provided be valid? by this, i mean - is this a
: syntactically valid address?
Check the FAQ. You can get close, but not exact. Even if you
build a spec Internet email parser (I forget the RFC for them)
you still can't be sure. -Sendmail will accept a lot of addresses
that will be "valid" but not parse by strict RFC.
: - is the address provided actually valid? this means connecting to
: the mail server on port 25 and using the VRFY command. the problem
: here, however, is that this is not always possible. you have to
: end up looking at mx records and it all gets very ugly.
Not to mention UUCP, POP spliting domains, et al that make this
method vary unreliable.
: anyone out there have a suggested solution?
See the FAQ.
--
-Zenin
ze...@archive.rhps.org
This doesn't meen the host is valid however, which is
more important:
$ whois archive.rhps.org
No match for "ARCHIVE.RHPS.ORG".
--
-Zenin
ze...@archive.rhps.org
In comp.lang.perl.misc, Yash Khemani <khe...@plexstar.com> writes:
:could you provide a suggestion about how to verify the validity of
:an email address?
Absolutely : you STOP TRYING. You can't do it.
And then you go read the damned FAQ where I tell you why.
Mail the address supplied a unique cookie, and ask them to reply with
some convolution of the cookie. It they do it, you have a real
person. Otherwise you don't.
YOU CANNOT VERIFY AN EMAIL ADDRESS.
--tom
--
Tom Christiansen tch...@jhereg.perl.com
It is, of course, written in Perl. Translation to C is left as an
exercise for the reader. :-) --Larry Wall in <74...@jpl-devvax.JPL.NASA.GOV>
A. Don't bother trying.
> - could the address provided be valid? by this, i mean - is this a
> syntactically valid address?
Impossible.
> - is the address provided actually valid? this means connecting to
> the mail server on port 25 and using the VRFY command. the problem
> here, however, is that this is not always possible. you have to
> end up looking at mx records and it all gets very ugly.
Yes, frequently not possible, and unreliable in the extreme, as a
solution.
The only thing that comes close to a reasonable solution -- and
which is not useful in many cases -- is to send a message to
the address asking for a reply, and see if you ever get one.
John Porter
But DO bother to check for shell escapes in their address, if your
script will be working with system commands. If you do something (very
undesirable) like:
system("cat temp.txt | mail $email_addr");
then all the person has to do is put in the email address
"no...@nobody.net;/bin/rm -r ~/*" .
steve
--
----------------------------------------
Domain name for replying is "inconnect".
----------------------------------------
In comp.lang.perl.misc, using a rude pain-in-the-ass spam-shrouded
address Nerv...@inconnect.com (whose real address has now been added
to a despammification table) writes:
: But DO bother to check for shell escapes in their address, if your
:script will be working with system commands. If you do something (very
:undesirable) like:
:
:system("cat temp.txt | mail $email_addr");
:
You give poor advice, sir. Checking for shell escapes is wrong.
Not calling the shell at all is right. Never use one-arged system
or exec, or backticks, or pipe opens with user input data. Period.
Consider the fred&bar...@stonehenge.com address, for example. Perfectly
valid and deliverable.
The right way to do this is explained in perlsec. You have to use a
forkopen and a manual, comma-separated exec.
You are running under -T aren't you?
And that was a useless use of cat.
--tom
--
Please don't send spam to Nerv...@inconnect.com
--
Tom Christiansen tch...@jhereg.perl.com
echo "I can't find the O_* constant definitions! You got problems."
--The Configure script from the perl distribution
>> > could you provide a suggestion about how to verify the validity of
>> > an email address?
>>
>> A. Don't bother trying.
>
> But DO bother to check for shell escapes in their address, if your
>script will be working with system commands. If you do something (very
>undesirable) like:
>
>system("cat temp.txt | mail $email_addr");
>
>then all the person has to do is put in the email address
>"no...@nobody.net;/bin/rm -r ~/*" .
>
>steve
>
>--
>
>----------------------------------------
>Domain name for replying is "inconnect".
>----------------------------------------
I like it !
All those poor spammers..:->
NerveGas> But DO bother to check for shell escapes in their address,
NerveGas> if your script will be working with system commands.
No. Some *valid* email addresses have things in them that you would
call "shell escapes".
NerveGas> If you
NerveGas> do something (very undesirable) like:
NerveGas> system("cat temp.txt | mail $email_addr");
NerveGas> then all the person has to do is put in the email address
NerveGas> "no...@nobody.net;/bin/rm -r ~/*" .
Right. So don't do that. You *never* have to hand an email address
to the shell. Read the Web Security FAQ. Attend one of my Perl
Security lectures. There are *ways* to do it *right*.
(And that's a useless use of Cat, by the way. :-)
print "Just another Perl hacker," # but not what the media calls "hacker!" :-)
## legal fund: $20,990.69 collected, $186,159.85 spent; just 158 more days
## before I go to *prison* for 90 days; email fu...@stonehenge.com for details
--
Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying
Email: <mer...@stonehenge.com> Snail: (Call) PGP-Key: (finger mer...@teleport.com)
Web: <A HREF="http://www.stonehenge.com/merlyn/">My Home Page!</A>
Quote: "I'm telling you, if I could have five lines in my .sig, I would!" -- me
+ jdpo...@min.net (John Porter) wrote on 26.03.98 in <351A69...@min.net>:
+
+ > > - could the address provided be valid? by this, i mean - is this a
+ > > syntactically valid address?
+ >
+ > Impossible.
+
+ Quite possible.
Ok, SmartGuy. What's the regex that does this?
James
--
Consulting Minister for Consultants, DNRC
The Bill of Rights is paid in Responsibilities - Jean McGuire
To cure your perl CGI problems, please look at:
<url:http://www.perl.com/CPAN-local/doc/FAQs/cgi/idiots-guide.html>
> > - could the address provided be valid? by this, i mean - is this a
> > syntactically valid address?
>
> Impossible.
Quite possible.
(Oh, and btw: at the very least, you need to look at RFC 1123, not only at
RFC 822. You probably also want to look at the DNS RFCs, look at rfc-
index.txt to find them and anything else that might be related.)
Of course, that's hard work (see the FAQ, there's example code).
The FAQ claims there are deliverable, but syntactically invalid addresses;
I've never seen any, if you don't count special local forms that only work
on the same host, and I don't remember this coming up in the working group
currently drafting a successor to RFC 822 (see
draft-ietf-drums-msg-fmt-*.txt in the usual places), so unless someone has
some actual examples, I'm going to call that a myth.
Kai
--
Internet: k...@khms.westfalen.de
Bang: major_backbone!khms.westfalen.de!kai
http://www.westfalen.de/private/khms/
> Quite possible.
Interesting. My biz address at chasecreek....@usa.net is valid and works
:-)
But webm...@astro.fccj.org fails.
How would you determine a successful address using the methods described?
DNS, et al? astro.fccj.org is a valid address and webmaster is a valid user but
the two together won't work... Bounce city...
Curious,
Sneex :-)
There isn't any. Just because it's (trivially) solvable doesn't mean
there's a regex. Regexes aren't the ultimate solution engine. Regexes are
limited (though 5.005 will make them better). RFC 822 has a grammar.
In BNF. It wouldn't be hard to translate the BNF into a *parser*.
That is, a state machine, with memory, and transitions.
Syntatical validation is easy. (If you can do it in Basic or TCL,
you should be able to do it in Perl as well)
Abigail
--
perl -we '%_ = map {local $_ = $_; y/a-z/n-za-m/; ($_, $_)} @_ = map {lc} <>;
print grep {$_{$_}} @_' < /usr/dict/words
>I R A Aggie (fl_a...@thepentagon.com) wrote on MDCLXX September MCMXCIII
>in <URL: news:fl_aggie-270...@aggie.coaps.fsu.edu>:
>++ In article <6qkBA...@khms.westfalen.de>,
>++ kaih=6qkBA...@khms.westfalen.de (Kai Henningsen) wrote:
>++
>++ + jdpo...@min.net (John Porter) wrote on 26.03.98 in <351A69...@min.net>:
>++ +
>++ + > > - could the address provided be valid? by this, i mean - is this a
>++ + > > syntactically valid address?
>++ + >
>++ + > Impossible.
>++ +
>++ + Quite possible.
>++
>++ Ok, SmartGuy. What's the regex that does this?
>
>
>There isn't any. Just because it's (trivially) solvable doesn't mean
>there's a regex. Regexes aren't the ultimate solution engine. Regexes are
>limited (though 5.005 will make them better). RFC 822 has a grammar.
>In BNF. It wouldn't be hard to translate the BNF into a *parser*.
>That is, a state machine, with memory, and transitions.
I agree with Abigail.
The fact is that the MRE book attempts to show an email address syntax
validity regex, but after creating a regex of over 6000 bytes the
author also admits that his concauction still won't permit nested
comments within the email address (something that is permissible). He
settles for a 6000+ byte regex that will accept one level of comments
in the address.
He also reminds us that his regex is only (mostly) valid for Internet
email addresses. It would fail to match a local email address.
Frankly, if the author of Mastering Regular Expressions took over 6000
bytes in a regex that validates only MOST Internet email address
syntax's, I'd say that for anyone with less expertise than him the
project would be futile, and the product would be ill-conceived.
>Syntatical validation is easy. (If you can do it in Basic or TCL,
>you should be able to do it in Perl as well)
It is a possible task but not with regexes alone, as you pointed out.
Dave
+ On 28 Mar 1998 09:07:54 GMT, abi...@fnx.com (Abigail) wrote:
+ >There isn't any. Just because it's (trivially) solvable doesn't mean
+ >there's a regex. Regexes aren't the ultimate solution engine. Regexes are
+ >limited (though 5.005 will make them better). RFC 822 has a grammar.
+ >In BNF. It wouldn't be hard to translate the BNF into a *parser*.
+ >That is, a state machine, with memory, and transitions.
Ok.
+ The fact is that the MRE book attempts to show an email address syntax
+ validity regex, but after creating a regex of over 6000 bytes the
+ author also admits that his concauction still won't permit nested
+ comments within the email address (something that is permissible). He
+ settles for a 6000+ byte regex that will accept one level of comments
+ in the address.
Right. I'm not disagreeing. But will it pass the "fred&bar...@stonehenge.com"
test?
But let us drop down to the bottom line, here: what is the cost/benefit
ratio to checking an email address? IMHO, not terribly high. In fact,
its so low as to become a worthless endeavor.
About the only people who would truely benefit from such a thing are
spammers. Except they don't care if their throwaway acount generates
100,000 bounces and turns some poor ISP's server into a smoking crater.
A valid use would be setting up a mailing list. But I suspect the running
and maintenance of such a thing is better served by setting up a proper
list server -- like majordomo.
Perl isn't necessarily the best solution for every problem. This may well
be one of them.
Yes. The regex is written according to RFC 822. The only thing it's missing
is nested comments.
http://enterprise.ic.gc.ca/~jfriedl/regex/code.html
--
_ / ' _ / - aka - r...@coos.dartmouth.edu
( /)//)//)(//)/( Ronald J. Kimball chip...@m-net.arbornet.org
/ http://www.ziplink.net/~rjk/
"It's funny 'cause it's true ... and vice versa."
> But let us drop down to the bottom line, here: what is the cost/benefit
> ratio to checking an email address? IMHO, not terribly high. In fact,
> its so low as to become a worthless endeavor.
Nitpick:
If the cost/benefit ratio is low, that should mean it's worth while
doing, right?
> James
--
//Tom Grydeland <Tom.Gr...@phys.uit.no>
I R A Aggie wrote:
>
> Perl isn't necessarily the best solution for every problem. This may well
> be one of them.
Infidel.
hand,
John Porter
> But let us drop down to the bottom line, here: what is the cost/benefit
> ratio to checking an email address? IMHO, not terribly high. In fact,
> its so low as to become a worthless endeavor.
That depends upon what your doing and what tools your doing it with -
once you've got your sytax checker it's easy enough to call, and won't
take a excessive amount of time to return it's results.
> About the only people who would truely benefit from such a thing are
> spammers. Except they don't care if their throwaway acount generates
> 100,000 bounces and turns some poor ISP's server into a smoking crater.
>
> A valid use would be setting up a mailing list. But I suspect the running
> and maintenance of such a thing is better served by setting up a proper
> list server -- like majordomo.
Anonther valid use is mail and news programs - some of the incorrect
addresses can be lethal to programs which hasn't anticipated them, and
sending mail (evil) or news (bad) with munged addresses isn't good to
start with, causing the other persons machine to crash and burn to boot
is downright uncivilized.
--
John Moreno
I am trying to convince the author of YA-NewsWatcher that the latest
version should be released to the public. He doesn't think there's much
interest in a new version. Help me prove him wrong. To do so, send me
mail <mailto:phe...@interpath.com> with a Subject of New YA. Comments
on what you like/dislike in the current version will be appreciated.
Okay, I find munged addresses pretty darn annoying, and I won't bother
unmunging an address to send an email reply. But I find the above argument
against munging addresses to be rather silly. If someone is using a buggy
program that crashes on a bad email address, that's their problem. One thing
we shouldn't have to do is compose our posts to conform to potential bugs in
miscellaneous news readers.
It doesn't cost much to do a quick check of a list of addresses that are
going on to a listserv. For each address you get one of three results:
either definitely OK, obviously bogus, and who knows. For the "who
knows" addresses (pretty rare in the real world) you just shrug and move
on, but when you see something like:
you reach for the "bogus" stamp and the ink-pad. The pay-off in my case
is reduced stress for those of my users who aren't *really* up to
dealing with listserv administration, even though it's their list.
--
Jaime Metcher
John Moreno wrote:
>
> I R A Aggie <fl_a...@thepentagon.com> wrote:
>
> > But let us drop down to the bottom line, here: what is the cost/benefit
> > ratio to checking an email address? IMHO, not terribly high. In fact,
> > its so low as to become a worthless endeavor.
>
> That depends upon what your doing and what tools your doing it with -
> once you've got your sytax checker it's easy enough to call, and won't
> take a excessive amount of time to return it's results.
>
> > About the only people who would truely benefit from such a thing are
> > spammers. Except they don't care if their throwaway acount generates
> > 100,000 bounces and turns some poor ISP's server into a smoking crater.
> >
> > A valid use would be setting up a mailing list. But I suspect the running
> > and maintenance of such a thing is better served by setting up a proper
> > list server -- like majordomo.
>
> Anonther valid use is mail and news programs - some of the incorrect
> addresses can be lethal to programs which hasn't anticipated them, and
> sending mail (evil) or news (bad) with munged addresses isn't good to
> start with, causing the other persons machine to crash and burn to boot
> is downright uncivilized.
>
Ooh! I'm salivating! What are the planned enhancements?
Brand
John Stanley wrote:
>
> In article <35219ED3...@spider.herston.uq.edu.au>,
> Jaime Metcher <met...@spider.herston.uq.edu.au> wrote:
> >It doesn't cost much to do a quick check of a list of addresses that are
> >going on to a listserv. For each address you get one of three results:
> >either definitely OK, obviously bogus, and who knows.
>
> Nope, you get one of two results: "obviously bogus" and "who knows".
> Most of the addresses, if properly formed, are "who knows", until you
> send mail and get a response back from the recipient. At that point,
> they enter the "previously valid but may change to invalid at any time"
> catagory.
>
In more detail, my three results are (in order): got an OK result from a
EXPN or RCPT (after necessary MX lookups), couldn't find the hostname at
all and/or the address looks weird, and neither of the above. I might
not have mentioned that this system is a machine/human hybrid - you have
to eyeball the rejects.
You're right in that the first one definitely isn't definitely OK. I
use it as a first step - can we *send* mail to that address. In some
cases, as you and everyone else has pointed out, the answer to that is
not the same as the answer to "will the mail *get* there".
My point is that, in giving addresses a plausibility rating, there's a
big difference between the first case and the third.
> The value of simple checks to discard "obviously bogus" addresses is
> that the person entering it gets the clue that you are somewhat serious
> about entering a valid email address, even if you can't test it
> immediately.
>
> >on, but when you see something like:
> >
> >Bill.C...@dyselxia.com
> >
> >you reach for the "bogus" stamp and the ink-pad.
>
> The domain exists. You have no way of knowing if the mailbox exists or
> not. Just what criteria are you applying to this address to brand it
> "obviously bogus"?
>
I can't get my SMTP host to accept mail for that address. InterNIC says
the domain doesn't exist. Eyeballing says that two of the components in
that address could be regarded as mispelt variations of common
words/names. You assumed I'd made a typo in the domain name - I'd
assume the same. I'd say that's plenty of justification for getting on
the phone and asking Mr Cilnton if he's given me the right address.
> >The pay-off in my case
> >is reduced stress for those of my users who aren't *really* up to
> >dealing with listserv administration, even though it's their list.
>
> At the expense of poor Mr. Cilnton at dyslexia.com...
That's dyselxia.com.
Poor Mr Cilnton had better get himself an account on an RFC822 compliant
mail server in a registered domain if he wants to have his address taken
seriously. Or at least get used to writing "I know this looks weird but
it really is my address" after it.
--
Jaime Metcher
(Reply to the group one last time, even though straying further
off-topic, because email to you bounced).
John Stanley wrote:
>
> In article <352352CE...@spider.herston.uq.edu.au>,
> Jaime Metcher <met...@spider.herston.uq.edu.au> wrote:
> >> Nope, you get one of two results: "obviously bogus" and "who knows".
> >
> >In more detail, my three results are (in order): got an OK result from a
> >EXPN or RCPT (after necessary MX lookups),
>
> Then you aren't really trying to verify the email address, you are
> doing something I can't even really figure out a name for. An OK from
> a RCPT can mean simply "I will accept it for delivery", which is NOT
> verification of the address.
>
You're right of course. I still think it's a valuable first step - if
you can't send an email, it certainly won't get to person you're trying
to reach! It certainly catches Mr Cilnton/Clinton's error (I know it's
an error, because I made it up!). Sure, it might not be an error - but
it's certainly worth checking up on.
> >couldn't find the hostname at all and/or the address looks weird,
>
> Again, not the same as verifying the address. Please do an MX lookup on
> a .UUCP address. You can't. As for "looks weird", you better stay away
> from X.400 addresses.
>
I'm happy to stay away from both UUCP and X.400. Maybe these things are
more common in the U.S., but in two years of running our departmental
mail server, I've never seen one. I'm happy to say that 90% of the
addresses that fail my checks are suffering from typos. For the other
10%, I'll spend a few minutes verifying by other means if possible. If
they turn out to be correct, fine. That still makes the checks more
than useful.
Please note that I'm not posting a module that I claim will verify email
addresses. I just think a blanket statement (not from you, but common
in this c.l.p.m) that "You can't verify an email address" is misleading.
As I R A Aggie wrote:
>But let us drop down to the bottom line, here: what is the cost/benefit
>ratio to checking an email address? IMHO, not terribly high. In fact,
>its so low as to become a worthless endeavor.
There's a lot of useful checking you *can* do. You can't give a
definite answer, but that's not the only kind of answer there is.
Tom Christiansen wrote an email address checking script. Why did he do
that? Just an idle exercise?
> >I can't get my SMTP host to accept mail for that address. InterNIC says
> >the domain doesn't exist.
>
> Yep, I tried dyslexia.com. This one is bogus. But how about
> dyslexia.UUCP?
>
> >words/names. You assumed I'd made a typo in the domain name - I'd
>
> Please don't try to guess what I assumed. You are wrong.
I don't know what I assumed you assumed. I'm confused: I said
dyselxia.com is bogus. You said dyslexia.com exists. In this last post
you seem to be saying that dyslexia.com *is* bogus. Well, my check of
InterNIC says that dyslexia.com *does* exist as a registered domain. I
don't know if it is accepting mail - is that what you mean? Anyway,
it's beside the point, because that's not the domain name I originally
typed. I typed dyselxia.com, *not* dyslexia.com. So rather than assume
(you're right, that was rude), I'll ask - *did* you assume that I'd made
a typo in the domain name?
--
Jaime Metcher
That might be because you didn't recognize them.
abi...@mars.ic.iaf.nl will be delivered by UUCP, even while it
has an RFC 822 addressing style.
Abigail
--
perl -weprint\<\<EOT\; -eJust -eanother -ePerl -eHacker -eEOT