I figured I'd bring this up for possible discussion...
Because of the virus attack that hit just before Christmas, it made
me take a closer look at the mailing list archives and just how
secure they are, both from a scenario where someone decides to
attack a list by harvesting addresses and mailing list members
directly, and the more general anti-spammer harvesting issue.
The problem is how to make archives easily accessible, without
leaving them wide open to anyone. It's an interesting tradeoff.
I use two sets of archives. One is web based, using Web Crossing
(www.webcrossing.com), which keeps threaded archives for about 30
days. I found it was possible to access e-mail addresses as guest, so
I'm in process of recoding it so that guests can't access that info.
Guests will still be able to browse, but can't access key identifying
data without logging in and registering on the site.
My other archive is via FTP, making the digested versions of things
available (and that is accessible via a search engine). This, of
course, is wide open. I've considered a number of ways to put some
better controls on this. The easy one, obviously, is to put it behind
a password, and make the password available in the list documentation.
But -- that fails any number of sniff tests. It's a step up from no
protection at all, but anyone motivatged enough to target the
archives specifically won't get slowed down significantly. It's a
false security.
What I've decided to do for now is to move the archives from FTP to
HTTP, on an Apache server, and then to write an apache
authentification module. When you try to access the archives, you'd
have to give your e-mail address, and you'll be validated in only if
that e-mail address is a subscribed user. That puts the archives at
the same level of security as the list itself -- they can only be
accessed by someone who has gone through the subscription validation
process (so by definition, they can get your e-mail simply by reading
the list). It locks out anyone who isn't subscribed, so it locks out
anyone you've kicked off the list or who isn't willing to give you a
valid e-mail (assuming subscriptions are mailback-validated).
anyone see any problems with this? I didn't want Yet Another
Password, and it seems to me an authentification scheme that ties
into the subscriber database is the easiest way to close off access
without significantly raising complexity for the end user. Anyone see
any real flaws here?
--
Chuq Von Rospach - Plaidworks Consulting (mailto:chu...@plaidworks.com)
Apple Mail List Gnome (mailto:ch...@apple.com)
Pokemon is a game where children go into the woods and capture furry
little creatures and then bring them home and teach them to pit fight.
Well, if your user base is like mine, users THINK their e-mail address
is Joe....@foo.bar when it is really joe...@piddly.foo.bar. If
you're extracting the subscriber address from headers and then they're
hand-typing in their e-mail address as a verification step, a goodly portion
of them aren't gonna get in.
--
Mike Nolan
> Well, if your user base is like mine, users THINK their e-mail address
> is Joe....@foo.bar when it is really joe...@piddly.foo.bar.
That's fixed by moving to something like what Lyris does, or a more
generalized VERP mailing. I'm currently wrangling with a replacement
for bulk_mailer that'll put the user's subscribed address back into
the To: line, and give me some other nice customization features (as
we speak, I'm doing some performance testing on net::DNS, and finding
they're real performance pigs. Sigh, I may have to move that to C...)
> Well, they have to give _a_ e-mail address, but I don't see where it
> makes them give _theirs_. If they only have to know the address of
> _someone_ who is subscribed to the list then it doesn't really lock
> out anyone who was once on the list but since kicked off.
Hmm.
> For anyone
> else, if the list info gives your address anywhere (maybe it doesn't)
> and you're a subscriber, then everyone can be assured of knowing one
> valid address.
That's easy enough to take care of, simply by denying access to admin
addresseses.
Hmm. you have a good point. While this would nuke out the spammers,
since they couldn't get an email address without first subscribing
SOME legal address to the list, it doesn't solve the "kicked out
getting even" attack scenario, because they would have had access to
mail where they could get someone else's address from.
So it's no better than the "password on the web site" solution, but a
lot more work.
Anyone see a way to fix this? I don't, unfortunately. thanks, Mitch.
Saves me a buncha work for little real benefit.
Well, they have to give _a_ e-mail address, but I don't see where it
makes them give _theirs_. If they only have to know the address of
_someone_ who is subscribed to the list then it doesn't really lock
out anyone who was once on the list but since kicked off. For anyone
else, if the list info gives your address anywhere (maybe it doesn't)
and you're a subscriber, then everyone can be assured of knowing one
valid address.
-Mitch
I don't think there are any good solutions to that problem. In
order to make the archives accessible to casual use by human beings,
it has to be fairly easy to authenticate yourself. In order to
make it sufficiently easy for the clueless to authenticate, the
authentication instructions need to be fairly prominent, enough
that it would not deter someone specifically interested in harvesting
your archives.
Here we make the authentication pretty easy. The list archives
are behind a password-protected server. Anyone can create their
own account and password -- it's trivial. I have never been able
to find evidence of someone targeting our archive directly for
e-mail addresses and just don't worry about it.
--
Regards,
Tim Pierce
RootsWeb.com lead system admonsterator
and Chief Hacking Officer
I think it's 'only' necessary to make the archives as safe as being
subscribed is (and that's another discussion entirely!) -- which is
why authentificating against whether the person is subscribed or not
is where I'm headed.
Hmm. Here's a thought. you have a web page, where you type in your
e-mail address. That's validated against the subscriber lists, and if
you authenticate, you e-mail the access into to the user. then, you
change the password on a regular basis (daily?) or even on a per-user
basis, if you want. With an SQL backend, adding a password field
isn't that bad, and allowing a user to set a password (and e-mailing
it to them again if they forget) isn't terribly difficult.
Hmm. that has potential.
> I have never been able
> to find evidence of someone targeting our archive directly for
> e-mail addresses and just don't worry about it.
I haven't, either, but I do worry about it, because the only thing I
can guarantee is if/when someone DOES target it, it'll be at the time
I can least afford to have to deal with it...
> What I've decided to do for now is to move the archives from FTP to
> HTTP, on an Apache server, and then to write an apache
> authentification module. When you try to access the archives, you'd
> have to give your e-mail address, and you'll be validated in only if
> that e-mail address is a subscribed user. That puts the archives at
> the same level of security as the list itself -- they can only be
> accessed by someone who has gone through the subscription validation
> process (so by definition, they can get your e-mail simply by reading
> the list). It locks out anyone who isn't subscribed, so it locks out
> anyone you've kicked off the list or who isn't willing to give you a
> valid e-mail (assuming subscriptions are mailback-validated).
I have been working on writing a mod_auth_listar, which will check the
HTTP user/pass against an e-mail and a web interface password (since
Listar does allow passwords for the web interface, though I think the
cookie method is more secure). I don't want to use just the e-mail, since
then if you knew even one e-mail of someone on the list, you could harvest
all the others... though I don't want to require people to set a web
password just to access the archives. I have been considering an
intermediate login page that would create a Listar authentication cookie,
but that is starting to just get frighteningly wrong.
> anyone see any problems with this? I didn't want Yet Another
> Password, and it seems to me an authentification scheme that ties
> into the subscriber database is the easiest way to close off access
> without significantly raising complexity for the end user. Anyone see
> any real flaws here?
Other than the one I point out, no. But say someone forwards a message
from the list and you take the 'From' field in the forwarded message,
enter that in the 'E-mail' login portion of your web authentication box,
and voila, given one e-mail you can harvest all. It is still a better
approach than simply leaving them open to the world, though.
--
Jeremy Blackman - lo...@maison-otaku.net / lo...@listar.org / jer...@lith.com
Lithtech Team, Monolith Productions -- http://www.lith.com
Listar Developer -- http://www.listar.org
> Anyone see a way to fix this? I don't, unfortunately. thanks, Mitch.
> Saves me a buncha work for little real benefit.
It sounds to me like the goal is to restrict access to the archives to
only list members, correct? If that's the case, then that means you have
to authenticate an incoming user as a list member. To do authentication,
you have to have a shared secret between you and the person you're
authenticating, however indirectly. In the absence of personal certs
issued by a trusted authority or something else extremely complicated, in
practice I think this pretty much means either a password equivalent of
some sort or a confirmation handshake (which is essentially a one-time
password leveraged off the security of the person's e-mail account).
The scheme of using their e-mail address and checking against the
subscriber list reduces to using their e-mail address as a password. It's
not necessary to join a mailing list to know the e-mail address of one of
the subscribers; there are other ways of obtaining that information, down
to someone just happening to mention in public that they're on a
particular mailing list and making some guesses about what address they
would be subscribed as.
--
Russ Allbery (r...@stanford.edu) <URL:http://www.eyrie.org/~eagle/>
Someone once pointed out to me that forging an email address has
no security risk if it results in the file being sent to the person
whose address was forged... so if you're emailing out the file,
using the list of current subscribers is just fine. If you're
showing it on the web, you'd have to do something like email a
cookie before authorizing viewing for a period of time. Of course
if I know that someone ELSE has been viewing your archives, I can
use their address to view until the cookie expires.
> It sounds to me like the goal is to restrict access to the archives to
> only list members, correct?
That seems to be the cleanest way to limit access to a group of
approved users -- given that being a subscriber means they've been
authorized at some level. I have no problem with a wider audience,
except it's really hard to define where "okay" ends, and exactly how
to authorize it. I'm open to suggestions. The two groups I'm
specifically trying to lock out are the e-mail address harvesters who
won't abide by a robots.txt restriction, and the occasional troll
that gets kicked off a list and goes looking for ways to create
havoc, where, by definition, a rule like "don't do this" won't work.
it's an interesting issue that I think has some real need -- it's
just not all that easy, if only because defining "good" and "not
good" are so difficult, especially programmatically.
> The scheme of using their e-mail address and checking against the
> subscriber list reduces to using their e-mail address as a password. It's
> not necessary to join a mailing list to know the e-mail address of one of
> the subscribers; there are other ways of obtaining that information, down
> to someone just happening to mention in public that they're on a
> particular mailing list and making some guesses about what address they
> would be subscribed as.
But the dillema faced by some of us of which I haven't seen discussed is
one where you have public mailing list archives (like in support of a
software product, as in my case). Our organization uses those mailing
lists as a form of support ("Check the archives, as your question may have
already been asked, before subscribing/posting to the list itself") The
problem lies in that we just can't restrict access to these archives to
subscribers; that negates a valuable resource to the user that may be able
to get a quick answer to his question via the archives rather than spend
time subscribing to the list, and then searching (or to ask the question
on the list itself).
Currently what we have resorted to is munging the headers via a function in
MHonArc to add a string, so that at least you couldn't auto-harvest the
addresses, but it would be easy to compensate <sigh>. We have thought
about just removing the addresses in the header, but there may be
situations where someone would want to contact someone on the list
directly (which is why we went the munging direction). This solution is
also incomplete - it doesn't take care of the addresses that show up in
.sig's or in the body of a message.
In my head, I have thought of a better solution, to somehow have a "entry
gate where they enter their email addresses and click 'ok' to a notice
saying that they will not harvest the email addresses. Thus a
dynamically-generated URL (w/ cookie) is created and is sent to the email
address with an expiry date of 2-3 hours. Thus the general public can view
the archives with relative ease, and while it's not bullet-proof in the
case of spamming (nothing is, IMO), it at least balances that concern with
public access.
But it would be a hard thing to bring together (as an Apache Module
perhaps?)
I am curious to hear how others have dealt with this, and if they have any
other ideas...
Best Wishes - Peter
| Peter Losher | SysAdmin - Nominum, Inc. | Peter....@nominum.com |
We had the same problem with harvester going through the archives,
in spite of the norobots.txt + some filtering based on user_agent
in apache (I recommend a good article on this subject :
http://www.csc.ncsu.edu/~brabec/antispam.html).
Harvester are jumping from a page to another via anchors (following
a <A HREF=""> but don't submit forms as far as we know. We have
inserted a basic FORM at the entry of archives. The user needs to
submit the form to access the archives. We have no direct link to
our archives on our web server. This method has shown its efficiency
over the last 2 years.
Here is an exemple : http://listes.cru.fr/arc/http-...@cru.fr/
> and the occasional troll that gets kicked off a list and goes looking
> for ways to create havoc, where, by definition, a rule like "don't do
> this" won't work.
We have developped a web interface to Sympa MLM, which has an
interresting way of managing archives :
The authentication scheme is based on e-mail addresses
and passwords. When subscribing to a list a user is
allocated an initial password, he can change latter.
He/she needs this password to access some private list
functions. Once he/she privides his/her password, a
HTTP cookie will do the auth job.
Web archives are managed with MHonArc but they are not
directly accessible through our web server (ie not in
the web hierarchy). This job is performed by a CGI
which has therefore complete control over who has
access to an archive. Depending on "web_archive_access"
list parameter (public|private|owner|listmaster|closed),
the CGI awaits a password or requires specific privileged.
Subscriber information are stored in a Relational Database,
so the CGI and the MLM may work on the same set of data.
Olivier Salaun
Comite Reseaux des Universites
http://www.cru.fr/