<<...generate file $myfile in some way...>>
ns_returnfile 200 text/plain $myfile
ns_unlink $myfile
If this snippet is executed in a tight loop on a Linux system, the chances
of returning the wrong results are very high due to AOLserver's fastpath
caching, which requires the following four attributes to be identical to
consider a new file to be a cache hit (as per the FastReturn function in
fastpath.c):
1) Same device number
2) Same inode number
3) Same modification time (within one second)
4) Same size
Assuming $myfile is always on the same filesystem, number 1 is taken care
of, and Linux reuses inode numbers, so the creation and deletion of
$myfile will typically result in a file with the same inode. So in this
example, files created within a given second that contains the same amount
of data as a preceding file created within that same second will be
considered identical, and will be erroneously served from cache.
This isn't just a hypothetical, BTW; a client of mine ran into this issue
and spent many weeks trying to figure out what was happening before
tracing it back to AOLserver's fastpath caching. And the issue had
existed for many years without being detected.
I'm mainly bringing this up to shine a light on the issue and see what
other people's views are. It's potentially a very serious issue given
that it may silently "corrupt" data, and the fact that fastpath caching is
enabled by default means that people may run into it without even knowing
they're exposed to the danger. The best workaround I can think of (short
of a checksum, which would defeat the purpose of caching in the first
place) would be to check that the mtime or ctime of the file is some
threshold number of seconds (e.g. 1 or 2) less than the current time, and
not serve the file from cache if it's not. In other words, a file would
have to be at least X seconds old (which could be a configurable value)
before it could be served from the cache rather than from disk.
Thoughts?
- John
--
AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to <list...@listserv.aol.com> with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
But in general it is not a good idea to do things the way you are, which
is reading and writing the same file at the same time, which has nothing
to do with fastpath. You should use a cond/mutex to serialize access.
tom jackson
No, you've misunderstood the scenario. The file name needn't be the same
to trigger this issue, and the "corruption" doesn't come from serving data
out of a file that's changing, but rather because fastpath caching
mistakenly identifies a new file as being identical to a previously-cached
file (for the reasons I outlined) and erroneously serves the
previously-cached data to the user.
This is a design limitation and arguably a bug in the fastpath caching
implementation, which is potentially quite serious since it silently
serves the wrong data to the user. If you want a more straightforward
(albeit contrived) demonstration of the problem, here you go:
set file [open "/var/tmp/myfile" "w"]
puts $file "ABC123"
close $file
ns_returnfile 200 text/plain "/var/tmp/myfile"
ns_unlink -nocomplain "/var/tmp/myfile"
set file [open "/var/tmp/myotherfile" "w"]
puts $file "XYZ987"
close $file
ns_returnfile 200 text/plain "/var/tmp/myotherfile"
ns_unlink -nocomplain "/var/tmp/myotherfile"
Assuming that /var/tmp/myfile and /var/tmp/myotherfile are created within
the same second, the fastpath caching algorithm will misidentify them as
the same file, and ns_returnfile will therefore erroneously return the
(previously cached) contents of /var/tmp/myfile when it should be
returning the (uncached) contents of /var/tmp/myotherfile.
Just to be clear: fastpath is for serving static content. This is not
what you are doing here, you are creating a temporary file to store
dynamic content. For your bug to work you must delete the old file and
create a new one within the same second, etc.
Also, your code sequence below will leave temporary files around in the
case of a crash. If you want to safely serve the content from this
temporary storage, you should unlink after you finish creating it (no
other thread or process will be able to access the content, or you can
unlink before you write the content and even local users will not be
able to see the file.
Then just send out the contents directly using the fd not the file
name.
(maybe something like:
ns_return 200 [ns_guesstype $myfile] [read $fd]
Then you can close the fd, although AOLserver does that automatically at
the end of each request.
Now: why are you writing the content to disk? Can't you use a temp
variable.
tom jackson
From the server side, this could be fixed by:
- adding in the filename to the hash key or checking that it is the same
- making ns_unlink flush the entry from the fastpath cache
- restricting what fastpath will cache - e.g., don't cache anything in
/var/tmp or /tmp or a configuration-specified directory.
- adding a "-nocache" flag to ns_returnfile
All of these have pros and cons.
I don't think your suggestion of waiting for cache entries to age a
second or two would work well, it just moves the race condition around
and adds a whole lot of disk activity when a busy server is warming up -
static files might be read a few dozen times instead of once.
Fixing it from the application side is much easier. Just use
ns_returnfp instead of ns_returnfile, on the open handle if you
generated the file from tcl code and it's convenient to get the handle,
otherwise by opening the file right there:
<<...generate file $myfile in some way...>>
set fp [open $myfile]
ns_returnfp 200 text/plain $fp
close $fp
ns_unlink $myfile
You'd probably lose some efficiency by not mmap-ing the file, but that's
likely to be noise compared to generating the file in the first place.
-J
It is a security issue mostly because the code sequence is incorrect.
(which also means that ns_returnfile should not be used for temp file
return)
The safe way to do this is to open the temp file, then immediately
unlink it! Then write to the fd.
BTW, this same bug exists in the ns_form/ns_conn files code which should
use fd's instead of files. We need a little code cleanup here.
tom jackson
I'd agree that that's the intent, but the caching is hidden within
ns_returnfile and it's not clear at all from the user's perspective that
this alligator is lurking in the swamp. Using ns_returnfile in this way
may not be the best approach in any particular situation, but it's
nonetheless a completely valid usage and isn't contraindicated in any
AOLserver docs I've seen.
It's not difficult to come up with examples where it might happen,
BTW...say, a web service that returns the result of an operating system
command to a user.
I think Jade makes a good point that this is not only a bug but
potentially a security issue.
>It happens to be used in ns_returnfile since that is the normal use
>case. On unix the fastpath cache is keyed off the dev/inode probably to
>keep the hash key shorter. Windows doesn't have device and inode numbers
>so it uses the filename as the hashkey, so it wouldn't run into this
>problem.
No, it can still easily run into this problem--it's just that the file
name needs to be the same in both cases (which actually did apply in my
client's case, and caused confusion in the early debugging of the problem,
since the assumption was that using the same file name and/or path name
was the source of the problem).
> From the server side, this could be fixed by:
>- adding in the filename to the hash key or checking that it is the same
No go, as observed above.
>- making ns_unlink flush the entry from the fastpath cache
Nope, since the file can be removed via (e.g.) exec rm.
>- restricting what fastpath will cache - e.g., don't cache anything in
>/var/tmp or /tmp or a configuration-specified directory.
>- adding a "-nocache" flag to ns_returnfile
This last is the one I'd considered as well, but the problem is that it
puts the onus on the user to know that they should use the flag, and
that's unlikely to be clear to them.
>I don't think your suggestion of waiting for cache entries to age a
>second or two would work well, it just moves the race condition around
>and adds a whole lot of disk activity when a busy server is warming up -
>static files might be read a few dozen times instead of once.
Nope, not at all. The only files that would get read more than once would
be those that were served within one second of being generated--which
wouldn't apply to any content that fits the definition of "static".
So this is actually a fairly non-intrusive fix. The main limitation is
that it relies on the file timestamps and the server timestamps being
synchronized, which may not always be true. But I can't think of a better
solution. Simply put, fastpath caching is inherently broken because it's
not possible to guarantee that the file in question really should be
served from cache (again, short of a cache-defeating checksum).
>Fixing it from the application side is much easier. Just use ns_returnfp
>instead of ns_returnfile, on the open handle if you generated the file
>from tcl code and it's convenient to get the handle, otherwise by opening
>the file right there:
Yep, and that's more or less the workaround I'd suggested to my
client. But my point here wasn't to ask about potential workarounds but
to highlight the issue itself, since I haven't seen it mentioned before.
I don't think it is a bug in fastpath.
Think about the case where multiple logical files are actually the same
physical file. Using the name would result in caching the same object
under different names. This is a much more likely situation than this so
called bug.
tom jackson
The command is named ns_returnfile.
The expectation is that you are returning a "file", not a web service
resource.
The expectation is that the file will be around for longer than one
second before being deleted and replaced.
The fact that the documentation doesn't say this is unimportant. Inodes
are reused, this is part of how the filesystem works. You could run into
the same problem with an archive program. A file of the same inode,
name, size and age is created replacing the old file. Most archive
programs would not understand that the file contents had changed. Is it
a bug? No. It is called a practical limitation.
Anyway: no bug, just how it works. The only bug is how ns_returnfile is
being used in the example.
tom jackson
> I'd agree that that's the intent, but the caching is hidden within
> ns_returnfile and it's not clear at all from the user's perspective that
> this alligator is lurking in the swamp. Using ns_returnfile in this way
> may not be the best approach in any particular situation, but it's
> nonetheless a completely valid usage and isn't contraindicated in any
> AOLserver docs I've seen.
This then is the real fix: mention it in the docs. I put a blurb on the
appropriate wiki pages; feel free to suggest something better :)
The docs in the distribution should be updated too.
>> It happens to be used in ns_returnfile since that is the normal use
>> case. On unix the fastpath cache is keyed off the dev/inode probably
>> to keep the hash key shorter. Windows doesn't have device and inode
>> numbers so it uses the filename as the hashkey, so it wouldn't run
>> into this problem.
>
> No, it can still easily run into this problem--it's just that the file
> name needs to be the same in both cases (which actually did apply in my
> client's case, and caused confusion in the early debugging of the
> problem, since the assumption was that using the same file name and/or
> path name was the source of the problem).
The system needs to be free to do some things to improve performance
with the understanding that the user needs to be aware of those things
or risk bad behaviour. I wouldn't call it an unreasonable assumption
that a file with the same name (and same modtime etc) is the same file.
You can run into a very similar problem with NFS (i.e., attribute
caching causing a modified file to appear not so) and people have
learned to deal with that.
>> - making ns_unlink flush the entry from the fastpath cache
>
> Nope, since the file can be removed via (e.g.) exec rm.
True, but I'd still put this in the "system needs to be able to ..."
category above. The system does some things and the developer should be
aware of those things.
>> I don't think your suggestion of waiting for cache entries to age a
>> second or two would work well, it just moves the race condition around
>> and adds a whole lot of disk activity when a busy server is warming up
>> - static files might be read a few dozen times instead of once.
>
> Nope, not at all. The only files that would get read more than once
> would be those that were served within one second of being
> generated--which wouldn't apply to any content that fits the definition
> of "static".
It would work in your exact case, where the file is always removed
immediately after being served and generated. But if not, it would
still come up with the wrong answer.
13:50:21 - create file
13:50:21 - serve file (gets cached)
13:50:21 - delete file
13:50:21 - create file again (reuses inode)
... time passes ...
13:55:11 - serve file
In this case the file modtime is more than a few seconds old, but the
cached mtime, inode, etc. are still matching the file on disk, so the
stale cache entry would get delivered.
There is also at least one clever optimization where "static" content
does get served within a second of being created, where the 404 page is
used to generate something like an image from something like a database
and writes it to a file where it is subsequently served by fastpath.
> So this is actually a fairly non-intrusive fix. The main limitation is
> that it relies on the file timestamps and the server timestamps being
> synchronized, which may not always be true. But I can't think of a
> better solution. Simply put, fastpath caching is inherently broken
> because it's not possible to guarantee that the file in question really
> should be served from cache (again, short of a cache-defeating checksum).
The same can be said about nearly any caching system: it is unable to
handle changes in the data that happen outside of the cache's control or
knowledge. This is just the bargain you make when you use a cache.
> But my point here wasn't to ask about potential workarounds but to
> highlight the issue itself, since I haven't seen it mentioned before.
I think you highlighting it is most of the fix. From there, get the
caveat inserted into the documentation and the knowledge into the
community so that the next person who runs into this problem will have
an easier, or at least less frustrating time solving it.
-J
Huh, hard links - I sometimes forget about those. It's a much more
believable reason (than my previous suggestion of shortening the key)
for why the inode was used instead of the filename for the hash key.
-J
fastpath is making assumptions about what means something is the
"same file", and those assumptions are not consistent with unix
filesystem semantics - how is this not a bug?
sure, the original use case that triggered this seems non-optimal,
and could be done in other ways that don't trigger the bug, but that
doesn't mean fastpath is behaving correctly...
Russell
Nope, because in this case the inodes for the files would be different, so
fastpath caching would distinguish them.
>What about this scenario:
> * You have a web application that allows administrators on various
> sites hosted on your application to download a list of user names and
> passwords (this is a slightly contrived example). They can download it
> to CSV.
> * Admin #1 generates this file. You create a unique filename for
> their site_id, because you want a unique filename to return back to the
> user: site-1234-passwords.csv. You return this file to the admin.
> * Admin #2 generates their file. You create a unique filename for
> their site_id, because you want a unique filename to return back to the
> user: site-5000-passwords.csv. You attempt to return this file to the
> admin. Because their request was in the same second, however, they get
> site-1234-passwords.csv?
Yep, it could happen in this case, assuming the files are deleted after
they're returned to the user via ns_returnfile.
As I mentioned, this bug wasn't discovered through code review or any
theoretical process--it was causing problems in live code, and it was
extremely difficult to track down. And the damage assessment is still
underway.
No, because each file has a different inode. The "bug" requires that you
create and destroy one file and create another one within one second (so
they have the same timestamp) also required that the same inode is used
and that the file is the same exact size.
But beyond that, hopefully your git checkout will maintain the original
timestamp with the file.
> What about this scenario:
> * You have a web application that allows administrators on
> various sites hosted on your application to download a list of
> user names and passwords (this is a slightly contrived
> example). They can download it to CSV.
> * Admin #1 generates this file. You create a unique filename for
> their site_id, because you want a unique filename to return
> back to the user: site-1234-passwords.csv. You return this
> file to the admin.
> * Admin #2 generates their file. You create a unique filename
> for their site_id, because you want a unique filename to
> return back to the user: site-5000-passwords.csv. You attempt
> to return this file to the admin. Because their request was in
> the same second, however, they get site-1234-passwords.csv?
> Do I understand the problem correctly? I think both of these scenarios
> are pretty common examples of the way people use Aolserver currently,
> but I'm not sure if I'm understanding correctly the bug.
>
The filename doesn't matter, neither does the source of the information.
Two different requests could create files. The requirement is that the
first is created and destroyed and the second is created within the same
second as the first, reuses the inode, has the exact same size.
This is why you should not use linked files (with path names) as
temporary storage. Instead, open the file then unlink it (delete it from
the filesystem), then use it via the fd.
In short: there is no bug.
tom jackson
the problem is that this can occur even if the filename is changed,
and I'd argue that pretty convincingly violates the principle of
least surprise.
yes, of course the system needs to make some assumptions about what
it can optimise, but if the contents of /tmp/userinfo-71562 might get
served back when I've asked for /tmp/userinfo-61453 then there's
something wrong.
Russell
No, fastpath is making the exact same assumptions that any archive
program would make, which is to record certain attributes at the time
something is cached and then compare them with the same attributes at a
later time. Unless you do a checksum or some other comparison, the cache
system doesn't work very well for the intended purpose.
> sure, the original use case that triggered this seems non-optimal,
> and could be done in other ways that don't trigger the bug, but that
> doesn't mean fastpath is behaving correctly...
The "use case" is a bug. You can't violate the essential granularity of
the support system and call it a bug. The granularity is: inode, size,
timestamp. Now, if we could just slow down AOLserver so that this never
happens, that would be a great fix.
This is like claiming that a checksum collision is a bug. No, it is
expected. We don't use things like checksums, or inode,size,time as a
key as a guarantee of anything. They are a compromise, in other words,
engineering.
tom jackson
It's not a bug because no one ever said that it *was* strictly following
unix filesystem semantics, which isn't even a single thing (ufs is
slightly different than nfs, is slightly different than ext2 -noatime,
is slightly different than afs, etc.) It is following a particular
definition: if the file still exists and has the same
dev/inode/mtime/size as it did when you last checked, then it is the
same file. This of it as a if-modified-since or if-none-match
conditional GET.
It is a bug in that it's not what you expect. However in that case, the
location of the bug is subject to debate.
-J
If it were not for the fact that the same system is entirely responsible
for the situation, then I would agree.
What you are really hoping for here is an idiot proof system. The big
hole in the reasoning here is that the important thing is the file name
with path, and that somehow this name is immutably linked to some
content. This is delusion. You want a transactional database but you are
using a filesystem. Grow up.
BTW, fastpath has configuration parameters. Maybe bone up on those
first.
tom jackson
Also, would it be possible to tell ns_returnfile to not use fastpath,
if it is for one time use?
The alternative in this scenario would of course be to simply read the
file and just ns_return it.
Bas.
I'd say "nearly any" is going too far, and in fact I'd say that for most
caching systems to fail to return the correct data is a serious bug. The
NFS example you bring up isn't really analogous since it's only about
attributes, which are frequently not a concern; were NFS to return
incorrect *data* for a file, that would be a serious bug. And in this
case we're talking about a web server that may silently return data that's
completely incorrect, which I'd say is very serious misbehavior.
The core problem here is that AOLserver is attempting to use the tuple of
(dev, inode, mtime, size) as a unique determiner of a file's identity, and
that's an inherently broken assumption--particularly so since the
granularity of mtime is one second and inodes are reused on many
filesystems (e.g. very common ones like ext3 and ufs).
>I think you highlighting it is most of the fix. From there, get the
>caveat inserted into the documentation and the knowledge into the
>community so that the next person who runs into this problem will have an
>easier, or at least less frustrating time solving it.
That'd be an improvement over the current situation, but it's still the
case that AOLserver as currently shipped has a file cache mechanism built
into it which 1) may return incorrect data and 2) is enabled by
default. Given the risk, I'd say fastpath caching should be disabled by
default rather than enabled.
- John
yes, that's exactly what I said - fastpath should be removed.
snark aside, if I say "ns_returnfile /tmp/foo-abcd" but nsd sends the
contents of the now-deleted /tmp/bar-wxyz to the client then it's not
doing what I've explicitly asked, and it's a bug.
just because the correct (imo) response is "tag WONTFIX, document as
a gotcha, document workaround" doesn't mean that the behaviour is
correct.
cheers
Russell
The "bug" conditions are actually slightly looser than this, because
fastpath checks mtime and not ctime. So a malicious user (or your
version control system, if it makes the local files have the same
timestamps as those in the repo) could overwrite a file at any point in
the future, utime() it back to the same time and fastpath would still
consider it the same. So would any number of unix utilities, like
rsync, tar, zip, etc.
Going back to my previous solutions, the only one on the server side
that I still think is reasonable (names break hardlinks, cache flushing
on unlink wasn't good in the first place, -nocache - why bother?) is to
add a configuration option to exclude particular paths from fastpath.
Actually not even a configuration option; that would involve a bit too
much overhead for a marginal case; maybe a patch to fix this problem for
users for whom it is a problem.
Using an unlinked file as a temporary is the right thing to do most of
the time, but I imagine ti could be difficult to do when you need to
pass the filename around to uncooperative external programs.
-J
> That'd be an improvement over the current situation, but it's still
> the case that AOLserver as currently shipped has a file cache
> mechanism built into it which 1) may return incorrect data and 2)
> is enabled by default. Given the risk, I'd say fastpath caching
> should be disabled by default rather than enabled.
if someone's application is at risk of triggering this behaviour,
that'd just delays any problem until their load is high enough that
they need to turn on fastpath - and surely that's an even worse
scenario.
cheers
Russell
If your application wasn't the responsible party which violated the
expectation you state, I would agree (maybe).
The problem is that you think that the contents of a file remains
unchanged as long as the filename itself remains unchanged.
Actually the problem is that someone is using a file to store volatile
data and then feeding this file through a cache.
You really need to think about this insanity. Because it is insanity.
1. You waste time writing data to a file.
2. You use ns_returnfile to send this data (reading from disk).
3. Fastpath puts this information into memory (taking space).
4. ns_returnfile uses the memory copy on later requests (but none
valid).
5. meanwhile the file is deleted, cache still exists taking up memory.
The above are "ideal" conditions.
The bug is not in ns_returnfile.
tom jackson
> You want a transactional database but you are using a filesystem.
> Grow up.
and
> If your application wasn't the responsible party which violated the
> expectation you state, I would agree (maybe).
please go and re-read this thread, and get your parties straight.
Sorry, I don't follow.
Until someone explains to me why we need to be able to create and delete
a file (then return it via fastpath), then create another file in the
same second, I'll maintain that there is no bug in fastpath.
The whole thing is a waste of time and space. We don't need to fix
ns_returnfile so that it is easier to waste time or space.
tom jackson
I'd say it's still better, because it requires explicit action on the
user's part to enable the flawed caching mechanism in that case. And
actually I don't think fastpath in its default configuration would be of
much help in performance terms these days, given that the cache is only
5MB large and file data is typically cached by the OS anyway (and servers
generally have far more RAM than they did even five years ago).
I do think this should have been considered (and steps taken to address
it) when the fastpath caching mechanism was initially developed, since
it's a glaring flaw. I've designed things that rely on shaky underlying
assumptions in the past, but only in controlled circumstances where those
assumptions were guaranteed to obtain. I can think of situations in which
a caching mechanism with this type of design limitation wouldn't be an
issue, but in my opinion it has no place being a default-enabled mechanism
in an enterprise-grade web server.
- John
ok, I'll spell it out.
it's not my application that's violated the expectation I state. you
haven't been paying attention to the From: headers, and seem to have
mistaken me for the original poster of this thread.
all I've been saying is that "ns_returnfile <filename>" returning the
content of something other than <filename>, contrary to the
documentation and common sense, is a bug. given that fastpath exists
for a (good) reason, and that the behaviour which triggers the bug is
marginal anyway, the correct response is "the bug will not be fixed,
here's why, and here's how to work around it".
cheers
Russell
Ah, okay. I didn't mean to point to any particular application, by "your" I didn't mean any particular you or your.
> all I've been saying is that "ns_returnfile <filename>" returning the
> content of something other than <filename>, contrary to the
> documentation and common sense, is a bug. given that fastpath exists
> for a (good) reason, and that the behaviour which triggers the bug is
> marginal anyway, the correct response is "the bug will not be fixed,
> here's why, and here's how to work around it".
It is an interesting point. But it isn't a bug. The purpose of the API
is to return a static file, not one which changes in under a second. It
is not a bug to not support code which is guaranteed to be slower than
common alternatives.
Fastpath is designed to support return of smallish static content. It
isn't some ancient way of speeding up stuff that was slow, it was for
speeding up stuff that was already fast but was easy to make even
faster.
If you want to avoid use of fastpath, just set the configuration lower
than your dynamic content:
#
# Fastpath
#
ns_section "ns/server/${server}/fastpath"
ns_param cache [set cache 10] ;# max entries ??
ns_param cachemaxsize [set cachemaxsize [expr 5 * 1024 * 1024]]
ns_param cachemaxentry [expr round(floor($cachemaxsize/$cache))]
Or, if the dynamic content is very small, or customized, don't write it
to a file in the first place. In general you are probably doing
something wrong if you write small content to a file and immediately
delete it. You are also likely doing something wrong if you are caching
large files.
tom jackson
> I'd say it's still better, because it requires explicit action on the
> user's part to enable the flawed caching mechanism in that case. And
> actually I don't think fastpath in its default configuration would be of
> much help in performance terms these days, given that the cache is only
> 5MB large and file data is typically cached by the OS anyway (and servers
> generally have far more RAM than they did even five years ago).
>
fastpath is for small static content. You don't need to cache large
files, and that is why the cachemaxsize parameter gives you a cutoff on
the largest size to cache.
AOLserver has great performance on small files, fastpath speeds it up
further, plus the overall scheme handles directory files, internal
redirects, etc.
> I do think this should have been considered (and steps taken to address
> it) when the fastpath caching mechanism was initially developed, since
> it's a glaring flaw. I've designed things that rely on shaky underlying
> assumptions in the past, but only in controlled circumstances where those
> assumptions were guaranteed to obtain. I can think of situations in which
> a caching mechanism with this type of design limitation wouldn't be an
> issue, but in my opinion it has no place being a default-enabled mechanism
> in an enterprise-grade web server.
Why not just write another API which strips out all the things you don't
like. I think you misjudge fastpath in every way, but whatever.
tom jackson
> That'd be an improvement over the current situation, but it's still the
> case that AOLserver as currently shipped has a file cache mechanism built
> into it which 1) may return incorrect data and 2) is enabled by
> default. Given the risk, I'd say fastpath caching should be disabled by
> default rather than enabled.
Sounds right to me. Either robustify Fastpath somehow against this
corner case, or don't have Fastpath turned on by default.
--
Andrew Piskorski <a...@piskorski.com>
http://www.piskorski.com/
For what it's worth, it seems to me that if it has a measurable benefit, it's worth leaving on by default, as long as developers are properly educated about design issues (flaws, bugs, tradeoffs, whatever) that they need to deal with. If it's off by default it may as well be removed entirely. I say on by default, but well-documented so that developers are forced to have at least a cursory understanding of it when doing anything that might relate to it.
Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309
Having it disabled by default would be like not using computers because
they fail sometimes. That would be too extreme, isn't it? ;-)
As long as it's well documented, and there are alternatives to avoid the
problems, i think it's ok to leave Fastpath activated by default.
Regards,
Juan José
--
Juan José del RÃo
Chief of Commerce
Simple Option S.L.
Avda. Editor Angel Caffarena 11, B11, 1B
Málaga, 29010, Spain
+34 616 512 340 cell
+34 951 930 122 tel/fax
This is not a corner case. The exact same thing could happen without
fastpath.
What is that thing? That the contents of a file changes after a request
is made and before the file is returned. In fact, there is no guarantee
that it won't change mid-return. This is a fact of life with files on
any filesystem.
In fact, with the HTTP caching mechanisms, you could fail to get
up-to-date contents of a file, since the If-Modified-Since mechanism
will also fail.
The problem here is that the application is using this static file
handling API to serve dynamic content. Wondering why it doesn't work is
pointless.
Just to summarize again, this case requires that a file is created then
destroyed and another file created within the same second that has the
same size. Also, the original file must get into the cache, and the only
way that can happen is for the application to treat it as a long lived
static file.
We have other means to cache dynamic data, and large chunks of dynamic
content saved as a file can avoid the fastpath cache by setting the
cachemaxsize parameter. Writing smaller content to disk doesn't make any
sense if your goal is speed...or security.
It is probably even more important to tamp down these misconceptions
about how AOLserver works. Static and dynamic content are handled by
different API. The reason is that it has long been recognized by the
developers of AOLserver that different techniques are required to
maintain high performance based upon how the content is generated, its
expected lifespan, its size, and its potential for reuse.
tom jackson
On Tue, 2008-08-19 at 03:00 -0400, Andrew Piskorski wrote:
> On Mon, Aug 18, 2008 at 06:06:23PM -0700, John Caruso wrote:
>
> > That'd be an improvement over the current situation, but it's still the
> > case that AOLserver as currently shipped has a file cache mechanism built
> > into it which 1) may return incorrect data and 2) is enabled by
> > default. Given the risk, I'd say fastpath caching should be disabled by
> > default rather than enabled.
>
> Sounds right to me. Either robustify Fastpath somehow against this
> corner case, or don't have Fastpath turned on by default.
>
--
That'd be a pretty easy and efficient way to discard fastpath items in
case they have been deleted and/or modified.
Just my two cents ;-)
-
Juan José del RÃo |
(+34) 616 512 340 | juan...@simpleoption.com
Simple Option S.L.
Tel: (+34) 951 930 122
Fax: (+34) 951 930 122
http://www.simpleoption.com
The description of the parameters here is a little confusing. Browsing
the source, it appears that "cache" is a flag to enable or disable
fastpath, "cachemaxsize" is the maximum size of the cache, and
"cachemaxentry" is the largest size of a file that will get cached.
There is no setting for the max number of entries, the use of $cache in
the settings above (reflecting the server defaults) is really a minimum
number of cache entries (i.e., the default cache will hold at least 10
entries of the max 512k size, but it could also hold 1000 5k files).
I didn't dig deep enough to see how the cache flushing works, but on
casual perusal it looks like the cache is pruned by removing the oldest
entries (not largest, least hit, or least recently hit).
-J
I wrote the code. The explanation below is correct -- I chose inode/
dev combination to cache the same file even with multiple names which
was the case at AOL -- hundreds of symlinks and hard links to the same
file. The same strategy is used for ADP templates. I think the code
uses just filenames on Windows because the inode/dev don't really
exist in Win32 weirdness (or at least I didn't care enough to find the
proper analog).
As for whether this is a bug or not, opinions vary. I would suggest
the code snippet of create temp file and use fastpath to return
contents is not a use case I was solving for or recommend. The
suggestion to open a temp fd, unlink it, dump content into and send
from the open fd seems the better approach for a few reason including
proper cleanup after a crash. In fact, there is an API for such cases
-- Ns_GetTemp or something -- and it's used internally, for example,
to spool large file uploads. It re-uses open and unlinked fd's -- in
practice file create is expensive and is avoided by just keeping a
cache of open fd's around, truncating the content at the end of the
connection. I'm not sure if the docs are up to date and/or if there
are useful Tcl commands but something could be added easily if needed.
Having said all that, a note in the docs that "ns_returnfile is
designed for truly static content..." and comments on how the cache
works and can be disabled would make sense.
-Jim
On Aug 18, 2008, at 7:37 PM, Tom Jackson wrote:
> On Mon, 2008-08-18 at 15:38 -0700, Jeff Rogers wrote:
>> While I'd agree this is a bug in fastpath, the real problem is that
>> fastpath is being used at all in this case.
>
> I don't think it is a bug in fastpath.
>
> Think about the case where multiple logical files are actually the
> same
> physical file. Using the name would result in caching the same object
> under different names. This is a much more likely situation than
> this so
> called bug.
>
> tom jackson
It's also not the use case in question--just a simple illustration of the
problem. Here's a more realistic template of a use case (which closely
mirrors the actual code that led to the discovery of the bug):
eval exec /some/external/program --output-file $tempfile
ns_returnfile 200 text/plain $tempfile
ns_unlink -nocomplain $tempfile
In other words, run an external program that writes its output to
$tempfile, return that file to the user, and delete the file. This is a
case in which ns_returnfile seems like the obvious and appropriate
call--but if this procedure is run on behalf of users A and B within the
same second (which is common on an active web server), and the results in
$tempfile are the same length, B will get A's output. Depending on what
information the external program writes to $tempfile, this could easily
represent a security breach.
That example involves timing between two different users, but something
like the following will also trigger the bug:
foreach user $users {
eval exec /some/external/program --output-file $tempfile --user
$user
ns_returnfile 200 text/plain $tempfile
}
Again, this code looks perfectly appropriate, but it's very likely to
return incorrect data due to this bug. Note that the ns_unlink isn't even
required in this case.
Also, regarding "use fastpath to return content": the developer in this
case didn't know fastpath from a hole in the ground--after all, they were
calling ns_returnfile, not fastpath. fastpath is just the
behind-the-scenes mechanism that was making "ns_returnfile X" return a
file other than X. And generally speaking, I'd say it's perfectly
reasonable for a developer to believe that "ns_returnfile X" actually will
return file X.
- John
It is not a bug in ns_returnfile.
tom jackson
Actually that's not analogous, for the same reason that the analogies to
caching of attributes in NFS, rsync or tar not noticing content changes if
attributes stay the same, etc, don't apply: because this bug can happen
*even with two files that have completely different names or
paths*. Again, in this example...:
set file [open "/var/tmp/myfile" "w"]
puts $file "ABC123"
close $file
ns_returnfile 200 text/plain "/var/tmp/myfile"
ns_unlink -nocomplain "/var/tmp/myfile"
set file [open "/var/tmp/myotherfile" "w"]
puts $file "XYZ987"
close $file
ns_returnfile 200 text/plain "/var/tmp/myotherfile"
ns_unlink -nocomplain "/var/tmp/myotherfile"
...AOLserver will almost always return the contents of /var/tmp/myfile
rather than /var/tmp/myotherfile in response to the second ns_returnfile.
I think the analogies to other systems aren't really germane
anyway--AOLserver's behavior has to be judged on its own merits. But
adopting that standard, I can't think of any other program that would
confuse /var/tmp/myfile with /var/tmp/myotherfile.
- John
>> Think of it as a if-modified-since or if-none-match
>> conditional GET.
>
> Actually that's not analogous, <...>
I didn't mean to say it was exactly the same, just similar in that given
a particular system that makes particular assumptions it is possible to
construct a situation where the results are unexpected or incorrect in a
particular way.
I think by now everyone reading this understands the problem. What's
not clear is what you are expecting to happen now.
Documentation has been updated to reflect awareness of this problem and
caution against using ns_returnfile in this situation and suggesting
alternate solutions in the client code.
Some code fixes have been proposed, which for various reasons are
undesirable or simply won't fix the problem.
A default configuration change was suggested which seems generally
viewed as undesirable.
What more are you looking for?
-J
There is no bug in ns_returnfile.
It is possible that you can modify your program (something like):
set tempfile myfile.txt
set tfd [open $tempfile w+]
file delete $tempfile
exec /some/external/program >@ $tfd
set tlen [tell $tfd]
seek $tfd 0
ns_returnfp 200 text/plain $tfd $flen
close $tfd
Tclsh Example:
tom@boron:~/Software/tcl/fd$ tclsh
% set filename hi.txt
hi.txt
% set tfd [open $filename w+]
file5
% ls
hi.txt
% file delete $filename
% ls
% exec echo "hi there" >@ $tfd
% tell $tfd
9
% seek $tfd 0
% gets $tfd
hi there
% gets $tfd
% tell $tfd
9
% seek $tfd
wrong # args: should be "seek channelId offset ?origin?"
% seek $tfd 0
% gets $tfd
hi there
% ls
% close $tfd
% exit
Working AOLserver example:
set directory [file dirname [info script]]
set filename [file join $directory tmp.txt]
set tfd [open $filename w+]
file delete $filename
exec echo "hi there" >@ $tfd
set flen [tell $tfd]
seek $tfd 0
ns_returnfp 200 text/plain $tfd $flen
close $tfd
## Trying this via telnet:
tom@boron:~/Software/tcl/fd$ telnet rmadilo.com 80
Trying 216.211.130.179...
Connected to rmadilo.com.
Escape character is '^]'.
GET /bugs/ns_returnfile.tcl HTTP/1.1
Host: rmadilo.com
HTTP/1.1 200 OK
Set-Cookie: SessionID = "C9C1F5F33E1B4FC8A462C794F994A7180A65E906" ;
Max-Age = 832044870 ; Path=/
Set-Cookie2: SessionID = "C9C1F5F33E1B4FC8A462C794F994A7180A65E906" ;
Max-Age = 832044870 ; Path=/ ; Version = 1
MIME-Version: 1.0
Date: Tue, 19 Aug 2008 21:25:30 GMT
Server: AOLserver/4.5.0
Content-Type: text/plain; charset=UTF-8
Content-Length: 9
Connection: close
hi there
Connection closed by foreign host.
tom jackson
My impression was that support was split about evenly, actually. I take
it that means you're against changing the default? I'm a bit surprised,
since you started out agreeing that it's a bug. Personally I can't
imagine any persuasive argument that a caching mechanism that can easily
confuse /usr/local/private/var/rootpass and
/var/tmp/verisign/certs/webcert.txt should be enabled by default in a web
server.
For anyone thinking, well, you're the only one who's ever seen this bug,
I'd say no, we're just the first ones to discover this bug. It's quite
possible that other people have run into it without knowing it, since
AOLserver will just silently serve the wrong data.
As for what I want, as I said, I was mainly bringing this up to shine a
light on the issue and see what other people's thoughts were. That's been
helpful in particular because I hadn't considered the security
implications, which are quite serious; I may raise this issue on security
forums as well so that people using ns_returnfile are aware of the danger
of silent data corruption and/or information leaks and can review their
code accordingly.
- John
This isn't a democracy. You have to demonstrate some understanding of
how things work.
The only real security issue is your misuse/abuse of ns_returnfile to
serve dynamic data.
Nobody is going to guarantee that you can't shoot yourself in the foot
due to your lack of understanding of writing robust code, or how to
configure and maintain a secure internet application, or take advice on
how to do so.
But please, go tell the security police about our insecure file
commands.
tom jackson
Oh, come on. Only if you're rapidly creating and deleting these files.
I think it's interesting, at least, that this topic has created more
traffic than the list usually sees in a whole year.
Rusty
Yes, I've explained the conditions several times. The point was that the
files can be in completely different locations in the filesystem with
completely different names, and may have secure contents.
Again: this is not an academic point. This is an actual bug encountered
in actual code, resulting in data corruption (effectively) and possible
information leakage--and all because "ns_returnfile X" may not actually
return file X. I don't doubt that there are other people who are also at
risk due to this behavior of ns_returnfile/fastpath.
If it's no big deal for you, great, but the security implications are
nonetheless serious.
- John
I haven't looked at a "directory change notification" type scheme in a
long time but that could be very clever. Aside from addressing issues
discussed here, the key benefit would be to avoid the repeated "stat"
syscalls. Those stat calls always bothered me conceptually but the
performance of the underlying systems always improved faster than my
irritation would grow to do something about it. However, we were
always careful to run websites against local filesystems - I would be
more concerned with the overhead if we were using NFS or some other
shared filesystem thing.
Somewhat related, the "dci module" (a series of AOL extensions we open
sourced awhile back) includes some content fetch/caching features
called "sob". That had the model you described -- things stayed in
the cache until either space was needed or the server received an
explicit flush message on a publish event. That approach worked well
and scaled well but it wasn't entirely general nor naive, i.e., it was
key that we understood how it worked under the covers and to make sure
the flush message links were reliable to avoid stale content problems.
Anyway, I've been pondering this whole discussion some more and agree
with Tom -- the fastpath isn't broken. It just does a certain thing
-- serves static files with a reasonable balance of performance and
stability -- and shouldn't be modified except to add notes about how
it works in the docs. I'm having trouble thinking through how it
could be modified to plug all possible race conditions. I'd suggest
the code snippets using fastpath for dynamic content should be
modified, perhaps some new Tcl commands could be added to make it
convenient, but otherwise it seems a mismatch between capabilities and
requirements.
-Jim
BTW: I believe the ns_returnfile command didn't use the fastpath
originally -- I think it just opened and sent the content. It was
changed because folks asked for it to go faster I think -- can't recall.
Anyway, for your app, it might be easiest to not change your code but
instead write a new ns_returnfile to override the builtin -- maybe
just with open and ns_returnfp.
-Jim
> I think it's interesting, at least, that this topic has created
> more traffic than the list usually sees in a whole year.
most of which isn't actually about the issue at hand, but rather
whether John is an idiot for expecting ns_returnfile to behave as
documented.
cheers
Russell
After all the talks, i think it's clear that the problem is with
fastpath, not with any other thing. Caches are not perfect. If you don't
like it, you can fix it, or deactivate it. You choose what to do.
If you don't want to deactivate it, and have some C skills, I would
recommend you to make the needed changes to fastpath code to enable it
to use the kernel facilities of the operating system (in case you're
using linux, then that'll be "epoll" system call; in FreeBSD case it's
kqueue; etc.).
For Linux, you'd only need to subscribe to the events on the files that
are stored in fastpath, so if a "bad" event happens to them (they're
deleted, modified, etc.), you can remove them from the fastpath cache.
It's as easy as that. And you'll fix this bug with ns_returnfile, and
with other functions that use fastpath and may be affected by this bug.
I am not being sarcastic. I am just trying to get the best exit to this
issue, so we all can get a good solution to this problem.
Regards,
Juan José
-
Juan José del RÃo |
(+34) 616 512 340 | juan...@simpleoption.com
Simple Option S.L.
Tel: (+34) 951 930 122
Fax: (+34) 951 930 122
http://www.simpleoption.com
It's not a race condition, actually; the code in
that example was serialized, so there's no race involved.
>...fastpath isn't broken.
It's designed in such a way that it can return
incorrect results (and ones that are wildly
outside of reasonable expectations). Whether or
not that's broken is a judgment call, and there
we apparently differ, though I find that
surprising--just because bad behavior can be
documented and avoided doesn't mean it's not bad behavior.
>Anyway, for your app, it might be easiest to not change your code but
>instead write a new ns_returnfile to override the builtin -- maybe
>just with open and ns_returnfp.
Yep, that was essentially my original suggestion
to the developers. I can guarantee you that all
uses of ns_returnfile will be receiving close scrutiny. :-)
On Tuesday 05:59 PM 8/19/2008, Juan José del RÃo wrote:
>If you don't want to deactivate it, and have some C skills, I would
>recommend you to make the needed changes to fastpath code to enable it
>to use the kernel facilities of the operating system (in case you're
>using linux, then that'll be "epoll" system call; in FreeBSD case it's
>kqueue; etc.).
This is an interesting suggestion, but from a
quick scan of the epoll man page it doesn't look
like it would work in this case since it acts on
an open file descriptor, but fastpath associates
file data with a (dev, inode, mtime, size) tuple
without keeping an open file descriptor (and it'd
be pretty wonky for AOLserver to keep open file
descriptors for all files currently in the fastpath cache).
No matter, though, we've got plenty of
workarounds, and we'll probably just disable
fastpath entirely since the benefits are likely vanishingly small anyway.
#ifdef _WIN32
key = file;
#else
ukey.dev = stPtr->st_dev;
ukey.ino = stPtr->st_ino;
key = (char *) &ukey;
#endif
Note similar code is in ADP. The downside to filename-based cache is
the keys are bigger strings instead of small fixed sized structures
(tiny) and double caching for files which are actually the same via
symlinks (could be large or nothing if you're using symlinks). Both
cases result in more memory but perhaps safer results. One positive
is the ns_cache_keys, size, etc. commands would show results for
fastpath objects by filename (I think they ignore non-string based
cache keys now).
In fact, I suppose you could just add this to the top of fastpath.c
and recompile to try it out:
#define _WIN32 1 /* Get Win32 filename-based cache keys */
As long as your filenames are unique names, this may give you the
results you're looking for (although I still think a dynamic app using
files should use open/cached fd's).
-Jim
Sorry John, i said "epoll", but i meant "inotify".
This sounds like the problem. Not a bug with fastpath, but a questionable decision to take an apparently generic command and apply a situation-specific optimizer to it that returns bad results when applied outside of its design parameters. It seemed like a valid assumption at the time, just as John's code seemed to be making valid assumptions, but it just seems to have led to asses being made left and right.
Is there any way to make ns_returnfile's use of fastpath configurable, and turn _that_ off by default? Or would that be essentially the same as just turning fastpath off entirely? I must confess my ignorance of where fastpath gets used.
Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309
Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309
> -----Original Message-----
> From: AOLserver Discussion [mailto:AOLS...@LISTSERV.AOL.COM] On
> Behalf Of Jim Davidson
> Sent: Tuesday, August 19, 2008 8:39 PM
> To: AOLS...@LISTSERV.AOL.COM
> Subject: Re: [AOLSERVER] Data "corruption" with fastpath caching
>
In my opinion, switching the caching index from inodes to file names
(like in windows)
is a very reasonable solution getting rid of most problems (although
less cache efficient).
It would be additionally a nice feature to provide a configuration
option for getting
back the current behavior (for people with tons of links). This option would
guarantee backward compatibiity. This way, one could savely let fastpath
switched on by default.
-gustaf neumann
-- respond "not found" if file doesn't exist
-- sets "last modified header" based on mtime
-- returns "not modified" if client sent "if-modified-since"
-- returns headers only if a HEAD request
-- sets mimetype and contentlength
-- caches, mmap's, or simply opens the fd and sends, chunk by chunk
The code is a bit nested, with a few layers of code which handle
similar cases (send file, FILE *, fd, etc.). It's not 100% perfect
because the file could change between the stat to check existence and
get size/mtime for header construction and the actual send. There's
even a note that say something like "silently ignore truncated
file..." deeper down so that if fastpath stats a file of size x but
opens and starts sending a smaller file, it'll just silently quit
after sending what it found. By comparison, the ADP template read/
parse/cache code attempts to catch that case -- it tries up to 10
times in ParseFile to get a consistent fstat/read on the open file
(you can see that in adpeval.c).
Overall, this means ns_returnfile is designed to mimic the core file-
send/fastpath stuff as if there was no Tcl code involved at all --
performance, pitfalls and all. Whether fastpath is significantly
smarter or dumber sending static content than, say, Apache, I can't
say. This also means that a new "-cache 1/0" option to
ns_returnfile (default 0 or 1, subject to debate) could be useful but
not quite the same as "just send this file..." One would still get
the result AOLserver would have generated (not found, not modified,
etc.) unless all that was disabled as well.
By comparison, ns_returnfp is more basic -- it assumes it's sending
stuff from an open channel, just sets basic headers, and has none of
the "not found", "not modified", etc. type logic. Because the docs
don't mention this, I think some folks expect ns_returnfile to be just
this:
set fp [open file]
ns_returnfp $fp
close $fp
BTW: I mentioned that I thought it worked that way in the past but I
think I was wrong. Checking the CVS logs and diff'ing the files with
rev 1.1 way back in 2000 ns_returnfile was calling the fastpath and
fastpath was using the dev/inode pair for the cache key on Unix.
Overall, it seems one thing to do would be to switch to filename-based
cache keys by default, leaving the dev/inode pair as an option for
folks who run sites with large symlinks and want to benefit from
caching objects just once. I think that should avoid the data
corruption cases John pointed out with minimal downside. We could
also modify ns_returnfile to behave as above, calling ns_returnfp, but
I'm thinking that may break more than it fixes if folks are assuming
the behavior it has now (e.g, an ns_returnfile of an image where they
expect the not-modified responses).
-Jim
Actually that wouldn't have fixed the problem in the code that led us to
find out about this in the first place. The change that I suggested does
fix that problem, though, and it directly addresses the limitation of
mtime's one-second granularity--which is the crux of the issue. The patch
below (really just a one-line change) implements this fix:
------- 8<
---------------------------------------------------------------------
--- aolserver-4.5.0-orig/nsd/fastpath.c 2006-04-19 10:48:47.000000000
-0700
+++ aolserver-4.5.0/nsd/fastpath.c 2008-08-19 21:22:26.000000000
-0700
@@ -507,9 +507,11 @@
}
if (servPtr->fastpath.cache == NULL
- || stPtr->st_size > servPtr->fastpath.cachemaxentry) {
+ || stPtr->st_size > servPtr->fastpath.cachemaxentry
+ || (time(NULL) - stPtr->st_mtime) < 2) {
/*
- * Caching is disabled or the entry is too large for the cache
+ * Caching is disabled, the entry is too large for the cache,
+ * or the file was modified too recently to be cached safely,
* so just open, mmap, and send the content directly.
*/
------- 8<
---------------------------------------------------------------------
We've tested this fix extensively against the code that was hitting the
bug before, and I can verify that it resolves the problem there. As far
as I can tell this fix would resolve the issue in any standard scenario
(and certainly all of the ones I've outlined thus far).
Given that this is a straightforward, user-transparent change that would
have only a negligible impact on fastpath caching, and considering the
security implications, I'd suggest that this change be applied to the
AOLserver code in CVS.
BTW, Jeff, the scenario you'd outlined that you thought would trip this
up...:
13:50:21 - create file
13:50:21 - serve file (gets cached)
13:50:21 - delete file
13:50:21 - create file again (reuses inode)
... time passes ...
13:55:11 - serve file
...actually wouldn't, because the file would NOT be cached in the second
line. The whole point of this strategy is that a file won't be cached if
it's been modified within the threshold time (2 seconds in the patch
above).
- John
> Actually that wouldn't have fixed the problem in the code that led
> us to find out about this in the first place. The change that I
> suggested does fix that problem, though, and it directly addresses
> the limitation of mtime's one-second granularity--which is the crux
> of the issue. The patch below (really just a one-line change)
> implements this fix:
all this traffic to use functionality in a way that wasn't intended.
i fail to see how the right problem is being solved here.
Fine, then change that first timestamp to 13:50:18 (say if you ran
another external program after creating the file but before serving it
that took more than 2 seconds, or if your external program backdated the
file mtime.) It's still a race condition that you'll hit if all the
stars are in the wrong place. And it still hurts the optimization of
using a 404 adp page to generate a heavyweight file only once that gets
cached.
If your patch solves your problem, that's great, and that's the whole
point of OSS. But it does nothing to solve the problem generally and
has negative side effects, so I think it would be a mistake to add it to
the general distribution.
-J
With the fix below with resolution for mtime at 1 second and the
"grace period" at 2 seconds I can see how it would work but it would
make me a bit queasy -- fixes which have assumptions of timing can be
fragile. With filename-based keys and unique filenames (which would
seem like a natural requirement from someone writing similar code),
the inode would be ignored and you'd get a consistent view, regardless
of timing or what other threads would be up to. I think you could try
this approach as quickly as your fix -- just define _WIN32 after the
"include nsd.h" to get the filename behavior of Win32. You could run
your test and see if it was stable as well -- I'd be curious.
Again, this whole issue is interesting and the problem report quite
subtle, justifying some sort of defensive fix but using ns_returnfile
for short, dynamic content still seems like the wrong approach. Ideas
to use a cache of open fd's via Ns_GetTemp or Tcl channels via
ns_returnfp seem closer to what's needed here.
BTW: Which OS is re-using inodes so quickly? I can't get my Mac OS/X
laptop to do that -- figured the inode re-use/prediction thing was
plugged years ago, e.g., when fsirand was introduced for scrambling
NFS vnodes.
-Jim
> BTW: Which OS is re-using inodes so quickly? I can't get my Mac OS/X
> laptop to do that -- figured the inode re-use/prediction thing was
> plugged years ago, e.g., when fsirand was introduced for scrambling NFS
> vnodes.
Linux. This tcl page:
set fn "/tmp/tmpfile[expr rand()]"
set f [open $fn w]
puts $f [ns_queryget data]
close $f
after 2
ns_returnfile 200 text/plain $fn
ns_unlink $fn
being hit at the same time in 2 windows:
$ while true; do res=`curl -s http://localhost:8000/crap.tcl?data=wxyz`;
if [ $res != 'wxyz' ]; then echo $res; break ; fi; echo -n . ; done
$ while true; do res=`curl -s http://localhost:8000/crap.tcl?data=wxyz`;
if [ $res != 'wxyz' ]; then echo $res; break ; fi; echo -n . ; done
Will cause one or the other test script to get the error typically
withing 5 requests.
-J
Perhaps, but not in this case. This is just another way to circumvent use
of the cache on an object-by-object basis (like the size restriction
that's already there), which recognizes the fact that using mtime as a
determiner of uniqueness is limited by the fact that mtime has a
granularity of only one-second. As for fragility, the fastpath algorithm
is fragile *now*, thanks precisely to its assumptions about timing. This
very simple change removes the bulk of that fragility.
>Again, this whole issue is interesting and the problem report quite
>subtle, justifying some sort of defensive fix but using ns_returnfile
>for short, dynamic content still seems like the wrong approach.
Whether or not that's so, the fact is that everyone on this list appeared
to share the same utterly natural assumption that "ns_returnfile X" really
will return file X--which turns out to be untrue solely because of
fastpath caching's design limitation. This fix resolves that design
limitation in any standard circumstance.
>BTW: Which OS is re-using inodes so quickly?
The ext3 filesystem on Linux and the ufs filesystem on Solaris both re-use
inodes in this way.
Jeff,
On Wednesday 10:56 AM 8/20/2008, Jeff Rogers wrote:
>John Caruso wrote:
>>BTW, Jeff, the scenario you'd outlined that you thought would trip this
>>up...:
>> 13:50:21 - create file
>> 13:50:21 - serve file (gets cached)
>> 13:50:21 - delete file
>> 13:50:21 - create file again (reuses inode)
>> ... time passes ...
>> 13:55:11 - serve file
>>...actually wouldn't, because the file would NOT be cached in the second
>>line. The whole point of this strategy is that a file won't be cached
>>if it's been modified within the threshold time (2 seconds in the patch
>>above).
>
>Fine, then change that first timestamp to 13:50:18 [...]
No, you're still not understanding how the patch works. If you change the
first timestamp to 13:50:18, the file will indeed be cached at
13:50:21--but with an mtime of 13:50:18. When the new file is served at
13:55:11, it will *not* result in a cache hit because the mtime will be
different. That's exactly how the patch fixes this issue.
>If your patch solves your problem, that's great, and that's the whole
>point of OSS. But it does nothing to solve the problem generally and has
>negative side effects, so I think it would be a mistake to add it to the
>general distribution.
I'm surprised you're taking such an all-or-nothing view now, given that
you started out being open to discussion. This patch certainly does solve
the problem generally--in all but what I'd say are pathological cases, and
certainly in any standard usage (like the various examples I've
posted). And it does it by directly addressing the fastpath algorithm's
reliance on mtime, which has only one-second granularity.
On Wednesday 10:56 AM 8/20/2008, Jeff Rogers wrote:
>And it still hurts the optimization of using a 404 adp page to generate a
>heavyweight file only once that gets cached.
...which you'd explained elsewhere as...:
>There is also at least one clever optimization where "static" content
>does get served within a second of being created, where the 404 page is
>used to generate something like an image from something like a database
>and writes it to a file where it is subsequently served by fastpath.
...this fix doesn't break this functionality. You can still do it and
it'll still work. And in fact, others have been arguing (and I believe
you've been agreeing) that serving anything other than truly static
content with ns_returnfile is immoral anyway--so it seems more than a bit
contradictory to use that exact case to argue against the fix.
In any case, though, assuming that this once-generated heavyweight file is
actually reused multiple times, it *will* be cached...just not for 1-2
seconds. I think the negligible cost of not having caching for
just-created files for a period of 1-2 seconds is more than justified by
the necessity to patch such a serious data corruption and security hole.
(I used two seconds out of an excess of caution, BTW; one would be
sufficient.)
- John
> Whether or not that's so, the fact is that everyone on this list
> appeared to share the same utterly natural assumption that
> "ns_returnfile X" really will return file X--which turns out to be
> untrue solely because of fastpath caching's design limitation. This
> fix resolves that design limitation in any standard circumstance.
use ns_returnfile for static data as it was intended or you will have
problems.
how is this thread still alive.
The patch below is not at all unreasonable as far as your stated goal of
not caching a newly modified file.
BTW, I think that fastpath is not enabled by default, your config has to
have the fastpath config section with the cache parameter set to true.
Also, ns_returnfile is really just an internal C api which has been
exposed to Tcl so that developers can create specific file handlers. In
general it is "dangerous" to serve content from outside of pageroot. But
that just means that the developer has to put a lot of care into
avoiding problems. In other words, ns_returnfile is not really intended
as an end user API. It usually ends up as some more friendly procedure
that handles things like relative paths, security, etc.
tom jackson
All, I've been on vacation or I would have chimed in earlier, but as John's
client and CTO of the company who found the problem (and is now faced with a
fairly extensive and difficult impact assessment to determine whether the
confidentiality and integrity of our customers' data has been compromised),
I find the suggestion that this is not a bug to be utterly baffling.
Perhaps if the procedure in question was called "ns_returnfromcache", I
could see the arguments against the behavior being considered a bug, but the
name of the procedure is "ns_returnfile", and it takes an argument which is
a filename. Our objective in using the procedure was not to return a
dynamic file through the cache, it was to return a dynamically generated
file (which was produced by an exec of an OS-level command) from the
filesystem...and the documentation for the procedure certainly did not
suggest that its functionality did not support this usage.
Obviously we'll work around the problem in the future, but it is
disheartening to find a fairly subtle bug, report it with a reproducible
test case, and be challenged so aggressively on the whether it was a poor
decision to use "ns_returnfile" to...um...return a file.
Eric
__________________________
Eric Larkin
Chief Technology Officer
Arena Solutions
ela...@arenasolutions.com
4100 E. Third Ave.| Suite 300 | Foster City | CA 94404
tel: 650.513.3502 | fax: 650.513.3511
> how is this thread still alive.
I think this bikeshed should be painted blue.
See http://www.bikeshed.com/ if you don't understand this.
-J
> No, you're still not understanding how the patch works.
Ok, I'll admit that I misread it at first, but you're also not
understanding why I'm saying why it will still break.
> I'm surprised you're taking such an all-or-nothing view now,
I don't think I'm taking an all-or-nothing view at all, I just think
your solution isn't the right one.
> given that
> you started out being open to discussion. This patch certainly does
> solve the problem generally--in all but what I'd say are pathological
> cases, and certainly in any standard usage (like the various examples
> I've posted). And it does it by directly addressing the fastpath
> algorithm's reliance on mtime, which has only one-second granularity.
"pathological cases" are exactly the problem. I'm only speaking for
myself of course, but I suspect that at least a few others would agree
that your case is pathological itself, and not at all standard usage.
I can very easily come up with a scenario that breaks your patched
fastpath just as easily as the original, to which you can rightly say,
"but why would you do it that way?". And you would be right.
That is the exact same thing that has been said repeatedly on this
thread to you: why are you doing it that way? You probably have valid
reasons and in any case I'm in no position to question your reasons.
That doesn't make your case any less pathological than some other one.
-J
Eric,
I'm not sure what your qualifications are to determine if it is a bug or not. The author of the code
doesn't seem to think it is a bug. Everyone agrees that the code works as intended. It was no secret
at the time the code was written that the file mtime granularity is one second. When fastpath was added
many years ago, it was documented in the changelogs. There are configuration parameters in the config
file.
I just sent an email responding to John's suggested patch. It is a great suggestion for several reasons,
the most important is that it doesn't change the intended purpose of the cache or the API. As John said
there is no visible impact on the user. I would even go so far as to suggest that the wait time (2 sec)
be added as a configuration parameter. Although the semantics should be discussed.
This patch may fix your initial problem, but it does nothing to fix the broken use of ns_returnfile. If you
are serious about not exposing sensitive information, don't write it to disk as a file. Most security breaches
don't happen by accident. I have outlined how you can avoid the problem using ns_returnfp, _AND_ a
particular series of commands. No single API will serve as some kind of shield of protection, it takes
a lot of effort. Anything involving files opens up a whole series of problems. They are not bugs.
tom jackson
Do it, then. This is the simplest example I've given that exhibits this
bug:
eval exec /some/external/program --output-file $tempfile
ns_returnfile 200 text/plain $tempfile
ns_unlink $tempfile
This is ALL that's required. No external meddling, no munging of file
modification times, nothing else. Three lines of Tcl code. By all means,
show me an example that defeats this patch that's anywhere near as simple
as that.
And far, far more to the point: that's as NATURAL as that. I'd assert
that that example code is perfectly intuitive--and right up to the point
where I pointed out this bug in the first place, it would have been
accepted without remark on this mailing list. *Now*, of course, serving
anything but 5-year old files with ns_returnfile is proof that one should
give up computers for a living, and not recognizing the holy contract we
enter into with AOLserver when we invoke ns_returnfile that our files must
continue to exist for at least one more second is a valid reason to
contemplate suicide to end our worthless existence.
Look: we discovered a serious bug in AOLserver's fastpath caching
mechanism that can cause both data corruption and information
leakage. I've explained that bug carefully, in the face of confusion,
obfuscation, and a continual stream of utterly unnecessary abuse. I've
offered a tested patch that provides a minimal, correct-by-inspection,
user-transparent, massive improvement over the current behavior, despite
my near certainty (based on previous experience) that it'd just lead to
another round of carping, finger-pointing, and refusal to accept that
there's even a problem here in the first place. I've explained why the
patch makes sense (and in fact addresses exactly the limitation that
should have been considered when the fastpath caching mechanism was
initially designed). I've responded to all serious concerns, respectfully
and without returning any of the bile that's been sent my way.
For anyone who's serious about securing your installation, you have the
patch now, and I'd strongly suggest that you apply it to your AOLserver
sources. I'll still respond to serious concerns that don't just rehash
the same excuses, but otherwise I'm done.
- John
I agree with Eric, even though I wrote the original code and was one
of the first to suggest is wasn't a bug. This thread has surprised me
in a few ways:
-- The bug was indeed subtle and curious
-- The debate on dynamic vs. static and underlying assumptions and
performance was well reasoned
-- The name of the command does indicate something it is not in a way
that matters
-- Folks generally underestimated the impact of this bug except for
those affected
-- The direct personal attacks were a bit embarrassing to watch
Overall, seems like:
-- Patching underlying fastpath to be filename-based keys makes sense
if it's confirmed to solve the problem
-- What ns_returnfile does is good for some things if you know what
it's doing
-- Sending dynamic content with the current ns_returnfile isn't a good
idea
For Eric and John, I'd recommend a Tcl-based ns_retunfile wrapper
using open/ns_returnfp/close as a quick first step.
BTW: The underlying cause -- the rapid re-use of inodes -- is indeed
a behavior of at least Linux but not Mac OS/X. Compare:
Linux:
jimbo@ubuntu:~$ rm -f foo ; touch foo ; stat -c "%d.%i" foo ; rm foo ;
touch foo ; stat -c "%d.%i" foo
2049.712963
2049.712963
Mac OS/X:
[JimBook:~] jimbo% rm -f foo ; touch foo ; stat -f "%d.%i" foo ; rm
foo ; touch foo ; stat -f "%d.%i" foo
234881026.4090565
234881026.4090566
I find this very interesting.... But, maybe I'm odd, I know most of
you have probably long since begun to ignore this thread and it should
probably die now...
Cheers,
-Jim
I just think that you're introducing complexity to a common and widely
used infraestructure (with a hack), just to hide the flaws on your side.
I don't like this hack. It's not the correct thing to do, not even if
you make some parameters (like the "timeout") changeable via the
configuration file. It just doesn't feel good to me.
But, of course, you are free to patch your own AOLServer. I can't (and
won't) complain about it.
Someone said that this was not an academic discussion... but I am afraid
that to some extent it is. We've been speaking that the cache can't know
if its contents are good or not, if the changes happen out if its scope,
and it's never notified of it.
But, since Linux version 2.6.13, inotify is into the kernel, and
aolserver can subscribe to a path, so know if that file has been
deleted, modified, or anything else. That's the way a cache can know if
the file has been altered in any way, and it should be marked as dirty.
As easy as that... but I know it is harder than that one-line-patch.
Now you can do whatever you want... as long as you don't ask me to patch
my code with the dirty hack you proposed. I don't like having
AOLServer's SVN version patched with this.
Best Regards,
Juan José
-
Juan José del RÃo |
(+34) 616 512 340 | juan...@simpleoption.com
Simple Option S.L.
Tel: (+34) 951 930 122
Fax: (+34) 951 930 122
http://www.simpleoption.com
inotify isn't available as of Redhat Enterprise
Linux 4, and I'm sure there are other major
distributions that are missing it as well--so
while it may be the best approach eventually it
won't solve the problem right now. Also, not all
platforms use inotify, so adding filesystem
monitoring support would require complex
cross-platform code (which in some cases won't
even be available). So while I'd agree with you
that the eventual (ideal) solution is a major
rewrite of the fastpath caching code to use some
sort of filesystem monitoring technique, that won't work now.
By comparison, the fix I offered will work just
fine on all the platforms that AOLserver runs on,
right now, and it fixes all non-pathological use
cases (i.e., all but ones that are artifically
designed to trip it up). As to whether it's a
hack, possibly, but so is the fastpath
code--that's the whole problem. The patch
corrects a specific flaw in the original code:
that fastpath never should have been caching
files that were modified within the last second,
since the underlying OS mechanism (mtime) doesn't
provide the resolution necessary to distinguish
reliably between two such files. In other words,
the patch is in the same spirit as the original
code--so if you really don't want such ugliness
in your code you should probably just strip out
fastpath caching entirely. :-)
[after 2] would wait 2 milliseconds. [after 2000] or [ns_sleep 2]
might make a difference(?).
Oops, sorry. I thought RedHat Linux had everything an Enterprise
needed ;-)
(Take the above line with a lot of salt. I don't want to discuss about
linux distros... yet ;-)
Then what about adding a "-nocache" parameter to the function? That way
you won't modify the original behaviour...
As far as the solution is not perfect, if we keep pushing up patches,
it'll end being a hell consisting on a pile of patches. In fact, I feel
that checking the filename instead of last modification time is a better
way to "patch" the "if" clause. It's safer to be sure that will not be
collisions by that way.
Anyways, as someone said, I am amazed on the fact of how ext3 reuses the
inode numbers... I didn't know it was so aggresive :-)
Regards, and hope you finally can serve your customers right (no matter
how you solve this problem),
Juan José
-
Juan José del RÃo |
(+34) 616 512 340 | juan...@simpleoption.com
Simple Option S.L.
Tel: (+34) 951 930 122
Fax: (+34) 951 930 122
http://www.simpleoption.com
Your last patch suggestion seems good, not caching something that looks
like a new file is fully in line with the intent of fastpath and
ns_returnfile.
I'm not sure everyone is commenting on this new patch idea, maybe a
previous idea?
Anyway if the cache is for serving static (and likely older than a few
seconds) content, why cache it until it looks static? I think that is
the basic thrust of the patch. It would be impossible to poison the
cache by accident if you wait until a file ages a few seconds before you
stick it in.
But the poison is really self-inflicted. The only data on the server
that can be unintentionally exposed is something that made it into the
cache in the first place. Stuff only makes its way into the cache by
filename. So if your application sends out secret files via
ns_returnfile, you could have a problem, but no long lived secret file
will ever fall into this trap by accident, it would have to cease to
exist (giving up its inode) after being place into cache, then a new
file would have to be created and served. All of this assumes that your
webserver process can read the secret file.
BUT! Please pay attention to this: ns_returnfile is by definition "not
safe" in the context of most webserver APIs. It returns content that is
outside of pageroot. It exposes (to Tcl) an internal API which handles
returning files under pageroot. The reason it is exposed is so that
developers can easily create their own filehandlers and virtual servers.
The internal API handles a handful of annoyingly picky but important and
standard website and HTTP features. It also provides a number of hooks
for customization related to these features.
tom jackson
> I looked at the code a bit closer. The ns_returnfile and
> ns_respond commands both call Ns_ConnReturnFile, the public API to
> the underlying FastPath. It does more than just blast the content
> -- it handles:
>
...
> -- caches, mmap's, or simply opens the fd and sends, chunk by chunk
not directly related to the issue at hand, but if this is being
worked on shouldn't the file returning be handled by sendfile() on
platforms that support it? that'd bypass fastpath, but on linux at
least you'd not be wasting RAM by buffering in both the page cache
and fastpath, and spitting the data back out the connection socket
gets handled entirely in the kernel...
just a thought...
cheers
Russell
I've remained silent on this issue because I didn't want to be accused
of "stifling the community," etc., but I have to agree: the majority of
responses is disappointing. As I see it:
1) This is a defect. Jim has explained that the caching strategy
implemented for fastpath solved a specific problem and may fail in
situations not meant to be solved by the strategy.
2) Attacking John, and by extension, Eric and his company's product, is
poor form, regardless of whether this is or isn't a defect.
3) While I agree that serving short-lived data from a disk file in
response to a dynamic request is a poor strategy, it _should work_
correctly--which is why I agree that this is clearly a defect. If it
works correctly at a cost to performance is a decision that Eric's team
must evaluate and accept or find alternatives to. The trade-off should
not include "program execution correctness" as a consequence.
4) I see the simplest (best?) solution here being a configurable
parameter that controls fastpath's cache key generation. As Jim points
out, one can quickly test whether this would solve the problem at hand
by temporarily #define'ing _WIN32 in the appropriate place. If this
proves successful, we change it from using #ifdef's to regular if()
statements and define a new configuration parameter. End of discussion.
--
Dossy Shiobara | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network | http://panoptic.com/
"He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on." (p. 70)
Overall, a key design assumption for AOLserver has always been that,
while respectable, it's not focused on sending static files (not
withstanding this thread about sending *dynamically created* files).
For example, at AOL the static stuff was always on separate systems --
a specialized "artblaster" (architecturally an AOLserver opposite) and/
or some commercial, geographically distributed, and generally smart
CDN -- leaving AOLserver to mostly ADP tasks.
-Jim
...
> End of discussion.
Accused. Guilty.
----
Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org
However, to let it continue to spin over and over is unproductive. I'd love to hear other solutions other than configurabe cache key strategies or a time-based delay caching strategy, but the time to debate whether this is a defect or not is officially over: it is a defect. Let's find the right technical solution to make fastpath more robust, please.
------Original Message------
From: Don Baccus
Sender: AOLserver Discussion
To: AOLS...@LISTSERV.AOL.COM
ReplyTo: AOLserver Discussion
Sent: Aug 21, 2008 12:25 PM
Subject: Re: [AOLSERVER] Data "corruption" with fastpath caching
...
> End of discussion.
Accused. Guilty.
--
Dossy Shiobara
do...@panoptic.com
I'm not sure if I've mentioned this on the list, but the initial case that
prompted us to discover the bug would not have been helped by this change;
the main difference is that it would have shortened the debugging
efforts. And this change would also alter current behavior in a major
way, by defeating the explicit goal of having hard-linked files served
from the same cache entry. A fix that doesn't resolve the initial problem
and which has undesirable side effects isn't worth pursuing.
The change I offered is the only one that corrects the problem in the
original code and all the other examples I've mentioned, transparently and
without any serious side effects. Simply put, fastpath as designed should
not be caching items that have been modified within the current second,
because mtime's granularity (in combination with inode reuse) doesn't
allow it to distinguish them from other items in the cache. Based on what
Jim's said here, I'm guessing he wasn't thinking much about the mtime
granularity issue because he thought inodes would be more than enough to
ensure uniqueness, but that's not always the case (in fact it's usually
*not* the case, since the two most widely-used filesystems reuse inodes).
In thinking about this further, barring a complete rewrite I don't think
the fastpath caching mechanism *can* be fixed completely--it can only be
improved.
- John
I have responded twice to John's newest patch idea, which is a one line
patch. It appears to completely eliminate any problem with cache
poisoning. It is simple, it doesn't change the semantics of the command
or anything else. It simply works around a known limitation of the stat
mtime granularity.
The only security issue that was exposed was the misuse of
ns_returnfile. All of the data put into cache were entirely under the
control of the AOLserver process. The developer / maintainer of that
process is responsible for everything the process does. ns_returnfile is
an inherently dangerous API, there is no handholding involved. You have
to understand what it is doing and why it exists.
In fact, John even pointed out that the original code which wrote out
the contents of the file reused the same name over and over. Assuming
that you can know that the contents of a file have not changed just
because it has the same name, same mtime and same size is an invalid
assumption, it will always be invalid. All caches have the same
limitation. By definition they are not in sync with the true copy.
Anyone who uses a cache needs to understand this.
So, this is important, John is not interested in the cache, he actually
wants to avoid the cache. So talking about how stuff is stored in the
cache, and under what key, is unimportant for John. He wants to keep his
newly created file from ever getting into the cache.
And this is where he has a point, a very good one. Why put newly created
files into a cache, if the point of the cache is to handle static files?
We can wait for evidence that it is static. In this case, we can wait
until it is a few seconds old, at least. John's patch does exactly this
and nothing more. It is actually a very ingenious change.
There is no difference between the inode and the filename under unix.
Both offer equal opportunity to screw up due to a race condition. It can
still happen even in the patched ns_returnfile. Jim mentioned this.
After a file is stat'ed, the open might find a different (maybe
truncated) file. There is no guarantee that you won't get something
else, especially if you have multiple processes/threads creating files
in an non-synchronized way. It is not part of ns_returnfile to guarantee
that the contents/age of a file remains unchanged during the course of
execution, and when you throw in an external process it is nearly
impossible to come up with any code which can provide that guarantee. If
data integrity is really important to you, don't try to provide it using
named files as temporary storage.
tom jackson
> However, to let it continue to spin over and over is unproductive.
> I'd love to hear other solutions other than configurabe cache key
> strategies or a time-based delay caching strategy, but the time to
> debate whether this is a defect or not is officially over: it is a
> defect. Let's find the right technical solution to make fastpath more
> robust, please.
The solutions I've seen proposed are
* configurable cache key
+ solves the problem for the most common and surprising case
- does not solve the OP's problem
? what to make the default?
* time-delay
+ solves OP's problem
- still easily broken
* flush files from cache with ns_unlink
- easily defeated (by using different command to remove files)
- links mostly unrelated commands
* exclude some paths from caching (e.g., don't cache if the file is
outside of pageroot, or is in some configured list of non-cached
directories, like /tmp)
+ keeps temporary files from taking up cache memory
- still easily broken
- developer needs to know to use excluded paths
* add a -nocache flag to ns_returnfile (or add a new command like
ns_returntmpfile) to indicate that the file shouldn't be cached
+ makes intent clearer
+ keeps temp files out of cache
- bloats api
- doesn't fix existing cases
* change fastpath default to off to be turned on only when desired and
understood
+ simple, effective
- default performance hit
* use system-provided change notifications to flush cache (e.g., inotify)
+ effective, efficient
- non-portable
Did I leave any out?
-J
This is a very good point, actually. Initially I didn't think having the
value be configurable was a good way to go--it should just be set to 1,
since that's the threshold value (yeah, I used 2 out of an excess of
caution, but 1 would have worked fine and would probably have created less
confusion).
But having this as a configurable (something like fastpath.min_cache_age?)
would give people the ability to say: do NOT cache anything unless it's at
least X seconds old. Which actually does seem like a genuinely useful
thing on its own (completely independent of this issue), since it would
allow people to prevent new/"dynamic" files from kicking good data out of
the cache. And enforcing a minimum value of 1 would serve the dual
purpose of resolving this problem as well.
- John
The remaining issue is whether something called "ns_returnfile" which takes a pathname as a parameter should have some guarantee that you will return what at least at some point was the contents of a file with that pathname. It's perfectly acceptable in dealing with caching systems that the cached value could be out of sync, but not that the cached value could be for something entirely different from what you were looking for. Even with the mtime fix there's no guarantee that systems which muck around with mtime (such as tar) won't cause separate files to collide. For a contrived example:
1. tar xf foo.tar (creating two files "a" and "b" with the same size and same mtime)
2. ns_returnfile b
3. Delete files "a" and "b"
4. tar xf foo.tar
5. ns_returnfile b (this could return the contents of "a" because the inode was reused)
I don't think this example violates any of the stated principles of using ns_returnfile for only "static" data. Both "a" and "b" could have completely stable contents and due to some minor issue of system administration (for example) their inodes could end up swapped and the cache poisoned.
So I think we need both fixes, one to eliminate caching unless a certain criterion of "static-ness" has been met, and the other to prevent the cache from returning completely unrelated data. Other caveats about ns_returnfile use still apply, and the documentation should reflect them.
Now the only people this wouldn't satisfy are those who are concerned about pathnames taking up space in the cache or slowing it down. The option has been suggested to make pathname inclusion optional, though I would advise against it unless the configuration option is named in such a way as to indicate its "unsafe"-ness.
Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309
> -----Original Message-----
> From: AOLserver Discussion [mailto:AOLS...@LISTSERV.AOL.COM] On
> Behalf Of Tom Jackson
> Sent: Thursday, August 21, 2008 12:25 PM
> To: AOLS...@LISTSERV.AOL.COM
> Subject: Re: [AOLSERVER] Data "corruption" with fastpath caching
>
Rusty
> what you were looking for. Even with the mtime fix there's no
> guarantee that systems which muck around with mtime (such as tar)
> won't cause separate files to collide. For a contrived example:
I think the best you can do is to use ctime instead of mtime, or maybe
btime on *bsd. You can still run into problems if you have clock skew,
but there's only so much you can account for.
-J
Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309
> -----Original Message-----
> From: AOLserver Discussion [mailto:AOLS...@LISTSERV.AOL.COM] On
> Behalf Of Rusty Brooks
> Sent: Thursday, August 21, 2008 3:25 PM
> To: AOLS...@LISTSERV.AOL.COM
> Subject: Re: [AOLSERVER] Data "corruption" with fastpath caching
>
> I don't have any opinion on the fix, but I think the actual objection
> to
> using the filename in the fix is that this would cause hard links to
> files, which are for all intents and purposes The Same File, to be
> considered different files by fastpath. (Hard links have different
> names, but the same inode)
>
> Rusty
But this is a cache, which is a copy. There is never any way to
guarantee that the content is the same as what is currently on disk,
unless you compare the files directly. Of course that totally negates
the purpose of the cache. If this is what you want, you should disable
the cache completely.
BTW, if you just delete the fastpath config parameters from the config
file fastpath will be disabled, so it is disabled by default right now.
I'm also wondering if the inode/dev key just catches hard links. I think
it also works via indirection with symlinks? Both stat and open follow
symbolic links, so the inode is probably more stable than the filename
on unix.
Anyway, trying to guarantee anything about files is a lost cause.
tom jackson
On Thu, 2008-08-21 at 12:46 -0700, Jeff Rogers wrote:
> Titi Alailima wrote:
>
> > what you were looking for. Even with the mtime fix there's no
> > guarantee that systems which muck around with mtime (such as tar)
> > won't cause separate files to collide. For a contrived example:
>
> I think the best you can do is to use ctime instead of mtime, or maybe
> btime on *bsd. You can still run into problems if you have clock skew,
> but there's only so much you can account for.
Yes -- the original reason for dev/inode on Unix instead of filename
was to reduce memory consumed in the case of a large # of symlinks or
hardlinks to the same file. This was the case for AOL's
digitalcity.com back in 1999. For better or worse, the "AOL" in
AOLserver means AOL was used as the model and we never had an example
of dynamically creating files to be returned by fastpath (this isn't
100% true -- see below).
Now, had we known 10 years ago that inodes could be rapidly reused as
John pointed out, we would have never written the code to use dev/
inode as opposed to filename keys. It simply cannot be strictly
correct given the non-atomic nature of the stat() before the open().
It may have been available as an option, but certainly off by default
and likely undocumented.
So, here's what I'd suggest:
-- Cache by filename key should be the default. This is technically
the correct fix to enable temporary, uniquely named files, to be
returned via ns_returnfile.
-- John's "grace period" code is a clever optimization if fastpath is
being used in this way and could also be an option, default off.
To clarify one point: There is no technical solution to creating temp
files with the same name and avoiding the race condition without
additional synchronization. This is expecting a "private per-thread
view" of the global filesystem namespace which isn't strictly possible
even if some sort of heuristics (such as the grace period trick above)
nearly eliminate the possibility in practice. The Unix file system
has always had atomic features which have traditionally been used to
solve for race conditions using the uniqueness of filenames (via
O_EXCL open/creat flag and atomic rename). It's reasonable to assume
programmers understand and leverage that fact; it's not reasonable to
attempt to solve for cases they don't. You can see some careful
tempfile creation code in Ns_GetTemp in fd.c if interested.
Finally, for those still awake and sober: The "not 100% true" comment
above was that there was one solution at AOL where ADP code would on-
demand create new ADP code. The idea was to do some "hard" work
(query database, fetch from remote, etc.) while leaving other ADP code
to do "easy" stuff on the fly (select a promo, etc.). AOLserver now
has direct support for this with it's new caching stuff in 4.5 but
back in '99 that didn't exists. So, technically this was a case where
we dynamically created code which was later read by ADP (which had the
same dev/inode cache stuff as fastpath). However, this was done
carefully:
-- Tcl-level mutex/condition variables to ensure only one thread did
the "hard" work even if several were interested in the result
-- Careful write to a non .adp extension, unique temp file
-- Atomic rename in place when ready
It was a combination of traditional atomic Unix filesystem semantics
and newer thread synchronization at the Tcl level used to avoid ever
getting some mutant result.
BTW: If you're still interested and awake, check the ADP cache/parse
code -- it has some code to detect modification in place during parse
even though at AOL we would have never allowed that possibility (we'd
use the atomic rename technique describe above). That's one case were
code was carefully written to mitigate poor programming practice we
ourselves would never have allowed (a programmer who had written such
code would have ended up in my office for an uncomfortable chat). I
seem to remember trying for a very long time to exercise the fail
cases as things rarely would trip up even under high load but we made
sure the code was in there anyway because it was closer to correct and
blindly writing a file in place is sadly quite common.
-Jim
Yep. I think that aspect of the issue has been getting lost--it's not
just about getting stale data from a given file, but getting data from an
entirely *different* file, which I'd agree violates any reasonable
expectation of a caching system.
>So I think we need both fixes, one to eliminate caching unless a certain
>criterion of "static-ness" has been met, and the other to prevent the
>cache from returning completely unrelated data.
You make a good point. The fix I suggested is intended to make fastpath
caching behave well in all cases where time proceeds monotonically (which
I'd guess is by far the most common use case, especially for a web server
that's unlikely to call utilities like tar/rsync that would munge file
times). In fact that's essentially what I mean by "pathological". But to
protect against the time-travelling scenarios causing fastpath to confuse
two different files, you'd have to use the filename-as-key fix as well.
Using the filename as a key is a bummer for sites like AOL that want the
cache to respect hard links when serving data from the cache, though. It
won't matter for us either way, since we're not that concerned about the
cache in the first place--just about the ways in which it might return the
wrong data.
(In case anyone's wondering: Arena's web application pre-caches large
amounts of data when it starts, and web servers aren't put into rotation
until they've finished this pre-caching step. The web servers are also
bathed in RAM, so it's unlikely that fastpath caching is offering much of
a performance boost over the Linux page cache--especially when that time
is set against the overhead of database access.)
- John
Here is an example using Tcl level commands (although a hidden use of
mutex/condition vars:
# save datastore to file
# Note: a data.tmp file is created. If writing to this
# succeeds, this is renamed to the data file, hopefully
# atomically replacing it
# Note2: the data.tmp file is not a lock file, it is used
# to avoid a half written file in the event of power loss
# or process exit.
proc ::datastore::save { store } {
....
set LockID [lock $store]
if {[catch {
set FD [open ${dataFileroot}.tmp w+]
fconfigure $FD -translation binary -encoding binary
puts $FD "$out"
close $FD
file rename -force ${dataFileroot}.tmp $dataFileroot
} err ]} {
unlock $store $LockID
error $err "::datastore::save error saving store $store"
}
unlock $store $LockID
}
But it is very difficult (impossible) to safely read/write files unless
you can synchronize access (you need cooperation) and/or use atomic
file operations (serialize access). The above example uses both.
John,
Your patch fixes this issue as best it can be fixed. The issue that Titi
is addressing cannot be fixed. With your patch you can be sure of the
following:
1. All cache entries are unique. You can't create and cache two files
with the same inode and mtime and have both be over 1 second old.
2. When an inode is reused (by the filesystem) the associated file mtime
will be larger than the mtime of the cached entry. The new entry will
replace the old entry.
3. If several files point to the same inode, updating this entry updates
all of them (there is only one cache entry if the inode is used.)
4. If a file changes on disk, and it is less than 2 sec old, it is
served directly, skipping the cache.
tom jackson
To clarify as well: the original code didn't involve a race condition--it
was effectively serialized, as though it were like this snippet:
foreach object $objects {
eval exec /some/external/program --output-file $tempfile --object
$object
ns_returnfile 200 text/plain $tempfile
}
(As I mentioned to you, this was basically a batch process driven by a
client-side Java applet making sequential HTTP requests to an
AOLserver-driven API web server, one transaction at a time, with the
results being returned by ns_returnfile on the server. Also, the temp
file in question was in a secure directory.)
So the bug can (and did) manifest itself with serialized access.
>So, here's what I'd suggest:
>
>-- Cache by filename key should be the default. This is technically
>the correct fix to enable temporary, uniquely named files, to be
>returned via ns_returnfile.
>-- John's "grace period" code is a clever optimization if fastpath is
>being used in this way and could also be an option, default off.
Again, this wouldn't have resolved Arena's initial problem; the original
code would still have hit the bug, and it would have been just as
difficult to detect that that was happening (though slightly easier to
debug). That's why I'd recommend having the mtime workaround code active
with a default of 1--otherwise people running a default config of
AOLserver will still be open to the same issue.
(That's my only stake in this, BTW; Arena is already using the mtime fix
and will continue to do so, but I'd really rather not have someone else
run into this issue in the future.)
In thinking about it today I realized that it's useful to think about the
four scenarios in which the bug can currently occur (which I believe
partition the bug space):
1) Monotonically increasing time with a different filename
2) Monotonically increasing time with the same filename
3) Time travelling with a different filename
4) Time travelling with the same filename
("Time travelling" here means mucking with the mtime artifically, a la
rsync, and "filename" means fully-qualified filename.)
The mtime workaround resolves scenarios 1 and 2, and using the filename as
the cache key resolves scenarios 1 and 3. Nothing suggested so far
resolves scenario 4--and in fact I don't think it's possible to resolve
scenario 4 short of a major rewrite of the code (like Juan's suggestion of
using inotify or similar functionality). So combining both fixes resolves
all of the resolvable issues.
For reference: the bug occurred in scenario 2, and subsequently in
scenario 1. And the security implications apply to all four scenarios,
though they're arguably worst in scenarios 1 and 3.
- John
So, this clarifies to me:
-- cache by filename key is correct and good for most cases and should
be on by default
-- the "grace period" is a clever solution for the rapid-changing,
same filename case you described and deserves to be on by default
-- ns_returnfile shouldn't use the cache but already does -- some
config and/or command flags can be added to toggle the behavior
I'll update the code with the options above.
-Jim
Can I ask why the filename is important for the cache key? With the
cache delay, the inode/dev + *time + size should do it all.
In fact, I finally understood the difference between mtime and ctime, if
any change is made, it should be the change to ctime.
Why ctime? ctime is unique in that it isn't something that can be set by
user level programs. It changes whenever the content of the file changes
or permissions, owners, or any of the metadata of the file.
So, for instance, if someone replaces a file with an identical file, the
ctime would still change. If you check the ctime, you can also skip
checking the size.
But none of this has to do with the filename. On Unix, filenames are
especially squishy. both stat and open follow symbolic links, and you
could therefore use a symlink to point to different files over time, but
the files could have identical mtime, atime, size, owner, etc. using
stat/open and comparing with what is in cache based upon the filename
would never detect this slight of hand. This is why the inode is most
important on unix. On windows, you can't do this, so filenames are safe.
My recommendation for changes are these:
1. use John's ingenious 1-2 second delay, optionally allow config of the
cache delay.
2. base the age for the above on the ctime, not the mtime. ctime is
always younger or the same age as mtime, and covers changes to metadata,
and is immune from easy modification.
Maybe: remove the fastpath config options from the basic config file, if
it is even there, other example configs could be set to cache = off.
Happy Weekend everyone.
tom jackson