fs.write Data Munging

Andy Kent

unread,

Nov 5, 2010, 11:52:40 AM11/5/10

to nodejs

We are running node 0.2.4 in a production system.

We write line seperated JSON objects to disk for every request using
something like

fs.write(fd, JSON.stringify(obj)+'\n')

The server receives ~ 100 req/sec through a single process

On the whole this works without an issue but around every 3,000 lines
written to the output file the first 3 characters of a line will be
munged into garbage.

Any ideas? is this a bug? or is just blindly calling fs.write()
without a callback inherently dangerous?

Ben Noordhuis

unread,

Nov 5, 2010, 12:00:48 PM11/5/10

to nod...@googlegroups.com

On Fri, Nov 5, 2010 at 16:52, Andy Kent <andy...@me.com> wrote:
> Any ideas? is this a bug? or is just blindly calling fs.write()
> without a callback inherently dangerous?

Yes, in so far that you won't know if the write actually happened.
Still, you shouldn't be seeing garbage. Would it be possible to
distill a test case from your application?

Andy Kent

unread,

Nov 5, 2010, 12:42:28 PM11/5/10

to nodejs

knowing whether the write happened or not is a bit of a non-issue as
this write is basically just a log, if the odd write fails then there
is not really any other action to be taken and we can safely ignore it
in my particular use case.

I now have a test case for the issue we are seeing. It can be
recreated quite reliably on the two machines I have here. There is
write.js which will create a file with errors. I have also included
read.js which will read the file and point out where the errors are.

https://gist.github.com/664408

The error only seems to become an issue at very high rates of
concurrent fs.write()'s.
If you don't see the issue at first then you can try increasing the
loop size in write.js and rerunning.

Thanks for your help.

On Nov 5, 4:00 pm, Ben Noordhuis <i...@bnoordhuis.nl> wrote:

Ryan Dahl

unread,

Nov 5, 2010, 12:41:09 PM11/5/10

to nod...@googlegroups.com

You should use fs.WriteStream if possible. It's possible for
fs.write() to return saying that it had only written a few of the
bytes - and that you must call it again. fs.WriteStream takes care of
this for you for append-only files (or at least it's a bug if it does
not)

Isaac Schlueter

unread,

Nov 5, 2010, 1:13:57 PM11/5/10

to nod...@googlegroups.com

Also, if you're getting a lot of requests, are you writing to the same
file? If so, I would expect that you may see issues with interleaved
chunks when using WriteStream.

--i

> --
> You received this message because you are subscribed to the Google Groups "nodejs" group.
> To post to this group, send email to nod...@googlegroups.com.
> To unsubscribe from this group, send email to nodejs+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/nodejs?hl=en.
>
>

Andy Kent

unread,

Nov 5, 2010, 1:41:50 PM11/5/10

to nodejs

@Issac: All writes are from the same process, how would interleaved
chunks occur and how would you recommend avoiding them?

@Ryan: Thanks for the heads up, I've re-written my code to use
fs.WriteStream.
This seems to have solved the issue for me but it is a massive
performance hit, I'm not really sure what is going on but the stats
are below. Is there any more performant way to do this? or am I doing
something silly?

# fs.WriteStream ascii

real 7m38.853s
user 7m7.013s
sys 0m26.723s

# fs.WriteStream utf8

real 7m43.411s
user 7m8.971s
sys 0m27.186s

# fs.write

real 0m3.542s
user 0m3.224s
sys 0m1.288s

WriteStream code...

var fs = require('fs');
var stream = fs.createWriteStream('output.json', { 'flags': 'a',
'encoding': 'ascii', 'mode': 0666});
var i = 0;
while (i < 500000) stream.write(JSON.stringify({ foo: 'bar',
iteration: i++ })+"\n");

On Nov 5, 5:13 pm, Isaac Schlueter <i...@izs.me> wrote:
> Also, if you're getting a lot of requests, are you writing to the same
> file? If so, I would expect that you may see issues with interleaved
> chunks when using WriteStream.
>
> --i
>
>
>
> On Fri, Nov 5, 2010 at 09:41, Ryan Dahl <r...@tinyclouds.org> wrote:

Andy Kent

unread,

Nov 5, 2010, 2:08:19 PM11/5/10

to nodejs

As a followup to this, what's the best way to gracefully close a
writable stream? We do log rotation and switch out file descriptors on
SUGHUP but I was worried about calling stream.end() or
stream.destroy() on something that may or may not still have stuff in
a buffer. Ideally anything left in the buffer would be flushed to disk
and then the stream closed and destroyed, whilst new items flow into a
newly created Write Stream. What's the best way to handle closing the
old stream given this use case?

Isaac Schlueter

unread,

Nov 5, 2010, 2:30:02 PM11/5/10

to nod...@googlegroups.com

On Fri, Nov 5, 2010 at 10:41, Andy Kent <andy...@me.com> wrote:
> @Issac: All writes are from the same process, how would interleaved
> chunks occur and how would you recommend avoiding them?

Well, it would happen if you were creating a WriteStream for each
request, rather than having all the requests use the same WriteStream
object.

Consider this:

http.createServer(function (req, res) {
var s = fs.createWriteStream("file.log")
s.write(JSON.stringify({foo:"bar"}) + "\n")
s.end()
res.writeHead(200, {})
res.end("ok")
}).listen(80)

The timeline goes:
1. server starts.
2. client A connects, creates write stream, calls write()
3. '{"fo' gets written to disk, the rest is queued. WriteStreamA
returns false (meaning not all data was flushed).
4. client B connects, creates write stream, calls write()
5. '{"foo":"b' gets written to disk, the rest is queued. WriteStreamB
returns false.
6. WriteStreamA writes 'oo":"bar"}\n' to disk, emits "drain"
7. WriteStreamB writes 'ar"}\n' to disk, emits "drain"

Resulting file:
{"fo{"foo":"bao":"bar"}
ar"}

Alternatively, if the client is uploading the data, and that's what
you're writing, you may run into problems as well if you don't buffer
the data.

var s = fs.createWriteStream("file.log")
http.createServer(function (req, res) {
req.on("data", function (chunk) {
if (!s.write(chunk)) {
req.pause()
s.on("drain", function D () {
req.resume()
s.removeListener("drain", D)
})
}
})
res.writeHead(200, {})
res.end("ok")
})

In short, you can't assume that any write succeeded, or succeeded
completely, and you also can't assume that the client will upload its
entire payload in a single "data" event.

--i

Isaac Schlueter

unread,

Nov 5, 2010, 2:33:38 PM11/5/10

to nod...@googlegroups.com

Also, regarding the performance numbers, I suspect that fs.write is so
fast because you're not waiting around for the verification that
something was written, or that all the bytes got written. You may
well only be writing the first chunk to disk, and then dropping all
the others on the floor. fs.write() does absolutely no buffering for
you, it just calls your cb function with an integer telling you how
many bytes were flushed to disk (which might be zero.)

--i

Jorge

unread,

Nov 5, 2010, 5:44:53 PM11/5/10

to nod...@googlegroups.com

I think that as each call to fs.write() ends into an i/o thread, if you do enough write()s to the same fd fast enough, you may easily end up writing to the same fd from a number (I'm not sure how many) of concurrent threads (IOW: in parallel), and these writes may even happen in an unexpected order. It does not sound very good, no, not really.

To avoid the problem, you could setup a 'currentlyWriting' flag, and buffer any attempts to write() when it's set : https://gist.github.com/664843

$ time node wrong.js

real 0m12.367s

user 0m11.305s

sys 0m5.910s

$ cat output.json | grep -ace ^[^{]

40 // # of lines with errors

$ cat output.json | grep -ace ^{

499960

Some writes even went unordered ! :

$ cat output.json | grep -ae ^[^{]

..

eration":263977}

eration":264225}

eration":264473}

n?eration":263729}

..

$ rm output.json

$ time node better.js

real 0m3.430s

user 0m2.642s

sys 0m0.230s

$ cat output.json | grep -ace ^{

500000

$ cat output.json | grep -ace ^[^{]

0

--

Jorge.

billywhizz

unread,

Nov 5, 2010, 7:06:04 PM11/5/10

to nodejs

wong.js works absolutely fine for me on 0.3.0 release on Fedora12.
everything is in order and no bad munging...

billywhizz

unread,

Nov 5, 2010, 8:38:20 PM11/5/10

to nodejs

as far as i can see writestream does not do any checking of the
returns from fs.write() either so number of bytes written should
always be checked either as the return value from fs.writeSync or in
the callback from fs.write() or writeStream.write().

is that correct?

Shouldn't writeStream be node'js buffered IO library on top of the
POSIX non-buffered io functions?

Jorge

unread,

Nov 6, 2010, 9:36:12 AM11/6/10

to nod...@googlegroups.com, nodej...@googlegroups.com

Hi billywhizz,

I think that we've found a bug here.

In short, I think that the problem is that upon return from the fs.write() call, no reference to the buffer passed to the C library's write call is retained anywhere. And we all know what the fate of unreferenced JS objects is...

It seems to me that this bug may be present in other methods of the fs module, and perhaps in other modules too. Who knows ?

It's a nasty bug because one can't tell for sure when it's going to bite.

I'm working on a patch. This could be a way to fix it :

0001-fs-dangling-buffer-bug-hack.patch

Ben Noordhuis

unread,

Nov 6, 2010, 10:11:13 AM11/6/10

to nodej...@googlegroups.com, nod...@googlegroups.com

On Sat, Nov 6, 2010 at 14:36, Jorge <jo...@jorgechamorro.com> wrote:
> In short, I think that the problem is that upon return from the fs.write() call, no reference to the buffer passed to the C library's write call is retained anywhere. And we all know what the fate of unreferenced JS objects is...
> It seems to me that this bug may be present in other methods of the fs module, and perhaps in other modules too. Who knows ?

Yep, that's it. Good catch, Jorge.

fs.read() / Read() also suffers from this but the other functions in
node_file.cc appear to be safe - libeio makes a copy of the string
arguments.

This should probably be solved by turning the Local<Buffer> into a
Persistent<Buffer> for the duration of the async operation.

billywhizz

unread,

Nov 7, 2010, 12:33:28 AM11/7/10

to nodejs

i've been digging around in the 3.0 code to try to understand this and
it seems confusing to me that the net.Stream object handles the return
value from write() while the fs.fileStream object doesn't seem to.

in net.Stream._writeOut, the following code deals with the case where
write() returns less than expected length:

if (charsWritten != data.length) {
//debug("couldn't fit " + (data.length - charsWritten) + " bytes
into the pool\n");
// Unshift whatever didn't fit onto the buffer
this._writeQueue.unshift(data.slice(charsWritten));
this._writeQueueEncoding.unshift(encoding);
this._writeWatcher.start();
queuedData = true;
}

but in fs.writeStream.flush there is no code to check the return value
of the call to write and this is left to the callee to handle in the
callback. if the calllee doesn't check it and take appropriate action,
then those bytes get dropped...

wouldn't it be better if these two stream objects worked the same way?

maybe i'm missing something in the code - please point me in the right
direction if i am...

Jorge

unread,

Nov 7, 2010, 6:15:12 PM11/7/10

to nod...@googlegroups.com

Ok. A persistent handle for the buffer object.

The callback's persistent handle is made by a call to cb_persist(callback) which is then passed to eio_write() into eio_req-> data. Upon completion, libev calls After() which cb_unwrap()s req->data and cb_destroy()s it (the persistent handle).

The question is: now we'd need pass 2 handles into eio_req-> data : the one for the callback plus another one for the buffer, so that in After() we can destroy them.

What I'd do is:

malloc() a struct to hold both, pass its pointer into eio_req-> data, then in After() destroy the handles (eio_req-> data.cb and eio_req-> data.buf) and free() the struct's *.

Does it sound ~ good ?
How would you do it ?

Thanks,
--
Jorge.

Sent from my fridge

Isaac Schlueter

unread,

Nov 7, 2010, 10:12:52 PM11/7/10

to nod...@googlegroups.com

I just wrote a quick patch to do the following:

1. instead of passing a persistent function handle, it passes a
persistent object, which has a "cb" member.
2. pass the object to the callback
3. hang the buffer on the object as well.

Then I wrote a test to do this:

1. write a kb of "a" 1000 times to a file.
2. on each cb, assert that the object's buffer member is the same as
the buffer I wrote.
3. walk through each buffer in the cb, and assert that every byte is 97.
4. after the last write, walk through the file contents and verify
that every byte is 97.

The write cb assert passes.
Bad data ends up in the file.

This is absolutely, definitely, without any shadow of a doubt, *not* a
result of the buffer being GC'ed while the request is on the thread
pool.

When I stack the writes so that each one waits for the previous to
complete first, then you end up with a few failures, but no bad data.

I think this is an issue with too many eio_write calls getting piled up, maybe?

--i

0001-Demonstrating-that-this-is-not-GC.patch

Ben Noordhuis

unread,

Nov 8, 2010, 5:14:03 AM11/8/10

to nod...@googlegroups.com

On Mon, Nov 8, 2010 at 04:12, Isaac Schlueter <i...@izs.me> wrote:
> The write cb assert passes.
> Bad data ends up in the file.

Seems to work for me, I get no bad data (repeatedly, of course).

$ perl -ne '/^a{80}\n$/ or print' out.txt | wc -l
0

I did notice another bug where Node quits before libeio's task queue
has drained so you end up with a out.txt that is anywhere between 0
and 82944 bytes.

Andy Kent

unread,

Nov 8, 2010, 6:15:07 AM11/8/10

to nodejs

Massive thanks to everyone for looking into this for me. Sorry I seem
to have stumbled onto a pretty epic issue here but we have invested
pretty heavily in using node for a new project and this is a bit of a
showstopper for us. We are currently only running a small percentage
of traffic through node but hopefully by the end of this week we'll be
pushing around 20 million requests per day through it so it's a fairly
reasonable volume.

I do have a follow-up question around write streams. I was wondering
how best to safely close a write heavy stream. Given that data may or
may not be waiting to be written to the stream. I'm considering a few
scenarios but none seem correct. Any advice would be appreciated.

1) end then destroy stream immediately?

// lot's of write's from various places then...
stream.end()
stream.destroy()

2) wait for a drain? but will this always occur?

// lot's of write's from various places then...
stream.on('drain', function() { stream.destroy() })
stream.end()

3) just end the stream and reassign the var, does the file descriptor
get cleaned up eventually?

// lot's of write's from various places then...
stream.end()
stream = null

This requirement comes about as we re-use one writeable stream for
logging throughout several requests and then on a SIGHUP we rotate to
a new stream. Opening the new stream is trivial I'm just a bit stumped
on the best way to close the old stream?

Additionally, once they are completed is it likely that the fixes to
fs.write() will make it back into the "stable" 0.2 branch? or has all
work on that stopped in favour of 0.3?

Thanks again!
Andy

Jorge

unread,

Nov 8, 2010, 2:06:14 PM11/8/10

to nod...@googlegroups.com

On 08/11/2010, at 04:12, Isaac Schlueter wrote:
> I just wrote a quick patch to do the following:
>
> 1. instead of passing a persistent function handle, it passes a
> persistent object, which has a "cb" member.
> 2. pass the object to the callback
> 3. hang the buffer on the object as well.
>
> Then I wrote a test to do this:
>
> 1. write a kb of "a" 1000 times to a file.
> 2. on each cb, assert that the object's buffer member is the same as
> the buffer I wrote.
> 3. walk through each buffer in the cb, and assert that every byte is 97.
> 4. after the last write, walk through the file contents and verify
> that every byte is 97.
>
> The write cb assert passes.
> Bad data ends up in the file.

Yeah, I see bad data here too:

$ cat -v out.txt | grep -aEv '^a{1,}$'
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

But, the tests we have been doing before ( wrong.js ) were *appending* to the file, and now you instead of 'a' are opening it 'w'.

That there's (was) a problem with the dangling buffers when appending, is for sure ( try passing just noop to see it reappear ). And there's another problem when writing-a-lot-in-parallel to a file opened 'w'. That's 2 different problemas, I think.

I've touched your patch a bit.

The idea is that you can now pass ( *from JS* ) either a function or an object, like so:

binding.write(fd, buffer, offset, length, position, {cb: callback || noop, buf: buffer}); // An object

or like so:

binding.write(fd, buffer, offset, length, position, callback || noop); // A function

If there's any reference ( in addition to the callback ) you want/need/would like to refcount+1 for the duration of the call, just refer to it in the object. If not, just pass the function alone.

In any case ASYNC_CALL will obj_persist whatever you pass.

Later, upon arrival to After(), it checks whether the ref was a function or an object. If it was a function the steps to get the callback are ~= before, if it's an object, we expect the callback to be in the .cb property.

The main difference with your patch is that you can refcount+1 as many things as you want, and that the decision is made in JSLand.

I've rewritten the test to checks the syntax of each and every line, and even the order. But does it in 'a' mode. There's not any garbage, but the write()s are not always executed in the same order as they're requested. Is there anything we can do to enforce the proper order ?

Please take a look at it. What do you think ?

0001-my-fs.write-buffers-bug-patch.patch

Ryan Dahl

unread,

Nov 16, 2010, 7:57:34 PM11/16/10

to nod...@googlegroups.com

On Fri, Nov 5, 2010 at 9:42 AM, Andy Kent <andy...@me.com> wrote:
> knowing whether the write happened or not is a bit of a non-issue as
> this write is basically just a log, if the odd write fails then there
> is not really any other action to be taken and we can safely ignore it
> in my particular use case.
>
> I now have a test case for the issue we are seeing. It can be
> recreated quite reliably on the two machines I have here. There is
> write.js which will create a file with errors. I have also included
> read.js which will read the file and point out where the errors are.
>
> https://gist.github.com/664408
>
> The error only seems to become an issue at very high rates of
> concurrent fs.write()'s.
> If you don't see the issue at first then you can try increasing the
> loop size in write.js and rerunning.
>
> Thanks for your help.

44fa89cda815e749cda9493ea313ef3679689a61 Fix applied in v0.2
cea3a95f9fe6faaa504542d4f03349739d08a0f3 Fix applied to master

(Two patches were needed because of the Buffer changes in master)

These fixes will be in the v0.2.5 and v0.3.1 releases (should be out
later today)

Ryan Dahl

unread,

Nov 17, 2010, 1:01:57 AM11/17/10

to nod...@googlegroups.com

I believe there is a bug in OSX (a race condition in the file system)
which causes this to fail. If you can get
test/simple/test-fs-sir-writes-alot.js to fail on a non-macintosh
system - please report.

Jorge

unread,

Nov 17, 2010, 4:48:27 AM11/17/10

to nod...@googlegroups.com

On 17/11/2010, at 07:01, Ryan Dahl wrote:
>
> I believe there is a bug in OSX (a race condition in the file system)
> which causes this to fail. If you can get
> test/simple/test-fs-sir-writes-alot.js to fail on a non-macintosh
> system - please report.

In the meantime here's a fix

0001-threaded-writes-bug-fix-for-apples.patch

Jorge Chamorro

unread,

Nov 18, 2010, 8:59:00 AM11/18/10

to nod...@googlegroups.com

On 17/11/2010, at 19:58, Jorge Chamorro wrote:

>> <0001-threaded-writes-bug-fix-for-apples.patch>
>>
>> https://github.com/ry/node/pull/425
>> https://github.com/xk/node/commits/appleWriteBug
>
> Indeed, it's reproducible not only in node: AppleWriteBug.c : https://gist.github.com/703744
> Does it work in the Linuxes ?

It's a bug that's not a bug, or something...

http://lwn.net/Articles/180387/
--
Jorge.

Jorge Chamorro

unread,

Nov 18, 2010, 10:43:20 AM11/18/10

to nod...@googlegroups.com

And, Ryan, given http://lwn.net/Articles/180396/ , I think the mutex/lock patch -only for macs- would be the way to go. What do you think ?
--
Jorge.

Ryan Dahl

unread,

Nov 21, 2010, 11:40:55 PM11/21/10

to nod...@googlegroups.com

The issue discussed in LWN is a similar issue but not the same. I do
not expect write() to be thread safe - I expect it to write to
whatever position in the file. That is, the file is of undefined
length after the operation in the test. However, I do not expect
multiple writes to introduce random data into the file as is happening
on macintosh.

I think your lock patch is reasonable. I will test it.

Jorge

unread,

Nov 22, 2010, 7:45:52 AM11/22/10

to nod...@googlegroups.com

The problem on the macs ISTM is that some write()s properly update the file pointer but end up writing the data somewhere else, leaving 'holes' of strings of zeroes of data.length length.

> I think your lock patch is reasonable. I will test it.

commit 6aa92d5289996780834e Yahooooooo ! One more contribution by Jorge. (•jorge: is happy)

Could you please revisit/reconsider the über-awesome sys^H^H^Hutil.inspect patch thing, too ?
--
Jorge.

Ben Noordhuis

unread,

Nov 22, 2010, 8:01:03 AM11/22/10

to nod...@googlegroups.com

On Mon, Nov 22, 2010 at 13:45, Jorge <jo...@jorgechamorro.com> wrote:
> The problem on the macs ISTM is that some write()s properly update the file pointer but end up writing the data somewhere else, leaving 'holes' of strings of zeroes of data.length length.

Shouldn't this be solved by always, always, always calling pwrite()?
Using write() in a concurrent environment is asking for trouble.

Jorge

unread,

Nov 22, 2010, 8:32:02 AM11/22/10

to nod...@googlegroups.com

Istm that this :

pwrite(fd, data, data.length, fd.filepointer);
fd.filepointer+= data.length;

(without any mutex) isn't thread-safe (not at all), but this instead :

write(fd, data, data.length);

ought to be thread-safe (and Linus has taken care -personally- to make it so in ~2006).
It seems that Apple instead has not, at least not yet.
--
Jorge.

Reply all

Reply to author

Forward