[erlang-questions] Reading a file before it has been completely written

41 views
Skip to first unread message

David Mercer

unread,
Mar 7, 2012, 12:25:33 PM3/7/12
to erlang-q...@erlang.org

While this isn’t an Erlang-specific question, the problem arises from my using Richard Carlsson’s file_monitor (https://github.com/richcarl/eunit/blob/master/src/file_monitor.erl), which sends messages when a file or directory is changed.  I have found that it is not unusual to get a message about a new file before the file has been completely written.

 

I had thought that by doing a file:open(Filepath, [read]) and making sure I got back {ok, _} rather than {error, eacces} I could avoid those cases, but that approach has failed for me: this morning, I got back {ok, _}, but the file was not completely written yet.

 

Another approach I tried was to attempt to obtain an exclusive lock (I think it was file:open(Filepath, [read, exclusive])), but in my testing I came across the bizarre scenario where I would copy a file into the monitored directory, the file_monitor would send the message, but the Erlang process that does the file-open didn’t see it, so created the file (the documentation says it creates the file if it does not exist), and then I got a message in my window where I was copying that the file already exists, do I want to overwrite it.

 

Another approach I tried was renaming the file to itself.  All my tests indicated that that approach would work, but all my tests also indicated that just doing the file:open(Filepath, [read]) would work, too, so I chose it, as it seemed cleaner.  I could revert to the rename approach, but I’m not even sure now that that will work.

 

I imagine others among us have encountered this issue, and rather than reinvent the wheel, what is the favored approach to handling this issue?

 

Cheers,

 

David Mercer

 

 

Tony Rogvall

unread,
Mar 7, 2012, 12:39:41 PM3/7/12
to David Mercer, erlang-q...@erlang.org

- Create and open a file with a temporary name.
-  Write the file content.
- Close the file.
- Rename the file to the name/place you want.

works ?
 
/Tony

On 7 mar 2012, at 18:25, David Mercer wrote:

While this isn’t an Erlang-specific question, the problem arises from my using Richard Carlsson’s file_monitor(https://github.com/richcarl/eunit/blob/master/src/file_monitor.erl), which sends messages when a file or directory is changed.  I have found that it is not unusual to get a message about a new file before the file has been completely written.
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

"Installing applications can lead to corruption over time. Applications gradually write over each other's libraries, partial upgrades occur, user and system errors happen, and minute changes may be unnoticeable and difficult to fix"



Richard Carlsson

unread,
Mar 7, 2012, 12:54:08 PM3/7/12
to erlang-q...@erlang.org
On 03/07/2012 06:25 PM, David Mercer wrote:
> While this isn’t an Erlang-specific question, the problem arises from my
> using Richard Carlsson’s /file_monitor/

Hey, a user! I haven't had any reports about this module before (and the
fact that it's still in my development branch of eunit is more of a
historical accident; it's not shipped with OTP). I don't know of any
real issues with it though.

In this case, I think the problem is just the underlying file system
semantics. I presume it's Linux, and in Unix:y file systems a file can
be seen to exist and can be opened for reading as soon as it has been
created. Trying to fiddle with exclusive locks is probably always going
to have corner cases. The only techniques you can trust to practically
always work and be portable across file systems are directory creation
and file renaming. So what Tony suggested is likely to be the best
solution: create the file under another name or in a separate directory,
and when it's completely written, rename it.

/Richard

David Mercer

unread,
Mar 7, 2012, 1:06:05 PM3/7/12
to Tony Rogvall, erlang-q...@erlang.org

I’m not the one writing the file.  I’m the one reading it.  I have no control over the writing.

 

Thanks for the thoughts, though.

 

DBM

David Mercer

unread,
Mar 7, 2012, 1:08:44 PM3/7/12
to Richard Carlsson, erlang-q...@erlang.org
On Wednesday, March 07, 2012, Richard Carlsson wrote:

> Hey, a user! I haven't had any reports about this module before (and
> the
> fact that it's still in my development branch of eunit is more of a
> historical accident; it's not shipped with OTP). I don't know of any
> real issues with it though.

It works fine. If you know of a better one, I'd be OK switching. This was
just the one that came up when I Googled.

> In this case, I think the problem is just the underlying file system
> semantics. I presume it's Linux, and in Unix:y file systems a file can
> be seen to exist and can be opened for reading as soon as it has been
> created. Trying to fiddle with exclusive locks is probably always going
> to have corner cases. The only techniques you can trust to practically
> always work and be portable across file systems are directory creation
> and file renaming. So what Tony suggested is likely to be the best
> solution: create the file under another name or in a separate
> directory,
> and when it's completely written, rename it.

I might go back to my renaming approach, which also had no failures during
testing. I just attempt to rename the file to itself. If it fails, I try
again 5 seconds later.

Thanks.

Cheers,

DBM

Daniel Luna

unread,
Mar 7, 2012, 1:09:32 PM3/7/12
to erlang-q...@erlang.org
(Sorry, Richard for getting this multiple times, forgot to include the list)

Just as an added caveat to what Richard mentions: you probably want to
make sure that the temp file is created on the same file system as
your test directory.  A rename between file systems is really just a
copy and delete, with the exact same problems you had previously
(depending on OS I guess), while a rename within a file system is a
link/unlink with the contents being untouched.

/Daniel

Richard Carlsson

unread,
Mar 7, 2012, 1:30:06 PM3/7/12
to David Mercer, erlang-q...@erlang.org
On 03/07/2012 07:08 PM, David Mercer wrote:
> On Wednesday, March 07, 2012, Richard Carlsson wrote:
>
>> Hey, a user! I haven't had any reports about this module before (and
>> the
>> fact that it's still in my development branch of eunit is more of a
>> historical accident; it's not shipped with OTP). I don't know of any
>> real issues with it though.
>
> It works fine. If you know of a better one, I'd be OK switching. This was
> just the one that came up when I Googled.

No, I think it's pretty good and I don't know any other portable
implementation. I'd just like to add optional inotify support (and
whatever it's called on Windows) on supported platforms. Right now it
only works by polling. Which is actually good enough in a lot of cases.

/Richard

David Mercer

unread,
Mar 8, 2012, 9:23:30 AM3/8/12
to Richard Carlsson, erlang-q...@erlang.org
On Wednesday, March 07, 2012, I wrote:
> On Wednesday, March 07, 2012, Richard Carlsson wrote:
>
> > Trying to fiddle with exclusive locks is probably always
> going
> > to have corner cases. The only techniques you can trust to
> practically
> > always work and be portable across file systems are directory
> creation
> > and file renaming. So what Tony suggested is likely to be the best
> > solution: create the file under another name or in a separate
> > directory,
> > and when it's completely written, rename it.
>
> I might go back to my renaming approach, which also had no failures
> during
> testing. I just attempt to rename the file to itself. If it fails, I
> try
> again 5 seconds later.

For closure here, I went back to my approach of attempting to rename the
file to itself before reading it. I'll let y'all know if I encounter any
more corner cases.

Reply all
Reply to author
Forward
0 new messages