Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

locking file access OR exchanging messages bw Matlab Instances

419 views
Skip to first unread message

Max

unread,
Jul 18, 2010, 12:20:05 AM7/18/10
to
Hi,

I'm writing a code which would allow multithreading on a cluster. For that I start several instances of Matlab: one of them is the "main" one, which put a request for jobs into a message file, "JobSubmit", while the rest of the instances (threads) are supposed to wait for until the jobs are posted. The number of jobs is supposed to be larger than the number of threads, and the the threads should be able to leave a message of what job number they are currently running, so that other thread would not take them.

Here comes the problem. I need to be able to lock the access to the file, while one of the threads is reading it and is leaving its marks, to exclude the possibility of two threads accessing the same file at once and thus confusing/corrupting the operation. I'm wondering if there is a way to implement it Matlab? Any advises on how it could be realized otherwise are much appreciated!

Thank you,
Max

Jan Simon

unread,
Jul 19, 2010, 4:08:05 AM7/19/10
to
Dear Max,

> I need to be able to lock the access to the file, while one of the threads is reading it and is leaving its marks, to exclude the possibility of two threads accessing the same file at once and thus confusing/corrupting the operation.

If you search for "file lock thread" in this newsgroup, you find:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/279510

Perhaps this helps, Jan

Max

unread,
Jul 19, 2010, 12:32:05 PM7/19/10
to
Dear Jan,

Thank you for the link. Your question in the thread was indeed close to but more restrictive than mine. As I understood from reading your thread, the conclusion to your conundrum was to implement a database solution, right? Unfortunately, I don't think it will work for me - I run the program under Condor Management system and don't think I can work with databases there. But the lack of "simpler" solutions in your thread seems to indicate that locking might not be an option.

For my purposes, though, it would be sufficient to simply check whether the common file is being accessed at the moment, and if so - try again in a second. I considered using bash command lsof to obtain such information, but, unfortunately, it works slow and requires root access, which I don't have...

"Jan Simon" <matlab.T...@nMINUSsimon.de> wrote in message <i21155$kq9$1...@fred.mathworks.com>...

Jan Simon

unread,
Jul 19, 2010, 4:27:06 PM7/19/10
to
Dear Max,

This was suggested by Ashish:
http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/FileLock.html

And another idea was:
create a file with a tempname
if renaming (MOVEFILE) to a specific lock file works, the caller get access
else wait for some time and try again
delete the lock file
While renaming can be triggered to work only, if the destination does not exist, the creation/opening of a file cannot be done thread-safe, as far as I understood.

Kind regards, Jan

Max

unread,
Jul 19, 2010, 5:34:03 PM7/19/10
to
> And another idea was:
> create a file with a tempname
> if renaming (MOVEFILE) to a specific lock file works, the caller get access
> else wait for some time and try again
> delete the lock file
> While renaming can be triggered to work only, if the destination does not exist, the creation/opening of a file cannot be done thread-safe, as far as I understood.

That's a great idea! But did you mean that the renaming (moving) can be triggered to work only if the ORIGIN exists?

One could use the presence of the destination file as a break on the access of other files:
fileattrib('test.lock','-w') % change the attribute so that if the file exists, the use of it as a destination will cause an error (status of copyfile==0). Windows platforms?
status = copyfile('test.lock','test.lock.lock') % copy the file. if cannot - try again later
<-- handing of a file -->
delete('test.lock.lock') % at the end: remove the roadblock

It seems to me that these two variants have the same chances to work...

Jan Simon

unread,
Jul 19, 2010, 5:56:04 PM7/19/10
to
Dear Max,

> > While renaming can be triggered to work only, if the destination does not exist, the creation/opening of a file cannot be done thread-safe, as far as I understood.
>
> That's a great idea! But did you mean that the renaming (moving) can be triggered to work only if the ORIGIN exists?

No, I meant destination.

> One could use the presence of the destination file as a break on the access of other files:
> fileattrib('test.lock','-w') % change the attribute so that if the file exists, the use of it as a destination will cause an error (status of copyfile==0). Windows platforms?
> status = copyfile('test.lock','test.lock.lock') % copy the file. if cannot - try again later

COPYFILE(Source, Dest)
Then FILEATTRIB(Source, '-w') protects the source from writing - but is this useful?
The existence of a non-protected destination file is enough to let MOVEFILE or COPYFILE return a 0.

Jan

Max

unread,
Jul 19, 2010, 6:51:04 PM7/19/10
to
Dear Jan,

> > One could use the presence of the destination file as a break on the access of other files:
> > fileattrib('test.lock','-w') % change the attribute so that if the file exists, the use of it as a destination will cause an error (status of copyfile==0). Windows platforms?
> > status = copyfile('test.lock','test.lock.lock') % copy the file. if cannot - try again later
>
> COPYFILE(Source, Dest)
> Then FILEATTRIB(Source, '-w') protects the source from writing - but is this useful?
> The existence of a non-protected destination file is enough to let MOVEFILE or COPYFILE return a 0.

No, you can try it. The attribute of the file is preserved when using "copyfile" command (at least in Linux). Thus the destination file will also be protected against copying into it, and copyfile will produce status == 1 (successful copying) only if the destination file does not exist.

> > > While renaming can be triggered to work only, if the destination does not exist, the creation/opening of a file cannot be done thread-safe, as far as I understood.
> >
> > That's a great idea! But did you mean that the renaming (moving) can be triggered to work only if the ORIGIN exists?
>
> No, I meant destination.

Sorry, but then I don't get it. If the DESTINATION file (the file you copy TO) exists, but it's not set to be read-only, Matlab just copies the content of the SOURCE file into the destination file, overwriting it's content. I couldn't find a way to tell Matlab to move (or copy) the file only if the destination does not exist. On the other hand, if the ORIGIN does not exist then there is nothing to move and it will cause problems. So, the solution would look smth like this:

% try to move the file until succeed:
while ~movefile(lockSource,lockDest)
pause(.1)
end
<-- file processing -->
movefile(lockDest,lockSource) % make the lock file available for the next process

Right?

Max

Jan Simon

unread,
Jul 27, 2010, 5:48:04 PM7/27/10
to
Dear Max,

> while ~movefile(lockSource,lockDest)
> pause(.1)
> end
> <-- file processing -->
> movefile(lockDest,lockSource)
>

> Right?

Right, this works (as far as I can see).
I thought of another mechanism:
file1 = tempname; fclose(fopen(file1, 'w'));
lockfile = 'D:\Temp\locked';
while ~movefile(file1, lockfile)
pause(0.1);
end
% Now file has been moved to lockfile.
% Another instance cannot do this again:
file2 = tempname; fclose(fopen(file2, 'w'));
disp(movefile(file2, lockfile)) % *must* be 0 until:
...
delete(lockfile);
% Now other instances can move another file to lockfile.

So you toggle the name of the file, I push the file to the lock and delete it afterwards. I cannot see a strong advantage for one of the solutions.
As far as I can see both methods are robust, but in the former thread (meantioned already) it was stated, that the problem has not been solved for decades of years. What do I miss?

Jan

Max

unread,
Jul 28, 2010, 2:51:04 PM7/28/10
to
> So you toggle the name of the file, I push the file to the lock and delete it afterwards. I cannot see a strong advantage for one of the solutions.
Agree. It looks like it's mostly a matter of taste which method to use.

> As far as I can see both methods are robust, but in the former thread (meantioned already) it was stated, that the problem has not been solved for decades of years. What do I miss?

If you mean, why others didn't adopt the methods, I don't know... Works perfectly fine for me.

Thanks,
Max

Walter Roberson

unread,
Jul 29, 2010, 2:48:42 PM7/29/10
to

If you examine the unix definition of "rename",
http://www.opengroup.org/onlinepubs/000095399/functions/rename.html
you will see that it does not exactly match the functionality of Matlab's
movefile() according to "help movefile", the contents of which differ from
"doc movefile":

MOVEFILE Move file or directory.
[STATUS,MESSAGE,MESSAGEID] = MOVEFILE(SOURCE,DESTINATION,MODE) moves the
file or directory SOURCE to the new file or directory DESTINATION. Both
SOURCE and DESTINATION may be either an absolute pathname or a pathname
relative to the current directory. When MODE is used, MOVEFILE moves SOURCE
to DESTINATION, even when DESTINATION is read-only. The DESTINATION's
writable attribute state is preserved. See NOTE 1.


In particular, the Unix definition has it that the destination is removed and
then the renaming happens, a process that does not preserve or examine any
attributes such as the "writable" attribute.

Examining this, we see that Matlab's movefile() cannot be implemented
atomically in Unix, and is thus open to race conditions.

Unix's rename() system call is atomic (according to POSIX.1-1990), but the
assumption made in saying that it is atomic is that any shared file systems
ensure the atomaticity against multiple accesses (possibly from different
nodes), and that is a guarantee that SMB between dissimilar systems doesn't
even try to make, and which NFSv2 doesn't try to make, and which NFSv3 tries
to make but real implementations tend to fail at.


The Unix definition of rename() says,

"If the old argument points to the pathname of a file that is not a directory,
the new argument shall not point to the pathname of a directory. If the link
named by the new argument exists, it shall be removed and old renamed to new.
In this case, a link named new shall remain visible to other processes
throughout the renaming operation and refer either to the file referred to by
new or old before the operation began."

Note the possibility there that other processes shall continue to see either
the old file or the new file while the rename is taking place. This makes it
dodgy to write code without race conditions.


As Matlab's movefile() cannot be implemented atomically in underlying
operating system semantics, the implication is that to do a proper rename
requires a lock of some sort -- but if a lock of some sort existed, you would
be using _that_ instead of trying to fudge things by using movefile().

David Portabella

unread,
Aug 10, 2010, 2:28:24 PM8/10/10
to
On Jul 18, 6:20 am, "Max " <nikitchmPub...@gmail.com> wrote:
> Hi,
>
> I'm writing a code which would allow multithreading on a cluster.For that I start several instances of Matlab: one of them is the "main" one, which put a request for jobs into a message file, "JobSubmit",while the rest of the instances (threads) are supposed to wait for until the jobs are posted. The number of jobs is supposed to be larger than the number of threads, and the the threads should be able to leave a message of what job number they are currently running, so that other thread would not take them.

>
> Here comes the problem. I need to be able to lock the access to the file, while one of the threads is reading it and is leaving its marks, to exclude the possibility of two threads accessing the same file at once and thus confusing/corrupting the operation. I'm wondering if there is a way to implement it Matlab? Any advises on how it could be realized otherwise are much appreciated!
>
> Thank you,
> Max


See a workaround here:
http://stackoverflow.com/questions/3451343/automically-writing-a-file-in-matlab/3452143#3452143


Regards,
David

Walter Roberson

unread,
Aug 10, 2010, 3:00:08 PM8/10/10
to
David Portabella wrote:
> On Jul 18, 6:20 am, "Max " <nikitchmPub...@gmail.com> wrote:

>> I'm writing a code which would allow multithreading on a cluster.

> See a workaround here:
> http://stackoverflow.com/questions/3451343/automically-writing-a-file-in-matlab/3452143#3452143

No, as I showed in detail before, Matlab's movefile() *cannot* be atomic on
Unix, and that the reality is that rename() cannot be guaranteed to be atomic
on any of the common shared file systems such as would be in use for the
cluster the original poster asked about.

Trying to do this is like trying to define "simultanity" on two different
clocks moving at different speeds.

James O'Connell

unread,
Oct 31, 2010, 8:20:04 AM10/31/10
to
I have used lock-files for years to allow multiple MATLAB sessions to pull jobs from a list. The solution involves a shared lock-file and data (job) file. First, each MATLAB session is given an ID. Since they are not likely to be spawned simultaneously, their start time to millisecond accuracy should ensure unique IDs.

To access the data file, a MATLAB session first waits until the lock file doesn't exist, then attempts to create the lock-file and write its ID. It then waits for a given amount of time and checks to see if the lock-file still contains its ID. If it does, it removes a job from the data file then deletes the lock-file. If it doesn't, then another session has obviously written the lock-file a fraction of a second after it and it will concede and try again.

A complete implementation for lock and data file access is as follows:

% Lock the job list
lockSuccess = 0;
while ~lockSuccess
% Wait for lock-file to disappear
fprintf('Checking job-list availability...');
while exist([DIR_JOBS JLCK],'file')
pause(0.1);
end
fprintf(' Available.\n');
% Attempt to create a lock
fprintf('Attempting to secure job-list lock file...');
fid = fopen([DIR_JOBS JLCK],'w');
fprintf(fid,'%s',myID);
fclose(fid);
pause(1);
% Was I successful?
if strcmp(textread([DIR_JOBS JLCK],'%s'),myID)
lockSuccess = 1;
fprintf(' Successful.\n');
else
fprintf(' Unsuccessful, trying again.\n');
end
end
% Grab the next job
fprintf('Pulling job from stack...');
load([DIR_JOBS 'matchlist']);
if isempty(matchlist)
unix(['rm ' DIR_JOBS JLCK]);
fprintf(' No more jobs to complete. Exiting.\n');
return;
end
nextjob = matchlist(1);
if length(matchlist) == 1
matchlist = [];
else
matchlist = matchlist(2:end);
end
save([DIR_JOBS 'matchlist'],'matchlist');
fprintf(' Job assigned.\n');
% Remove the job lock
unix(['rm ' DIR_JOBS JLCK]);
fprintf('Lock file deleted.\n\n');

My jobs take a few minutes to complete, so the 0.1 and 1 second pauses do not impact efficiency. You could safely lower these though. I have run this script for days and never experienced a deadlock, albeit with only 4 simultaneous MATLAB sessions.

I hope this helps,
James


"Max " <nikitch...@gmail.com> wrote in message <i1tvdl$6h$1...@fred.mathworks.com>...

Josh Porter

unread,
Feb 23, 2012, 11:51:17 AM2/23/12
to
I tried this code with 80 MATLAB instances running on ten 8-core computers connected to a common (NFS) filesystem. It brought the file server to its knees - it crashed so badly that it wouldn't boot afterwards. So this may work well with a couple instances of MATLAB, but be warned that it doesn't scale well.

-- Josh
0 new messages