Trying to open a file for writing that is already open for writing
should result in an exception.
It's all too easy to accidentally open a shelve for writing twice and
this can lead to hard to track down database corruption errors.
Amir
PAolo
--
if you have a minute to spend please visit my photogrphy site:
http://mypic.co.nr
But if this is usually a serious bug, shouldn't an exception be raised?
Amir
executing "rm -rf /" via subprocess is usually also a bad idea. So? No
language can prevent you from doing such mistake. And there is no way to
know if a file is opened twice - it might that you open the same file
twice via e.g. a network share. No way to know that it is the same file.
Diez
I'd have a fixed field at the beginning of the field that can hold the
hostname process number, and access time of a writing process, togeher
with a sentinal value that means "no process has access to the file".
A program would:
1. wait a random time.
2. open for update the file
3. read the locking data
4. If it is already being used by another process then goto 1.
5. write the process's locking data and time into the lock field.
6 Modify the files other fields.
7 write the sentinal value to the locking field.
8. Close and flush the file to disk.
I have left what to do if a process has locked the file for too long as
a simple exercise for you ;-).
- Paddy.
The scenario I have in mind is something like this:
def f():
db=shelve.open('test.db', 'c')
# do some stuff with db
g()
db.close()
def g():
db=shelve.open('test.db', 'c')
# do some stuff with db
db.close()
I think it would be easy for python to check for this problem in
scenarios like this.
Amir
You are requesting a general solution for a very particular problem. As
I pointed out, that solution is unlikely to work reliably - if not
infeasible at all.
If you really have problems as the above, use a custom wrapper for
shelve that prevents _you_ from making that mistake.
Diez
The right solution is file locking. Unfortunately, the Python
tandard distribution doesn't have a portable file lock, but you
can do it on Unix and Win NT or better. See:
http://mail.python.org/pipermail/python-win32/2005-February/002957.html
and/or
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/65203.
--
--Bryan
> Trying to open a file for writing that is already open for writing
> should result in an exception.
MS Windows seems to do something similar, and it pisses me off
no end. Trying to open a file and read it while somebody else
has it open for writing causes an exception. If I want to open
a file and read it while it's being writtent to, that's my
business.
Likewise, if I want to have a file open for writing twice,
that's my business as well. I certainly don't want to be
hobbled to prevent me from wandering off in the wrong direction.
> It's all too easy to accidentally open a shelve for writing
> twice and this can lead to hard to track down database
> corruption errors.
It's all to easy to delete the wrong element from a list. It's
all to easy to re-bind the wrong object to a name. Should
lists be immutable and names be permanently bound?
--
Grant Edwards grante Yow! I'm in a twist
at contest!! I'm in a
visi.com bathtub! It's on Mars!! I'm
in tip-top condition!
How often do you need to open a file multiple times for writing?
As a high-level language, Python should prevent people from corrupting
data as much as possible.
Amir
#!/usr/bin/env python
import fcntl, shelve, time, bsddb
from os.path import exists
class fLocked:
def __init__(self, fname):
if exists(fname):
#verify it is not corrupt
bsddb.db.DB().verify(fname)
self.fname = fname
self.have_lock = False
self.db = shelve.open(self.fname)
self.fileno = self.db.dict.db.fd()
def __del__(self):
try: self.db.close()
except: pass
def aquire_lock(self, timeout = 5):
if self.have_lock: return True
started = time.time()
while not self.have_lock and (time.time() - started < timeout):
try:
fcntl.flock(self.fileno, fcntl.LOCK_EX + fcntl.LOCK_NB)
self.have_lock = True
except IOError:
# wait for it to become available
time.sleep(.5)
return self.have_lock
def release_lock(self):
if self.have_lock:
fcntl.flock(self.fileno, fcntl.LOCK_UN)
self.have_lock = False
return not self.have_lock
def get(self, key, default = {}):
if self.aquire_lock():
record = self.db.get(key, default)
self.release_lock()
else:
raise IOError, "Unable to lock %s" % self.fname
return record
def set(self, key, value):
if self.aquire_lock():
self.db[key] = value
self.release_lock()
else:
raise IOError, "Unable to lock %s" % self.fname
if __name__ == '__main__':
fname = 'test.db'
dbs = []
for i in range(2): dbs.append(fLocked(fname))
print dbs[0].aquire_lock()
print dbs[1].aquire_lock(1) #should fail getting flock
dbs[0].release_lock()
print dbs[1].aquire_lock() #should be able to get lock
--Tim
That doesn't really work; you have still have a race condition.
Locking the file is the good solution, but operating systems
vary in how it works. Other reasonable solutions are to re-name
the file, work with the renamed version, then change it back
after closing; and to use "lock files", which Wikipedia explains
near the bottom of the "File locking" article.
--
--Bryan
Windows is actually much more sophisticated. It does allows shared
write access; see the FILE_SHARE_WRITE option for Win32's CreateFile.
You can also lock specific byte ranges in a file.
--
--Bryan
> How often do you need to open a file multiple times for writing?
Not very often, but I don't think it should be illegal. That's
probably a result of being a 25 year user of Unix where it's
assumed that the user knows what he's doing.
> As a high-level language, Python should prevent people from
> corrupting data as much as possible.
For somebody with a Unix background it seems overly restrictive.
--
Grant Edwards grante Yow! Youth of today! Join
at me in a mass rally
visi.com for traditional mental
attitudes!
> On Sun, 27 Aug 2006 14:41:05 -0000, Grant Edwards <gra...@visi.com>
> declaimed the following in comp.lang.python:
>
>>
>> MS Windows seems to do something similar, and it pisses me off
>> no end. Trying to open a file and read it while somebody else
>> has it open for writing causes an exception. If I want to open
>> a file and read it while it's being writtent to, that's my
>> business.
>>
> Though strangely, Windows seems to permit one to make a COPY of that
> open file, and then open that with another application...
Yes, so long as the file hasn't been opened so as to deny reading you can
open it for reading, but you do have to specify the sharing mode. Microsoft
too follow the rule that "Explicit is better than implicit."
> How often do you need to open a file multiple times for writing?
How often do you write code that you don't understand well enough to
fix? This issue is clearly a problem within *your* application.
I'm curious how you could possibly think this could be solved in any
case. What if you accidentally open two instances of the application?
How would Python know? You are asking Python to perform an OS-level
operation (and a questionable one at that).
My suggestion is that you use a real database if you need concurrent
access. If you don't need concurrent access then fix your application.
> As a high-level language, Python should prevent people from corrupting
> data as much as possible.
"Data" is application-specific. Python has no idea how you intend to
use your data and therefore should not (even if it could) try to protect
you.
Regards,
Cliff