Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Rename file without overwriting existing files

2,690 views
Skip to first unread message

Steve D'Aprano

unread,
Jan 29, 2017, 9:49:54 PM1/29/17
to
This code contains a Time Of Check to Time Of Use bug:

if os.path.exists(destination)
raise ValueError('destination already exists')
os.rename(oldname, destination)


In the microsecond between checking for the existence of the destination and
actually doing the rename, it is possible that another process may create
the destination, resulting in data loss.

Apart from keeping my fingers crossed, how should I fix this TOCTOU bug?



--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

Chris Angelico

unread,
Jan 29, 2017, 10:27:23 PM1/29/17
to
On Mon, Jan 30, 2017 at 1:49 PM, Steve D'Aprano
<steve+...@pearwood.info> wrote:
> This code contains a Time Of Check to Time Of Use bug:
>
> if os.path.exists(destination)
> raise ValueError('destination already exists')
> os.rename(oldname, destination)
>
>
> In the microsecond between checking for the existence of the destination and
> actually doing the rename, it is possible that another process may create
> the destination, resulting in data loss.
>
> Apart from keeping my fingers crossed, how should I fix this TOCTOU bug?

The Linux kernel (sorry, I don't know about others) provides a
renameat2() system call that has the option of failing if the
destination exists. However, I can't currently see any way to call
that from CPython. Seems like an excellent feature request - another
keyword-only argument for os.rename(), like the directory file
descriptors.

ChrisA

MRAB

unread,
Jan 29, 2017, 10:45:49 PM1/29/17
to
On 2017-01-30 03:27, Chris Angelico wrote:
> On Mon, Jan 30, 2017 at 1:49 PM, Steve D'Aprano
> <steve+...@pearwood.info> wrote:
>> This code contains a Time Of Check to Time Of Use bug:
>>
>> if os.path.exists(destination)
>> raise ValueError('destination already exists')
>> os.rename(oldname, destination)
>>
>>
>> In the microsecond between checking for the existence of the destination and
>> actually doing the rename, it is possible that another process may create
>> the destination, resulting in data loss.
>>
>> Apart from keeping my fingers crossed, how should I fix this TOCTOU bug?
>
> The Linux kernel (sorry, I don't know about others) provides a
> renameat2() system call that has the option of failing if the
> destination exists. However, I can't currently see any way to call
> that from CPython. Seems like an excellent feature request - another
> keyword-only argument for os.rename(), like the directory file
> descriptors.
>
On Windows it raises FileExistsError if the destination already exists.

shutil.move, on the other hand, replaces if the destination already exists.

Cameron Simpson

unread,
Jan 29, 2017, 11:34:05 PM1/29/17
to
On 30Jan2017 13:49, Steve D'Aprano <steve+...@pearwood.info> wrote:
>This code contains a Time Of Check to Time Of Use bug:
>
> if os.path.exists(destination)
> raise ValueError('destination already exists')
> os.rename(oldname, destination)
>
>
>In the microsecond between checking for the existence of the destination and
>actually doing the rename, it is possible that another process may create
>the destination, resulting in data loss.
>
>Apart from keeping my fingers crossed, how should I fix this TOCTOU bug?

For files this is a problem at the Python level. At the UNIX level you can play
neat games with open(2) and the various O_* modes.

however, with directories things are more cut and dry. Do you have much freedom
here? What's the wider context of the question?

Cheers,
Cameron Simpson <c...@zip.com.au>

Steve D'Aprano

unread,
Jan 30, 2017, 5:16:39 AM1/30/17
to
The wider context is that I'm taking from 1 to <arbitrarily huge number>
path names to existing files as arguments, and for each path name I
transfer the file name part (but not the directory part) and then rename
the file. For example:

foo/bar/baz/spam.txt

may be renamed to:

foo/bar/baz/ham.txt

but only provided ham.txt doesn't already exist.

Peter Otten

unread,
Jan 30, 2017, 5:40:23 AM1/30/17
to
Google finds

http://stackoverflow.com/questions/3222341/how-to-rename-without-race-conditions

and from a quick test it appears to work on Linux:

$ echo foo > foo
$ echo bar > bar
$ python3
Python 3.4.3 (default, Nov 17 2016, 01:08:31)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> def rename(source, dest):
... os.link(source, dest)
... os.unlink(source)
...
>>> rename("foo", "baz")
>>> os.listdir()
['bar', 'baz']
>>> rename("bar", "baz")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in rename
FileExistsError: [Errno 17] File exists: 'bar' -> 'baz'


Jussi Piitulainen

unread,
Jan 30, 2017, 5:55:54 AM1/30/17
to
It doesn't seem to be documented. I looked at help(os.link) on Python
3.4 and the corresponding current library documentation on the web. I
saw no mention of what happens when dst exists already.

Also, creating a hard link doesn't seem to work between different file
systems, which may well be relevant to Steve's case. I get:

OSError: [Errno 18] Invalid cross-device link: [snip]

And that also is not mentioned in the docs.

Wolfgang Maier

unread,
Jan 30, 2017, 6:25:30 AM1/30/17
to
On 01/30/2017 03:49 AM, Steve D'Aprano wrote:
> This code contains a Time Of Check to Time Of Use bug:
>
> if os.path.exists(destination)
> raise ValueError('destination already exists')
> os.rename(oldname, destination)
>
>
> In the microsecond between checking for the existence of the destination and
> actually doing the rename, it is possible that another process may create
> the destination, resulting in data loss.
>
> Apart from keeping my fingers crossed, how should I fix this TOCTOU bug?
>

There is a rather extensive discussion of this problem (with no good
cross-platform solution if I remember correctly):

https://mail.python.org/pipermail/python-ideas/2011-August/011131.html

which is related to http://bugs.python.org/issue12741

Wolfgang

Jon Ribbens

unread,
Jan 30, 2017, 8:38:06 AM1/30/17
to
On 2017-01-30, Jussi Piitulainen <jussi.pi...@helsinki.fi> wrote:
> It doesn't seem to be documented. I looked at help(os.link) on Python
> 3.4 and the corresponding current library documentation on the web. I
> saw no mention of what happens when dst exists already.
>
> Also, creating a hard link doesn't seem to work between different file
> systems, which may well be relevant to Steve's case. I get:
>
> OSError: [Errno 18] Invalid cross-device link: [snip]
>
> And that also is not mentioned in the docs.

Nor *should* either of those things be mentioned in the Python docs.

A lot of the functions of the 'os' module do nothing but call the
underlying OS system call with the same name. It would not only be
redundant to copy the OS documentation into the Python documentation,
it would be misleading and wrong, because of course the behaviour may
vary slightly from OS to OS.

Peter Otten

unread,
Jan 30, 2017, 8:55:33 AM1/30/17
to
However, the current Python version of link() is sufficiently different from
<https://linux.die.net/man/2/link>, say, to warrant its own documentation.

Peter Otten

unread,
Jan 30, 2017, 9:00:27 AM1/30/17
to
Jussi Piitulainen wrote:

> Peter Otten writes:
>
>> Steve D'Aprano wrote:

>>> The wider context is that I'm taking from 1 to <arbitrarily huge number>
>>> path names to existing files as arguments, and for each path name I
>>> transfer the file name part (but not the directory part) and then rename
>>> the file. For example:
>>>
>>> foo/bar/baz/spam.txt
>>>
>>> may be renamed to:
>>>
>>> foo/bar/baz/ham.txt
>>>
>>> but only provided ham.txt doesn't already exist.
>>
>> Google finds
>>
>> http://stackoverflow.com/questions/3222341/how-to-rename-without-race-conditions
>>
>> and from a quick test it appears to work on Linux:
>
> It doesn't seem to be documented.

For functions with a C equivalent a look into the man page is usually
helpful.

> I looked at help(os.link) on Python
> 3.4 and the corresponding current library documentation on the web. I
> saw no mention of what happens when dst exists already.
>
> Also, creating a hard link doesn't seem to work between different file
> systems, which may well be relevant to Steve's case.

In his example above he operates inside a single directory. Can one
directory spread across multiple file systems?

Jon Ribbens

unread,
Jan 30, 2017, 9:12:52 AM1/30/17
to
On 2017-01-30, Peter Otten <__pet...@web.de> wrote:
> Jon Ribbens wrote:
>> A lot of the functions of the 'os' module do nothing but call the
>> underlying OS system call with the same name. It would not only be
>> redundant to copy the OS documentation into the Python documentation,
>> it would be misleading and wrong, because of course the behaviour may
>> vary slightly from OS to OS.
>
> However, the current Python version of link() is sufficiently different from
><https://linux.die.net/man/2/link>, say, to warrant its own documentation.

What are you referring to here? As far as I can see, the current
Python implementation of link() just calls the underlying OS call
directly.

Chris Angelico

unread,
Jan 30, 2017, 9:18:53 AM1/30/17
to
On Tue, Jan 31, 2017 at 12:58 AM, Peter Otten <__pet...@web.de> wrote:
>> I looked at help(os.link) on Python
>> 3.4 and the corresponding current library documentation on the web. I
>> saw no mention of what happens when dst exists already.
>>
>> Also, creating a hard link doesn't seem to work between different file
>> systems, which may well be relevant to Steve's case.
>
> In his example above he operates inside a single directory. Can one
> directory spread across multiple file systems?

Yep. Try unionfs.

... make ourselves some scratch space ...
$ mkdir space modifier
$ dd if=/dev/zero of=space.img bs=4096 count=65536
$ mkfs space.img
$ sudo mount space.img space
$ dd if=/dev/zero of=modifier.img bs=4096 count=1024\
$ mkfs modifier.img
$ sudo mount modifier.img modifier

... put some content into the base directory ...
$ sudo -e space/demo.txt

... and now the magic:
$ unionfs modifier=RW:space=RO joiner/
$ cd joiner

At this point, you're in a directory that is the union of the two
directories. One of them is read-only, the other is read/write. It is
thus possible to view a file that you can't hard-link to a new name,
because the new name would have to be created in the 'modifier' file
system, but the old file exists on the 'space' one.

ChrisA

Peter Otten

unread,
Jan 30, 2017, 9:38:38 AM1/30/17
to
The current signature differs from that of link()

os.link(src, dst, *, src_dir_fd=None, dst_dir_fd=None, follow_symlinks=True)


but it looks like you are right in so far as link() is still called by
default:

if ((src_dir_fd != DEFAULT_DIR_FD) ||
(dst_dir_fd != DEFAULT_DIR_FD) ||
(!follow_symlinks))
result = linkat(src_dir_fd, src->narrow,
dst_dir_fd, dst->narrow,
follow_symlinks ? AT_SYMLINK_FOLLOW : 0);
else
#endif /* HAVE_LINKAT */
result = link(src->narrow, dst->narrow);


Jussi Piitulainen

unread,
Jan 30, 2017, 9:40:26 AM1/30/17
to
Peter Otten writes:

> Jussi Piitulainen wrote:
>
>> Peter Otten writes:
>>
>>> Steve D'Aprano wrote:
>
>>>> The wider context is that I'm taking from 1 to <arbitrarily huge number>
>>>> path names to existing files as arguments, and for each path name I
>>>> transfer the file name part (but not the directory part) and then rename
>>>> the file. For example:
>>>>
>>>> foo/bar/baz/spam.txt
>>>>
>>>> may be renamed to:
>>>>
>>>> foo/bar/baz/ham.txt
>>>>
>>>> but only provided ham.txt doesn't already exist.
>>>
>>> Google finds
>>>
>>> http://stackoverflow.com/questions/3222341/how-to-rename-without-race-conditions
>>>
>>> and from a quick test it appears to work on Linux:
>>
>> It doesn't seem to be documented.
>
> For functions with a C equivalent a look into the man page is usually
> helpful.

Followed by a few test cases to see what Python actually does, at least
in those particular test cases, I suppose. Yes.

But is it a bug in Python if a Python function *doesn't* do what the
relevant man page in the user's operating system says? Or whatever the
user's documentation entry is called. For me, yes, it's a man page.

>> I looked at help(os.link) on Python
>> 3.4 and the corresponding current library documentation on the web. I
>> saw no mention of what happens when dst exists already.
>>
>> Also, creating a hard link doesn't seem to work between different file
>> systems, which may well be relevant to Steve's case.
>
> In his example above he operates inside a single directory. Can one
> directory spread across multiple file systems?

Hm, you are right, he does say he's working in a single directory.

But *I'm* currently working on processes where results from a batch
system are eventually moved to another directory, and I have no control
over the file systems. So while it was interesting to learn about
os.link, I cannot use os.link here; on the other hand, I can use
shutil.move, and in my present case it will only accidentally overwrite
a file if I've made a programming mistake myself, or if the underlying
platform is not working as advertised, so I'm in a different situation.

[- -]

Terry Reedy

unread,
Jan 30, 2017, 10:00:39 AM1/30/17
to
On 1/30/2017 8:58 AM, Peter Otten wrote:
> Jussi Piitulainen wrote:

>> It doesn't seem to be documented.
>
> For functions with a C equivalent a look into the man page is usually
> helpful.

Man pages do not exist on Windows. I suspect that there are more
individual Python programs on Windows than *nix. I am more sure of this
as applied to beginners, most of whom have no idea what a 'man' page is
(sexist docs? ;-).

--
Terry Jan Reedy

Peter Otten

unread,
Jan 30, 2017, 10:14:38 AM1/30/17
to
Chris Angelico wrote:

> On Tue, Jan 31, 2017 at 12:58 AM, Peter Otten <__pet...@web.de> wrote:
>>> I looked at help(os.link) on Python
>>> 3.4 and the corresponding current library documentation on the web. I
>>> saw no mention of what happens when dst exists already.
>>>
>>> Also, creating a hard link doesn't seem to work between different file
>>> systems, which may well be relevant to Steve's case.
>>
>> In his example above he operates inside a single directory. Can one
>> directory spread across multiple file systems?
>
> Yep. Try unionfs.
>
> ... make ourselves some scratch space ...
> $ mkdir space modifier
> $ dd if=/dev/zero of=space.img bs=4096 count=65536
> $ mkfs space.img
> $ sudo mount space.img space
> $ dd if=/dev/zero of=modifier.img bs=4096 count=1024\
> $ mkfs modifier.img
> $ sudo mount modifier.img modifier
>
> ... put some content into the base directory ...
> $ sudo -e space/demo.txt
>
> ... and now the magic:
> $ unionfs modifier=RW:space=RO joiner/
> $ cd joiner
>
> At this point, you're in a directory that is the union of the two
> directories. One of them is read-only, the other is read/write. It is
> thus possible to view a file that you can't hard-link to a new name,
> because the new name would have to be created in the 'modifier' file
> system, but the old file exists on the 'space' one.

Interesting example, thanks!

Jon Ribbens

unread,
Jan 30, 2017, 10:41:58 AM1/30/17
to
On 2017-01-30, Peter Otten <__pet...@web.de> wrote:
> Jon Ribbens wrote:
>> On 2017-01-30, Peter Otten <__pet...@web.de> wrote:
>>> However, the current Python version of link() is sufficiently different
>>> from
>>><https://linux.die.net/man/2/link>, say, to warrant its own documentation.
>>
>> What are you referring to here? As far as I can see, the current
>> Python implementation of link() just calls the underlying OS call
>> directly.
>
> The current signature differs from that of link()
>
> os.link(src, dst, *, src_dir_fd=None, dst_dir_fd=None, follow_symlinks=True)
>
> but it looks like you are right in so far as link() is still called by
> default:

Yeah, it's been extended to call linkat() if the extra parameters are
provided, but it still calls link() if you call it like link().

So basically it's been extended analogously to how Linux has been,
and the OS manpage is still the right place to look to understand
what it does. (linkat() is documented as part of the same manpage as
link() anyway.)

Grant Edwards

unread,
Jan 30, 2017, 10:48:39 AM1/30/17
to
On 2017-01-30, Jussi Piitulainen <jussi.pi...@helsinki.fi> wrote:

> It doesn't seem to be documented. I looked at help(os.link) on Python
> 3.4 and the corresponding current library documentation on the web. I
> saw no mention of what happens when dst exists already.

The functions in the os module are thin-as-possible wrappers around
the OS's libc functions. The authors of the os module don't really
have any way of knowing the details of what your os/libc combination
is going to do.

If you're calling os.foo(), you're sort of expected to know what foo()
does on your OS.

> Also, creating a hard link doesn't seem to work between different file
> systems, which may well be relevant to Steve's case. I get:
>
> OSError: [Errno 18] Invalid cross-device link: [snip]
>
> And that also is not mentioned in the docs.

Again, that's a detail that depends on your particular
OS/libc/filesystem implementation. It's not determined by nor
knowable by the authors of the os module.

--
Grant Edwards grant.b.edwards Yow! I'm ANN LANDERS!!
at I can SHOPLIFT!!
gmail.com

Grant Edwards

unread,
Jan 30, 2017, 10:56:35 AM1/30/17
to
On 2017-01-30, Terry Reedy <tjr...@udel.edu> wrote:
> On 1/30/2017 8:58 AM, Peter Otten wrote:
>> Jussi Piitulainen wrote:
>
>>> It doesn't seem to be documented.
>>
>> For functions with a C equivalent a look into the man page is usually
>> helpful.
>
> Man pages do not exist on Windows. I suspect that there are more
> individual Python programs on Windows than *nix. I am more sure of this
> as applied to beginners, most of whom have no idea what a 'man' page is
> (sexist docs? ;-).

IMO, beginners shouldn't be using the os module. This is implied by
the first parapgraph of the doc:

If you just want to read or write a file see open(), if you want
to manipulate paths, see the os.path module, and if you want to
read all the lines in all the files on the command line see the
fileinput module. For creating temporary files and directories
see the tempfile module, and for high-level file and directory
handling see the shutil module.

When you're using the OS module, you're just writing C code in Python.
[Which, BTW, is a _very_ useful thing to be able to do, but it's
probably not what beginners are trying to do.]

I always found the first sentence to be a bit funny:

This module provides a portable way of using operating system
dependent functionality.

I understand whay they're tying to say, but I always found it amusing
to say you're going to provide a portable way to do something
non-portable.

--
Grant Edwards grant.b.edwards Yow! Somewhere in DOWNTOWN
at BURBANK a prostitute is
gmail.com OVERCOOKING a LAMB CHOP!!

Marko Rauhamaa

unread,
Jan 30, 2017, 11:12:01 AM1/30/17
to
Grant Edwards <grant.b...@gmail.com>:

> IMO, beginners shouldn't be using the os module.

Hard to know. Depends on what the beginner wants to accomplish.

> I always found the first sentence to be a bit funny:
>
> This module provides a portable way of using operating system
> dependent functionality.
>
> I understand whay they're tying to say, but I always found it amusing
> to say you're going to provide a portable way to do something
> non-portable.

One of the best things in Python is that it has exposed the operating
system to the application programmer. I don't think a programming
language should abstract the operating system away.


Marko

Ben Finney

unread,
Jan 30, 2017, 7:18:23 PM1/30/17
to
Peter Otten <__pet...@web.de> writes:

> http://stackoverflow.com/questions/3222341/how-to-rename-without-race-conditions
>
> and from a quick test it appears to work on Linux:

By “works on Linux”, I assume you mean “works on filesystems that use
inodes and hard links”. That is not true for all filesystems, even on
Linux.

--
\ “[Entrenched media corporations will] maintain the status quo, |
`\ or die trying. Either is better than actually WORKING for a |
_o__) living.” —ringsnake.livejournal.com, 2007-11-12 |
Ben Finney

Steve D'Aprano

unread,
Feb 1, 2017, 7:21:43 PM2/1/17
to
On Tue, 31 Jan 2017 02:56 am, Grant Edwards wrote:

> On 2017-01-30, Terry Reedy <tjr...@udel.edu> wrote:
>> On 1/30/2017 8:58 AM, Peter Otten wrote:
>>> Jussi Piitulainen wrote:
>>
>>>> It doesn't seem to be documented.
>>>
>>> For functions with a C equivalent a look into the man page is usually
>>> helpful.
>>
>> Man pages do not exist on Windows. I suspect that there are more
>> individual Python programs on Windows than *nix. I am more sure of this
>> as applied to beginners, most of whom have no idea what a 'man' page is
>> (sexist docs? ;-).
>
> IMO, beginners shouldn't be using the os module.

Do you mean beginners to Python or beginners to programming, or both?


[...]
> I always found the first sentence to be a bit funny:
>
> This module provides a portable way of using operating system
> dependent functionality.
>
> I understand whay they're tying to say, but I always found it amusing
> to say you're going to provide a portable way to do something
> non-portable.

Fortunately, as Python has matured as a language, it has moved away from
that simplistic "dumb interface to OS specific functions".

For example, os.urandom:

- calls the getrandom() syscall if available;
- fall back on /dev/urandom if not;
- calls CryptGenRandom() on Windows;
- and getentropy() on BSD.

https://docs.python.org/3.6/library/os.html#os.urandom

Similarly, quite a few os functions either come in a pair of flavours, one
returning a string, the other returning bytes, or they take either a byte
argument or a string argument. Some even accept pathlib path objects.


Python is a high-level, platform-independent language, and the more
platform-independent wrappers around platform-dependent functions, the
better.

Steve D'Aprano

unread,
Feb 9, 2017, 6:46:47 AM2/9/17
to
On Mon, 30 Jan 2017 09:39 pm, Peter Otten wrote:

>>>> def rename(source, dest):
> ... os.link(source, dest)
> ... os.unlink(source)
> ...
>>>> rename("foo", "baz")
>>>> os.listdir()
> ['bar', 'baz']
>>>> rename("bar", "baz")
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "<stdin>", line 2, in rename
> FileExistsError: [Errno 17] File exists: 'bar' -> 'baz'


Thanks Peter!

That's not quite ideal, as it isn't atomic: it is possible that the link
will succeed, but the unlink won't. But I prefer that over the alternative,
which is over-writing a file and causing data loss.

So to summarise, os.rename(source, destination):

- is atomic on POSIX systems, if source and destination are both on the
same file system;

- may not be atomic on Windows?

- may over-write an existing destination on POSIX systems, but not on
Windows;

- and it doesn't work across file systems.

os.replace(source, destination) is similar, except that it may over-write an
existing destination on Windows as well as on POSIX systems.


The link/unlink trick:

- avoids over-writing existing files on POSIX systems at least;

- but maybe not Windows?

- isn't atomic, so in the worst case you end up with two links to
the one file;

- but os.link may not be available on all platforms;

- and it won't work across file systems.


Putting that all together, here's my attempt at a version of file rename
which doesn't over-write existing files:


import os
import shutil

def rename(src, dest):
"""Rename src to dest only if dest doesn't already exist (almost)."""
if hasattr(os, 'link'):
try:
os.link(src, dest)
except OSError:
pass
else:
os.unlink(src)
return
# Fallback to an implementation which is vulnerable to a
# Time Of Check to Time Of Use bug.
# Try to reduce the window for this race condition by minimizing
# the number of lookups needed between one call and the next.
move = shutil.move
if not os.file.exists(dest):
move(src, dest)
else:
raise shutil.Error("Destination path '%s' already exists" % dest)



Any comments? Any bugs? Any cross-platform way to slay this TOCTOU bug once
and for all?

Steve D'Aprano

unread,
Feb 9, 2017, 6:52:03 AM2/9/17
to
On Tue, 31 Jan 2017 11:17 am, Ben Finney wrote:

> Peter Otten <__pet...@web.de> writes:
>
>>
http://stackoverflow.com/questions/3222341/how-to-rename-without-race-conditions
>>
>> and from a quick test it appears to work on Linux:
>
> By “works on Linux”, I assume you mean “works on filesystems that use
> inodes and hard links”. That is not true for all filesystems, even on
> Linux.


Indeed it is not, and we're often very sloppy about describing file system
differences as if they were OS differences.

Jussi Piitulainen

unread,
Feb 9, 2017, 7:54:52 AM2/9/17
to
To claim the filename before crossing a filesystem boundary, how about:

1) create a temporary file in the target directory (tempfile.mkstemp)

2) link the temporary file to the target name (in the same directory)

3) unlink the temporary name

4) now it should be safe to move the source file to the target name

5) set permissions and whatever other attributes there are?

Or maybe copy the source file to the temporary name, link the copy to
the target name, unlink the temporary name, unlink the source file;
failing the link step: unlink the temporary name but do not unlink the
source file.

eryk sun

unread,
Feb 9, 2017, 8:08:46 AM2/9/17
to
On Thu, Feb 9, 2017 at 11:46 AM, Steve D'Aprano
<steve+...@pearwood.info> wrote:
>
> So to summarise, os.rename(source, destination):
>
> - is atomic on POSIX systems, if source and destination are both on the
> same file system;
> - may not be atomic on Windows?
> - may over-write an existing destination on POSIX systems, but not on
> Windows;
> - and it doesn't work across file systems.

On Windows in 2.7 and prior to 3.3, os.rename will silently copy and
delete when the destination isn't on the same volume. It may even
silently leave the original file in place in some cases -- e.g. when
the file is read-only and the user isn't allowed to modify the file
attributes.

If the destination is on the same volume, renaming should be atomic
via the system calls NtOpenFile and NtSetInformationFile. Ultimately
it depends on the file system implementation of
IRP_MJ_SET_INFORMATION, FileRenameInformation [1].

> The link/unlink trick:
> - avoids over-writing existing files on POSIX systems at least;
> - but maybe not Windows?

This works for renaming files on Windows as long as the file system
supports hard links (e.g. NTFS). It's not documented on MSDN, but
WinAPI CreateHardLink is implemented by calling NtSetInformationFile
to set the FileLinkInformation, with ReplaceIfExists set to FALSE, so
it fails if the destination exists. Note that this does not allow
renaming directories. See the note for FileLinkInformation [1]; NTFS
doesn't allow directory hard links. But why bother with this 'trick'
on Windows?

[1]: https://msdn.microsoft.com/en-us/library/ff549366

Steve_D...@f38.n261.z1

unread,
Feb 9, 2017, 1:53:25 PM2/9/17
to
From: Steve D'Aprano <steve+...@pearwood.info>

On Tue, 31 Jan 2017 11:17 am, Ben Finney wrote:

> Peter Otten <__pet...@web.de> writes:
>
>>
http://stackoverflow.com/questions/3222341/how-to-rename-without-race-condition
s
>>
>> and from a quick test it appears to work on Linux:
>
> By â Łworks on Linuxâ Ř, I assume you mean â Łworks on filesystems that use
> inodes and hard linksâ Ř. That is not true for all filesystems, even on
> Linux.


Indeed it is not, and we're often very sloppy about describing file system
differences as if they were OS differences.




--
Steve
â ŁCheer up,â Ř they said, â Łthings could be worse.â Ř So I cheered up, and

Steve_D...@f38.n261.z1

unread,
Feb 9, 2017, 2:49:42 PM2/9/17
to
From: Steve_D'Apr...@f38.n261.z1

Steve D'Aprano

unread,
Feb 11, 2017, 11:09:26 PM2/11/17
to
On Fri, 10 Feb 2017 12:07 am, eryk sun wrote:

> On Thu, Feb 9, 2017 at 11:46 AM, Steve D'Aprano
> <steve+...@pearwood.info> wrote:
>>
>> So to summarise, os.rename(source, destination):
>>
>> - is atomic on POSIX systems, if source and destination are both on the
>> same file system;
>> - may not be atomic on Windows?
>> - may over-write an existing destination on POSIX systems, but not on
>> Windows;
>> - and it doesn't work across file systems.
>
> On Windows in 2.7 and prior to 3.3, os.rename will silently copy and
> delete when the destination isn't on the same volume.


Will the copy overwrite an existing file?


> It may even
> silently leave the original file in place in some cases -- e.g. when
> the file is read-only and the user isn't allowed to modify the file
> attributes.
>
> If the destination is on the same volume, renaming should be atomic
> via the system calls NtOpenFile and NtSetInformationFile. Ultimately
> it depends on the file system implementation of
> IRP_MJ_SET_INFORMATION, FileRenameInformation [1].
>
>> The link/unlink trick:
>> - avoids over-writing existing files on POSIX systems at least;
>> - but maybe not Windows?
>
> This works for renaming files on Windows as long as the file system
> supports hard links (e.g. NTFS). It's not documented on MSDN, but
> WinAPI CreateHardLink is implemented by calling NtSetInformationFile
> to set the FileLinkInformation, with ReplaceIfExists set to FALSE, so
> it fails if the destination exists. Note that this does not allow
> renaming directories. See the note for FileLinkInformation [1]; NTFS
> doesn't allow directory hard links. But why bother with this 'trick'
> on Windows?

I don't know, that's why I'm asking for guidance. I don't have a Windows
system to test on.

On Windows, how would you implement a file rename (potentially across file
system boundaries) which will not overwrite existing files? Just by calling
os.rename()?

eryk sun

unread,
Feb 12, 2017, 10:32:11 PM2/12/17
to
On Sun, Feb 12, 2017 at 4:09 AM, Steve D'Aprano
<steve+...@pearwood.info> wrote:
> On Fri, 10 Feb 2017 12:07 am, eryk sun wrote:
>
>> On Thu, Feb 9, 2017 at 11:46 AM, Steve D'Aprano
>> <steve+...@pearwood.info> wrote:
>>>
>>> So to summarise, os.rename(source, destination):
>>>
>>> - is atomic on POSIX systems, if source and destination are both on the
>>> same file system;
>>> - may not be atomic on Windows?
>>> - may over-write an existing destination on POSIX systems, but not on
>>> Windows;
>>> - and it doesn't work across file systems.
>>
>> On Windows in 2.7 and prior to 3.3, os.rename will silently copy and
>> delete when the destination isn't on the same volume.
>
> Will the copy overwrite an existing file?

2.7/3.2 calls MoveFile. This is effectively the same as (but not
necessarily implemented by) MoveFileEx with the flag
MOVEFILE_COPY_ALLOWED. It will not replace an existing file whether or
not the target is on the same volume. For a cross-volume move, it's
effectively a CopyFile followed by DeleteFile. If deleting the source
file fails, it tries to reset the file attributes and retries the
delete.

> On Windows, how would you implement a file rename (potentially across file
> system boundaries) which will not overwrite existing files? Just by calling
> os.rename()?

I'd prefer to always call shutil.move on Windows. In 2.7 the os.rename
call in shutil.move will copy and delete for a cross-volume move of a
file, but fail for a directory. In the latter case shutil.move falls
back on shutil.copytree and shutil.rmtree. The OS won't allow
replacing an existing file with a directory, so that's not a problem.

In 3.3+ os.rename doesn't move files across volumes on any platform --
I think. In this case the default copy2 function used by shutil.move
is a problem for everyone. Ideally we want "x" mode instead of "w"
mode for creating the destination file. It raises a FileExistsError.
3.5+ allows using a custom copy function to implement this.
0 new messages