Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

No os.copy()? Why not?

11,502 views
Skip to first unread message

John Ladasky

unread,
Mar 28, 2012, 4:12:30 PM3/28/12
to
I'm looking for a Python (2.7) equivalent to the Unix "cp" command.
Since the equivalents of "rm" and "mkdir" are in the os module, I
figured I look there. I haven't found anything in the documentation.
I am also looking through the Python source code in os.py and its
child, posixfile.py.

Any help? Thanks.

alex23

unread,
Mar 29, 2012, 12:50:52 AM3/29/12
to
On Mar 29, 6:12 am, John Ladasky <john_lada...@sbcglobal.net> wrote:
> I'm looking for a Python (2.7) equivalent to the Unix "cp" command.
> Any help?  Thanks.

Try the shutil module: http://docs.python.org/library/shutil.html

John Ladasky

unread,
Mar 30, 2012, 5:25:05 AM3/30/12
to
Many thanks! That's what I was looking for.

Ian Kelly

unread,
Apr 2, 2012, 4:48:38 PM4/2/12
to John Ladasky, pytho...@python.org
The os module wraps system calls, not shell commands. You want the
shutil module, not the os module.

HoneyMonster

unread,
Apr 2, 2012, 5:11:52 PM4/2/12
to
One way:
import os

os.system ("cp src sink")

Thomas Rachel

unread,
Apr 3, 2012, 2:24:53 AM4/3/12
to
Am 02.04.2012 23:11 schrieb HoneyMonster:

> One way:
> import os
>
> os.system ("cp src sink")

Yes. The worst way you could imagine.

Why not the much much better

from subprocess
subprocess.call(['cp', 'src', 'sink'])

?

Then you can call it with (really) arbitrary file names:


def call_cp(from, to):
from subprocess
subprocess.call(['cp', '--', from, to])

Try that with os.system() and from="That's my file"...


Thomas

John Ladasky

unread,
Apr 3, 2012, 5:34:33 AM4/3/12
to
I use subprocess.call() for quite a few other things.

I just figured that I should use the tidier modules whenever I can.

Ian Kelly

unread,
Apr 3, 2012, 12:29:24 PM4/3/12
to pytho...@python.org
On Tue, Apr 3, 2012 at 12:24 AM, Thomas Rachel
<nutznetz-0c1b6768-bfa9...@spamschutz.glglgl.de>
wrote:
> Am 02.04.2012 23:11 schrieb HoneyMonster:
>
>
>> One way:
>> import os
>>
>> os.system ("cp src sink")
>
>
> Yes. The worst way you could imagine.
>
> Why not the much much better
>
> from subprocess
> subprocess.call(['cp', 'src', 'sink'])


In any case, either of these approaches will only work in UNIX,
whereas shutil is cross-platform.

D'Arcy Cain

unread,
Apr 3, 2012, 3:46:31 PM4/3/12
to john_l...@sbcglobal.net, pytho...@python.org
cp is not a system command, it's a shell command. Why not just use the
incredibly simple and portable

>>>open("outfile", "w").write(open("infile").read())

put it into a method if you find that too much to type:

def cp(infile, outfile):
open(outfile, "w").write(open(infile).read())

--
D'Arcy J.M. Cain <da...@druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
IM: da...@Vex.Net

Tycho Andersen

unread,
Apr 3, 2012, 4:10:32 PM4/3/12
to D'Arcy Cain, pytho...@python.org, john_l...@sbcglobal.net
On Tue, Apr 03, 2012 at 03:46:31PM -0400, D'Arcy Cain wrote:
> On 03/28/12 16:12, John Ladasky wrote:
> cp is not a system command, it's a shell command. Why not just use the
> incredibly simple and portable
>
> >>>open("outfile", "w").write(open("infile").read())

Note, though, that this reads the whole file into memory. As many
others have said, shutil is the most idiomatic option.

\t

Evan Driscoll

unread,
Apr 3, 2012, 4:21:25 PM4/3/12
to Tycho Andersen, pytho...@python.org, john_l...@sbcglobal.net
On 01/-10/-28163 01:59 PM, Tycho Andersen wrote:
> Note, though, that this reads the whole file into memory. As many
> others have said, shutil is the most idiomatic option.

* most idiomatic
* clearest in terms of showing intent
* potentially fastest
* hardest to screw up (unlike concatenating file names into a
'system' call)
* has at least a snowball's chance of persisting metadata associated
with the file (even if shutil doesn't now, it could theoretically
change)
* portable

There's basically nothing NOT to like about shutil, and at least one
significant problems with all the other suggestions.

Evan

Steven D'Aprano

unread,
Apr 4, 2012, 1:53:44 AM4/4/12
to
On Tue, 03 Apr 2012 15:46:31 -0400, D'Arcy Cain wrote:

> On 03/28/12 16:12, John Ladasky wrote:
>> I'm looking for a Python (2.7) equivalent to the Unix "cp" command.
>> Since the equivalents of "rm" and "mkdir" are in the os module, I
>> figured I look there. I haven't found anything in the documentation. I
>> am also looking through the Python source code in os.py and its child,
>> posixfile.py.
>
> cp is not a system command, it's a shell command. Why not just use the
> incredibly simple and portable
>
> >>>open("outfile", "w").write(open("infile").read())
>
> put it into a method if you find that too much to type:
>
> def cp(infile, outfile):
> open(outfile, "w").write(open(infile).read())


Because your cp doesn't copy the FILE, it copies the file's CONTENTS,
which are not the same thing.

Consider:

* permissions
* access times
* file ownership
* other metadata
* alternate streams and/or resource fork, on platforms that support them
* sparse files


By the time you finish supporting the concept of copying the file itself,
rather than merely its content, you will have something similar to the
shutil.copy command -- only less tested.



--
Steven

Chris Angelico

unread,
Apr 4, 2012, 4:37:20 AM4/4/12
to pytho...@python.org
On Wed, Apr 4, 2012 at 3:53 PM, Steven D'Aprano
<steve+comp....@pearwood.info> wrote:
> On Tue, 03 Apr 2012 15:46:31 -0400, D'Arcy Cain wrote:
>
>> def cp(infile, outfile):
>>    open(outfile, "w").write(open(infile).read())
>
> Because your cp doesn't copy the FILE, it copies the file's CONTENTS,
> which are not the same thing.

And, as a subtle point: This method can't create the file "at size". I
don't know how it'll end up allocating space, but certainly there's no
opportunity to announce to the OS at file open/create time "please
allocate X bytes for this file". That may be an utterly trivial point,
or a crucially vital one.

ChrisA

Alain Ketterlin

unread,
Apr 4, 2012, 5:22:45 AM4/4/12
to
Steven D'Aprano <steve+comp....@pearwood.info> writes:

> On Tue, 03 Apr 2012 15:46:31 -0400, D'Arcy Cain wrote:
>
>> On 03/28/12 16:12, John Ladasky wrote:

>>> I'm looking for a Python (2.7) equivalent to the Unix "cp" command.

>> >>>open("outfile", "w").write(open("infile").read())

> Because your cp doesn't copy the FILE, it copies the file's CONTENTS,
> which are not the same thing.
> Consider:
> * permissions
> * access times
> * file ownership
> * other metadata
> * alternate streams and/or resource fork, on platforms that support them
> * sparse files
> By the time you finish supporting the concept of copying the file itself,
> rather than merely its content, you will have something similar to the
> shutil.copy command -- only less tested.

A minor point, but shutil.copy only "copies" contents and permissions
(no access times, etc.) You probably mean shutil.copy2.

And sparse files are really hard to reproduce, at least on Unix: on
Linux even the system's cp doesn't guarantee sparseness of the copy (the
manual mentions a "crude heuristic").

But of course shutil.copy is the best solution to mimic a raw cp.

-- Alain.

Roy Smith

unread,
Apr 4, 2012, 8:08:31 AM4/4/12
to
On Tue, 03 Apr 2012 15:46:31 -0400, D'Arcy Cain wrote:
> > cp is not a system command, it's a shell command. Why not just use the
> > incredibly simple and portable
> >
> > >>>open("outfile", "w").write(open("infile").read())

In article <4f7be1e8$0$29999$c3e8da3$5496...@news.astraweb.com>,
Steven D'Aprano <steve+comp....@pearwood.info> wrote:

> Because your cp doesn't copy the FILE, it copies the file's CONTENTS,
> which are not the same thing.

Not to mention that this will read the entire contents of the file into
memory at once. Probably don't want to do that with 100 GB of data.

Slightly off-topic, but are there file systems these days which support
off-line copying? If I have a disk at the other end of a network link,
it would be nice to tell the disk to copy a file and tell me when it's
done. As opposed to dragging all that data over the network just so I
can buffer it in local memory and shove it right back out the network
port to the same disk. That kind of stuff used to be standard practice
in the neanderthalic days of IBM mainframes.

Roy Smith

unread,
Apr 4, 2012, 8:14:18 AM4/4/12
to
In article <87fwcj4...@dpt-info.u-strasbg.fr>,
Alain Ketterlin <al...@dpt-info.u-strasbg.fr> wrote:

> And sparse files are really hard to reproduce, at least on Unix: on
> Linux even the system's cp doesn't guarantee sparseness of the copy (the
> manual mentions a "crude heuristic").

I imagine the heuristic is to look for blocks of all zeros. The problem
is, unless you know the block size of the file system, you can only
guess as to how many zeros in a row you need to look for.

In the old days, dump/restore used to know about sparse files. But
things like dump/restore really get inside the file system's kimono. In
today's world of SANs, WANs, and all sorts of virtual file-system-ish
things, I would expect that's less common.

Chris Angelico

unread,
Apr 4, 2012, 8:17:20 AM4/4/12
to pytho...@python.org
On Wed, Apr 4, 2012 at 10:08 PM, Roy Smith <r...@panix.com> wrote:
> Slightly off-topic, but are there file systems these days which support
> off-line copying?  If I have a disk at the other end of a network link,
> it would be nice to tell the disk to copy a file and tell me when it's
> done.

Depends on your network protocol. One of the coolest and oldest tricks
with FTP is initiating a file transfer from one remote host to
another; I've never done it but it ought to work with localhost (ie
two sessions to the same host).

ChrisA

Thomas Rachel

unread,
Apr 4, 2012, 8:30:35 AM4/4/12
to
Am 03.04.2012 11:34 schrieb John Ladasky:
> I use subprocess.call() for quite a few other things.
>
> I just figured that I should use the tidier modules whenever I can.

Of course. I only wanted to point out that os.system() is an even worse
approach. shutils.copy() is by far better, of course.

Steve Howell

unread,
Apr 4, 2012, 11:15:54 AM4/4/12
to
On Apr 4, 1:37 am, Chris Angelico <ros...@gmail.com> wrote:
> On Wed, Apr 4, 2012 at 3:53 PM, Steven D'Aprano
>
FWIW shutil.py doesn't do anything particularly fancy with respect to
creating files "at size", unless I'm missing something:

http://hg.python.org/cpython/file/2.7/Lib/shutil.py

Only one level away from copyfile, you have copyfileobj, which is a
read/write loop:

46 def copyfileobj(fsrc, fdst, length=16*1024):
47 """copy data from file-like object fsrc to file-like object
fdst"""
48 while 1:
49 buf = fsrc.read(length)
50 if not buf:
51 break
52 fdst.write(buf)

...and that gets called by copyfile, which only does a little bit of
"os"-related stuff:

66 def copyfile(src, dst):
67 """Copy data from src to dst"""
68 if _samefile(src, dst):
69 raise Error("`%s` and `%s` are the same file" % (src,
dst))
70
71 for fn in [src, dst]:
72 try:
73 st = os.stat(fn)
74 except OSError:
75 # File most likely does not exist
76 pass
77 else:
78 # XXX What about other special files? (sockets,
devices...)
79 if stat.S_ISFIFO(st.st_mode):
80 raise SpecialFileError("`%s` is a named pipe" %
fn)
81
82 with open(src, 'rb') as fsrc:
83 with open(dst, 'wb') as fdst:
84 copyfileobj(fsrc, fdst)

The "value add" vs. a simple read/write loop depends on whether you
want OSError suppressed. The _samefile guard is nice to have, but
probably unnecessary for many apps.

I'm sure shutil.copyfile() makes perfect sense for most use cases, and
it's nice that you can see what it does under the covers pretty
easily, but it's not rocket science.


Nobody

unread,
Apr 4, 2012, 3:37:49 PM4/4/12
to
On Wed, 04 Apr 2012 08:14:18 -0400, Roy Smith wrote:

>> And sparse files are really hard to reproduce, at least on Unix: on
>> Linux even the system's cp doesn't guarantee sparseness of the copy (the
>> manual mentions a "crude heuristic").
>
> I imagine the heuristic is to look for blocks of all zeros.

Yes. Although it's not really accurate to describe it as a "heuristic".

With --sparse=always, it will try to make the output sparse regardless of
whether the input was sparse, replacing any all-zeros block with a hole.

The default of --sparse=auto will only create a sparse file if the input
itself is sparse, i.e. if the length of the file rounded up to the nearest
block exceeds its disk usage.

Regardless of the --sparse= setting and whether the input was sparse, if
it tries to create a sparse file it will create holes wherever possible
rather than attempting to preserve the exact pattern of holes in a sparse
input file.

0 new messages