Do we need "cp -a" - or would 'cp -pR' do ?

491 views
Skip to first unread message

Dr. David Kirkby

unread,
Nov 7, 2009, 3:48:49 AM11/7/09
to sage-...@googlegroups.com
In my recent attempt to create a binary distribution for Solaris (see thread
"What directories should go into a binary distribution?"), failed as the '-a'
option was used to the 'cp' command in

SAGE_ROOT/local/bin/sage-bdist.

See trac http://trac.sagemath.org/sage_trac/ticket/7407

A look at the top of sage-bdist shows:

if [ $UNAME = "Darwin" ]; then
OPT="Rp"
else
OPT="ra"
fi

then later on we see the '$OPT' is used when the 'cp' command is used.

cp -$OPT examples local makefile *.txt *.sage sage ipython data "$TMP"/
cp -L$OPT devel/sage-main "$TMP"/devel/sage-main
cp -$OPT installed $TMP/$PKGDIR/
cp -$OPT standard $TMP/$PKGDIR/
cp install README.txt gen_html $TMP/$PKGDIR/
cp sage/local/bin/sage-README-osx.txt README.txt


Clearly it appears OS X is happy with the options '-Rp'. I can't see why Linux
should not be too, as those are standard options.

Do we know why the '-L' option is used once?

"cp -L$OPT devel/sage-main "$TMP"/devel/sage-main"

I read the POSIX standard, and although this is a required option, I can't
really work out exactly what the option is supposed to do. To quote from the
2004 standard:

----------------------------
"-L
Take actions based on the type and contents of the file referenced by any
symbolic link specified as a source_file operand or any symbolic links
encountered during traversal of a file hierarchy."
-----------------------------

I do not know exactly that action it is supposed to take though!

I'd like to get rid of the '-L' option too, unless there is a need for it, as it
is not supported on HP-UX 11.11. However, '-L' is a POSIX option, so I'm not
suggesting '-L' is removed if '-L' has any use in Sage. I'm just wondering what
the reason for its inclusion is.

If there is a good reason for it, I'll change the code to:

if [ `uname` != "HP-UX" ] ; then
cp -LpR devel/sage-main "$TMP"/devel/sage-main
else
cp -pR devel/sage-main "$TMP"/devel/sage-main
fi


I just done a check on 4 Unix systems. Not one of them accepts this '-a' option.
However, three of the four support -L, which is required by POSIX.

1) OpenSolaris
bash-3.2$ uname -a
SunOS hawk 5.11 snv_111b i86pc i386 i86pc
bash-3.2$ touch b
bash-3.2$ cp -a b c
cp: illegal option -- a
Usage: cp [-f] [-i] [-p] [-@] [-/] f1 f2
cp [-f] [-i] [-p] [-@] [-/] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] [-/] d1 ... dn-1 dn


2) Solaris 10
drkirkby@kestrel:~$ cp -a b c
cp: illegal option -- a
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn

3) AIX 6.1 (runs on IBM Unix boxes)
$ uname -a
AIX client9 1 6 00C6B7C04C00
$ touch b
$ cp -a b c
cp: illegal option -- a
Usage: cp [-fhipHILPU][-d|-e] [-r|-R] [-E{force|ignore|warn}] [--] src target
or: cp [-fhipHILPU] [-d|-e] [-r|-R] [-E{force|ignore|warn}] [--] src1 ...
srcN directory


4) HP-UX hpbox B.11.11 U 9000/785 2016698240 unlimited-user license
$ touch b
$ cp -a b c
cp: illegal option -- a
Usage: cp [-f|-i] [-p] [-S] [-e warn|force|ignore] source_file target_file
cp [-f|-i] [-p] [-S] [-e warn|force|ignore] source_file ... target_directory
cp [-f|-i] [-p] [-S] -R|-r [-e warn|force|ignore] source_directory ...
target_directory

As you can see, '-a' is not well supported, and I think is best removed.

Dave

MaxTheMouse

unread,
Nov 7, 2009, 6:30:43 AM11/7/09
to sage-devel

>
> Do we know why the '-L' option is used once?
>
> "cp -L$OPT devel/sage-main "$TMP"/devel/sage-main"
>
> I read the POSIX standard, and although this is a required option, I can't
> really work out exactly what the option is supposed to do. To quote from the
> 2004 standard:
>
> ----------------------------
> "-L
>      Take actions based on the type and contents of the file referenced by any
> symbolic link specified as a source_file operand or any symbolic links
> encountered during traversal of a file hierarchy."
> -----------------------------
>
> I do not know exactly that action it is supposed to take though!
>


If it helps, my cp reports:

-L, --dereference always follow symbolic links in SOURCE

This at least makes more sense. I don't know if it is needed in the
specific case. This is with cp (GNU coreutils) 7.4.

Adam

Gonzalo Tornaria

unread,
Nov 7, 2009, 11:09:14 AM11/7/09
to sage-...@googlegroups.com
On Sat, Nov 7, 2009 at 6:48 AM, Dr. David Kirkby
<david....@onetel.net> wrote:
> "cp -L$OPT devel/sage-main "$TMP"/devel/sage-main"

Maybe this is done to handle the case where "sage-main" is a symlink
to an actual directory. The option -L means to copy symlinks as real
files. Otherwise, the symlink may be copied (when using -a, at least
--- unspecified by posix when using -Rp).

As a matter of fact, using "cp -Lra" does NOT work as claimed above,
because the -a option implies -P which overrides the -L option. Using
"cp -raL" would copy symlinks, though.

The actual meaning of "-a" in gnu cp is really "-dRp", not just "-Rp".
The "-d" option should be replaced with "-P" posix option (preserve
symlinks), except the "-d" also preserves hard links. I don't think
posix has an option to preserve hard links.

For the line with -L, if the only motivation is to follow the symlink
in case sage-main itself is a symlink, the correct option is -H rather
than -L, but it may be possible to use (instead of -H or -L):

cp -$OPT devel/sage-main/ "$TMP"/devel/sage-main"

the extra "/" at the end of the source operand makes it to expand the
symlink, if any.

Maybe somebody (wstein?) can comment on why the -L was added to the script?

----

WRT -d option (preserve symlinks + hardlinks). It's clearly necessary
to preserve symbolic links. For instance, dynamic libraries use
symlinks.

Is it really necessary for sage-bdist to preserve hardlinks?

[ ... checking a bdist tarball of 4.1.1 ... ]

there is exactly one hardlink in this bdist tarball:
"local/bin/python" is a hardlink to "local/bin/python2.6".

IOW, using -P instead of -d would produce a tarball with two copies of
the python binary. Shouldn't this be handled with a symbolic link
instead?

-------------------------------------

My suggestion would be:

a. fix installation of python so that a symlink is used instead of a hard link
b. use -PRp as options for cp (this is posix!)
c. for the sage-main directory, use the trailing / trick so the -L /
-H option is not necessary (double check this with whoever wrote the
sage-bdist script to use -L option)
d. for systems where -P is not supported, figure out a way to copy
preserving symlinks.

[a. is not critical, but as long as it's not done, sage-bdist should
keep using -a or -d on gnu systems, to avoid bloat in the bdist
tarfile]

Gonzalo
PS: I've attached a shell script which exhibits the different
behaviours of cp with different options. You can try something like
that in HP-UX to see if there is a way to preserve symbolic links in a
copy.

cp-test.sh

Dr. David Kirkby

unread,
Nov 7, 2009, 8:00:54 PM11/7/09
to sage-...@googlegroups.com
Gonzalo Tornaria wrote:
> On Sat, Nov 7, 2009 at 6:48 AM, Dr. David Kirkby
> <david....@onetel.net> wrote:
>> "cp -L$OPT devel/sage-main "$TMP"/devel/sage-main"
>
> Maybe this is done to handle the case where "sage-main" is a symlink
> to an actual directory. The option -L means to copy symlinks as real
> files. Otherwise, the symlink may be copied (when using -a, at least
> --- unspecified by posix when using -Rp).
>
> As a matter of fact, using "cp -Lra" does NOT work as claimed above,
> because the -a option implies -P which overrides the -L option. Using
> "cp -raL" would copy symlinks, though.
>
> The actual meaning of "-a" in gnu cp is really "-dRp", not just "-Rp".
> The "-d" option should be replaced with "-P" posix option (preserve
> symlinks), except the "-d" also preserves hard links. I don't think
> posix has an option to preserve hard links.
>
> For the line with -L, if the only motivation is to follow the symlink
> in case sage-main itself is a symlink, the correct option is -H rather
> than -L, but it may be possible to use (instead of -H or -L):
>
> cp -$OPT devel/sage-main/ "$TMP"/devel/sage-main"
>
> the extra "/" at the end of the source operand makes it to expand the
> symlink, if any.
>
> Maybe somebody (wstein?) can comment on why the -L was added to the script?


That would be very useful if William commented here.

> ----
>
> WRT -d option (preserve symlinks + hardlinks). It's clearly necessary
> to preserve symbolic links. For instance, dynamic libraries use
> symlinks.
>
> Is it really necessary for sage-bdist to preserve hardlinks?
>
> [ ... checking a bdist tarball of 4.1.1 ... ]
>
> there is exactly one hardlink in this bdist tarball:
> "local/bin/python" is a hardlink to "local/bin/python2.6".
>
> IOW, using -P instead of -d would produce a tarball with two copies of
> the python binary. Shouldn't this be handled with a symbolic link
> instead?

I would have thought so too. But I'm puzzled, as python-2.6.2.p4/spkg-install
creates a symbolic link. It has the line:

ln -s python2.6 python


I see that local/bin/python and local/bin/python2.6 are hard links, as they have
the same inode.


drkirkby@kestrel:~/sage-4.2/local/bin$ ls -i python
8980 python
drkirkby@kestrel:~/sage-4.2/local/bin$ ls -i python2.6
8980 python2.6

But how did you find out these two were hard links? I'm not aware of any way to
find if A is a hard link of B, unless one finds the inodes and compares them,
which would be next to impossible where there are a lot of files. I assume there
is some way you do this.

> -------------------------------------
>
> My suggestion would be:
>
> a. fix installation of python so that a symlink is used instead of a hard link

Do you know where this bit of code is? As I say, from what I can see, the link
should be created as a symbolic link, not a hard link.

> b. use -PRp as options for cp (this is posix!)
> c. for the sage-main directory, use the trailing / trick so the -L /
> -H option is not necessary (double check this with whoever wrote the
> sage-bdist script to use -L option)
> d. for systems where -P is not supported, figure out a way to copy
> preserving symlinks.
>
> [a. is not critical, but as long as it's not done, sage-bdist should
> keep using -a or -d on gnu systems, to avoid bloat in the bdist
> tarfile]

Though the bloat will already exist on OS X, as OS X uses -pR, and no -a.

> Gonzalo
> PS: I've attached a shell script which exhibits the different
> behaviours of cp with different options. You can try something like
> that in HP-UX to see if there is a way to preserve symbolic links in a
> copy.

You have a far better understanding of this than me. If I gave you an account on
the HP-UX box, could you try this out? (If so, let me know a username). But do
not waste much time over it. Clearly the use of this non-POSIX option '-a' needs
to be removed asap, as it stops a binary being created on Solaris. I'm reluctant
to put tests in the script to handle Solaris differently to linux, when POSIX
options should be suitable for either. If its possible to do something which
works on all platforms (HP-UX etc), so much the better. But that is hardly that
important.

Gonzalo Tornaria

unread,
Nov 8, 2009, 9:32:52 AM11/8/09
to sage-...@googlegroups.com
On Sat, Nov 7, 2009 at 11:00 PM, Dr. David Kirkby
<david....@onetel.net> wrote:
> But how did you find out these two were hard links? I'm not aware of any way to
> find if A is a hard link of B, unless one finds the inodes and compares them,
> which would be next to impossible where there are a lot of files. I assume there
> is some way you do this.

$ ls -l sage-4.1.1/local/bin/python*
-rwxr-xr-x 2 tornaria tornaria 5528068 2009-09-04 21:45
sage-4.1.1/local/bin/python
-rwxr-xr-x 2 tornaria tornaria 5528068 2009-09-04 21:45
sage-4.1.1/local/bin/python2.6
-rwxr-xr-x 1 tornaria tornaria 1419 2009-09-04 23:54
sage-4.1.1/local/bin/python2.6-config
lrwxrwxrwx 1 tornaria tornaria 16 2009-10-30 01:25
sage-4.1.1/local/bin/python-config -> python2.6-config

The "2" in the second column indicates the number of hard link
references; you can guess they are the same file b/c they have the
same metadata --- you can confirm by looking at the inode numbers.


What I actually did, is check out the tarball:

$ tar tvf sage-4.1.1-core2-jsmath_fonts-x86_64-Linux.tar.gz | grep ^h
hrwxr-xr-x tornaria/tornaria 0 2009-09-04 21:45
sage-4.1.1-core2-jsmath_fonts-x86_64-Linux/local/bin/python link to
sage-4.1.1-core2-jsmath_fonts-x86_64-Linux/local/bin/python2.6


WRT symlinks:

$ tar tvf sage-4.1.1-core2-jsmath_fonts-x86_64-Linux.tar.gz | grep ^l | wc -l
125


>> a. fix installation of python so that a symlink is used instead of a hard link
>
> Do you know where this bit of code is? As I say, from what I can see, the link
> should be created as a symbolic link, not a hard link.

I suppose in the python spkg...

>> [a. is not critical, but as long as it's not done, sage-bdist should
>> keep using -a or -d on gnu systems, to avoid bloat in the bdist
>> tarfile]
>
> Though the bloat will already exist on OS X, as OS X uses -pR, and no -a.

Their choice ;-)

> not waste much time over it. Clearly the use of this non-POSIX option '-a' needs
> to be removed asap, as it stops a binary being created on Solaris. I'm reluctant
> to put tests in the script to handle Solaris differently to linux, when POSIX
> options should be suitable for either. If its possible to do something which
> works on all platforms (HP-UX etc), so much the better. But that is hardly that
> important.

<rant>
POSIX is not suitable. There's no posix way to copy hardlinks. The
standard is just too restrictive, like a dinosaur: big, fat, and slow.
We can try to adjust to posix, for the sake of portability, but that
doesn't make it suitable. We can do it because we are small, lean, and
fast :-)
</rant>

Gonzalo

Gonzalo Tornaria

unread,
Nov 8, 2009, 6:28:57 PM11/8/09
to sage-...@googlegroups.com
On Sat, Nov 7, 2009 at 11:00 PM, Dr. David Kirkby
<david....@onetel.net> wrote:

> Gonzalo Tornaria wrote:
>> Is it really necessary for sage-bdist to preserve hardlinks?
>>
>> [ ... checking a bdist tarball of 4.1.1 ... ]
>>
>> there is exactly one hardlink in this bdist tarball:
>> "local/bin/python" is a hardlink to "local/bin/python2.6".
>>
>> IOW, using -P instead of -d would produce a tarball with two copies of
>> the python binary. Shouldn't this be handled with a symbolic link
>> instead?
>
> I would have thought so too. But I'm puzzled, as python-2.6.2.p4/spkg-install
> creates a symbolic link. It has the line:
>
> ln -s python2.6 python

This is actually creating a local/lib/python symlink to local/lib/python2.6.

It seems to me the line at fault is in file
python-2.6.2.p4/src/Makefile.pre.in, line 763:

(cd $(DESTDIR)$(BINDIR); $(LN) python$(VERSION)$(EXE) $(PYTHON))

an "-s" option to that could make a difference, I think. (I didn't try it)

>> My suggestion would be:
>>
>> a. fix installation of python so that a symlink is used instead of a hard link
>
> Do you know where this bit of code is? As I say, from what I can see, the link
> should be created as a symbolic link, not a hard link.
>
>> b. use -PRp as options for cp (this is posix!)
>> c. for the sage-main directory, use the trailing / trick so the -L /
>> -H option is not necessary (double check this with whoever wrote the
>> sage-bdist script to use -L option)

Actually, use a trailing "/." as this is more portable.

>> d. for systems where -P is not supported, figure out a way to copy
>> preserving symlinks.

For the HP-UX, it turns out that "cp -Rp" is good. It preserves
symlinks, and it turned out also hard links.

Gonzalo

David Kirkby

unread,
Nov 8, 2009, 7:04:22 PM11/8/09
to sage-...@googlegroups.com
2009/11/8 Gonzalo Tornaria <torn...@math.utexas.edu>:

>
> On Sat, Nov 7, 2009 at 11:00 PM, Dr. David Kirkby
> <david....@onetel.net> wrote:
>> But how did you find out these two were hard links? I'm not aware of any way to
>> find if A is a hard link of B, unless one finds the inodes and compares them,
>> which would be next to impossible where there are a lot of files. I assume there
>> is some way you do this.
>
> $ ls -l sage-4.1.1/local/bin/python*
> -rwxr-xr-x 2 tornaria tornaria 5528068 2009-09-04 21:45
> sage-4.1.1/local/bin/python
> -rwxr-xr-x 2 tornaria tornaria 5528068 2009-09-04 21:45
> sage-4.1.1/local/bin/python2.6
> -rwxr-xr-x 1 tornaria tornaria    1419 2009-09-04 23:54
> sage-4.1.1/local/bin/python2.6-config
> lrwxrwxrwx 1 tornaria tornaria      16 2009-10-30 01:25
> sage-4.1.1/local/bin/python-config -> python2.6-config
>
> The "2" in the second column indicates the number of hard link
> references; you can guess they are the same file b/c they have the
> same metadata --- you can confirm by looking at the inode numbers.
>
>
> What I actually did, is check out the tarball:
>
> $ tar tvf sage-4.1.1-core2-jsmath_fonts-x86_64-Linux.tar.gz  | grep ^h
> hrwxr-xr-x tornaria/tornaria        0 2009-09-04 21:45
> sage-4.1.1-core2-jsmath_fonts-x86_64-Linux/local/bin/python link to
> sage-4.1.1-core2-jsmath_fonts-x86_64-Linux/local/bin/python2.6


Sorry, I overlooked you had answered this, and asked it again.

Dave

Gonzalo Tornaria

unread,
Nov 8, 2009, 7:08:47 PM11/8/09
to sage-...@googlegroups.com
On Sun, Nov 8, 2009 at 9:31 PM, David Kirkby <david....@onetel.net> wrote:
> Hopefully, -pR may work for any POSIX system if the reason for the
> hard link is known. I can't see what creates that link myself. You
> clearly have a much greater understanding of the issues than me.

Just to clarify, the option "-pR" is not enough, even disregarding the
hard link issue. In fact, for solaris, using "-pR" means that also the
symlinks are not handled properly!

IOW, it's really necessary to use "-pRP" (both "p" and "P") --- this
should work for any posix system (except for the hard link issue).

> This is quite an old release of HP-UX. The more modern versions are
> probabably better, but they will not run on my PA-RISC machine. I
> believe they will only run on Itanium systems.

Then I would guess a newer release of HP-UX actually supports the -P
option to cp.

Gonzalo

Reply all
Reply to author
Forward
0 new messages