Test failure(s) on Cygwin due to ACLs set on folders

12 views
Skip to first unread message

mle...@gmail.com

unread,
Dec 22, 2022, 7:44:42 AM12/22/22
to bup-list
Hi,

executing "make test" on Cygwin shows that most tests fail. To cut down on that a bit, I changed ./pytest to abort on the first failure (the -x argument to pytest) and I found that the failure is caused by test/ext/test_get.py, line 339 (in _test_replace):

$ ./pytest
============== test session starts =============
platform cygwin -- Python 3.9.10, pytest-7.2.0, pluggy-1.0.0 -- /tmp/bup/dev/bup-python.exe
cachedir: .pytest_cache
rootdir: /tmp/bup, configfile: pytest.ini, testpaths: test/int, test/ext
collected 127 items / 1 deselected / 126 selected

test/ext/test_get.py::test_get[get-replace] FAILED                         [  0%]

====== FAILURES ============
______ test_get[get-replace] _________

tmpdir = b'/tmp/bup/test/tmp/ext-test_get-py-test_getizfx_zpe'
disposition = 'get', category = 'replace'

    @pytest.mark.parametrize("disposition,category", product(dispositions_to_test, categories))
    def test_get(tmpdir, disposition, category):
        chdir(tmpdir)
        try:
            src_info = create_get_src()
>           globals().get('_test_' + category)(disposition, src_info)

test/ext/test_get.py:966:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
test/ext/test_get.py:339: in _test_replace
    validate_tagged_save(b'obj', getcwd() + b'/src',
test/ext/test_get.py:198: in validate_tagged_save
    _validate_save(orig_value, b'tmp-branch-for-tag/latest' + restore_subpath,
test/ext/test_get.py:149: in _validate_save
    exr = verify_rcz((bup_cmd, b'-d', b'get-dest',
test/ext/test_get.py:58: in verify_rcz
    wvcheck(rc == 0, 'process exit %d == 0' % rc)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

cond = False, msg = 'process exit 1 == 0'

    def WVCHECK(cond, msg):
>       assert cond, msg
E       AssertionError: process exit 1 == 0

test/lib/wvpytest.py:38: AssertionError
------------------------------ Captured stdout call ------------------------------
(.... large amout of output of previous tests....)
------------------------------ Captured stderr call ------------------------------
(.... large amout of output of previous tests....)

git --git-dir get-dest branch tmp-branch-for-tag refs/tags/obj
/tmp/bup/lib/cmd/bup -d get-dest restore -C restore tmp-branch-for-tag/latest/tmp/bup/test/tmp/ext-test_get-py-test_getizfx_zpe/src/.
POSIX1e ACL: can't create [b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x'] for 'y'
POSIX1e ACL: can't create [b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x'] for 'x'
POSIX1e ACL: can't create [b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x'] for '.'
warning: 3 errors encountered

---------------------------- Captured stderr teardown ----------------------------

Preserving: b'test/tmp/ext-test_get-py-test_getizfx_zpe'

(... some final output...)

The ACLs seem strange to me because they are repeated. So what I would understand is trying to set "u::rwx\ng::r-x\no::r-x", even though I am unsure about the newline characters in there. But what is shown in the error message is a python ist with three times this value.

The test repository (bup folder "get-dest") looks like this:
src/
src/1
src/tiny-file
src/x/
src/x/2
src/y/

The error happens when a directory is restored. For files, it works:

$ /tmp/bup/lib/cmd/bup -d get-dest restore -C restore tmp-branch-for-tag/latest/tmp/bup/test/tmp/ext-test_get-py-test_getizfx_zpe/src/x/y
POSIX1e ACL: can't create [b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x', b'u::rwx\ng::r-x\no::r-x'] for 'y'
Restoring: 1, done.
warning: 1 errors encountered

$ /tmp/bup/lib/cmd/bup -d get-dest restore -C restore tmp-branch-for-tag/latest/tmp/bup/test/tmp/ext-test_get-py-test_getizfx_zpe/src/x/2
Restoring: 1, done.



I will try to dig into it, but still report it here in case you have hints where to look in the code and especially what the format of the ACLs in the error message mean (the fact that it is a list with (byte)strings that contain newlines)

Best,
Moritz


mle...@gmail.com

unread,
Dec 24, 2022, 7:38:57 AM12/24/22
to bup-list
Solved.

Cygwin differs from Linux in the format that it accepts an ACL string (functions acl_from_text, acl_set_file in sys/acl.h):

There are two formats how to represent POSIX ACLs, abbreviated and not abbreviated. Bup as of now requests and saves
the abbreviated form (bup/_helpers.c > bup_read_acl_to_text(); sys/acl.h > acl_to_any_text(), TEXT_ABBREVIATE). Examples for the abbreviated form:

u::rwx,g::r-x,o::---

u::rwx
g::r-x
o::---

However, only the first form is documented in "man acl" but the second form is accepted by acl_set_file as well. At least on Linux. On Cygwin,
it causes errno EINVAL Illegal Argument.

Solution: Do not store abbreviated ACLs to avoid this ambiguity and keep newline-separated records. Patch follows.

Merry Christmas,
Moritz

Rob Browning

unread,
Dec 30, 2022, 3:38:04 PM12/30/22
to mle...@gmail.com, bup-list
"mle...@gmail.com" <mle...@gmail.com> writes:

> However, only the first form is documented in "man acl"

Hmm, looks like both are documented here in acl(5), i.e. in the debian
manpages, e.g.: https://man7.org/linux/man-pages/man5/acl.5.html#ACL_TEXT_FORMS

Wonder if this is just a limitation of, or difference in cygwin's
implementation. I also saw your proposed patches, but before
considering those carefully, I want to try to make sure I understand the
situation well.

Thanks
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

Rob Browning

unread,
Dec 30, 2022, 3:59:47 PM12/30/22
to mle...@gmail.com, bup-list
Rob Browning <r...@defaultvalue.org> writes:

> "mle...@gmail.com" <mle...@gmail.com> writes:
>
>> However, only the first form is documented in "man acl"
>
> Hmm, looks like both are documented here in acl(5), i.e. in the debian
> manpages, e.g.: https://man7.org/linux/man-pages/man5/acl.5.html#ACL_TEXT_FORMS
>
> Wonder if this is just a limitation of, or difference in cygwin's
> implementation. I also saw your proposed patches, but before
> considering those carefully, I want to try to make sure I understand the
> situation well.

OK, so it's starting to appear like cygwin acl_from_text() does intend
to support POSIX (never actually approved by POSIX, but what everyone
adopted anyway?) short form, so I'm wondering if something else might be
going on.

Specifically, I'm wondering if we might be including a trailing newline
and that might be confusing it. If it's feasible, could you add some
print statements to see if we do, and/or try an acl.trimr() or similar
to see if that might fix the problem?

Rob Browning

unread,
Dec 31, 2022, 2:33:57 AM12/31/22
to mle...@gmail.com, bup-list
Rob Browning <r...@defaultvalue.org> writes:

> OK, so it's starting to appear like cygwin acl_from_text() does intend
> to support POSIX (never actually approved by POSIX, but what everyone
> adopted anyway?) short form, so I'm wondering if something else might be
> going on.

Hmm, that wasn't quite right. I forgot we use acl_to_any_text() not
acl_to_text(). I'll need to reappraise.

Rob Browning

unread,
Dec 31, 2022, 2:42:34 PM12/31/22
to mle...@gmail.com, bup-list
Rob Browning <r...@defaultvalue.org> writes:

> Hmm, that wasn't quite right. I forgot we use acl_to_any_text() not
> acl_to_text(). I'll need to reappraise.

Ahh, so I think it was a delimiter issue, but ours, not cygwin's.
i.e. acl(5) clearly says that the short form is comma delimited, but
we've been specifying '\n' as the delimiter for acl_to_any_text(). I'll
post a potential fix for review/testing in a bit.

(And now that I think about it, unless there are some escaping syntaxes
not documented in acl(5), neither of the POSIX formats can handle
user/group names that contain a colon?)

M L

unread,
Jan 1, 2023, 12:16:44 PM1/1/23
to Rob Browning, bup-list

31.12.2022 20:42:34 Rob Browning <r...@defaultvalue.org>:
> (And now that I think about it, unless there are some escaping syntaxes
> not documented in acl(5), neither of the POSIX formats can handle
> user/group names that contain a colon?)
>

I thought about this, too. On Windows, colons might
be disallowed, however, the only proof I found so far is this screenshot of a Windows error message listing the characters / \ " [ ] : < > | + = ; , ? * @. Also the handling of local vs. Active Directory usernames might be a bit different, where AD usernames might treat e.g. ć and c as the same letter.
https://stackoverflow.com/questions/8732008/local-user-name-disallowed-chars

For Linux, the regex for a valid user name can be changed, but there is a notion of POSIX portable user names that do not have a colon.

https://serverfault.com/questions/73084/what-characters-should-i-use-or-not-use-in-usernames-on-linux

Rob Browning

unread,
Jan 1, 2023, 3:14:18 PM1/1/23
to M L, bup-list
M L <mle...@gmail.com> writes:

> For Linux, the regex for a valid user name can be changed, but there is a notion of POSIX portable user names that do not have a colon.

At the lowest level, in theory, I suspect bup should record *anything*
(presumably except null) on save, but that'll have to wait for later.

And practically speaking, yeah, I imagine platforms vary in what they
allow, but might not allow the really troublesome characters (easily).
For example, looks like, Debian forbids colon, newline, ...., via
useradd, etc.

In any case, I think I have a more correct patch about ready. I'll send
it later. It's still suceptible to some of the theoretical corner
cases, but it might be plausible for now. Though we'll need some
additional tests before it'll be ready for merge.

In the longer run, given the time, I think we'd probably want to drop
the "built in" text formats entirely, i.e. read the data ourselves via
the acl(5) functions and encode it completely unambiguously in a new
metadata record format.

But if "all the relevant platforms" ban the problematic characters, then
that might not be worth the effort, compared to other things we need to
do.

Thanks again
Reply all
Reply to author
Forward
0 new messages