[PATCH 1/1] Fix ACL metadata format; delimit short form entries with commas

5 views
Skip to first unread message

Rob Browning

unread,
Dec 31, 2022, 2:49:54 PM12/31/22
to bup-...@googlegroups.com
Previously, ACL entries were delimited by newlines, which is incorrect
according to acl(5), and the POSIX1e standard, and which is rejected
with EINVAL by acl_from_text() on some platforms (e.g. cygwin).

To fix the problem, change read_acl to use comma delimiters when
reading new ACLs for a path, and write those in a new "v1" ACL
metadata record. When loading the previous newline delimited v0
records and when it appears safe[1], convert newlines delimiters to
commas. If it doesn't appear safe, defer an error and ignore the
entire ACL record.

[1] For example, if a user name or group had contained a newline,
which is entirely valid on some platforms (e.g. linux).

Thanks to Moritz Lell for reporting the problem.

cf. (via acl(5)) http://wt.tuxomania.net/publications/posix.1e/download.html

Signed-off-by: Rob Browning <r...@defaultvalue.org>
Tested-by: Rob Browning <r...@defaultvalue.org>
---

It also looks like the standard "requires" platforms to do escaping
if required, but since it doesn't specify *how*, it's not portable,
so if we want to fix this "right", we'll need to add our own format.
From the standard:

If an implementation allows the colon character ":" to be present
in an ACL entry qualifer, then that implementation shall provide a
method for distinguishing between a colon character as a field
separator in an ACL entry definition and a colon character as a
compoinent of the ACL entry qualifer value.

lib/bup/_helpers.c | 4 ++--
lib/bup/metadata.py | 31 +++++++++++++++++++++++++++++--
2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/lib/bup/_helpers.c b/lib/bup/_helpers.c
index 4de72547a..c0c048d93 100644
--- a/lib/bup/_helpers.c
+++ b/lib/bup/_helpers.c
@@ -1945,9 +1945,9 @@ static int bup_read_acl_to_text(const char *name, acl_type_t type,
}

*num = NULL;
- *txt = acl_to_any_text(acl, "", '\n', TEXT_ABBREVIATE);
+ *txt = acl_to_any_text(acl, "", ',', TEXT_ABBREVIATE);
if (*txt)
- *num = acl_to_any_text(acl, "", '\n', TEXT_ABBREVIATE | TEXT_NUMERIC_IDS);
+ *num = acl_to_any_text(acl, "", ',', TEXT_ABBREVIATE | TEXT_NUMERIC_IDS);

if (*txt && *num)
return 0;
diff --git a/lib/bup/metadata.py b/lib/bup/metadata.py
index b2e3ed996..33e93cbe7 100644
--- a/lib/bup/metadata.py
+++ b/lib/bup/metadata.py
@@ -187,13 +187,14 @@ _rec_tag_end = 0
_rec_tag_path = 1
_rec_tag_common_v1 = 2 # times, user, group, type, perms, etc. (legacy/broken)
_rec_tag_symlink_target = 3
-_rec_tag_posix1e_acl = 4 # getfacl(1), setfacl(1), etc.
+_rec_tag_posix1e_acl_v0 = 4 # (broken \n delimited format, see v2 below)
_rec_tag_nfsv4_acl = 5 # intended to supplant posix1e? (unimplemented)
_rec_tag_linux_attr = 6 # lsattr(1) chattr(1)
_rec_tag_linux_xattr = 7 # getfattr(1) setfattr(1)
_rec_tag_hardlink_target = 8 # hard link target path
_rec_tag_common_v2 = 9 # times, user, group, type, perms, etc. (current)
_rec_tag_common_v3 = 10 # adds optional size to v2
+_rec_tag_posix1e_acl = 11 # getfacl(1), setfacl(1), etc.

_warned_about_attr_einval = None

@@ -545,10 +546,34 @@ class Metadata:
else:
return None

- def _load_posix1e_acl_rec(self, port):
+ @staticmethod
+ def _correct_posix1e_v0_delimiters(acl, path):
+ assert acl
+ # The v0 format had newline delimiters which are incorrect for
+ # the ACL short text format, and which are rejected with
+ # EINVAL by some platforms' acl_from_text(). This function is
+ # intended to be correct unless the original user or group
+ # name (or any other field) had commas or colons in it, and it
+ # also assumes that platforms didn't define and append any
+ # additional fields (technically allowed by the withdrawn
+ # standard). For now we just give up entirely, dropping all
+ # the ACLs if any entry doesn't match the restrictions.
+ for i in range(len(acl)):
+ entry = acl[i]
+ fields = entry.split(b'\n')
+ if len(fields) == 3:
+ acl[i] = b','.join(fields)
+ else: # Not ideal - no path information
+ add_error(f'Unexpected ACL entry; entirely ignoring {acl!r}\n')
+ return None
+ return acl
+
+ def _load_posix1e_acl_rec(self, port, *, version=None):
acl_rep = vint.unpack('ssss', vint.read_bvec(port))
if acl_rep[2] == b'':
acl_rep = acl_rep[:2]
+ if version == 0:
+ acl_rep = self._correct_posix1e_v0_delimiters(acl_rep)
self.posix1e_acl = acl_rep

def _apply_posix1e_acl_rec(self, path, restore_numeric_ids=False):
@@ -838,6 +863,8 @@ class Metadata:
result._load_hardlink_target_rec(port)
elif tag == _rec_tag_posix1e_acl:
result._load_posix1e_acl_rec(port)
+ elif tag == _rec_tag_posix1e_acl_v0:
+ result._load_posix1e_acl_rec(port, version=0)
elif tag == _rec_tag_linux_attr:
result._load_linux_attr_rec(port)
elif tag == _rec_tag_linux_xattr:
--
2.30.2

Rob Browning

unread,
Dec 31, 2022, 2:53:27 PM12/31/22
to mle...@gmail.com, bup-...@googlegroups.com
Rob Browning <r...@defaultvalue.org> writes:

> Previously, ACL entries were delimited by newlines, which is incorrect
> according to acl(5), and the POSIX1e standard, and which is rejected
> with EINVAL by acl_from_text() on some platforms (e.g. cygwin).

If you have time, could you test this patch and see if it fixes the ACL
EINVAL problem for you?

(I'd recommend against using it for any important data yet, in case we
decide we want to make further changes, since it alters the metadata
format.)

Thanks
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

Rob Browning

unread,
Dec 31, 2022, 10:13:08 PM12/31/22
to mle...@gmail.com, bup-...@googlegroups.com
Rob Browning <r...@defaultvalue.org> writes:

> If you have time, could you test this patch and see if it fixes the ACL
> EINVAL problem for you?

Actually, nevermind? I think I may have gotten in too big a hurry, and
this isn't right. I'll look in to it more carefully over the next day
or two.

Johannes Berg

unread,
Apr 28, 2023, 2:17:03 PM4/28/23
to Rob Browning, bup-...@googlegroups.com
Now that I'm looking at this ...

On Sat, 2022-12-31 at 13:49 -0600, Rob Browning wrote:
> Previously, ACL entries were delimited by newlines, which is incorrect
> according to acl(5), and the POSIX1e standard, and which is rejected
> with EINVAL by acl_from_text() on some platforms (e.g. cygwin).
>
> To fix the problem, change read_acl to use comma delimiters when
> reading new ACLs for a path, and write those in a new "v1" ACL
> metadata record. When loading the previous newline delimited v0
> records and when it appears safe[1], convert newlines delimiters to
> commas. If it doesn't appear safe, defer an error and ignore the
> entire ACL record.
>
> [1] For example, if a user name or group had contained a newline,
> which is entirely valid on some platforms (e.g. linux).
>
> Thanks to Moritz Lell for reporting the problem.
>
> cf. (via acl(5)) http://wt.tuxomania.net/publications/posix.1e/download.html

FWIW the approach seems fine, it breaks for colons in user/group names,
but I don't see how that'd be allowed anywhere (Linux has /etc/passwd
and /etc/group, Windows documents no : in
https://learn.microsoft.com/en-us/previous-versions/windows/it-
pro/windows-2000-server/bb726984(v=technet.10)
and I can't find anything for OSX but POSIX also says it shouldn't be
used:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_437
and same for group)

> -_rec_tag_posix1e_acl = 4 # getfacl(1), setfacl(1), etc.
> +_rec_tag_posix1e_acl_v0 = 4 # (broken \n delimited format, see v2 below)

you called this v0 so you could've used v1 for the newer one :)

> _rec_tag_nfsv4_acl = 5 # intended to supplant posix1e? (unimplemented)
> _rec_tag_linux_attr = 6 # lsattr(1) chattr(1)
> _rec_tag_linux_xattr = 7 # getfattr(1) setfattr(1)
> _rec_tag_hardlink_target = 8 # hard link target path
> _rec_tag_common_v2 = 9 # times, user, group, type, perms, etc. (current)
> _rec_tag_common_v3 = 10 # adds optional size to v2
> +_rec_tag_posix1e_acl = 11 # getfacl(1), setfacl(1), etc.

and maybe add _v2 (or _v1) to this?

> - def _load_posix1e_acl_rec(self, port):
> + @staticmethod
> + def _correct_posix1e_v0_delimiters(acl, path):

You didn't use the path argument ...

> + assert acl
> + # The v0 format had newline delimiters which are incorrect for
> + # the ACL short text format, and which are rejected with
> + # EINVAL by some platforms' acl_from_text(). This function is
> + # intended to be correct unless the original user or group
> + # name (or any other field) had commas or colons in it,

I'm not sure I understand the comment about commas, it's still possible
to parse unambiguously with a comma in the user/group name since the
fields are colon-separated, just not with a simple "split(',')", no?

> and it
> + # also assumes that platforms didn't define and append any
> + # additional fields (technically allowed by the withdrawn
> + # standard). For now we just give up entirely, dropping all
> + # the ACLs if any entry doesn't match the restrictions.
> + for i in range(len(acl)):
> + entry = acl[i]
> + fields = entry.split(b'\n')
> + if len(fields) == 3:
> + acl[i] = b','.join(fields)
> + else: # Not ideal - no path information
> + add_error(f'Unexpected ACL entry; entirely ignoring {acl!r}\n')

... maybe you meant to use it in the error message?

johannes

Rob Browning

unread,
Apr 29, 2023, 2:35:06 PM4/29/23
to Johannes Berg, bup-...@googlegroups.com
Johannes Berg <joha...@sipsolutions.net> writes:

> On Sat, 2022-12-31 at 13:49 -0600, Rob Browning wrote:

>> -_rec_tag_posix1e_acl = 4 # getfacl(1), setfacl(1), etc.
>> +_rec_tag_posix1e_acl_v0 = 4 # (broken \n delimited format, see v2 below)
>
> you called this v0 so you could've used v1 for the newer one :)

>> _rec_tag_common_v2 = 9 # times, user, group, type, perms, etc. (current)
>> _rec_tag_common_v3 = 10 # adds optional size to v2
>> +_rec_tag_posix1e_acl = 11 # getfacl(1), setfacl(1), etc.
>
> and maybe add _v2 (or _v1) to this?

I'll have to remember why I did both of those -- not sure, but suspect
it wasn't for any substantial reasons.

>> + assert acl
>> + # The v0 format had newline delimiters which are incorrect for
>> + # the ACL short text format, and which are rejected with
>> + # EINVAL by some platforms' acl_from_text(). This function is
>> + # intended to be correct unless the original user or group
>> + # name (or any other field) had commas or colons in it,
>
> I'm not sure I understand the comment about commas, it's still possible
> to parse unambiguously with a comma in the user/group name since the
> fields are colon-separated, just not with a simple "split(',')", no?

I don't think I considered the semantics extensively, but I was likely
concerned about cases where a user or group name might contain arbitrary
colons or commas, and then you can't tell which ones were the actual
delimiters.

# useradd ',group:srsly:r-x'

Though perhaps vanishingly unlikely, both because "dont do that", and
because with our current format it'd have to be more like this?

# useradd "$(echo -e '\ngroup::r-x')"

...and I suspect very few systems allow (or if they do, survive for long
with) newlines in user/group names. I'd guess that the general
user-land handling would be some exciting flavor of "this is fine".

>> + acl[i] = b','.join(fields)
>> + else: # Not ideal - no path information
>> + add_error(f'Unexpected ACL entry; entirely ignoring {acl!r}\n')
>
> ... maybe you meant to use it in the error message?

The path argument, right -- perhaps. I'll check that too.

Thanks

Rob Browning

unread,
May 28, 2023, 6:45:52 PM5/28/23
to bup-...@googlegroups.com
Rob Browning <r...@defaultvalue.org> writes:

> Previously, ACL entries were delimited by newlines, which is incorrect
> according to acl(5), and the POSIX1e standard, and which is rejected
> with EINVAL by acl_from_text() on some platforms (e.g. cygwin).
>
> To fix the problem, change read_acl to use comma delimiters when
> reading new ACLs for a path, and write those in a new "v1" ACL
> metadata record. When loading the previous newline delimited v0
> records and when it appears safe[1], convert newlines delimiters to
> commas. If it doesn't appear safe, defer an error and ignore the
> entire ACL record.
>
> [1] For example, if a user name or group had contained a newline,
> which is entirely valid on some platforms (e.g. linux).
>
> Thanks to Moritz Lell for reporting the problem.
>
> cf. (via acl(5)) http://wt.tuxomania.net/publications/posix.1e/download.html

Pushed the 0.33.1 version of this to main.

Rob Browning

unread,
May 28, 2023, 6:47:09 PM5/28/23
to Johannes Berg, bup-...@googlegroups.com
Johannes Berg <joha...@sipsolutions.net> writes:

> Now that I'm looking at this ...

I think the 0.33.1 version I pushed should have addressed all the
comments. Thanks again for the help.
Reply all
Reply to author
Forward
0 new messages