Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Alternate Data Stream Support in FreeBSD (was Re: O_XATTR support in FreeBSD?)

63 views
Skip to first unread message

Lionel Cons

unread,
Nov 26, 2013, 7:27:45 AM11/26/13
to
On 26 November 2013 11:19, Jordan Hubbard <jordan....@gmail.com> wrote:
>
> On Nov 26, 2013, at 1:51 AM, Cedric Blancher <cedric....@gmail.com> wrote:
>
>> 1. You do not need more syscalls. Solaris uses the plain openat()
>> syscall for this, with the O_XATTR flag passed to the normal
>> open()/openat() flags to open a named attribute. Likewise read(),
>> write(), mmap() etc work, too.
>
> I don’t know if I’d go so far as to say “you do not need more syscalls”;
> there are additional functions for manipulating EAs that go well beyond
> the Solaris extensions to the directory and file I/O functions. Assuming you
> want to be able to get/set as well as enumerate or remove EAs, then
> you might just as well add getxattr(2), listxattr(2), removexattr(2), setxattr(2)
> too and follow the herd (Linux and OS X, so far).

You mean 'follow the lemmings down into the abyss'? :)

> We’re also glossing over ACLs and where they get to live. I don’t know if Robert
> and friends have stuck them in a separate namespace on FreeBSD or if they’re
> in system-protected EAs, as they are in OS X, but ACL preservation across
> serialization / deserialization is just as important as it is for EAs.

Could we first agree what we are talking about, please? I'm a bit new
to this thread, but AFAIK we are talking about the Windows Alternate
Data Streams as they appear in networked filesystem like NFSv4 and
CIFS and physical filesystems like NTFS, ZFS and Solaris UFS, right?
ACLs have no direct relation to those streams.

The attributes support from Linux has been proven (at least from CERNs
viewpoint) as pretty useless because of their size constrains and
crappy API (i.e. no mmap(), no sparse support, no normal tools can
access them, ...) so IMHO the herd to follow is the herd which
implements at least the following requirements:
1) A proper implementation, which includes access using the normal
system utilities (in Solaris there is the runat(1) utility to access
the hidden directory containing the attribute files, and bash4.3 and
ksh have cd -@ to cwd into the hidden directories containing the
attribute files. From that point on (inside the hidden directory)
ls(1) and even chown(1) and chmod(1) work as usual. You can even stick
ZFS and NFSv4 ACLs on the files in the hidden directory containing the
attribute files)
2) read(), write() and mmap() access, i.e. the normal POSIX API (of
course with the minor extension to flag an access to an alternate data
stream or the directory containing the alternate data streams)
2) Support in networked filesystems (i.e. NFSv4, CIFS)
3) No size restrictions (just to explain, at CERN the alternate data
streams are often precompiled caches or index files of the main file's
contents, and can easily in the TB range)
4) Support for sparse data (i.e. SEEK_HOLE and SEEK_DATA)
5) More than one implementation available

AFAIK Solaris, Nexenta, Illumos (NFSv4, ZFS, UFS) and Windows
Alternate Data Streams (CIFS, NTFS) fit these requirements.

Lionel
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hacke...@freebsd.org"

Jordan Hubbard

unread,
Nov 26, 2013, 12:08:29 PM11/26/13
to

On Nov 26, 2013, at 4:27 AM, Lionel Cons <lionelc...@gmail.com> wrote:

>> I don’t know if I’d go so far as to say “you do not need more syscalls”;
>> there are additional functions for manipulating EAs that go well beyond
>> the Solaris extensions to the directory and file I/O functions. Assuming you
>> want to be able to get/set as well as enumerate or remove EAs, then
>> you might just as well add getxattr(2), listxattr(2), removexattr(2), setxattr(2)
>> too and follow the herd (Linux and OS X, so far).
>
> You mean 'follow the lemmings down into the abyss'? :)

Well, I don’t know that it’s an “abyss” - EAs may or may not be useful, depending on how you employ them!

In the first version of OS X to support them, in fact, I believe they were limited in size to 4883 bytes (don’t ask me why that number) and they were still used to apply various “tags” to files (Finder metadata, some index values into the search database, etc). General pressure to use them for more things eventually got this size bumped up to 128K, and now it’s actually 2GB(!) (http://support.apple.com/kb/HT5983) so I think it’s fair to say that EAs in OS X are now essentially equivalent to forked files, more or less.

> Could we first agree what we are talking about, please? I'm a bit new
> to this thread, but AFAIK we are talking about the Windows Alternate
> Data Streams as they appear in networked filesystem like NFSv4 and
> CIFS and physical filesystems like NTFS, ZFS and Solaris UFS, right?
> ACLs have no direct relation to those streams.

Actually, I didn’t think we were talking about alternate data streams myself. Conceptually they’re equivalent, I guess, but I’ve always through they were somewhat overkill and I’ve yet to encounter an application that seriously uses them. I’m sure they’re out there somewhere, but even back in the days when EAs were limited to just over 4K, we found them very useful for what was essentially their original purpose - an extension to the file attribute data that Unix already proves. The only reason that ACLs crept into the discussion is because of where they’re stored. I don’t know about Linux, but Apple has chosen to store ACLs in EAs, which is pretty useful because this gives you an easy way of serializing the ACLs too - you just serialize them from a suitably privileged process.

The main point I was trying to make is that if you’re going to have EAs at all, you need to commit fully. The various Unix tools need to support them (we’ve already talked about the archivers and compressors) and tools like ls(1) need to be able to show them on files. You need a way of dealing with them on foreign filesystems that don’t support EAs. Most folks just cram EAs into the filesystem, add a few decorations to existing system calls and then shout “done!” and do a victory dance. Then when nobody actually uses EAs, they go “See? I always told you EAs were crap! Terrible idea! Never should have added them!”

This is tantamount to building a car with an engine but no wheels, dashboard or steering wheel and then declaring that the world just isn’t ready for cars since they’re not buying yours.

I know you cite Solaris’ integration as an example of such a “full solution, and maybe the “~@“ syntax was awesome in practice, I dunno, all I can say is that the only namespace trick we needed to pull at Apple were the AppleDouble (._) sidecar files. There was an earlier filename/..rfork/ syntax for addressing resource forks, which predated EAs in OS X, and some folks used it quite a bit, but it was eventually deprecated in favor of the single sidecar file. I never found a need to “cd” into the namespace of a file’s EAs when I had the xattr(1) command so handy for deleting / changing them, and ls’s -@ argument would also display them for me. I suppose it is all a matter of taste. If someone wanted to do the namespace thing in FreeBSD, I wouldn’t argue against it.

I also wouldn’t argue against fully parallel “forks” being a superset of EAs, since I guess at CERN where folks are routinely looking at Petabytes of data from a detector like ATLAS or CMS, anything that puts size constraints on their data is just the devil, but again, that wasn’t actually the point I was trying to make. I was simply trying to say that NFSv4 or ZFS “native EA support” is the easy part. The harder part is in making sure that the EAs don’t get stripped out in transit or during routine file manipulations, and this requires that everything from cp(1) to rsync(8) becomes EA-aware. Most of the implementations I’ve seen don’t bother to do that last mile of integration, and as a result EAs are just basically untrustworthy beasts that users shy away from.

- Jordan

Matthew Fleming

unread,
Nov 26, 2013, 12:17:43 PM11/26/13
to
On Tue, Nov 26, 2013 at 9:08 AM, Jordan Hubbard <jordan....@gmail.com>wrote:

>
> On Nov 26, 2013, at 4:27 AM, Lionel Cons <lionelc...@gmail.com> wrote:
> > Could we first agree what we are talking about, please? I'm a bit new
> > to this thread, but AFAIK we are talking about the Windows Alternate
> > Data Streams as they appear in networked filesystem like NFSv4 and
> > CIFS and physical filesystems like NTFS, ZFS and Solaris UFS, right?
> > ACLs have no direct relation to those streams.
>
> Actually, I didn’t think we were talking about alternate data streams
> myself. Conceptually they’re equivalent, I guess, but I’ve always through
> they were somewhat overkill and I’ve yet to encounter an application that
> seriously uses them.


Anyone implementing a SMB/CIFS filesystem on top of FreeBSD (e.g. a vendor)
will need to come up with support for ADS. Whether or not any "FreeBSD"
application has use for them is maybe beside the point -- vendors have a
use for support. Though probably every vendor who wants them has already
coded their own variant.

Cheers,
matthew

Cedric Blancher

unread,
Nov 26, 2013, 12:34:23 PM11/26/13
to
On 26 November 2013 18:17, Matthew Fleming <m...@freebsd.org> wrote:
> On Tue, Nov 26, 2013 at 9:08 AM, Jordan Hubbard <jordan....@gmail.com>
> wrote:
>>
>>
>> On Nov 26, 2013, at 4:27 AM, Lionel Cons <lionelc...@gmail.com> wrote:
>> > Could we first agree what we are talking about, please? I'm a bit new
>> > to this thread, but AFAIK we are talking about the Windows Alternate
>> > Data Streams as they appear in networked filesystem like NFSv4 and
>> > CIFS and physical filesystems like NTFS, ZFS and Solaris UFS, right?
>> > ACLs have no direct relation to those streams.
>>
>> Actually, I didn’t think we were talking about alternate data streams
>> myself. Conceptually they’re equivalent, I guess, but I’ve always through
>> they were somewhat overkill and I’ve yet to encounter an application that
>> seriously uses them.
>
>
> Anyone implementing a SMB/CIFS filesystem on top of FreeBSD (e.g. a vendor)
> will need to come up with support for ADS. Whether or not any "FreeBSD"
> application has use for them is maybe beside the point -- vendors have a use
> for support. Though probably every vendor who wants them has already coded
> their own variant.

+1

BTW: FreeBSD uses ZFS as main file system, and ZFS has native O_XATTR
support (aka Alternate Data Streams). How hard is it to add O_XATTR,
VFS_XATTR and FXATTR in the FreeBSD vfs layer?

Ced
--
Cedric Blancher <cedric....@gmail.com>
Institute Pasteur

Cedric Blancher

unread,
Nov 26, 2013, 12:48:26 PM11/26/13
to
On 26 November 2013 13:27, Lionel Cons <lionelc...@gmail.com> wrote:
> On 26 November 2013 11:19, Jordan Hubbard <jordan....@gmail.com> wrote:
>>
>> On Nov 26, 2013, at 1:51 AM, Cedric Blancher <cedric....@gmail.com> wrote:
>>
>>> 1. You do not need more syscalls. Solaris uses the plain openat()
>>> syscall for this, with the O_XATTR flag passed to the normal
>>> open()/openat() flags to open a named attribute. Likewise read(),
>>> write(), mmap() etc work, too.
>>
>> I don’t know if I’d go so far as to say “you do not need more syscalls”;
>> there are additional functions for manipulating EAs that go well beyond
>> the Solaris extensions to the directory and file I/O functions. Assuming you
>> want to be able to get/set as well as enumerate or remove EAs, then
>> you might just as well add getxattr(2), listxattr(2), removexattr(2), setxattr(2)
>> too and follow the herd (Linux and OS X, so far).
>
> You mean 'follow the lemmings down into the abyss'? :)
>
>> We’re also glossing over ACLs and where they get to live. I don’t know if Robert
>> and friends have stuck them in a separate namespace on FreeBSD or if they’re
>> in system-protected EAs, as they are in OS X, but ACL preservation across
>> serialization / deserialization is just as important as it is for EAs.
>
> Could we first agree what we are talking about, please? I'm a bit new
> to this thread, but AFAIK we are talking about the Windows Alternate
> Data Streams as they appear in networked filesystem like NFSv4 and
> CIFS and physical filesystems like NTFS, ZFS and Solaris UFS, right?
> ACLs have no direct relation to those streams.
>
> The attributes support from Linux has been proven (at least from CERNs
> viewpoint) as pretty useless because of their size constrains and
> crappy API (i.e. no mmap(), no sparse support, no normal tools can
> access them, ...) so IMHO the herd to follow is the herd which
> implements at least the following requirements:
> 1) A proper implementation, which includes access using the normal
> system utilities (in Solaris there is the runat(1) utility to access
> the hidden directory containing the attribute files, and bash4.3 and
> ksh have cd -@ to cwd into the hidden directories containing the
> attribute files. From that point on (inside the hidden directory)
> ls(1) and even chown(1) and chmod(1) work as usual. You can even stick
> ZFS and NFSv4 ACLs on the files in the hidden directory containing the
> attribute files)
> 2) read(), write() and mmap() access, i.e. the normal POSIX API (of
> course with the minor extension to flag an access to an alternate data
> stream or the directory containing the alternate data streams)
> 2) Support in networked filesystems (i.e. NFSv4, CIFS)
> 3) No size restrictions (just to explain, at CERN the alternate data
> streams are often precompiled caches or index files of the main file's
> contents, and can easily in the TB range)
> 4) Support for sparse data (i.e. SEEK_HOLE and SEEK_DATA)
> 5) More than one implementation available
>
> AFAIK Solaris, Nexenta, Illumos (NFSv4, ZFS, UFS) and Windows
> Alternate Data Streams (CIFS, NTFS) fit these requirements.

+1

Other argument pro-Alternate Data Streams: Alternate Data Streams are
a superset of the Linux extended attributes (and can thus be used to
emulate them in libc), have all their strengths but none of their
weaknesses (like the hideously duplicated vfs apis and the lack of
support in POSIX utilities).

IMHO the Solaris/Illumos/Nexenta solution of O_XATTR provides a better
integration into the Unix filesystem philosophy (everything is a file)
and already reached such a common market penetration that common of
the shelf shells like bash and ksh integrated support for them.

Jordan Hubbard

unread,
Nov 26, 2013, 3:58:05 PM11/26/13
to

On Nov 26, 2013, at 9:17 AM, Matthew Fleming <m...@FreeBSD.org> wrote:

> Anyone implementing a SMB/CIFS filesystem on top of FreeBSD (e.g. a vendor) will need to come up with support for ADS. Whether or not any "FreeBSD" application has use for them is maybe beside the point -- vendors have a use for support. Though probably every vendor who wants them has already coded their own variant.

Oh, you mean like some provider of a NAS appliance based on FreeBSD? :-)

I do take your point, though fortunately (?) there really is only one "vendor" we all need to concern ourselves with here, and that's Samba. I doubt that anyone in the BSD world save Apple has the resources to go their own direction with respect to CIFS.

A fair amount of googling around doesn't reveal many folks hacking this sort of integration into samba. I'll have to ask Jeremy. Now you have me curious. :-)

- Jordan

Rick Macklem

unread,
Nov 26, 2013, 5:42:15 PM11/26/13
to
Not exactly, as I understand it. Linux (and FreeBSD) extended attributes
support atomic replacement of the entire attribute value. At least for
NFSv4 named attributes, this cannot be emulated, since the named attribute
is replaced by a "Setattr size=0, write offset=0" and there is no way
to do those 2 operations as one atomic operation.

I suppose a syscall could be implemented as an atomic update (in FreeBSD
an exclusive vnode lock could be used) for local file systems, but since
the VOPs for the atomic size limited extended attrbutes already exist, I
don't see emulation as useful.

The NFSv4 working group seems to have decided to add support for atomically
set extended attributes (Linux, FreeBSD style) to a future minor revision,
due to it not being possible to accurately emulate them with named attributes.

rick

> IMHO the Solaris/Illumos/Nexenta solution of O_XATTR provides a
> better
> integration into the Unix filesystem philosophy (everything is a
> file)
> and already reached such a common market penetration that common of
> the shelf shells like bash and ksh integrated support for them.
>
> Ced
> --
> Cedric Blancher <cedric....@gmail.com>
> Institute Pasteur
>

Joshuah Hurst

unread,
Nov 26, 2013, 5:53:10 PM11/26/13
to
On Tue, Nov 26, 2013 at 11:42 PM, Rick Macklem <rmac...@uoguelph.ca> wrote:
> Cedric Blancher wrote:
>> On 26 November 2013 13:27, Lionel Cons <lionelc...@gmail.com>
>> wrote:
> The NFSv4 working group seems to have decided to add support for atomically
> set extended attributes (Linux, FreeBSD style) to a future minor revision,
> due to it not being possible to accurately emulate them with named attributes.

ABSOLUTELY not. Any such attempt will be voted down, as of NetApp,
Oracle, and others. The extended attributes are nonstandard and not
even backed by ANY other standard (e.g. POSIX, Single UNIX Standard).
This will not happen, except for vendor-specific extensions not part
of any NFSv4 RFC.

Whatever shit or FUD Linux may invent, Linux-style extended attributes
will NOT be part of the official NFS4.x *standard*

Josh

Rick Macklem

unread,
Nov 26, 2013, 6:12:40 PM11/26/13
to
Joshuah Hurst wrote:
> On Tue, Nov 26, 2013 at 11:42 PM, Rick Macklem <rmac...@uoguelph.ca>
> wrote:
> > Cedric Blancher wrote:
> >> On 26 November 2013 13:27, Lionel Cons <lionelc...@gmail.com>
> >> wrote:
> > The NFSv4 working group seems to have decided to add support for
> > atomically
> > set extended attributes (Linux, FreeBSD style) to a future minor
> > revision,
> > due to it not being possible to accurately emulate them with named
> > attributes.
>
> ABSOLUTELY not. Any such attempt will be voted down, as of NetApp,
> Oracle, and others. The extended attributes are nonstandard and not
> even backed by ANY other standard (e.g. POSIX, Single UNIX Standard).
> This will not happen, except for vendor-specific extensions not part
> of any NFSv4 RFC.
>
> Whatever shit or FUD Linux may invent, Linux-style extended
> attributes
> will NOT be part of the official NFS4.x *standard*
>
Well, here's a url for one of the messages in the mailing list
thread. Several people from Netapp and Oracle participate in
the working group and I don't see them complaining about it in
the mailing list thread.
http://www.ietf.org/mail-archive/web/nfsv4/current/msg12498.html

You are welcome to join the working group mailing list and comment, rick

Wojciech Puchar

unread,
Nov 26, 2013, 6:01:39 PM11/26/13
to
>> there are additional functions for manipulating EAs that go well beyond
>> the Solaris extensions to the directory and file I/O functions. Assuming you
>> want to be able to get/set as well as enumerate or remove EAs, then
>> you might just as well add getxattr(2), listxattr(2), removexattr(2), setxattr(2)
>> too and follow the herd (Linux and OS X, so far).
>
> You mean 'follow the lemmings down into the abyss'? :)

seems to. why just not make 2 files when two "data streams" are needed.

if compatibility is needed with other OSes (like windows with SMB export)
- just add this to samba.

Joshuah Hurst

unread,
Nov 26, 2013, 6:24:33 PM11/26/13
to
On Tue, Nov 26, 2013 at 11:42 PM, Rick Macklem <rmac...@uoguelph.ca> wrote:
> Cedric Blancher wrote:
>> On 26 November 2013 13:27, Lionel Cons <lionelc...@gmail.com>
>> wrote:
>> > On 26 November 2013 11:19, Jordan Hubbard
>> > <jordan....@gmail.com> wrote:
>> >>
>> >> On Nov 26, 2013, at 1:51 AM, Cedric Blancher
>> >> <cedric....@gmail.com> wrote:
>> >>
>> >>> 1. You do not need more syscalls. Solaris uses the plain openat()
>> >>> syscall for this, with the O_XATTR flag passed to the normal
>> >>> open()/openat() flags to open a named attribute. Likewise read(),
>> >>> write(), mmap() etc work, too.
>> >>
>> >> I don’t know if I’d go so far as to say “you do not need more
>> >> syscalls”;
>> >> there are additional functions for manipulating EAs that go well
>> >> beyond
>> >> the Solaris extensions to the directory and file I/O functions.
>> >> Assuming you
>> >> want to be able to get/set as well as enumerate or remove EAs,
>> >> then
>> >> you might just as well add getxattr(2), listxattr(2),
>> >> removexattr(2), setxattr(2)
>> >> too and follow the herd (Linux and OS X, so far).
>> >
>> > You mean 'follow the lemmings down into the abyss'? :)
>> >
Sun once proposed a new flag to renameat() to atomically swap two
files. That would solve this solution without new syscalls or extra
apis. It would even fit into existing NFSv4 requests without major
protocol changes.

Josh
0 new messages