On Tue, 8 Apr 2025 at 20:17, 'Richard W.M. Jones' via libnfs
<
lib...@googlegroups.com> wrote:
>
> Hi,
>
> Firstly I couldn't work out how to subscribe to the libnfs mailing
> list, so hopefully this message finds it way to the right people.
>
> I'm trying to add libnfs bindings to our pluggable Network Block
> Device (NBD) server (
https://gitlab.com/nbdkit/nbdkit) and I have a
> few technical questions. This could eventually be an alternative to /
> replacement for the qemu block layer libnfs driver
> (
https://gitlab.com/qemu-project/qemu/-/blob/master/block/nfs.c?ref_type=heads).
> I want performance to be the best that is reasonably possible. I have
> a few questions about the best way to structure this.
>
> (1) nbdkit is multithreaded, with each NBD client read/write request
> being handled from a pool of threads. An easy way to add libnfs
> support would simply be to use the libnfs synchronous API from the
> thread that handles the request.
>
> Another possibility (which we have used in other plugins) is to start
> one or more background worker threads, and use the libnfs asynchronous
> API from those worker thread(s) only.
>
> Do you have an opinion on which of these would have better performance?
> And if the second, how many worker threads to use?
Async operations I think will always give the best performance. It
allows you to have really high concurrency
without a ridiculous amount of threads.
There are some users that need to replicate enormous amounts of data
and sometimes they reach
tens of thousands of concurrent operation.
(I also personally think async/event driven designs are nicer than
multithreaded ones)
>
> (2) For specifying the connection options, we could map each libnfs
> feature (eg. server name, NFS version, etc) into a separate nbdkit
> option, which would look to the user like:
>
> nbdkit nfs server=
nfs.example.com mount=/mnt file=disk.img version=4
>
> or we could use the libnfs URI format:
>
> nbdkit nfs 'nfs://
nfs.example.com/mnt?disk.img?version=4'
>
> The second one seems like the best option, but any opinions / catches
> we should be aware of?
No issue as far as I can see.
Which approach you use is more a policy question for your app.
>
> (3) NBD has a property called "multiconn" which is quite critical to
> performance. When this property is advertised it allows a single
> client to safely make multiple connections to the server. However we
> can only advertise this property safely if 'fsync' on one connection
> also persists writes that have been completed by other connections.
> The exact wording from the spec is:
You can not yet do multiple sessions for one context but you can use
multiple contexts,
each connected to the same server/share and then just round-robin accross them.
See for example examples/nfs-pthreads-writefile.c
To do the kind of fsync you mention you would need to create a wrapper
that sends
a sync across all the sessions.
The filehandles are shared across all clients and sessions on the server-side
so you can just use the nfsfh for one session and use it on the other
sessions and they
will still be guaranteed to map to the same open file in memory on the server.
I strongly doubt that you will need to do this in the case of NFS
though as all servers
already guarantee that "if you sync a filehandle on one connection
this flushes the data on ALL
connections server-side."
For example, kernel nfs clients often open multiple sessions a when
you write to a file, all writes are
round-robin written to the different sessions. Then when the app does
a sync, a single COMMIT is sent
on whetever the next session in the round-robin scheme and everything
is updated and flushed correctly on the server.
That behavior you mention sounds like a NBD specific requirement.
Maybe the NBD server has cache that is local to each connection.
that is not the case for nfs.
>
> bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
> entirely without cache, or that the cache it uses is shared among
> all connections to the given device. In particular, if this flag is
> present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST
> be visible across all connections when the server sends its reply to
> that command to the client. In the absence of this flag, clients
> SHOULD NOT multiplex their commands over more than one connection to
> the export.
> [
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md]
>
> Determining this property usually involves examining the server side
> of whatever we are connecting to - an NFS server in this case - but I
> wonder if you would know the answer here?
You do not have to worry about it.
This is normal for NFS. All connections share the same cache so a
flush/COMMIT on one
session will affect/do the right thing for all sessions.
>
> This question also depends on the answer to (1) since we may be able
> to serialize fsync through a single worker thread.
>
> (4) NBD supports: trim/discard (hole punching) NBD_CMD_TRIM; and
> writing zeroes NBD_CMD_WRITE_ZEROES. I may be missing something, but
> I don't see anything like that in the API. Is that not supported? By
> NFS itself or just by libnfs?
NFSv3 does nor have this.
NFSv4 might be able to support for this but I have not looked into it.
Open an issue to add this and I can see if discard or write-zero is
possible on v4.
> --
> You received this message because you are subscribed to the Google Groups "libnfs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
libnfs+un...@googlegroups.com.
> To view this discussion visit
https://groups.google.com/d/msgid/libnfs/20250408101700.GH1450%40redhat.com.