BeeGFS and network configuration

adrien....@epfl.ch

unread,

Aug 22, 2017, 11:21:16 AM8/22/17

to beegfs-user

Dear BeeGFS users, dev-team,

I am currently evaluation a BeeGFS cluster for a usage related to the Blue Brain Project ( http://bluebrain.epfl.ch/ ) with a moderate size instance.

We are for now very happy about the usability of the system, however, there is some informations related to network usage I do not see documented and I would like to have confirmed by the dev team if possible.

1 - What are the protocol and port ranges used by all the related services of BeeGFS ( mgmt, meta, storage, client ) ?

From my little investigation it requires entire port ranges in both UDP and TCP to be open on server side, but also on client side.
Could this be documented formally somewhere ? if possible with some firewall / NAT configuration recommendations from the dev team if any ?

2- Is BeeGFS supporting any kind of authentication ? Authentication client -> server but also server <-> server ?

Even something as primitive as pre-shared key authentication would be a good enough for me.

3 - Is there any way to protect the data integrity of a BeeGFS client over untrusted network ?

A la ipsec AH style, without having to enforce ipsec manually on every node.

4- Is BeeGFS supporting IPv6 ?

The BeeGFS protocols seems to me quite sensitive to NAT, it would be good if we could simply set up a BeeGFS over IPv6 in infrastructure that mix private / public IPv4 ranges.

Thank you in advance,

Best Regards,

Adrien Devresse
Blue Brain Project, Switzerland

Sven Breuner

unread,

Aug 27, 2017, 3:58:58 PM8/27/17

to fhgfs...@googlegroups.com, adrien....@epfl.ch

Hi Adrien,

please find my answers inline...

adrien....@epfl.ch wrote on 22.08.2017 17:21:
> Dear BeeGFS users, dev-team,
>
> I am currently evaluation a BeeGFS cluster for a usage related to the Blue Brain
> Project ( http://bluebrain.epfl.ch/ ) with a moderate size instance.
>
> We are for now very happy about the usability of the system, however, there is

Thanks, glad to hear about that.

> some informations related to network usage I do not see documented and I would
> like to have confirmed by the dev team if possible.
>
> 1 - What are the protocol and port ranges used by all the related services of
> BeeGFS ( mgmt, meta, storage, client ) ?
>
> From my little investigation it requires entire port ranges in both UDP and TCP
> to be open on server side, but also on client side.
> Could this be documented formally somewhere ? if possible with some firewall /
> NAT configuration recommendations from the dev team if any ?

Thanks for the hint. You can find the first attempt to document this here:
https://www.beegfs.io/wiki/NetworkTuning#firewall

> 2- Is BeeGFS supporting any kind of authentication ? Authentication client ->
> server but also server <-> server ?
>
> Even something as primitive as pre-shared key authentication would be a good
> enough for me.

Yes, see option "connAuthFile" in the config files of the BeeGFS
mgmtd/meta/storage/client services (/etc/beegfs/beegfs-...conf).
This option defines a pre-shared secret and requests are only accepted from
connections that can provide the pre-shared secret.

> 3 - Is there any way to protect the data integrity of a BeeGFS client over
> untrusted network ?
>
> A la ipsec AH style, without having to enforce ipsec manually on every node.

There is currently no such mechanism built in. If only certain clients are
untrusted and performance for these clients is not critical, you might want to
consider re-exporting BeeGFS via NFS (by mounting a BeeGFS client somewhere in
your trusted network and re-exporting this BeeGFS client mountpoint via the
kernel's NFSv4 server) to make use of the security and authentication features
of NFS.

> 4- Is BeeGFS supporting IPv6 ?
>
> The BeeGFS protocols seems to me quite sensitive to NAT, it would be good if we
> could simply set up a BeeGFS over IPv6 in infrastructure that mix private /
> public IPv4 ranges.

Currently not. While we are aware that this is becoming increasingly interesting
and we believe that adding IPv6 support will generally not be a big task, it's
something that we just have not done yet.

Best regards,
Sven

--
Sven Breuner
CEO
ThinkParQ GmbH

Adrien Devresse

unread,

Aug 28, 2017, 4:54:55 AM8/28/17

to Sven Breuner, fhgfs...@googlegroups.com

Hi Sven,

Thank you very much for this detailed information

> Thanks for the hint. You can find the first attempt to document this
> here:
> https://www.beegfs.io/wiki/NetworkTuning#firewall

Thank you very much for that, it is very useful.

>
> Yes, see option "connAuthFile" in the config files of the BeeGFS
> mgmtd/meta/storage/client services (/etc/beegfs/beegfs-...conf).
> This option defines a pre-shared secret and requests are only accepted
> from connections that can provide the pre-shared secret.

Perfect, it is enough for us.

> There is currently no such mechanism built in. If only certain clients
> are untrusted and performance for these clients is not critical, you
> might want to consider re-exporting BeeGFS via NFS (by mounting a
> BeeGFS client somewhere in your trusted network and re-exporting this
> BeeGFS client mountpoint via the kernel's NFSv4 server) to make use of
> the security and authentication features of NFS.

Ok that's fine.
NFSv4 is an acceptable compromise indeed. Access over WAN are not
performance critical anyway.

> Currently not. While we are aware that this is becoming increasingly
> interesting and we believe that adding IPv6 support will generally not
> be a big task, it's something that we just have not done yet.

Thank you, that would be very much appreciated.
We are (very likely) going to buy a BeeGFS cluster with commercial
support soon. I might get back to you with that.

Best Regards,
Adrien

Le 27. 08. 17 à 21:58, Sven Breuner a écrit :

Oliver Freyermuth

unread,

Aug 28, 2017, 5:07:24 AM8/28/17

to fhgfs...@googlegroups.com, Adrien Devresse, Sven Breuner

Hi Sven (and others on this list),

while we're on it...

Am 28.08.2017 um 10:54 schrieb Adrien Devresse:
>> There is currently no such mechanism built in. If only certain clients
>> are untrusted and performance for these clients is not critical, you
>> might want to consider re-exporting BeeGFS via NFS (by mounting a
>> BeeGFS client somewhere in your trusted network and re-exporting this
>> BeeGFS client mountpoint via the kernel's NFSv4 server) to make use of
>> the security and authentication features of NFS.
>
> Ok that's fine.
> NFSv4 is an acceptable compromise indeed. Access over WAN are not
> performance critical anyway.
>

I have asked a more generic question on this list a while ago (Exporting BeeGFS with load balancing, HA and authentication)
which sadly has been without answer up to now.
We are now also looking at using "standard" NFSv4 to export our FS (servers are behind NAT).

However, I spent quite some time trying to get NFS Ganesha work, since pNFS would come with some performance benefits.
Sadly, it seems that the "instability" of the path/name <=> raw kernel file handle mapping breaks the pNFS approach without special tricks (which NFS Ganesha does not use yet).

Is there any other pNFS implementation which has been tested successfully with BeeGFS? What are other people using?

All the best,
Oliver

Adrien Devresse

unread,

Aug 28, 2017, 6:05:02 AM8/28/17

to fhgfs...@googlegroups.com

Hi Oliver,

> However, I spent quite some time trying to get NFS Ganesha work, since pNFS would come with some performance benefits.
> Sadly, it seems that the "instability" of the path/name <=> raw kernel file handle mapping breaks the pNFS approach without special tricks (which NFS Ganesha does not use yet).

If I can comment on that, I would honestly avoid the pain of pNFS on that.

Why not just setup multiple independant NFSv4 servers ( without pNFS)
and associate your clients with a load balancing on them ?

The complexity of the decoupling meta-data / disk servers of pNFS brings
very little advantages, BeeGFS already takes care of that in the backend.

Regards,
Adrien

Le 28. 08. 17 à 11:07, 'Oliver Freyermuth' via beegfs-user a écrit :

Oliver Freyermuth

unread,

Aug 28, 2017, 6:15:14 AM8/28/17

to fhgfs...@googlegroups.com, Adrien Devresse

Hi Adrian,

Am 28.08.2017 um 12:04 schrieb Adrien Devresse:
> If I can comment on that, I would honestly avoid the pain of pNFS on that.
>
> Why not just setup multiple independant NFSv4 servers ( without pNFS)
> and associate your clients with a load balancing on them ?

this indeed appears to be the most "painless" solution currently available.

Ideally, with pNFS, I would have hoped (for a good implementation) to additionally allow to:
- Perform file "striping", i.e. 4 data servers with 1 GB ethernet (on both BeeGFS internal network and outer network side) would allow for 4 GB/s throughput to a single NFS-client.
- The meta-data server could "throw out" data-servers which go offline, so there would be limited high availability (and, if there's also a hot failover for the meta-data server, full HA).

I guess we will go with several "classic" NFSv4 servers with DNS-round-robin, and then adjust the DNS entries if data servers fail.

In the end, our goal is not maximum throughput per client, since people will (likely...) use the system to work on some small files.
They *should* not use it to work on big datasets in any case.

Many thanks and best regards,
Oliver

Reply all

Reply to author

Forward