Gfs2 Debian

0 views

Skip to first unread message

Fito Coulter

unread,

Aug 3, 2024, 4:31:42 PM8/3/24

to saigooleales

My immediate project is to mount a gfs2-formatted iscsi-exported device on multiple clients as a shared file system. For the moment, I am not interested in HA or fencing, although this may be important later on.

At this point, I feel like I should be able to mount the gfs2 file system on any/all of the clients, but here is where I get the error above -- the mount command reports a "no such file or directory", and dmesg and syslog report "gfs2: can't find protocol lock_dlm".

There are several other gfs2 guides out there, but many of them seem to be RH/CentOS specific, and for other cluster-management schemes besides corosync, like cman or pacemaker. Those aren't necessarily deal-breakers, but it's high-value to me to have this work on nearly-stock Debian Stretch.

This happens independently of whether the lockspace I am joining is "rpitest" or not. This suggests that lockspace names and cluster names are indeed the same thing, and/but that the dlm is evidently not aware of the corosync config?

So make sure you have it enabled, otherwise filesystems created with -p lock_dlm (the default lock protocol in gfs2_mkfs) won't be usable (without a mount-time override). And the lock_nolock lock protocol works with single-node mounts only.

In a few cases, the Linux file system API does not allow the clustered nature of GFS2 to be totally transparent; for example, programs using POSIX locks in GFS2 should avoid using the GETLK function since, in a clustered environment, the process ID may be for a different node in the cluster. In most cases however, the functionality of a GFS2 file system is identical to that of a local file system.

To get the best performance from GFS2, it is important to take into account the performance considerations which stem from the underlying design. Just like a local file system, GFS2 relies on the page cache in order to improve performance by local caching of frequently used data. In order to maintain coherency across the nodes in the cluster, cache control is provided by the glock state machine.

Make sure that your deployment of the Red Hat High Availability Add-On meets your needs and can be supported. Consult with an authorized Red Hat representative to verify your configuration prior to deployment.

You may see performance problems with GFS2 when many create and delete operations are issued from more than one node in the same directory at the same time. If this causes performance problems in your system, you should localize file creation and deletions by a node to directories specific to that node as much as possible.

GFS2 is based on a 64-bit architecture, which can theoretically accommodate an 8 EB file system. If your system requires larger GFS2 file systems than are currently supported, contact your Red Hat service representative.

When determining the size of your file system, you should consider your recovery needs. Running the fsck.gfs2 command on a very large file system can take a long time and consume a large amount of memory. Additionally, in the event of a disk or disk subsystem failure, recovery time is limited by the speed of your backup media. For information about the amount of memory the fsck.gfs2 command requires, see Determining required memory for running fsck.gfs2.

Although a GFS2 file system can be implemented in a standalone system or as part of a cluster configuration, Red Hat does not support the use of GFS2 as a single-node file system, with the following exceptions:

Red Hat supports a number of high-performance single-node file systems that are optimized for single node and thus have generally lower overhead than a cluster file system. Red Hat recommends using these file systems in preference to GFS2 in cases where only a single node needs to mount the file system. For information about the file systems that Red Hat Enterprise Linux 9 supports, see Managing file systems.

When you configure a GFS2 file system as a cluster file system, you must ensure that all nodes in the cluster have access to the shared storage. Asymmetric cluster configurations in which some nodes have access to the shared storage and others do not are not supported. This does not require that all nodes actually mount the GFS2 file system itself.

Note that even though GFS2 large file systems are possible, that does not mean they are recommended. The rule of thumb with GFS2 is that smaller is better: it is better to have 10 1TB file systems than one 10TB file system.

The mkfs.gfs2 command attempts to estimate an optimal block size based on device topology. In general, 4K blocks are the preferred block size because 4K is the default page size (memory) for Red Hat Enterprise Linux. Unlike some other file systems, GFS2 does most of its operations using 4K kernel buffers. If your block size is 4K, the kernel has to do less work to manipulate the buffers.

It is recommended that you use the default block size, which should yield the highest performance. You may need to use a different block size only if you require efficient storage of many very small files.

When you run the mkfs.gfs2 command to create a GFS2 file system, you may specify the size of the journals. If you do not specify a size, it will default to 128MB, which should be optimal for most applications.

Some system administrators might think that 128MB is excessive and be tempted to reduce the size of the journal to the minimum of 8MB or a more conservative 32MB. While that might work, it can severely impact performance. Like many journaling file systems, every time GFS2 writes metadata, the metadata is committed to the journal before it is put into place. This ensures that if the system crashes or loses power, you will recover all of the metadata when the journal is automatically replayed at mount time. However, it does not take much file system activity to fill an 8MB journal, and when the journal is full, performance slows because GFS2 has to wait for writes to the storage.

It is generally recommended to use the default journal size of 128MB. If your file system is very small (for example, 5GB), having a 128MB journal might be impractical. If you have a larger file system and can afford the space, using 256MB journals might improve performance.

When a GFS2 file system is created with the mkfs.gfs2 command, it divides the storage into uniform slices known as resource groups. It attempts to estimate an optimal resource group size (ranging from 32MB to 2GB). You can override the default with the -r option of the mkfs.gfs2 command.

You should experiment with different resource group sizes to see which results in optimal performance. It is a best practice to experiment with a test cluster before deploying GFS2 into full production.

If your file system has too many resource groups, each of which is too small, block allocations can waste too much time searching tens of thousands of resource groups for a free block. The more full your file system, the more resource groups that will be searched, and every one of them requires a cluster-wide lock. This leads to slow performance.

If, however, your file system has too few resource groups, each of which is too big, block allocations might contend more often for the same resource group lock, which also impacts performance. For example, if you have a 10GB file system that is carved up into five resource groups of 2GB, the nodes in your cluster will fight over those five resource groups more often than if the same file system were carved into 320 resource groups of 32MB. The problem is exacerbated if your file system is nearly full because every block allocation might have to look through several resource groups before it finds one with a free block. GFS2 tries to mitigate this problem in two ways:

Deploying a cluster file system is not a "drop in" replacement for a single node deployment. Red Hat recommends that you allow a period of around 8-12 weeks of testing on new installations in order to test the system and ensure that it is working at the required performance level. During this period, any performance or functional issues can be worked out and any queries should be directed to the Red Hat support team.

In computing, the Global File System 2 or GFS2 is a shared-disk file system for Linux computer clusters. GFS2 allows all members of a cluster to have direct concurrent access to the same shared block storage, in contrast to distributed file systems which distribute data throughout the cluster. GFS2 can also be used as a local file system on a single computer.

GFS2 has no disconnected operating-mode, and no client or server roles. All nodes in a GFS2 cluster function as peers. Using GFS2 in a cluster requires hardware to allow access to the shared storage, and a lock manager to control access to the storage.The lock manager operates as a separate module: thus GFS2 can use the Distributed Lock Manager (DLM) for cluster configurations and the "nolock" lock manager for local filesystems. Older versions of GFS also support GULM, a server-based lock manager which implements redundancy via failover.

Development of GFS began in 1995 and was originally developed by University of Minnesota professor Matthew O'Keefe and a group of students.[3] It was originally written for SGI's IRIX operating system, but in 1998 it was ported to Linux (2.4)[4] since the open source code provided a more convenient development platform. In late 1999/early 2000 it made its way to Sistina Software, where it lived for a time as an open-source project. In 2001, Sistina made the choice to make GFS a proprietary product.

Developers forked OpenGFS from the last public release of GFS and then further enhanced it to include updates allowing it to work with OpenDLM. But OpenGFS and OpenDLM became defunct, since Red Hat purchased Sistina in December 2003 and released GFS and many cluster-infrastructure pieces under the GPL in late June 2004.