cannot mount beegfs

785 views
Skip to first unread message

Daofeng Li

unread,
Sep 14, 2016, 3:47:55 PM9/14/16
to beegfs-user
Dear group,

This is pretty strange as this is the only 1 server I have this issue...none of the following ways worked out....
Anyone have some suggestions? Thanks.

:~$ sudo /etc/init.d/beegfs-helperd restart

Shutting down BeeGFS Client Helper Daemon:                 [  OK  ]

Starting BeeGFS Client Helper Daemon:                      [  OK  ]

~$ sudo /etc/init.d/beegfs-client restart

Shutting down BeeGFS Client: 

- Unmounting directories from /etc/beegfs/beegfs-mounts.conf

- Unloading modules

rmmod: ERROR: Module beegfs is in use

rmmod failed: 


$ sudo rmmod -f beegfs

rmmod: ERROR: ../libkmod/libkmod-module.c:793 kmod_module_remove_module() could not remove 'beegfs': Resource temporarily unavailable

rmmod: ERROR: could not remove module beegfs: Resource temporarily unavailable



$ sudo /sbin/modprobe -r -f beegfs

modprobe: FATAL: Module beegfs is in use.



ps aux | grep beegfs 

root      5691  0.0  0.0      0     0 ?        S<   Sep13   0:00 [beegfs-rwPgWQ]

root      5714  0.0  0.0      0     0 ?        S    Sep13   0:01 [beegfs_DGramLis]

root      5715  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5716  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5717  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5718  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5719  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/5]

root      5720  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/6]

root      5721  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/7]

root      5722  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/8]

root      5723  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/9]

root      5724  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5725  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5726  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5727  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5728  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5729  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5730  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5731  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5732  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5733  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/1]

root      5734  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5735  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5736  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5737  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5738  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5739  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5740  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5741  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5742  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5743  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/2]

root      5744  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5745  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5746  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5747  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5748  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5749  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5750  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5751  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5752  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5753  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/3]

root      5754  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5755  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5756  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5757  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5758  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5759  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5760  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5761  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5762  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Worker/4]

root      5763  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_RtrWrk/1]

root      5764  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_XNodeSyn]

root      5765  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_AckMgr]

root      5766  0.0  0.0      0     0 ?        S    Sep13   0:00 [beegfs_Flusher]



$ sudo kill -9  $(ps aux | grep -e beegfs | awk '{ print $2 }')

Sven Breuner

unread,
Sep 15, 2016, 4:55:26 PM9/15/16
to fhgfs...@googlegroups.com, Daofeng Li
Hi Daofeng,

the Linux kernel is refusing the rmmod here, because it thinks that there are
still references to the BeeGFS module. One possible reason for this would be if
"umount -l" was used on the mountpoint and there are still processes holding a
reference to a file or directory within the previous mountpoint. Did you use
"umount -l"?
If so, identifying the processes holding the references is unfortunately a bit
difficult, because when you use the "lsof" command after a mountpoint has been
unmounted with "umount -l", the kernel will strip the path to the previous
mountpoint from reference path.
To give one example:
If you have an application that has the file "/mnt/beegfs/mydir/myfile1" open
and use "umount -l /mnt/beegfs", then afterwards "lsof" will show the
application having "/mydir/myfile1" open (so "/mnt/beegfs" would be missing).

Regarding the "kill -9": It is normal that this command is not able to kill the
threads that you listed below, because these are threads of the kernel module.

Best regards,
Sven


Daofeng Li wrote on 14.09.2016 21:47:
> Dear group,
>
> This is pretty strange as this is the only 1 server I have this issue...none of
> the following ways worked out....
> Anyone have some suggestions? Thanks.
>
> :*~*$ sudo /etc/init.d/beegfs-helperd restart
>
> Shutting down BeeGFS Client Helper Daemon: [* OK *]
>
> Starting BeeGFS Client Helper Daemon: [* OK *]
>
> *~*$ sudo /etc/init.d/beegfs-client restart
>
> Shutting down BeeGFS Client:
>
> - Unmounting directories from /etc/beegfs/beegfs-mounts.conf
>
> - Unloading modules
>
> rmmod: ERROR: Module beegfs is in use
>
> rmmod failed:
>
>
> $ sudo rmmod -f beegfs
>
> rmmod: ERROR: ../libkmod/libkmod-module.c:793 kmod_module_remove_module() could
> not remove 'beegfs': Resource temporarily unavailable
>
> rmmod: ERROR: could not remove module beegfs: Resource temporarily unavailable
>
>
>
> $ sudo /sbin/modprobe -r -f beegfs
>
> modprobe: FATAL: Module beegfs is in use.
>
>
>
> ps aux | grep beegfs
>
> root 5691 0.0 0.0 0 0 ? S< Sep13 0:00 [*beegfs*-rwPgWQ]
>
> root 5714 0.0 0.0 0 0 ? S Sep13 0:01 [*beegfs*_DGramLis]
> [...]
>
> root 5759 0.0 0.0 0 0 ? S Sep13 0:00 [*beegfs*_Worker/4]
>
> root 5762 0.0 0.0 0 0 ? S Sep13 0:00 [*beegfs*_Worker/4]
>
> root 5763 0.0 0.0 0 0 ? S Sep13 0:00 [*beegfs*_RtrWrk/1]
>
> root 5764 0.0 0.0 0 0 ? S Sep13 0:00 [*beegfs*_XNodeSyn]
>
> root 5765 0.0 0.0 0 0 ? S Sep13 0:00 [*beegfs*_AckMgr]
>
> root 5766 0.0 0.0 0 0 ? S Sep13 0:00 [*beegfs*_Flusher]

Daofeng Li

unread,
Sep 16, 2016, 10:14:19 AM9/16/16
to Sven Breuner, fhgfs...@googlegroups.com
Thank you so much Sven, that explains very clean.
I did use 'umount -l' because I was not able to stop the client daemon and I made some changes to the server side and want to re-mount.
seems the only way is reboot? Thanks again.

Daofeng

Sven Breuner

unread,
Sep 16, 2016, 10:59:22 AM9/16/16
to fhgfs...@googlegroups.com, Daofeng Li
Hi Daofeng,

Daofeng Li wrote on 16.09.2016 16:13:
> Thank you so much Sven, that explains very clean.
> I did use 'umount -l' because I was not able to stop the client daemon and I
> made some changes to the server side and want to re-mount.
> seems the only way is reboot? Thanks again.

if you need to avoid a reboot of this machine, you could still try to identify
the processes that have the mountpoint referenced (using "lsof") and kill them,
although that is a bit inconvenient now after "umount -l", as mentioned below.
However, if they are all killed, then the rmmod should work and you don't need
to reboot.

If you encounter such a case again in the future where you want to unload the
beegfs module and cannot unmount due to some processes still having a reference
to the mountpoint, you can use this command...
$ fuser -k /mnt/beegfs
...to kill the processes that have a reference on the mountpoint.

Have a nice weekend and best regards,
Sven

Daofeng Li

unread,
Sep 16, 2016, 11:12:01 AM9/16/16
to Sven Breuner, fhgfs...@googlegroups.com
Thank you so much Sven!
I am not at a computer now but will try later.
Best.


--
Daofeng

Reply all
Reply to author
Forward
0 new messages