Can`t Mount. beegfs-Client wont start

3,402 views
Skip to first unread message

Tobias Kühn

unread,
Mar 7, 2018, 8:06:42 AM3/7/18
to beegfs-user
Hello Every one,
journalctl -xe:
Mär 07 13:35:37 management beegfs-client[4183]: Starting BeeGFS Client:
M
är 07 13:35:37 management beegfs-client[4183]: - Loading BeeGFS modules
M
är 07 13:35:37 management beegfs-client[4183]: - Mounting directories from /etc/beegfs/beegfs-mounts.conf
M
är
 
07 13:35:37 management kernel: beegfs: mount(4206): Mount sanity check
failed
. Canceling mount. (Log file may provide additional information.
Check can be disabled with sysMountSanityCheckMS=0 in the config file.)
M
är 07 13:35:39 management beegfs-client[4183]: mount: mount beegfs_nodev on /mnt/beegfs failed: Die Operation wird abgebrochen
M
är 07 13:35:39 management systemd[1]: beegfs-client.service: main process exited, code=exited, status=32/n/a
M
är 07 13:35:39 management systemd[1]: Failed to start Start BeeGFS Client.

Logfile from client:
(3) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [NodeConn (acquire stream)]
>> Connected: beegfs-helperd@127.0.0.1:8006 (protocol: TCP)
(2) Mar07 13:35:37 *beegfs_DGramLis(4207) [Heartbeat incoming] >> New node: beegfs-mgmtd management [ID: 1];
(3) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Init] >> Management node found. Downloading node groups...
(3)
 
Mar07 13:35:37 *beegfs_XNodeSyn(4208) [NodeConn (acquire stream)]
>> Connected: beegfs-mgmtd@XX.XX.XX.221:8008 (protocol: TCP)
(2) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Sync] >> Nodes added (sync results): 1 (Type: beegfs-meta)
(2) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Sync] >> Nodes added (sync results): 1 (Type: beegfs-storage)
(3) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Init] >> Node registration...
(2) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Registration] >> Node registration successful.
(3) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Init] >> Init complete.
(3) Mar07 13:35:37 *mount(4206) [NodeConn (acquire stream)] >> Connected: beegfs-meta@XX.XX.XX.222:8005 (protocol: TCP)
(3)
 
Mar07 13:35:37 *mount(4206) [NodeConn (acquire stream)] >>
Connected: beegfs-storage@XX.XX.XX.220:8003 (protocol: TCP)
(2)
Mar07 13:35:37 *mount(4206) [Remoting (stat storage targets)] >>
Error target (storage): 1; Msg: Unknown storage target
(0) Mar07
13:35:37 *mount(4206) [Mount sanity check] >> Retrieval of storage
 server free space info failed
. Are the storage servers running and
registered at the management daemon
? Did you remove a storage target
directory on a server
? (Error: Unknown storage target)
(2) Mar07 13:35:37 *mount(4206) [App (stop components)] >> Stopping components...
(2) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Deregistration] >> Node deregistration successful.
(2)
 
Mar07 13:35:39 *mount(4206) [App (wait for component termination)]
>> Still waiting for this component to stop: beegfs_AckMgr
(2) Mar07 13:35:39 *mount(4206) [App (wait for component termination)] >> Component stopped: beegfs_AckMgr
(1) Mar07 13:35:39 *mount(4206) [App (stop)] >> All components stopped.

as you can see from the Client logfile, that i cant mount . The name of the Machine is correct. Also the name resolution is correct.

I can also reach storage with ping (via IP and Name).

My config looks like this:
/mnt/beegfs /etc/beegfs/beegfs-client.conf

thank you.

Harry Mangalam

unread,
Mar 7, 2018, 5:55:47 PM3/7/18
to beegfs-user
Taking a wild guess, it looks like one or more of your storage targets has wandered away:

Mar07 13:35:37 *mount(4206) [Remoting (stat storage targets)] >> 
Error target (storage): 1; Msg: Unknown storage target
(0) Mar07 
13:35:37 *mount(4206) [Mount sanity check] >> Retrieval of storage
 server free space info failed. Are the storage servers running and 
registered at the management daemon? Did you remove a storage target 
directory on a server? (Error: Unknown storage target)

What does your admon UI say?  
Or beegfs-net? 
Or 'beegfs-ctl --listnodes   --nodetype=storage    --details'
Or log into the storage nodes directly and check why they are not feeling 'beegy'. :)

hjm
Message has been deleted

Tobias Kühn

unread,
Mar 8, 2018, 1:51:08 AM3/8/18
to beegfs-user
beegfs-net says:
[root@management beegfs]# beegfs-net
df
: keine Dateisysteme bearbeitet
No active BeeGFS mounts found.

and listnode shows every think correctly:
root@managemen:beegfs-ctl --listnodes --nodetype=storage --details
storage
[ID: 1]
   
Ports: UDP: 8003; TCP: 8003
   
Interfaces: eth0(TCP)

log shows this:
(3) Mar07 13:34:08 Main [RegDGramLis] >> Listening for UDP datagrams: Port 8003
(1) Mar07 13:34:08 Main [App] >> Waiting for beegfs-mgmtd@management:8008...
(2) Mar07 13:34:08 RegDGramLis [Heartbeat incoming] >> New node: beegfs-mgmtd management [ID: 1];
(3) Mar07 13:34:08 Main [NodeConn (acquire stream)] >> Connected: beegfs-mgmtd@XX.XX.XX.221:8008 (protocol: TCP)
(1) Mar07 13:34:08 Main [App] >> Version: 7.0-rc2
(2) Mar07 13:34:08 Main [App] >> LocalNode: beegfs-storage storage [ID: 1]
(2) Mar07 13:34:08 Main [App] >> Usable NICs: eth0(TCP)
(2) Mar07 13:34:08 Main [App] >> Storage targets: 1
(3) Mar07 13:34:08 Main [RegDGramLis] >> Listening for UDP datagrams: Port 8003
(2) Mar07 13:34:08 Main [Register node] >> Node registration successful.
(2) Mar07 13:34:08 Main [InternodeSyncer.cpp:617] >> Storage targets registration successful.
(1) Mar07 13:34:08 Main [Nodes sync] >> Root NodeID (from sync results): 1
(2) Mar07 13:34:08 Main [Sync results] >> Nodes added: 1 (Type: beegfs-meta)
(3) Mar07 13:34:08 Main [App] >> Registration and management info download complete.
(3) Mar07 13:34:08 Main [DGramLis] >> Listening for UDP datagrams: Port 8003
(3) Mar07 13:34:08 Main [ConnAccept] >> Listening for TCP connections: Port 8003
(3) Mar07 13:34:08 Main [App] >> 0 sessions restored.
(0) Mar07 13:35:15 XNodeSync [Messaging (RPC)] >> Communication error: Soft disconnect from XX.XX.XX.221:8008; Peer: beegfs-mgmtd management [ID: 1]. (Message type: GetStatesAndBuddyGroups (1053))
(2) Mar07 13:35:15 XNodeSync [MessagingTk.cpp:25] >> Retrying communication. peer: beegfs-mgmtd management [ID: 1]; message type: GetStatesAndBuddyGroups (1053)
(0) Mar07 13:35:37 Worker7 [StatStoragePathMsg (stat path)] >> Unknown targetID: 1
(0) Mar07 13:35:37 Worker10 [StatStoragePathMsg (stat path)] >> Unknown targetID: 2

I dont get the softconnect. Do you have any idea here.
Thanks for your support :)

I hope my system will feel beegy soon :)

harry mangalam

unread,
Mar 8, 2018, 11:33:54 AM3/8/18
to fhgfs...@googlegroups.com, Tobias Kühn

On Wednesday, March 7, 2018 10:51:08 PM PST Tobias Kühn wrote:

> beegfs-net says:

> [root@management beegfs]# beegfs-net

> df: keine Dateisysteme bearbeitet

> No active BeeGFS mounts found.

 

Sorry - I should have been clearer - 'beegfs-net' has to be executed from a client.

 

 

> and listnode shows every think correctly:

> root@managemen:beegfs-ctl --listnodes --nodetype=storage --details

> storage [ID: 1]

> Ports: UDP: 8003; TCP: 8003

> Interfaces: eth0(TCP)

 

This looks good (if you have a single storage node) or bad (if you have multiple ones). And I think the log file references a storage target ID 0. So I think one or more of your storage targets isn't online or at least isn't communicating with the mgmnt/admon/meta servers.

 

The beegfs-net command should reveal this.

 

hjm

 

 

> log shows this:

> (3) Mar07 13:34:08 Main [RegDGramLis] >> Listening for UDP datagrams:

> Port 8003

> (1) Mar07 13:34:08 Main [App] >> Waiting for

> beegfs-mgmtd@management:8008... (2) Mar07 13:34:08 RegDGramLis

> [Heartbeat incoming] >> New node: beegfs-mgmtd management [ID: 1];

> (3) Mar07 13:34:08 Main [NodeConn (acquire stream)] >> Connected: beegfs-

> mg...@XX.XX.XX.221:8008 (protocol: TCP)

> (1) Mar07 13:34:08 Main [App] >> Version: 7.0-rc2

> (2) Mar07 13:34:08 Main [App] >> LocalNode: beegfs-storage storage [ID:

> 1] (2) Mar07 13:34:08 Main [App] >> Usable NICs: eth0(TCP)

> (2) Mar07 13:34:08 Main [App] >> Storage targets: 1

> (3) Mar07 13:34:08 Main [RegDGramLis] >> Listening for UDP datagrams:

> Port 8003

> (2) Mar07 13:34:08 Main [Register node] >> Node registration successful.

> (2) Mar07 13:34:08 Main [InternodeSyncer.cpp:617] >> Storage targets

> registration successful.

> (1) Mar07 13:34:08 Main [Nodes sync] >> Root NodeID (from sync results):

> 1 (2) Mar07 13:34:08 Main [Sync results] >> Nodes added: 1 (Type:

> beegfs-meta) (3) Mar07 13:34:08 Main [App] >> Registration and

> management info download complete.

> (3) Mar07 13:34:08 Main [DGramLis] >> Listening for UDP datagrams: Port

> 8003 (3) Mar07 13:34:08 Main [ConnAccept] >> Listening for TCP

> connections: Port 8003

> (3) Mar07 13:34:08 Main [App] >> 0 sessions restored.

> (0) Mar07 13:35:15 XNodeSync [Messaging (RPC)] >> Communication error:

> Soft disconnect from XX.XX.XX.221:8008; Peer: beegfs-mgmtd management

> [ID: 1]. ( Message type: GetStatesAndBuddyGroups (1053))

> > >> Connected: beegfs-...@127.0.0.1:8006 (protocol: TCP)

> >

> > (2) Mar07 13:35:37 *beegfs_DGramLis(4207) [Heartbeat incoming] >> New

> > node>

> > : beegfs-mgmtd management [ID: 1];

> >

> > (3) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Init] >> Management node

> > found. Downloading node groups...

> > (3)

> >

> > Mar07 13:35:37 *beegfs_XNodeSyn(4208) [NodeConn (acquire stream)]

> >

> > >> Connected: beegfs...@XX.XX.XX.221:8008 (protocol: TCP)

> >

> > (2) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Sync] >> Nodes added (sync

> > results): 1 (Type: beegfs-meta)

> > (2) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Sync] >> Nodes added (sync

> > results): 1 (Type: beegfs-storage)

> > (3) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Init] >> Node

> > registration...

> > (2) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Registration] >> Node

> > registration successful.

> > (3) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Init] >> Init complete.

> > (3) Mar07 13:35:37 *mount(4206) [NodeConn (acquire stream)] >>

> > Connected: beegf...@XX.XX.XX.222:8005 (protocol: TCP)

> > (3)

> >

> > Mar07 13:35:37 *mount(4206) [NodeConn (acquire stream)] >>

> >

> > Connected: beegfs-...@XX.XX.XX.220:8003 (protocol: TCP)

> > (2)

> > Mar07 13:35:37 *mount(4206) [Remoting (stat storage targets)] >>

> > Error target (storage): 1; Msg: Unknown storage target

> > (0) Mar07

> > 13:35:37 *mount(4206) [Mount sanity check] >> Retrieval of storage

> >

> > server free space info failed. Are the storage servers running and

> >

> > registered at the management daemon? Did you remove a storage target

> > directory on a server? (Error: Unknown storage target)

> > (2) Mar07 13:35:37 *mount(4206) [App (stop components)] >> Stopping

> > components...

> > (2) Mar07 13:35:37 *beegfs_XNodeSyn(4208) [Deregistration] >> Node

> > deregistration successful.

> > (2)

> >

> > Mar07 13:35:39 *mount(4206) [App (wait for component termination)]

> >

> > >> Still waiting for this component to stop: beegfs_AckMgr

> >

> > (2) Mar07 13:35:39 *mount(4206) [App (wait for component termination)]

> > >> Component stopped: beegfs_AckMgr

> > (1) Mar07 13:35:39 *mount(4206) [App (stop)] >> All components stopped.

> >

> > as you can see from the Client logfile, that i cant mount . The name of

> > the Machine is correct. Also the name resolution is correct.

> >

> > I can also reach storage with ping (via IP and Name).

> >

> > My config looks like this:

> > /mnt/beegfs /etc/beegfs/beegfs-client.conf

> >

> > thank you.

 


--

Harry Mangalam, Research Computing, Rm 225 MSTB, UC Irvine

[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487

415 South Circle View Dr, Irvine, CA, 92697 [shipping]

XSEDE 'Campus Champion' - ask me about your research computing needs.

Map to MSTB| Map to Data Center Gate

 

Tobias Kühn

unread,
Apr 3, 2018, 10:39:10 AM4/3/18
to beegfs-user
i got it now.

There must be some issues with the last Installation of beegfs. I have tested some Kind of situations and Scenarios on how to install beegfs.

and there musst be a part of a bunch.

I reinstalled the OS and now everythink is beegy now :)

Thanks alot.
Reply all
Reply to author
Forward
0 new messages