New testing storage problem

321 views
Skip to first unread message

Sébastien CAPS

unread,
Mar 2, 2012, 3:41:47 AM3/2/12
to fhgfs-user
Hi,

I'm testing your storage solution and i'm blocked for the moment.
I'm running under centos 6.2 with selinux disabled on 2 testing VM.
- everything is resolvable and reverse dns ok
- ssh with no password activated on all host
- everything has been created with fhgfs-admon-gui

I have change the client config with
sysMountSanityCheckMS=0
and when i mount it and try "df" it show an i/o error.

when and with fhgfs-df it show storage in "emergency" is this normal ?

On the admon log file I have :
"MetadataTk findOwner Unable to proceed without a working root node"
how could i fix this ?

I have also try fhgfs-fsck but i'm not sure of the option (log below)

So should i initialize something ?



if i should provide you other information to help to debug this do not
hesitate

(0) Mar02 10:17:54 Worker1 [MetadataTk::referenceOwner] >> Unable to
proceed without a working root node
(0) Mar02 10:17:54 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:17:58 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:17:58 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:17:59 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:17:59 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:18:00 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:18:02 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:18:05 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:18:07 Mongoose Thread 139698978682624
[MetadataTk::findOwner] >> Unable to proceed without a working root
node
(0) Mar02 10:18:11 Mongoose Thread 139698978682624
[MetadataTk::findOwner] >> Unable to proceed without a working root
node
(0) Mar02 10:18:14 Mongoose Thread 139698978682624
[MetadataTk::findOwner] >> Unable to proceed without a working root
node
(0) Mar02 10:18:20 Worker3 [MetadataTk::findOwner] >> Unable to
proceed without a working root node
(0) Mar02 10:18:20 Mongoose Thread 139698978682624
[MetadataTk::findOwner] >> Unable to proceed without a working root
node
(0) Mar02 10:18:24 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:18:31 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:18:33 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node
(0) Mar02 10:18:39 Worker1 [MetadataTk::referenceOwner] >> Unable to
proceed without a working root node
(0) Mar02 10:18:39 Mongoose Thread 139698978682624
[MetadataTk::referenceOwner] >> Unable to proceed without a working
root node


[root@ip131 fhgfs]# ls /data/fhgfs/storage/
chunks format.conf lock.pid originalNodeID targetID
[root@ip131 fhgfs]# ls /data/fhgfs/meta/
entries format.conf lock.pid originalNodeID structure
[root@ip131 fhgfs]# ls /data/fhgfs/m
meta/ mgmtd/
[root@ip131 fhgfs]# ls /data/fhgfs/mgmtd/
clients.nodes format.conf lock.pid meta.nodes originalNodeID
storage.nodes targets



[root@ip131 fhgfs]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 5.5G 1.2G 4.0G 23% /
tmpfs 980M 0 980M 0% /dev/shm
df: `/mnt/fhgfs': Remote I/O error
[root@ip131 fhgfs]# fhgfs-df
METADATA SERVERS:
[LOW]
ip131.changed.example.tld (Total: 5.4GB / Free: 4.3GB)
ip147.changed.example.tld (Total: 5.4GB / Free: 4.0GB)


STORAGE TARGETS:
[EMERGENCY]
1-4F4FA5D9-ip131.changed.example.tld (Total: 5.4GB / Free:
4.3GB)
1-4F508F99-ip147.changed.example.tld (Total: 5.4GB / Free:
4.0GB)

[root@ip131 fhgfs]# fhgfs-df --help
METADATA SERVERS:
[LOW]
ip131.changed.example.tld (Total: 5.4GB / Free: 4.3GB)
ip147.changed.example.tld (Total: 5.4GB / Free: 4.0GB)


STORAGE TARGETS:
[EMERGENCY]
1-4F4FA5D9-ip131.changed.example.tld (Total: 5.4GB / Free:
4.3GB)
1-4F508F99-ip147.changed.example.tld (Total: 5.4GB / Free:
4.0GB)

[root@ip131 fhgfs]# fhgfs-fsck mode=FULLCHECK checkPath=/data/fhgfs/
storage/

--------------------------------------------------------------------
Started FhGFS fsck in forward check mode [Fri Mar 2 10:27:48 2012]
Output will be written to /var/log/fhgfs-fsck.log
--------------------------------------------------------------------

Step 1 : Checking reachability of nodes : OK!
Step 2 : Checking metadata for file and directory links :
(0) 10:27:48 Main [MetadataTk::findOwner] >> Unable to proceed without
a working root node


--------------------------------------------------------------------
Could not find owner node for path: /data/fhgfs/storage/
--------------------------------------------------------------------


Non-recoverable error occured. Please fix manually before running FS
check again.

--------------------------------------------------------------------
Started FhGFS fsck in backward check mode [Fri Mar 2 10:27:49 2012]
Output will be written to /var/log/fhgfs-fsck.log
--------------------------------------------------------------------

Step 1 : Checking reachability of nodes : OK!
Step 2 : Checking file and directory links for metadata:
\

--------------------------------------------------------------------
Backward check mode finished
--------------------------------------------------------------------

[root@ip131 fhgfs]# fhgfs-fsck mode=FULLCHECK checkPath=/

--------------------------------------------------------------------
Started FhGFS fsck in forward check mode [Fri Mar 2 10:27:55 2012]
Output will be written to /var/log/fhgfs-fsck.log
--------------------------------------------------------------------

Step 1 : Checking reachability of nodes : OK!
Step 2 : Checking metadata for file and directory links :
(0) 10:27:55 Main [MetadataTk::referenceOwner] >> Unable to proceed
without a working root node


--------------------------------------------------------------------
Could not find owner node for path: /
--------------------------------------------------------------------


Non-recoverable error occured. Please fix manually before running FS
check again.

--------------------------------------------------------------------
Started FhGFS fsck in backward check mode [Fri Mar 2 10:27:56 2012]
Output will be written to /var/log/fhgfs-fsck.log
--------------------------------------------------------------------

Step 1 : Checking reachability of nodes : OK!
Step 2 : Checking file and directory links for metadata:
\

--------------------------------------------------------------------
Backward check mode finished
--------------------------------------------------------------------

Sébastien CAPS

unread,
Mar 2, 2012, 4:27:58 AM3/2/12
to fhgfs-user
Some more info on the setup log I have a warning is this normal ?
setup log :

Installing FhGFS Client module on host ip147.....
--------------------
Loaded plugins: fastestmirror, security
Loading mirror speeds from cached hostfile
* base: ftp.belnet.be
* epel: be.mirror.eurid.eu
* extras: ftp.belnet.be
* updates: centos.mirror.transip.nl
Setting up Install Process
Package fhgfs-client-2011.04.r14-el6.x86_64 already installed and
latest version
Nothing to do
- FhGFS module autobuild
Building fhgfs-client-opentk module
Building fhgfs client module
WARNING: could not find /opt/fhgfs/src/client/fhgfs_client_module/
build/../source/closed/components/.AckManager.o.cmd for /opt/fhgfs/src/
client/fhgfs_client_module/build/../source/closed/components/
AckManager.o

Christian Mohrbacher

unread,
Mar 2, 2012, 5:01:58 AM3/2/12
to fhgfs...@googlegroups.com
Hi Sebastien,

On 03/02/2012 09:41 AM, S�bastien CAPS wrote:
>
> when and with fhgfs-df it show storage in "emergency" is this normal ?

This is normal. The emergency state just tells you that your free disk
space of 4GB on this server is lower than a specified limit. This limit
can be set with the parameter 'tuneStorageSpaceEmergencyLimit' in
/etc/fhgfs/fhgfs-mgmtd.conf.

> On the admon log file I have :
> "MetadataTk findOwner Unable to proceed without a working root node"
> how could i fix this ?

FhGFS needs a root metadata server, which manages the root of your
filesystem an is a starting point for every new client request. In your
case there seems to be a wrong information for the root metadata node
and this seems to cause all your problems. To find out, which server is
set as root metadata server you can use the following command :

fhgfs-ctl mode=listnodes nodetype=meta print_details

In the last line of the output you can see your root metadata server. Is
this one of 'ip131.changed.example.tld' or 'ip147.changed.example.tld'?

>
> I have also try fhgfs-fsck but i'm not sure of the option (log below)
>
> So should i initialize something ?
>

You could change sysMountSanityCheckMS back to 10000, then try starting
the client again and provide us the logfile /var/log/fhgfs_client.log.
Maybe the client's sanity check gives us some more information in the log.


> [root@ip131 fhgfs]# fhgfs-fsck mode=FULLCHECK checkPath=/
>
> --------------------------------------------------------------------
> Started FhGFS fsck in forward check mode [Fri Mar 2 10:27:55 2012]
> Output will be written to /var/log/fhgfs-fsck.log
> --------------------------------------------------------------------
>
> Step 1 : Checking reachability of nodes : OK!
> Step 2 : Checking metadata for file and directory links :
> (0) 10:27:55 Main [MetadataTk::referenceOwner]>> Unable to proceed
> without a working root node
>
>
> --------------------------------------------------------------------
> Could not find owner node for path: /
> --------------------------------------------------------------------
>
>
> Non-recoverable error occured. Please fix manually before running FS
> check again.

With the fsck you are not checking a storage or a metadata server
directly, but you check the FhGFS, as a real filesystem (so basically
from a FhGFS client's point of view).

The option 'checkPath' takes a path relative to your FhGFS root. So
having '/' there would mean checking from the root of your FhGFS (which
means checking the whole FhGFS then), so you could also just leave that
option out.

But in your case this will not work either, because you do not have a
working root node, so the fsck can not find an entry point into the
filesystem. Besides that, you do not have a filesystem yet, so there is
nothing to check, as fsck is really just to check for errors in the
files and directories of a FhGFS, not for checking configuration.

Regards,
Christian

--
=====================================================
| Christian Mohrbacher |
| Competence Center for High Performance Computing |
| Institut fuer Techno- und |
| Wirtschaftsmathematik (ITWM) |
| Fraunhofer-Platz 1 |
| |
| D-67663 Kaiserslautern |
=====================================================
| Tel: (49) 631 31600 4425 |
| Fax: (49) 631 31600 1099 |
| |
| E-Mail: christian....@itwm.fraunhofer.de |
| Internet: http://www.itwm.fraunhofer.de |
=====================================================

Sven Breuner

unread,
Mar 2, 2012, 7:37:19 AM3/2/12
to fhgfs...@googlegroups.com
Hi,

S�bastien CAPS wrote on 03/02/2012 10:27 AM:
> - FhGFS module autobuild
> Building fhgfs-client-opentk module
> Building fhgfs client module
> WARNING: could not find /opt/fhgfs/src/client/fhgfs_client_module/
> build/../source/closed/components/.AckManager.o.cmd for /opt/fhgfs/src/
> client/fhgfs_client_module/build/../source/closed/components/
> AckManager.o

this warning can be ignored. It's just the kernel's module build process
complaining about a missing .o.cmd file in the closed source part of fhgfs.
However, this .o.cmd file is not relevant for the build process and the
warning will be gone when we release fhgfs version 2011.04-r15 next
week, which will contain a completely open-sourced client.

Best regards,
Sven
Fraunhofer

Sébastien CAPS

unread,
Mar 2, 2012, 10:39:27 AM3/2/12
to fhgfs-user
Many for all the replies :)
I will try to fix the root metadata server problem and restarting from
scratch to be sure and keep you in touch :)

Some other questions :
is there a way to "promote" a metaserver to the role of root metadata
server with "fhgfs-ctl" ?

And is there any more examples/documentations/man/wiki entry for the
fhgfs-ctl utility ?

Thanks again.
Sebastien Caps

Frank Kautz

unread,
Mar 2, 2012, 11:03:04 AM3/2/12
to fhgfs...@googlegroups.com
Hi Sebastien,

Am 03/02/2012 04:39 PM, schrieb S�bastien CAPS:
> Many for all the replies :)
> I will try to fix the root metadata server problem and restarting from
> scratch to be sure and keep you in touch :)
>
> Some other questions :
> is there a way to "promote" a metaserver to the role of root metadata
> server with "fhgfs-ctl" ?

The fhgfs-ctl doesn't have a mode for changing the root metadata server.
But after the installation the first metadata server which will be
started is automatically the root metadata server.

>
> And is there any more examples/documentations/man/wiki entry for the
> fhgfs-ctl utility ?

No, sorry. We have a lot of requests about the documentation of
fhgfs-ctl. It is on our todo list. fhgfs-ctl have a help with examples.
Start fhgfs-ctl without any parameter to print the help. The help for a
special mode of fhgfs-ctl will be printed with a command like the
following "fhgfs-ctl mode=MODENAME help".

>
> Thanks again.
> Sebastien Caps

Please do not hesitate to ask if you should have further questions!

Greetings,
Frank

frank_kautz.vcf

Sébastien CAPS

unread,
Mar 5, 2012, 9:02:15 AM3/5/12
to fhgfs-user
Ok, I have reinstall all things from scratch and now everything is
ok :)
Thanks everyone !
Sebastien Caps
Reply all
Reply to author
Forward
0 new messages