[sheepdog-ng] I/O error and recovery

Valerio Pachera

unread,

Jul 19, 2016, 6:56:54 PM7/19/16

to sheepdog-ng, Liu Yuan, Alessandro Bolgia

Hi all, I have a PRODUCTION cluster with 2 nodes:

dog node info
Id      Size    Used    Avail   Use%
0      1.7 TB 142 GB 1.6 TB    8%
1      3.4 TB 142 GB 3.3 TB    4%
Total   5.1 TB 284 GB 4.9 TB    5%

I noticed a recovery has started

node sheep01
Jul 19 22:33:22 ERROR [main] check_request_epoch(158) old node version 5, 4 (READ_PEER)
Jul 19 22:33:26   INFO [main] recover_object_main(864) object recovery progress   1%
Jul 19 22:33:29   INFO [main] recover_object_main(864) object recovery progress   2%
Jul 19 22:33:37   INFO [main] recover_object_main(864) object recovery progress   3%
Jul 19 22:33:49   INFO [main] recover_object_main(864) object recovery progress   4%
...
Jul 19 22:47:27   INFO [main] recover_object_main(864) object recovery progress 98%
Jul 19 22:47:34   INFO [main] recover_object_main(864) object recovery progress 99%

node sheep02
Jul 19 22:33:22 ERROR [io 10120] default_read_from_path(227) failed to read object 80a86f00004a6f, path=/mnt/sheep/0/0080a86f00004a6f, offset=3096576, size=102400, result=-1, Input/output error
Jul 19 22:33:22 ERROR [io 10120] err_to_sderr(79) oid=80a86f00004a6f, Input/output error
Jul 19 22:33:22 INFO [main] md_remove_disk(360) /mnt/sheep/0 from multi-disk array
Jul 19 22:33:22 INFO [main] zk_leave(1036) leaving from cluster
Jul 19 22:33:22 ERROR [main] check_request_epoch(158) old node version 5, 4 (READ_PEER)

It seems a disk was getting removed because of I/O error on node sheep02.

This is the only disk sheepdog is working on, except for the metadata directory.

Anyway the node is up and is still showing the /mnt/sheep/0 device:

dog node md info --all
Id      Size    Used    Avail   Use%    Path
Node 0:
0      1.7 TB 142 GB 1.6 TB    8%    /mnt/sheep/0
Node 1:
0      3.4 TB 142 GB 3.3 TB    4%    /mnt/sheep/0

What do you think about it?

Here are some more info

Sheepdog daemon version 0.9.0_327_gdc0496e

This are sheep options (they are the same on both nodes, except for --myaddr):

sheep \
-n /var/lib/sheepdog,/mnt/sheep/0 \
--cluster zookeeper:192.168.6.111:2181,192.168.6.112:2181,192.168.6.80:2181 \
--myaddr 192.168.6.112 \
--ioaddr host=192.168.5.112,port=3333

dog cluster info -v
Cluster status: running, auto-recovery enabled
Cluster store: plain with 2 redundancy policy
Cluster vnode mode: disk
Cluster created at Tue Mar 8 17:46:48 2016

Epoch Time           Version
2016-07-19 22:33:22      5 [192.168.6.80:7000(1), 192.168.6.111:7000(1)]
2016-03-08 19:14:52      4 [192.168.6.80:7000(1), 192.168.6.111:7000(1), 192.168.6.112:7000(1)]
2016-03-08 19:13:55      3 [192.168.6.111:7000(1), 192.168.6.112:7000(1)]
2016-03-08 19:13:31      2 [192.168.6.80:7000(1), 192.168.6.111:7000(1), 192.168.6.112:7000(1)]
2016-03-08 17:46:48      1 [192.168.6.111:7000(1), 192.168.6.112:7000(1)]

Liu Yuan

unread,

Jul 20, 2016, 1:35:04 AM7/20/16

to Valerio Pachera, sheepdog-ng, Alessandro Bolgia

This is the expected result and the sheep will be gateway only node.

>
> Anyway the node is up and is still showing the /mnt/sheep/0 device:
>
> dog node md info --all
> Id Size Used Avail Use% Path
> Node 0:
> 0 1.7 TB 142 GB 1.6 TB 8% /mnt/sheep/0
> Node 1:
> 0 3.4 TB 142 GB 3.3 TB 4% /mnt/sheep/0
>
> What do you think about it?

This looks weird. If the disk was removed, we shouldn't see it by 'md info'

Thanks,
Yuan

Valerio Pachera

unread,

Jul 20, 2016, 2:34:58 AM7/20/16

to sheepdog-ng

2016-07-20 7:35 GMT+02:00 Liu Yuan <namei...@gmail.com>:

This looks weird. If the disk was removed, we shouldn't see it by 'md info'

I was thinking the same but now, looking more carefully at 'dog cluster info' I see that the cluster was of 3 nodes and that node named sheep02 has been removed form the cluster.

The only data disk it had, was removed by sheepdog.

I was expecting for it to become a gateway only but it may be better that it got remove from the cluster.

Looking at s.m.a.r.t. of the removed disk, I see a pending sector.
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1

I think I'm going to run fsck.ext4 with '-c' options to check for badblocks.

Then I'll try to add the node back.

Do you recommend to remove the metadata dir? Or to remove also the objects?

Valerio Pachera

unread,

Jul 20, 2016, 2:42:42 AM7/20/16

to sheepdog-ng, Liu Yuan

2016-07-20 8:34 GMT+02:00 Valerio Pachera <sir...@gmail.com>

I think I'm going to run fsck.ext4 with '-c' options to check for badblocks.
Then I'll try to add the node back.
Do you recommend to remove the metadata dir? Or to remove also the objects?

Ok, now I see a weird thing: sheep is still running on the disconnected node (192.168.6.112)!!!

ip addr ls dev br6 | grep -w inet
    inet 192.168.6.112/24 brd 192.168.6.255 scope global br6

ps aux | grep sheep
root     19268 0.0 0.0 12748 2040 pts/0    S+   08:38   0:00 grep sheep
root     40707 0.0 0.3 1220364 15248 ?       Sl   mar08 47:01 sheep -n /var/lib/sheepdog /mnt/sheep/0 --cluster zookeeper:192.168.6.111:2181,192.168.6.112:2181,192.168.6.80:2181 --myaddr 192.168.6.112 --ioaddr host=192.168.5.112 port=3333
root     40708 0.0 0.0 37644 1272 ?        Ss   mar08   3:36 sheep -n /var/lib/sheepdog /mnt/sheep/0 --cluster zookeeper:192.168.6.111:2181,192.168.6.112:2181,192.168.6.80:2181 --myaddr 192.168.6.112 --ioaddr host=192.168.5.112 port=3333

dog node list
Id   Host:Port         V-Nodes       Zone
   0   192.168.6.80:7000        436 1342613696
   1   192.168.6.111:7000       881 1862707392

What do you think about it?

Is it safe to kill sheep process?

Valerio Pachera

unread,

Jul 20, 2016, 4:06:12 AM7/20/16

to sheepdog-ng, Liu Yuan

2016-07-20 8:42 GMT+02:00 Valerio Pachera <sir...@gmail.com>:

2016-07-20 8:34 GMT+02:00 Valerio Pachera <sir...@gmail.com>

I think I'm going to run fsck.ext4 with '-c' options to check for badblocks.
Then I'll try to add the node back.
Do you recommend to remove the metadata dir? Or to remove also the objects?

Ok, now I see a weird thing: sheep is still running on the disconnected node (192.168.6.112)!!!

dog node list
Id   Host:Port         V-Nodes       Zone
   0   192.168.6.80:7000        436 1342613696
   1   192.168.6.111:7000       881 1862707392

Another note:

the sheep process running on the disconnected node (192.168.6.112), accepts command from dog.

So the the process is not hanged up.

That confused me in the beginning, I didn't notice right away that the node was disconnected from the cluster because I was running dog on it and it was working.

That's really weird.

Liu Yuan

unread,

Jul 22, 2016, 5:03:41 AM7/22/16

to Valerio Pachera, sheepdog-ng

I think right now the sheep is running as gateway only but there is no indicator
of dog output to confirm it is or not. It is one of the members of the cluster
just won't execute io requests on its local disk.

You can kill the node and restart it later once you check the health of the disk

Thanks,
Yuan

Liu Yuan

unread,

Jul 22, 2016, 5:04:57 AM7/22/16

to Valerio Pachera, sheepdog-ng

I think removing the data objects is good enough.

Thanks,
Yuan

Valerio Pachera

unread,

Jul 25, 2016, 9:45:43 AM7/25/16

to sheepdog-ng

2016-07-22 11:05 GMT+02:00 Liu Yuan <namei...@gmail.com>:

I think removing the data objects is good enough.

In the end I substituted the drive so I started with a clean hard disk.

I noticed that once I added it back to the cluster, in sheep.log of another node, it printed instantly 'object recovery progress' for 0% to 100% in no time.

What does that mean?

'Dog node recovery' is "slowly" progressing with recovery.

sheep.log
Jul 25 15:40:45 INFO [main] recover_object_main(864) object recovery progress 1%
...
Jul 25 15:40:45 INFO [main] recover_object_main(864) object recovery progress 100%

Valerio Pachera

unread,

Jul 25, 2016, 9:47:54 AM7/25/16

to Liu Yuan, sheepdog-ng

2016-07-22 11:03 GMT+02:00 Liu Yuan <namei...@gmail.com>:

I think right now the sheep is running as gateway only but there is no indicator
of dog output to confirm it is or not.

Some days ago I did a quick test on my testing cluster and as soon I remove the disk by 'dog node md unplug' , the node got remove from the cluster.

I'm going to test couple of more time with latest sheepdog.

Liu Yuan

unread,

Jul 27, 2016, 3:43:24 AM7/27/16

to Valerio Pachera, sheepdog-ng

On Mon, Jul 25, 2016 at 03:45:42PM +0200, Valerio Pachera wrote:
> 2016-07-22 11:05 GMT+02:00 Liu Yuan <namei...@gmail.com>:
>
> >
> > I think removing the data objects is good enough.
> >
> > In the end I substituted the drive so I started with a clean hard disk.
> I noticed that once I added it back to the cluster, in sheep.log of another
> node, it printed instantly 'object recovery progress' for 0% to 100% in no
> time.
> What does that mean?
> 'Dog node recovery' is "slowly" progressing with recovery.

Yes.

>
> sheep.log
> Jul 25 15:40:45 INFO [main] recover_object_main(864) object recovery
> progress 1%
> ...
> Jul 25 15:40:45 INFO [main] recover_object_main(864) object recovery
> progress 100%

Better double check the data directory to see if the data are recovered actually

Thanks,
Yuan

Reply all

Reply to author

Forward