# gnt-instance modify -t drbd -n secondary-node instance-name
Sat Jan 29 21:22:42 2011 Converting template to drbd
Sat Jan 29 21:22:42 2011 Creating aditional volumes...
Sat Jan 29 21:22:44 2011 Renaming original volumes...
Sat Jan 29 21:22:44 2011 Initializing DRBD devices...
Sat Jan 29 21:22:45 2011 - INFO: Waiting for instance
secondary-node to sync disks.
Sat Jan 29 21:22:56 2011 - INFO: Instance instance-name's
disks are in sync.
Failure: command execution error:
There are some degraded disks for this instance, please cleanup manually
Does anyone have a pointer to what "cleanup manually" entails?
/proc/drbd on the primary:
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Inconsistent C r----
ns:0 nr:0 dw:0 dr:984 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:10485760
...and on the secondary:
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:StandAlone ro:Secondary/Unknown ds:Inconsistent/DUnknown r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
oos:10485760
Clearly the secondary needs to be told to sync with the primary, but I
don't know how to do that with no resources defined. Also, the output from
the conversion indicates that the disk are in sync (which I don't believe).
Advice welcome!
Thanks,
Keith
Basically it means "you have to make DRBD work again manually".
> /proc/drbd on the primary:
>
> version: 8.3.7 (api:88/proto:86-91)
> srcversion: EE47D8BF18AC166BE219757
> 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Inconsistent C r----
> ns:0 nr:0 dw:0 dr:984 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
> oos:10485760
>
> ...and on the secondary:
>
> version: 8.3.7 (api:88/proto:86-91)
> srcversion: EE47D8BF18AC166BE219757
> 0: cs:StandAlone ro:Secondary/Unknown ds:Inconsistent/DUnknown r----
> ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
> oos:10485760
>
> Clearly the secondary needs to be told to sync with the primary, but I
> don't know how to do that with no resources defined. Also, the output from
> the conversion indicates that the disk are in sync (which I don't believe).
The message means that the sync has finished, but not necessarily
successfully.
Have you run "gnt-cluster verify" and it finishes OK?
iustin
Yes:
# gnt-cluster verify
Sun Jan 30 08:55:23 2011 * Verifying global settings
Sun Jan 30 08:55:23 2011 * Gathering data (2 nodes)
Sun Jan 30 08:55:24 2011 * Verifying node status
Sun Jan 30 08:55:24 2011 * Verifying instance status
Sun Jan 30 08:55:24 2011 * Verifying orphan volumes
Sun Jan 30 08:55:24 2011 * Verifying orphan instances
Sun Jan 30 08:55:24 2011 * Verifying N+1 Memory redundancy
Sun Jan 30 08:55:24 2011 * Other Notes
Sun Jan 30 08:55:24 2011 - NOTICE: 6 non-redundant instance(s) found.
Sun Jan 30 08:55:24 2011 * Hooks Results
I'll set up a manual DRBD resource and see if that works.
Thanks for the help.
Keith
Hmm. My guess was that the DRBD helper is not correctly set, but cluster
verify should have warned about that. In any case, what is your current
usermode_helper?
You should also look at dmesg, usually it gives the reason why the peers
have disconnected.
iustin
$ grep drbd /etc/modules
drbd minor_count=128 usermode_helper=/bin/true
...BUT: thanks for the clue. I looked in dmesg and saw:
On master:
[ 842.689988] block drbd0: meta connection shut down by peer.
On secondary:
[10815.644569] block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0
[10815.645210] block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0 exit code 10 (0xa00)
...which show the incorrect usermode_helper. Further experimentation
showed that the drbd module was being loaded by udev (or at least prior
to /etc/modules being read), so I created an /etc/modprobe.d/local file
and put the options in there.
That seems to have partly fixed the problem in that synchronisation
definitely starts now, but it still fails with read errors. I'll look into
that to see if I can resolve it.
Thanks for your help, and hopefully the detail above will help anyone else
in a similar position.
Regards,
Keith