Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

problem booting to multi-vdev root pool [Was: kern/150503: [zfs]

48 views
Skip to first unread message

Andriy Gapon

unread,
Nov 16, 2012, 10:45:07 AM11/16/12
to
on 13/11/2012 18:16 Guido Falsi said the following:
> My idea, but is just a speculation, i could be very wrong, is that the geom
> tasting code has some problem with multiple vdev root pools.

Guido,

you are absolutely correct. The code for reconstructing/tasting a root pool
configuration is a modified upstream code, so it inherited a limitation from it:
the support for only a single top-level vdev in a root pool.
I have an idea how to add the missing support, but it turned out not to be
something that I can hack together in couple of hours.

So, instead I wrote the following patch that should fall back to using a root pool
configuration from zpool.cache (if it's present there) for a multi-vdev root pool:
http://people.freebsd.org/~avg/zfs-spa-multi_vdev_root_fallback.diff

The patch also fixes a minor (single-time) memory leak.

Guido, Bartosz,
could you please test the patch?

Apologies for the breakage.

--
Andriy Gapon
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-curre...@freebsd.org"

Guido Falsi

unread,
Nov 16, 2012, 11:17:05 AM11/16/12
to
On 11/16/12 16:45, Andriy Gapon wrote:
> on 13/11/2012 18:16 Guido Falsi said the following:
>> My idea, but is just a speculation, i could be very wrong, is that the geom
>> tasting code has some problem with multiple vdev root pools.
>
> Guido,
>
> you are absolutely correct. The code for reconstructing/tasting a root pool
> configuration is a modified upstream code, so it inherited a limitation from it:
> the support for only a single top-level vdev in a root pool.
> I have an idea how to add the missing support, but it turned out not to be
> something that I can hack together in couple of hours.

I can imagine, it does not look simple in any way!

>
> So, instead I wrote the following patch that should fall back to using a root pool
> configuration from zpool.cache (if it's present there) for a multi-vdev root pool:
> http://people.freebsd.org/~avg/zfs-spa-multi_vdev_root_fallback.diff
>
> The patch also fixes a minor (single-time) memory leak.
>
> Guido, Bartosz,
> could you please test the patch?

I have just compiler an r242910 kernel with this patch (and just this
one) applied.

System booted so it seems to work fine! :)

>
> Apologies for the breakage.
>

No worries, and thanks for this fix.

Also thanks for all the work on ZFS!

--
Guido Falsi <m...@madpilot.net>

Guido Falsi

unread,
Nov 16, 2012, 11:33:02 AM11/16/12
to
On 11/16/12 17:13, Niclas Zeising wrote:
>
> Just to confirm, since I am holding back an update pending on this.
> If I have a raidz root pool, with three disks, like this:
> NAME STATE READ WRITE CKSUM
> zroot ONLINE 0 0 0
> raidz1-0 ONLINE 0 0 0
> gpt/disk0 ONLINE 0 0 0
> gpt/disk1 ONLINE 0 0 0
> gpt/disk2 ONLINE 0 0 0
>
> Then I'm fine to update without issues. the problem is only if, as an
> example, you have a mirror with striped disks, or a stripe with mirrored
> disks, which it seems to me the original poster had.
> Am I correct, and therefore ok to update?

Yes, looks like that. The affected system pool looks like this:

NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gpt/disk0 ONLINE 0 0 0
gpt/disk1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ada2p2 ONLINE 0 0 0
gpt/disk3 ONLINE 0 0 0


other systems I have with simple mirror pools or single disks have shown
no problems.

BTW I don't know why the system insists on identifying the third disk as
ada2p2, it has a gpt label defined just like the others.

--
Guido Falsi <m...@madpilot.net>
Message has been deleted

Bartosz Stec

unread,
Nov 17, 2012, 7:26:34 PM11/17/12
to
W dniu 2012-11-16 17:17, Guido Falsi pisze:
I've just compiled and installed fresh kernel with your patch, system
booted without any problems, so apparently patch works as intended.
Good job Andriy!

>
>>
>> Apologies for the breakage.
>>
>
> No worries, and thanks for this fix.
>
> Also thanks for all the work on ZFS!
>
Make it twice :)

Regards,

--
Bartosz Stec

Andriy Gapon

unread,
Nov 18, 2012, 6:48:44 AM11/18/12
to
on 18/11/2012 02:26 Bartosz Stec said the following:
> W dniu 2012-11-16 17:17, Guido Falsi pisze:
>> On 11/16/12 16:45, Andriy Gapon wrote:
>>> Guido, Bartosz,
>>> could you please test the patch?
>>
>> I have just compiler an r242910 kernel with this patch (and just this one)
>> applied.
>>
>> System booted so it seems to work fine! :)
> I've just compiled and installed fresh kernel with your patch, system booted
> without any problems, so apparently patch works as intended.

Thank you both very much for testing!
Committed as r243213.

--
Andriy Gapon

Andriy Gapon

unread,
Nov 19, 2012, 8:00:13 AM11/19/12
to
on 18/11/2012 13:48 Andriy Gapon said the following:
> on 18/11/2012 02:26 Bartosz Stec said the following:
>> W dniu 2012-11-16 17:17, Guido Falsi pisze:
>>> On 11/16/12 16:45, Andriy Gapon wrote:
>>>> Guido, Bartosz,
>>>> could you please test the patch?
>>>
>>> I have just compiler an r242910 kernel with this patch (and just this one)
>>> applied.
>>>
>>> System booted so it seems to work fine! :)
>> I've just compiled and installed fresh kernel with your patch, system booted
>> without any problems, so apparently patch works as intended.
>
> Thank you both very much for testing!
> Committed as r243213.
>

BTW, if you have some spare time and a desire to do some more testing, you can
try the following patch:
http://people.freebsd.org/~avg/zfs-spa-multi_vdev_root_support.diff

It adds support for multi-vdev root pool probing in kernel.
The best way to test is to remove zpool.cache before rebooting (but make sure to
keep a copy somewhere and be able to recover). I'd use a boot environment (a
root filesystem clone) for this.

Thank you.
Message has been deleted
Message has been deleted

Andrei Lavreniyuk

unread,
Nov 20, 2012, 3:57:02 AM11/20/12
to
Hi!


My system:

# uname -a
FreeBSD open.technica-03.local 10.0-CURRENT FreeBSD 10.0-CURRENT #0:
Tue Oct 30 14:13:01 EET 2012
ro...@open.technica-03.local:/usr/obj/usr/src/sys/SMP64R amd64


# zpool status -v
pool: zsolar
state: ONLINE
scan: resilvered 2,56M in 0h0m with 0 errors on Tue Nov 20 10:26:35 2012
config:

NAME STATE READ WRITE CKSUM
zsolar ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/disk0 ONLINE 0 0 0
gpt/disk2 ONLINE 0 0 0
gpt/disk3 ONLINE 0 0 0

errors: No known data errors


Update source:

# svn info
Path: .
Working Copy Root Path: /usr/src
URL: svn://svn.freebsd.org/base/head
Repository Root: svn://svn.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 243278
Node Kind: directory
Schedule: normal
Last Changed Author: avg
Last Changed Rev: 243272
Last Changed Date: 2012-11-19 13:35:56 +0200


I used http://people.freebsd.org/~avg/zfs-spa-multi_vdev_root_support.diff


buildworld + kernel

rm /boot/zfs/zpool.cache


Reboot....


Mounting from zfs:zsolar failed with error 45





---
Best regards, Andrei Lavreniyuk.

Andriy Gapon

unread,
Nov 20, 2012, 4:48:50 AM11/20/12
to
on 20/11/2012 10:57 Andrei Lavreniyuk said the following:
Are there any other unusual messages before this line?
Could you please try adding vfs.zfs.debug=1 to loader.conf and check again?

Could you also provide 'zdb -CC zsolar' output and 'zdb -l /dev/gpt/diskX' for
each of the disks. These could be uploaded somewhere as they can be quite lengthy.

--
Andriy Gapon

Andriy Gapon

unread,
Nov 20, 2012, 7:03:51 AM11/20/12
to
on 20/11/2012 12:45 Andrei Lavreniyuk said the following:
> Hi!
>
>
>> Are there any other unusual messages before this line?
>> Could you please try adding vfs.zfs.debug=1 to loader.conf and check again?
>
>> Could you also provide 'zdb -CC zsolar' output and 'zdb -l /dev/gpt/diskX' for
>> each of the disks. These could be uploaded somewhere as they can be quite lengthy.
>
>
> Please view attached files.

Thank you.
"Can not parse the config for pool" message explains what happens but not why...

Could you please apply the following patch, "un-ifdef" the DEBUG sections of it
and try again?
http://people.freebsd.org/~avg/spa_generate_rootconf.debug.diff

Andrei Lavreniyuk

unread,
Nov 20, 2012, 7:35:00 AM11/20/12
to

Andriy Gapon

unread,
Nov 20, 2012, 8:08:22 AM11/20/12
to
on 20/11/2012 14:41 Andrei Lavreniyuk said the following:
> Hi!
>
>
>> "Can not parse the config for pool" message explains what happens but not why...
>>
>> Could you please apply the following patch, "un-ifdef" the DEBUG sections of it
>> and try again?
>> http://people.freebsd.org/~avg/spa_generate_rootconf.debug.diff
>
>
> I use spa_generate_rootconf.debug.diff.

What about the " "un-ifdef" the DEBUG sections of it" part?

> make kernel && reboot
>
> No new debug messages. Pool cannot mount.

Andrei Lavreniyuk

unread,
Nov 20, 2012, 8:34:19 AM11/20/12
to

Andrei Lavreniyuk

unread,
Nov 21, 2012, 2:51:23 AM11/21/12
to
2012/11/20 Andriy Gapon <a...@freebsd.org>:
> on 20/11/2012 17:06 Andriy Gapon said the following:
>> on 20/11/2012 16:59 Andrei Lavreniyuk said the following:
>>>>> Sorry to make you jump through so many hoops.
>>>>> Now that I see that the probed config is entirely correct, the problem appears to
>>>>> be quite obvious: vdev_alloc is not able to properly use spa_version in this
>>>>> context because spa_ubsync is not initialized yet.
>>>>>
>>>>> Let me think about how to fix this.
>>>>
>>>> I hope that the following simple patch should fix the problem:
>>>> http://people.freebsd.org/~avg/spa_import_rootpool.version.diff
>>>
>>>
>>> At mount system trap and reboot.
>>>
>>
>> Unexpected. Can you catch the backtrace of the panic?
>> If you have it on the screen.
>>
>>
>
> Ah, found another bogosity in the code:
> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
> @@ -3925,8 +4117,6 @@ spa_import_rootpool(const char *name)
> return (error);
> }
>
> - spa_history_log_version(spa, LOG_POOL_IMPORT);
> -
> spa_config_enter(spa, SCL_ALL, FTAG, RW_WRITER);
> vdev_free(rvd);
> spa_config_exit(spa, SCL_ALL, FTAG);
>
>
> This previously "worked" only because the pool version was zero and thus the
> action was a NOP anyway.
>


Problem solved. Raidz pool mount without zpool.cache.


# zpool status -v
pool: zsolar
state: ONLINE
scan: resilvered 2,56M in 0h0m with 0 errors on Tue Nov 20 10:26:35 2012
config:

NAME STATE READ WRITE CKSUM
zsolar ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/disk0 ONLINE 0 0 0
gpt/disk2 ONLINE 0 0 0
gpt/disk3 ONLINE 0 0 0

errors: No known data errors

# uname -a
FreeBSD opensolaris.technica-03.local 10.0-CURRENT FreeBSD
10.0-CURRENT #6 r243278M: Wed Nov 21 09:28:51 EET 2012
ro...@opensolaris.technica-03.local:/usr/obj/usr/src/sys/SMP64R amd64


Thanks!

Andriy Gapon

unread,
Nov 21, 2012, 11:49:20 AM11/21/12
to
on 21/11/2012 09:51 Andrei Lavreniyuk said the following:
> Problem solved. Raidz pool mount without zpool.cache.
>
>
> # zpool status -v
> pool: zsolar
> state: ONLINE
> scan: resilvered 2,56M in 0h0m with 0 errors on Tue Nov 20 10:26:35 2012
> config:
>
> NAME STATE READ WRITE CKSUM
> zsolar ONLINE 0 0 0
> raidz2-0 ONLINE 0 0 0
> gpt/disk0 ONLINE 0 0 0
> gpt/disk2 ONLINE 0 0 0
> gpt/disk3 ONLINE 0 0 0
>
> errors: No known data errors
>
> # uname -a
> FreeBSD opensolaris.technica-03.local 10.0-CURRENT FreeBSD
> 10.0-CURRENT #6 r243278M: Wed Nov 21 09:28:51 EET 2012
> ro...@opensolaris.technica-03.local:/usr/obj/usr/src/sys/SMP64R amd64
>
>
> Thanks!

Thank you for testing!

--
Andriy Gapon
0 new messages