Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: ZFS l2arc broken in 10.3

1 view
Skip to first unread message

Pete French

unread,
Oct 13, 2016, 6:00:32 AM10/13/16
to freebsd...@freebsd.org

Ok, thats a bit worry if true - but I can confirm that l2arc works fine
under 10.3 on amd64, so what you say about cross-compling might be true.
Am taking an inetrest in this as I have just dpeloyed a lot of machines
which are going to be relying on l2arc working to get reasobale performance.

-pete.

On 10/12/16 21:18, Peter wrote:
> Details:
> After upgrading 2 machines from 9.3 to 10.3-STABLE, on one of them the
> l2arc stays empty (capacity alloc = 0), although it is online and gets
> accessed. It did work well on 9.3.
>
> I did the following tests:
> * Create a zpool on a stick, with two volumes: one filesystem and one
> cache. The cache stays with alloc=0.
> Export it and move it into the other machine. The cache immediately
> fills.
> Move it back, the cache stays with alloc=0.
> -> this rules out all zpool/zfs get/set options, as they should
> walk with the pool.
> * Boot the GENERIC kernel. l2arc stays with alloc=0.
> -> this rules out all my nonstandard kernel options.
> * Boot in single user mode. l2arc stays with alloc=0.
> -> this rules out all /etc/* config files.
> * Delete the zpool.cache and reimport pools. l2arc stays with alloc=0.
> * Copy the /boot/loader.conf settings to the other machine. The l2arc
> still works there.
>
> I could not think of any remaining place where this could come from,
> except the kernel code itself.
> From there, I found these counters nicely incrementing each second:
> kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 50758
> kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 27121
> kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 40589375488
> But also this counter incrementing:
> kstat.zfs.misc.arcstats.l2_write_full: 14604
>
> Then with some printf in the code I saw these values provided:
> buf_sz = hdr->b_size;
> align = (size_t)1 << dev->l2ad_vdev->vdev_ashift;
> buf_a_sz = P2ROUNDUP(buf_sz, align);
> if ((write_asize + buf_a_sz) > target_sz) {
> full = B_TRUE;
> mutex_exit(hash_lock);
> ARCSTAT_BUMP(arcstat_l2_write_full);
> break;
> }
>
> buf_sz = 1536
> align = 512
> buf_a_sz = 18446744069414585856
> write_asize = 0
> target_sz = 16777216
>
> where buf_a_sz is obviousely off by (2^64 - 2^32).
>
> Maybe this is an effect of crosscompiling i386 on amd64. But anyway, as
> long as i386 is still supported, it should not happen.
>
>
> Now, my real concern is: if this really obvious ... made it undetected
> until 10.3, how many other missing typecasts are still in the code??
>
> _______________________________________________
> freebsd...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"
_______________________________________________
freebsd...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

Andriy Gapon

unread,
Oct 13, 2016, 7:24:50 AM10/13/16
to Peter, freebsd...@freebsd.org
Yes, the problem is specific to 32-bit platforms where size_t is 32-bit.

> But anyway, as long as
> i386 is still supported, it should not happen.

Certainly.

> Now, my real concern is: if this really obvious ... made it undetected until
> 10.3, how many other missing typecasts are still in the code??

No need to be dramatic here. That particular piece code is very new.
I committed it to head in April (r297848), MFC-ed even later.
Apparently no one else who uses 32-bit systems and has L2ARC configured had a
chance to run into the bug.

Thank you very much for discovering and analyzing the bug and providing a fix
for it!


--
Andriy Gapon
0 new messages