Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[BUG] lib: zram lz4 compression/decompression still broken on big endian

104 views
Skip to first unread message

Rui Salvaterra

unread,
Apr 5, 2016, 10:10:18 AM4/5/16
to linux-...@vger.kernel.org, eunb...@samsung.com, gre...@linuxfoundation.org, min...@kernel.org, linu...@kvack.org
Hi,


I apologise in advance if I've cc'ed too many/the wrong people/lists.

Whenever I try to use zram with lz4, on my Power Mac G5 (tested with
kernel 4.4.0-16-powerpc64-smp from Ubuntu 16.04 LTS), I get the
following on my dmesg:

[13150.675820] zram: Added device: zram0
[13150.704133] zram0: detected capacity change from 0 to 5131976704
[13150.715960] zram: Decompression failed! err=-1, page=0
[13150.716008] zram: Decompression failed! err=-1, page=0
[13150.716027] zram: Decompression failed! err=-1, page=0
[13150.716032] Buffer I/O error on dev zram0, logical block 0, async page read

I believe Eunbong Song wrote a patch [1] to fix this (or a very
identical) bug on MIPS, but it never got merged (maybe
incorrect/incomplete?). Is there any hope of seeing this bug fixed?


Thanks,

Rui Salvaterra


[1] http://comments.gmane.org/gmane.linux.kernel/1752745

Greg KH

unread,
Apr 5, 2016, 11:34:49 AM4/5/16
to Rui Salvaterra, linux-...@vger.kernel.org, eunb...@samsung.com, min...@kernel.org, linu...@kvack.org
For some reason it never got merged, sorry, I don't remember why.

Have you tested this patch? If so, can you resend it with your
tested-by: line added to it?

thanks,

greg k-h

Rui Salvaterra

unread,
Apr 5, 2016, 12:02:29 PM4/5/16
to Greg KH, linux-...@vger.kernel.org, eunb...@samsung.com, min...@kernel.org, linu...@kvack.org
Hi, Greg


No, I haven't tested the patch at all. I want to do so, and fix if if
necessary, but I still need to learn how to (meaning, I need to watch
your "first kernel patch" presentation again). I'd love to get
involved in kernel development, and this seems to be a good
opportunity, if none of the kernel gods beat me to it (I may need a
month, but then again nobody complained about this bug in almost two
years).


Thanks,

Rui

Sergey Senozhatsky

unread,
Apr 6, 2016, 1:34:17 AM4/6/16
to Rui Salvaterra, Greg KH, linux-...@vger.kernel.org, eunb...@samsung.com, min...@kernel.org, linu...@kvack.org, Sergey Senozhatsky, Sergey Senozhatsky
On (04/05/16 17:02), Rui Salvaterra wrote:
[..]
> > For some reason it never got merged, sorry, I don't remember why.
> >
> > Have you tested this patch? If so, can you resend it with your
> > tested-by: line added to it?
> >
> > thanks,
> >
> > greg k-h
>
> Hi, Greg
>
>
> No, I haven't tested the patch at all. I want to do so, and fix if if
> necessary, but I still need to learn how to (meaning, I need to watch
> your "first kernel patch" presentation again). I'd love to get
> involved in kernel development, and this seems to be a good
> opportunity, if none of the kernel gods beat me to it (I may need a
> month, but then again nobody complained about this bug in almost two
> years).

Hello Rui,

may we please ask you to test the patch first? quite possible there
is nothing to fix there; I've no access to mips h/w but the patch
seems correct to me.

LZ4_READ_LITTLEENDIAN_16 does get_unaligned_le16(), so
LZ4_WRITE_LITTLEENDIAN_16 must do put_unaligned_le16() /* not put_unaligned() */

-ss

Rui Salvaterra

unread,
Apr 6, 2016, 5:40:04 AM4/6/16
to Sergey Senozhatsky, Greg KH, linux-...@vger.kernel.org, eunb...@samsung.com, min...@kernel.org, linu...@kvack.org, Sergey Senozhatsky
2016-04-06 6:33 GMT+01:00 Sergey Senozhatsky
<sergey.seno...@gmail.com>:
Hi, Sergey


Besides ppc64, I have ppc32, x86 and x86_64 hardware readily
available. The only mips (74kc, also big endian) hardware I have
access to is my router, running OpenWrt, I can try to test it there
too, but it will be more complicated. Still, after reading the
existing code [1] more thoroughly, I can't see how Eunbong Song's
patch [2] would fix the ppc case (please correct me if I'm wrong,
which is highly likely, since my C preprocessor knowledge varies
between nonexistent to very superficial).

Now, LZ4_READ_LITTLEENDIAN_16 is unconditionally defined as:

#define LZ4_READ_LITTLEENDIAN_16(d, s, p)
(d = s - get_unaligned_le16(p))

As far as I can tell, and unlike ppc, mips doesn't define
HAVE_EFFICIENT_UNALIGNED_ACCESS, which means for mips case,
LZ4_WRITE_LITTLEENDIAN_16 will be defined as:

#define LZ4_WRITE_LITTLEENDIAN_16(p, v)
do {
put_unaligned(v, (u16 *)(p));
p += 2;
} while (0)

Whereas for ppc, which defines HAVE_EFFICIENT_UNALIGNED_ACCESS,
LZ4_WRITE_LITTLEENDIAN_16 will be defined as:

#define LZ4_WRITE_LITTLEENDIAN_16(p, v)
do {
A16(p) = v;
p += 2;
} while (0)

Consequentially, while I believe the patch will fix the mips case, I'm
not so sure about ppc (or any other big endian architecture with
efficient unaligned accesses).


Thanks,

Rui

[1] https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/lib/lz4/lz4defs.h?h=v4.4.6
[2] http://permalink.gmane.org/gmane.linux.kernel/1752745

Sergey Senozhatsky

unread,
Apr 6, 2016, 8:11:36 AM4/6/16
to Rui Salvaterra, Sergey Senozhatsky, Greg KH, linux-...@vger.kernel.org, eunb...@samsung.com, min...@kernel.org, linu...@kvack.org, Sergey Senozhatsky, Chanho Min, Kyungsik Lee
Cc Chanho Min, Kyungsik Lee


Hello,

On (04/06/16 10:39), Rui Salvaterra wrote:
> > may we please ask you to test the patch first? quite possible there
> > is nothing to fix there; I've no access to mips h/w but the patch
> > seems correct to me.
> >
> > LZ4_READ_LITTLEENDIAN_16 does get_unaligned_le16(), so
> > LZ4_WRITE_LITTLEENDIAN_16 must do put_unaligned_le16() /* not put_unaligned() */
> >
[..]
> Consequentially, while I believe the patch will fix the mips case, I'm
> not so sure about ppc (or any other big endian architecture with
> efficient unaligned accesses).

frankly, yes, I took a quick look today (after I sent my initial
message, tho) ... and it is fishy, I agree. was going to followup
on my email but somehow got interrupted, sorry.

so we have, write:
((U16_S *)(p)) = v OR put_unaligned(v, (u16 *)(p))

and only one read:
get_unaligned_le16(p))

I guess it's either read part also must depend on
HAVE_EFFICIENT_UNALIGNED_ACCESS, or write path
should stop doing so.

I ended up with two patches, NONE was tested (!!!). like at all.

1) provide CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS-dependent
LZ4_READ_LITTLEENDIAN_16

2) provide common LZ4_WRITE_LITTLEENDIAN_16 and LZ4_READ_LITTLEENDIAN_16
regardless CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.


assuming that common LZ4_WRITE_LITTLEENDIAN_16 will somehow hit the
performance, I'd probably prefer option #1.

the patch is below. would be great if you can help testing it.

---

lib/lz4/lz4defs.h | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
index abcecdc..a23e6c2 100644
--- a/lib/lz4/lz4defs.h
+++ b/lib/lz4/lz4defs.h
@@ -36,10 +36,14 @@ typedef struct _U64_S { u64 v; } U64_S;
#define PUT4(s, d) (A32(d) = A32(s))
#define PUT8(s, d) (A64(d) = A64(s))
#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
- do { \
- A16(p) = v; \
- p += 2; \
+ do { \
+ A16(p) = v; \
+ p += 2; \
} while (0)
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+ (d = s - A16(p))
+
#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */

#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v))
@@ -52,10 +56,13 @@ typedef struct _U64_S { u64 v; } U64_S;
put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)

#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
- do { \
- put_unaligned(v, (u16 *)(p)); \
- p += 2; \
+ do { \
+ put_unaligned_le16(v, (u16 *)(p)); \
+ p += 2; \
} while (0)
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+ (d = s - get_unaligned_le16(p))
#endif

#define COPYLENGTH 8
@@ -140,9 +147,6 @@ typedef struct _U64_S { u64 v; } U64_S;

#endif

-#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
- (d = s - get_unaligned_le16(p))
-
#define LZ4_WILDCOPY(s, d, e) \
do { \
LZ4_COPYPACKET(s, d); \

Rui Salvaterra

unread,
Apr 7, 2016, 8:33:48 AM4/7/16
to Sergey Senozhatsky, Sergey Senozhatsky, Greg KH, linux-...@vger.kernel.org, eunb...@samsung.com, min...@kernel.org, linu...@kvack.org, Chanho Min, Kyungsik Lee
Hi again, Sergey


Thanks for the patch, I'll test it as soon as possible. I agree with
your second option, usually one selects lz4 when (especially
decompression) speed is paramount, so it needs all the help it can
get.

Speaking of fishy, the 64-bit detection code also looks suspiciously
bogus. Some of the identifiers don't even exist anywhere in the kernel
(__ppc64__, por example, after grepping all .c and .h files).
Shouldn't we instead check for CONFIG_64BIT or BITS_PER_LONG == 64?


Thanks,

Rui

Sergey Senozhatsky

unread,
Apr 7, 2016, 9:09:25 AM4/7/16
to Rui Salvaterra, Sergey Senozhatsky, Sergey Senozhatsky, Greg KH, linux-...@vger.kernel.org, eunb...@samsung.com, min...@kernel.org, linu...@kvack.org, Chanho Min, Kyungsik Lee
On (04/07/16 13:33), Rui Salvaterra wrote:
[..]
> Hi again, Sergey

Hello,

> Thanks for the patch, I'll test it as soon as possible. I agree with
> your second option, usually one selects lz4 when (especially
> decompression) speed is paramount, so it needs all the help it can
> get.

thanks!

> Speaking of fishy, the 64-bit detection code also looks suspiciously
> bogus. Some of the identifiers don't even exist anywhere in the kernel
> (__ppc64__, por example, after grepping all .c and .h files).
> Shouldn't we instead check for CONFIG_64BIT or BITS_PER_LONG == 64?

definitely a good question. personally, I'd prefer to test for
CONFIG_64BIT only, looking at this hairy

/* Detects 64 bits mode */
#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
|| defined(__ppc64__) || defined(__LP64__))

and remove/rewrite a bunch of other stuff. but the thing with cleanups
is that they don't fix anything, while potentially can introduce bugs.
it's more risky to touch the stable code. /* well, removing those 'ghost'
identifiers is sort of OK to me */. but that's just my opinion, I'll
leave it to you and Greg.

-ss

Rui Salvaterra

unread,
Apr 8, 2016, 10:53:39 AM4/8/16
to Sergey Senozhatsky, Sergey Senozhatsky, Greg KH, linux-...@vger.kernel.org, eunb...@samsung.com, min...@kernel.org, linu...@kvack.org, Chanho Min, Kyungsik Lee
Hi again, Sergey

I finally was able to test your patch but, as I suspected, it wasn't
enough. However, based on it, I was able to write a (hopefully)
correct one, which I'll send soon (tested on ppc64, with no
regressions on x86_64).

Thanks,

Rui
0 new messages