[RFC] CryptoAPI & Compression

Artem B. Bityuckiy

unread,

Mar 25, 2005, 11:09:42 AM3/25/05

to her...@gondor.apana.org.au, dw...@infradead.org, linux-...@vger.kernel.org

Hello Herbert and others,

I'm working on cleaning-up the JFFS3 compression stuff. JFFS3 contains a
number of compressors which actually shouldn't be there as they are
platform-independent and generic. So we want to move them to the generic
part of the Linux kernel instead of storing them in fs/jffs2/. And we
were going to use CryptoAPI to access the compressors.

But I've hit on a problem - CryptoAPI's compression method isn't
flexible enough for us.

CryptoAPI's compress method is:

int crypto_compress(struct crypto_tfm *tfm,
const u8 *src, unsigned int slen,
u8 *dst, unsigned int *dlen);

*src - input data;
slen - input data length;
*dst - output data;
*dlen - on input - output buffer length, on output - the length of the
compressed data;

The crypto_compress() API call either compresses all the input data or
returns error.

In JFFS2 we need something more flexible. Te following is what we want:

int crypto_compress_ext(struct crypto_tfm *tfm,
const u8 *src, unsigned int *slen,
u8 *dst, unsigned int *dlen);

*src - input data;
*slen - on input - input data length, on output - the amount of data
that were actually compressed.
*dst - output data;
*dlen - on input - output buffer length, on output - the length of the
compressed data;

This would allow us (and we really need this) to provide a large input
data buffer, a small output data buffer and to ask the compressor to
compress as much input data as it can to fit (in the compressed form) to
the output buffer. To put it differently, we often have a large input,
and several smalloutput buffers, and we want to compress the input data
to them.

I offer to extend the CryptoAPI and add an "extended compress" API call
with the above mentioned capabilities. We might as well just change the
crypto_compress() and all its users.

Alternatively, we may create some kind of "Compression API" but I don't
like this idea...

Comments?

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Herbert Xu

unread,

Mar 25, 2005, 11:48:36 PM3/25/05

to Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org

Hi Artem:

On Fri, Mar 25, 2005 at 04:08:20PM +0000, Artem B. Bityuckiy wrote:
>
> I'm working on cleaning-up the JFFS3 compression stuff. JFFS3 contains a
> number of compressors which actually shouldn't be there as they are
> platform-independent and generic. So we want to move them to the generic
> part of the Linux kernel instead of storing them in fs/jffs2/. And we
> were going to use CryptoAPI to access the compressors.

Sounds good.

> In JFFS2 we need something more flexible. Te following is what we want:
>
> int crypto_compress_ext(struct crypto_tfm *tfm,
> const u8 *src, unsigned int *slen,
> u8 *dst, unsigned int *dlen);

Compressing part of the input could be significantly different from
compressing all of the input depending on the algorithm. In particular
it could be most costly to do so and/or result in worse compression.

So while I'm happy to add such an interface I think we should keep
crypto_comp_compress as well for full compression.

I've whipped up something quick and called it crypto_comp_pcompress.
How does this interface look to you?

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <her...@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

pcompress-1

Artem B. Bityuckiy

unread,

Mar 26, 2005, 6:34:14 AM3/26/05

to Herbert Xu, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org

On Sat, 2005-03-26 at 15:44 +1100, Herbert Xu wrote:
> I've whipped up something quick and called it crypto_comp_pcompress.
> How does this interface look to you?

Thanks for the patch. At the first glance it looks OK. I'll try to use
it and add the deflate method which in fact is already implemented in
JFFS2. I'll send you my results.

Artem B. Bityuckiy

unread,

Mar 28, 2005, 12:25:00 PM3/28/05

to Herbert Xu, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com

On Sat, 2005-03-26 at 15:44 +1100, Herbert Xu wrote:

> I've whipped up something quick and called it crypto_comp_pcompress.
> How does this interface look to you?

Hello Herbert,

I've done some work. Here are 2 patches:
1. pcompress-deflate-1.diff
2. uncompress-1.diff
(should be applied in that order). And I also imply your patch is
applied as well.

The first patch is the implementation of the deflate_pcompress()
function. It was ported from JFFS2 with some changes.
The second patch is my implementation of the deflate_decompr() function
and I'd like to hear some comments on this.

I made the changes to deflate_decompr() because the old version doesn't
work properly for me. There are 2 changes.

1. I've added the following code:

------------------------------------------------------------------------
if (slen > 2 && !(src[1] & PRESET_DICT) /* No preset dictionary */
&& ((src[0] & 0x0f) == Z_DEFLATED) /* Comp. method byte is OK */
&& !(((src[0] << 8) + src[1]) % 31)) { /* CMF*256 + FLG */
stream->next_in += 2;
stream->avail_in -= 2;
}
------------------------------------------------------------------------

This peace of code checks if there are 2 header bytes (see RFC-1950)
present in the stream to be inflated. If they are present, we should
skip them because we use a negative windowBits parameter when we
initialize inflate. When this "undocumented" feature is used, zlib
ignores the stream header and doesn't check the adler32 checksum (I've
found this roughly exploring the inflate source code). If we don't skip
the first 2 bytes, inflate will treat them as the part of the input data
and will work incorrectly.

I don't know how deflate_decompr() worked before, it shouldn't have
worked AFAICS. So, I'm in doubt and would like to see your or James'
comments.

2. I've removed the "strange" (for me) uncompress sequence:

------------------------------------------------------------------------
ret = zlib_inflate(stream, Z_SYNC_FLUSH);
/*
* Work around a bug in zlib, which sometimes wants to taste an extra
* byte when being used in the (undocumented) raw deflate mode.
* (From USAGI).
*/
if (ret == Z_OK && !stream->avail_in && stream->avail_out) {
u8 zerostuff = 0;
stream->next_in = &zerostuff;
stream->avail_in = 1;
ret = zlib_inflate(stream, Z_FINISH);
}
------------------------------------------------------------------------

and have changed it to the following:

------------------------------------------------------------------------
ret = zlib_inflate(stream, Z_FINISH);
if (ret != Z_STREAM_END)
return -EINVAL;
------------------------------------------------------------------------

I made this because the old sequence didn't uncompress data compressed
by my deflate_pcompress(). Frankly, I just don't understand the meaning
of that replaced part and changed it to what seems correct and obvious
for me. My argument is the documentation of zlib_inflate() from
include/linux/zlib.h:

------------------------------------------------------------------------
inflate() should normally be called until it returns Z_STREAM_END or an
error. However if all decompression is to be performed in a single step
(a single call of inflate), the parameter flush should be set to
Z_FINISH. In this case all pending input is processed and all pending
output is flushed; avail_out must be large enough to hold all the
uncompressed data.
------------------------------------------------------------------------

I don't know anything about the declared bug "(From USAGI)".
So I'd like to hear some comment on this as well.

Also I've simplified error handling a bit.

The stuff I'm sending works for me. I also run the tcrypt.c test module,
and the deflate/inflate tests passed.

pcompress-deflate-1.diff

uncompress-1.diff

Herbert Xu

unread,

Mar 29, 2005, 5:43:52 AM3/29/05

to Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com

Hi Artem:

On Mon, Mar 28, 2005 at 05:22:36PM +0000, Artem B. Bityuckiy wrote:
>
> The first patch is the implementation of the deflate_pcompress()

Thanks for the patch. I'll comment on the second patch later.

Are you sure that 12 bytes is enough for all cases? It would seem
to be safer to use the formula in deflateBound/compressBound from
later versions (> 1.2) of zlib to calculate the reserve.

> + while (stream->total_in < *slen
> + && stream->total_out < *dlen - DEFLATE_PCOMPR_RESERVE) {

We normally put the operator on the preceding line, i.e.,

while (foo &&
bar) {

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <her...@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Herbert Xu

unread,

Mar 31, 2005, 5:01:17 AM3/31/05

to Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com, sven...@post5.tele.dk, YOSHIFUJI Hideaki

On Mon, Mar 28, 2005 at 05:22:36PM +0000, Artem B. Bityuckiy wrote:
>

> I made the changes to deflate_decompr() because the old version doesn=

't
> work properly for me. There are 2 changes.
>
> 1. I've added the following code:
>

> ---------------------------------------------------------------------=

---
> if (slen > 2 && !(src[1] & PRESET_DICT) /* No preset dictionary */

> && ((src[0] & 0x0f) == Z_DEFLATED) /* Comp. method byte is O=

K */
> && !(((src[0] << 8) + src[1]) % 31)) { /* CMF*256 + FLG */
> stream->next_in += 2;
> stream->avail_in -= 2;
> }

> ---------------------------------------------------------------------=
---

The reason you need to add this is because the window bits that
was used to produce the compressed data is positive while the window
bits crypto/deflate is using to perform the decompression isn't.

So what we should do here is turn window bits into a configurable
parameter.

Once you supply the correct window bits information, the above is
then simply an optimisation.

Rather than keeping the above optimisation, JFFS should simply do
the compression with a negative window bits value.

Of course to maintain backwards compatibility you'll need to do this
as a new compression type.

> 2. I've removed the "strange" (for me) uncompress sequence:
>

> ---------------------------------------------------------------------=

---
> ret = zlib_inflate(stream, Z_SYNC_FLUSH);
> /*
> * Work around a bug in zlib, which sometimes wants to taste an extra
> * byte when being used in the (undocumented) raw deflate mode.
> * (From USAGI).
> */

I believe this bit of code originally came from FreeS/WAN and was
written by Svenning Sørensen. Maybe he or Yoshifuji-san can tell
us why?

Unless we're sure that zlib has been fixed we should leave it in.
It should be a no-op if zlib has been fixed. So this probably
isn't causing the breakage that you saw.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <her...@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-

To unsubscribe from this list: send the line "unsubscribe linux-kernel"=

Artem B. Bityuckiy

unread,

Apr 3, 2005, 7:48:39 AM4/3/05

to Herbert Xu, Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com, sven...@post5.tele.dk, YOSHIFUJI Hideaki

Herbert,

I also wonder, does it at all correct to use negative windowBits in
crypto API? I mean, if windowBits is negative, zlib doesn't produce the
proper zstream header, which is incorrect according to RFC-1950. It also
doesn't calculate adler32.

For example, if we work over an IP network (RFC-2384), the receiving
side may be confused by such a "stripped" zstream.

Isn't it conceptually right to produce *correct* zstream, with the
header and the proper adler32 ?

Yes, for JFFS2 we would like to have no adler32, we anyway protect our
data by CRC32. But I suppose this should be an additional feature.

Comments?

Herbert Xu wrote:
>>I made the changes to deflate_decompr() because the old version doesn't

>>work properly for me. There are 2 changes.
>>
>>1. I've added the following code:
>>

>>------------------------------------------------------------------------

>>if (slen > 2 && !(src[1] & PRESET_DICT) /* No preset dictionary */

>> && ((src[0] & 0x0f) == Z_DEFLATED) /* Comp. method byte is OK */

>> && !(((src[0] << 8) + src[1]) % 31)) { /* CMF*256 + FLG */
>> stream->next_in += 2;
>> stream->avail_in -= 2;
>>}

>>------------------------------------------------------------------------

>
> The reason you need to add this is because the window bits that
> was used to produce the compressed data is positive while the window
> bits crypto/deflate is using to perform the decompression isn't.
>
> So what we should do here is turn window bits into a configurable
> parameter.
>
> Once you supply the correct window bits information, the above is
> then simply an optimisation.
>
> Rather than keeping the above optimisation, JFFS should simply do
> the compression with a negative window bits value.
>
> Of course to maintain backwards compatibility you'll need to do this
> as a new compression type.
>

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

Herbert Xu

unread,

Apr 3, 2005, 7:54:28 AM4/3/05

to Artem B. Bityuckiy, Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com, sven...@post5.tele.dk, YOSHIFUJI Hideaki

On Sun, Apr 03, 2005 at 03:41:07PM +0400, Artem B. Bityuckiy wrote:
>
> I also wonder, does it at all correct to use negative windowBits in
> crypto API? I mean, if windowBits is negative, zlib doesn't produce the

It's absolutely correct for IPComp. For other uses it may or may not
be correct.

For instance for JFFS2 it's absolutely incorrect since it breaks
compatibility. Incidentally, JFFS should create a new compression
type that doesn't include the zlib header so that we don't need the
head-skipping speed hack.

> proper zstream header, which is incorrect according to RFC-1950. It also

Can you please point me to the paragraph in RFC 1950 that says this?

> Isn't it conceptually right to produce *correct* zstream, with the
> header and the proper adler32 ?

Not really. However it should be possible if the user needs it which
is why we should make windowBits configurable.

> Yes, for JFFS2 we would like to have no adler32, we anyway protect our
> data by CRC32. But I suppose this should be an additional feature.

Yes, I'd love to see a patch that makes windowBits configurable in
crypto/deflate.c.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <her...@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Artem B. Bityuckiy

unread,

Apr 3, 2005, 7:54:53 AM4/3/05

to Herbert Xu, Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com, sven...@post5.tele.dk, YOSHIFUJI Hideaki

Herbert Xu wrote:
> Can you please point me to the paragraph in RFC 1950 that says this?

Ok, if to do s/correct/compliant/, here it is:

Section 2.3, page 7

-----------------------------------------------------------------------
A compliant compressor must produce streams with correct CMF, FLG and
ADLER32, but need not support preset dictionaries.
...

A compliant decompressor must check CMF, FLG, and ADLER32, and
provide an error indication if any of these have incorrect values.
A compliant decompressor must give an error indication if CM is
not one of the values defined in this specification (only the
value 8 is permitted in this version), since another value could
indicate the presence of new features that would cause subsequent
data to be interpreted incorrectly
-----------------------------------------------------------------------

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

Artem B. Bityuckiy

unread,

Apr 3, 2005, 8:03:16 AM4/3/05

to Herbert Xu, Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com, sven...@post5.tele.dk, YOSHIFUJI Hideaki

Herbert Xu wrote:
> On Sun, Apr 03, 2005 at 03:41:07PM +0400, Artem B. Bityuckiy wrote:
>
>>I also wonder, does it at all correct to use negative windowBits in
>>crypto API? I mean, if windowBits is negative, zlib doesn't produce the
>
>
> It's absolutely correct for IPComp. For other uses it may or may not
> be correct.

I've looked through RFC-2394 quickly, but didn't found any mention about
non-complient zstreams. I suppose they must be complient by default.
Although I'm far not an expert in the area.

> For instance for JFFS2 it's absolutely incorrect since it breaks
> compatibility. Incidentally, JFFS should create a new compression
> type that doesn't include the zlib header so that we don't need the
> head-skipping speed hack.

Anyway, in JFFS2 we may do that "hack" before calling pcompress(), so it
isn't big problem.

> Yes, I'd love to see a patch that makes windowBits configurable in
> crypto/deflate.c.

I wonder, do we really want this?

Imagine we have 100 different compressors, and each is differently
configurable. It may make cryptoAPI messy. May be it is better to stand
that user must use deflate (and the other 99 compressors) directly if he
wants something not standard/compliant?

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

Herbert Xu

unread,

Apr 3, 2005, 8:03:16 AM4/3/05

to Artem B. Bityuckiy, Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com, sven...@post5.tele.dk, YOSHIFUJI Hideaki

On Sun, Apr 03, 2005 at 03:53:40PM +0400, Artem B. Bityuckiy wrote:
> Herbert Xu wrote:
> >Can you please point me to the paragraph in RFC 1950 that says this?
>
> Ok, if to do s/correct/compliant/, here it is:
>
> Section 2.3, page 7

Sorry, I thought you were referring to an RFC that defined IPComp/deflate.

RFC 1950 only defines what a zlib compressed data stream should
look like, it does not define what a deflate compressed data
stream is.

RFC 2394 which defines IPComp/deflate only refers to the deflate
document, and not zlib.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <her...@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Herbert Xu

unread,

Apr 3, 2005, 8:10:53 AM4/3/05

to Artem B. Bityuckiy, Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com, sven...@post5.tele.dk, YOSHIFUJI Hideaki

On Sun, Apr 03, 2005 at 04:01:19PM +0400, Artem B. Bityuckiy wrote:
>
> >For instance for JFFS2 it's absolutely incorrect since it breaks
> >compatibility. Incidentally, JFFS should create a new compression
> >type that doesn't include the zlib header so that we don't need the
> >head-skipping speed hack.
>
> Anyway, in JFFS2 we may do that "hack" before calling pcompress(), so it
> isn't big problem.

It still makes sense to use a negative window bits for JFFS since it
means that you don't have to calculate the adler checksum in the
first place AND you don't have to store the zlib header/trailer on
disk.

BTW, that hack can only be applied when there is no preset dictionary.
Although the Linux implementation of JFFS probably never used a preset
dictionary, it is theoretically possible that someone out there may
have generated a JFFS image which contains a compressed stream that has
a preset dictionary.

In that case if you don't set the window bits to a postive value it
won't work at all.

> >Yes, I'd love to see a patch that makes windowBits configurable in
> >crypto/deflate.c.
>
> I wonder, do we really want this?

Yes since the the window bits determines the compression quality and
the amount of memory used. This is going to differ depending on the
application.

> Imagine we have 100 different compressors, and each is differently
> configurable. It may make cryptoAPI messy. May be it is better to stand
> that user must use deflate (and the other 99 compressors) directly if he
> wants something not standard/compliant?

Each crypto/deflate user gets their own private zlib instance.
Where is the problem?

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <her...@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Artem B. Bityuckiy

unread,

Apr 3, 2005, 8:20:18 AM4/3/05

to Herbert Xu, Artem B. Bityuckiy, dw...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, jmo...@redhat.com, sven...@post5.tele.dk, YOSHIFUJI Hideaki

Herbert Xu wrote:
> Each crypto/deflate user gets their own private zlib instance.
> Where is the problem?

Hmm, OK. No problems, that was just RFC. :-)

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.