Need peer review: May have found mistake in Adler-32!

105 views
Skip to first unread message

Daryle Walker

unread,
Aug 25, 2008, 2:50:50 AM8/25/08
to zl...@gzip.org, Jean-loup Gailly, Mark Adler
[This is going out the Mr. Adler, his friends at zlib, the related
newsgroups comp.compression and sci.crypt, and the newsgroups sci.math
and sci.math.num-analysis, where I posted an Adler-32 question long
ago.]

This post relates to suspect calculations. Apologies to all involved
if all of your calculations are right and I somehow got all of mine
wrong. (Please confirm or debunk my suspicions if you can.)

Recently, I decided to try implementing the Adler-32 algorithm for
fun. The algorithm is described in the last parts of RFC 1950 and at
its Wikipedia page (http://en.wikipedia.org/wiki/Adler-32). The
calculations involve modulo addition. Computer scientists know that
such calculations can be done by doing the modulo reduction after each
addition, or save the additions up and do the reduction right before
the register would overflow. Assuming 8-bit bytes and 32-bit
registers, the RFC suggests that modulo reduction can be delayed for
5552 bytes. The Wikipedia entry, however, suggests only a 5550-byte
limit. Said web entry doesn't state its reasoning, except to mention
that the reasoning is intricate and involves detailed analysis of the
reading & reduction loop.

Question 1: can anyone give me the full reasoning the Wikipedia person
used?

Now that 64-bit integer arithmetic is common in computers, I wanted to
update the limit to the next generation, hopefully in a way that will
work no matter the register size in the future. So I tried to
calculate the general limit before reduction.

An Adler-32 checksum is made from combining two variables. The first
one is the sum of all bytes plus one. The second is the sum of all
values of the first variable. Both sums are taken modulo 65521 (the
largest prime within 16 bits). The least non-negative residue of
those variables are combined by shifting the second residue up 16 bits
and combining it with the first residue, resulting in the checksum.
Finding the iteration limit supposes the worst case scenario where
each byte takes its maximum value. Let's call the value CM, which is
255 for 8-bit bytes. The two variables proceed as:

s1(1): 1 + CM, 1 + 2 * CM, 1 + 3 * CM, ... 1 + n * CM
s2(0): 1 + CM, 2 + 3 * CM, 3 + 6 * CM, ... n + n(n+1)/2 * CM

where n is the number of steps taken. The n(n+1)/2 formula is the sum
of all whole numbers from 1 to n. (I think Euler figured that out
first.) Obviously, s2 will bust the limit first. We have to keep a
next step from going over the register's maximum value so:

n + n(n+1)/2 * CM <= MAX_REGISTER - BASE

where BASE is 65521 and MAX_REGISTER is the largest value the register
type can take. For (at the time) regular computers, the register was
32-bits long, giving a maximum value of 2^32 - 1 = 4,294,967,295. We
can reduce the equation to:

n + (n^2 + n) * CM/2 <= MAX_REGISTER - BASE

CM/2 * n^2 + (1 + CM/2) * n <= MAX_REGISTER - BASE

CM/2 * n^2 + (1 + CM/2) * n + (BASE - MAX_REGISTER) <= 0

CM * n^2 + (2 + CM) * n + 2 * (BASE - MAX_REGISTER) <= 0

plugging in our system's variables gives:

255*n^2 + 257*n - 8,589,803,548 <= 0

(Hopefully, that "4" isn't a "9;" can't read my writing.) The roots
of that equation are approximately 5,803.4 and -5,804.4. Plugging in
values around the positive root results in n <= 5,803. I tried to
send a private message to Mark Adler asking why my result differed,
but I haven't got anything back (yet). I've done other stuff for a
week, but now I'm following up again.

I tried to Google for more useful information about Adler-32 to help.
I eventually found an old post of mine sent to the sci.math
newsgroups. It had a different quadratic formula. My post said that
I found it from some supplemental material. Groping for answers, I
eventually found the material from looking at the zlib source code, in
"adler32.c":

//=============================================
#define BASE 65521UL /* largest prime smaller than 65536 */
#define NMAX 5552
/* NMAX is the largest n such that 255n(n+1)/2 + (n+1)(BASE-1) <=
2^32-1 */
//=============================================

This different quadratic equation reduces to:

255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1

255n(n+1) + 2(n+1)(BASE-1) <= 2(2^32-1)

(n+1)(255n + 2(BASE-1)) <= 2(2^32-1)

255n^2 + (255+2(BASE-1))n + 2(BASE-1) <= 2(2^32-1)

255n^2 + (255+2(BASE-1))n + 2((BASE-1) - (2^32-1)) <= 0

255 * n^2 + 131,295 * n - 8,585,803,550 <= 0

This has roots of 5,550.8 and -6,065.7. Plugging in values around the
positive root gives n <= 5,550. Yes, this is the same value from
Wikipedia.

Observation 2: I guess the effort the Wikipedia guy in correcting the
limit through analyzing the read & reduce loop is commendable.
However,...

Observation 3: The difference you saw between theory and practice
wasn't mysterious. The two values do match; it's just that the theory
guy got his answer wrong.

Observation 4: Either the first guy guesstimated the root or got his
exact calculations wrong, and then either got the confirmation wrong
or never bothered to check. I have checked mine, but through the use
of Google's built-in calculator and visiting quadratic equation web
sites. This wasn't feasible in 1995 (the earliest copyright on
"adler32.c"), and the large numbers make hand checks hard, and home
computers were less powerful. So maybe we can forgive this mistake.

Observation 5: I can't believe it took nearly 15 years for someone to
finally confirm (and actually debunk) the first guy's calculations!
And this checksum has been all over the place since then.

Now I got worried. I calculations still had a different result
(5,803) and was based on a different formula. Then I realized that if
the first guy messed up his computation, maybe he messed up the entire
formula.

Question 6: can anyone tell me how the "adler32.c" shorthand formula
was derived?

So I finally decided to manually crank out the steps and see what
happens. I have an August 2002 eMac running Mac OS X 10.4.11 on a
PowerPC G4 and wrote this code:

//=============================================
#include <climits>
#include <iomanip>
#include <ios>
#include <iostream>

int main( int argc, char * const argv[] )
{
using std::setw;

for ( unsigned long long i = 0, s1 = 1, s2 = 0 ; i < 6000 ; ++i,
s1 += UCHAR_MAX, s2 += s1 )
std::cout << std::dec << setw(4) << i << '\t' << std::hex <<
setw(8) << static_cast<unsigned long>(s1) << '\t' << setw(16)
<<
s1 << '\t' << setw(8) << static_cast<unsigned long>(s2) <<
'\t' << setw(16) << s2 << '\n';

return 0;
}
//=============================================

on an 8-bit byte system with 32-bit longs and 64-bit double-longs.
The registers are 64 bits so I can check beyond what a 32-bit type
will turnover. This prints a table with the following 5 columns: step
number, s1 within 32 bits, s1, s2 within 32 bits, and s2. I'm always
entering the maximum byte-value. This gives the following table:

//=============================================
0 1 1 0 0
1 100 100 100 100
2 1ff 1ff 2ff 2ff
3 2fe 2fe 5fd 5fd
4 3fd 3fd 9fa 9fa
5 4fc 4fc ef6 ef6
6 5fb 5fb 14f1 14f1
7 6fa 6fa 1beb 1beb
8 7f9 7f9 23e4 23e4
9 8f8 8f8 2cdc 2cdc
10 9f7 9f7 36d3 36d3
11 af6 af6 41c9 41c9
12 bf5 bf5 4dbe 4dbe
13 cf4 cf4 5ab2 5ab2
14 df3 df3 68a5 68a5
15 ef2 ef2 7797 7797
16 ff1 ff1 8788 8788
17 10f0 10f0 9878 9878
18 11ef 11ef aa67 aa67
//...
//=============================================

Let's go to the conventional limit of 5,550:

//=============================================
//...
5547 159556 159556 e9e03cb9 e9e03cb9
5548 159655 159655 e9f5d30e e9f5d30e
5549 159754 159754 ea0b6a62 ea0b6a62
5550 159853 159853 ea2102b5 ea2102b5
5551 159952 159952 ea369c07 ea369c07
5552 159a51 159a51 ea4c3658 ea4c3658
5553 159b50 159b50 ea61d1a8 ea61d1a8
//...
//=============================================

I don't see any turnover here yet. Let's go to my limit of 5,803:

//=============================================
//...
5801 169258 169258 ffc86a8c ffc86a8c
5802 169357 169357 ffdefde3 ffdefde3
5803 169456 169456 fff59239 fff59239
5804 169555 169555 c278e 1000c278e
5805 169654 169654 22bde2 10022bde2
5806 169753 169753 395535 100395535
//...
//=============================================

Well, it seems that a 32-bit register will turnover when it goes past
5,803 to the 5,804th step! I did a modified program to help confirm:

//=============================================
// I guess any applicable copyrights are mine (Daryle Walker (c)
2008).
#include <climits>
#include <iomanip>
#include <ios>
#include <iostream>

struct adler32
{
unsigned long long s1, s2;
adler32() : s1( 1 ), s2( 0 ) {}
void operator ()( unsigned char byte ) { s1 += byte; s2 += s1; }
unsigned long operator ()() const
{ return ((s2 % 65521) << 16) | (s1 % 65521); }
void operator ()( void const *b, unsigned l )
{
unsigned char const *bb = static_cast<unsigned char
const*>(b);
while ( l-- ) this->operator ()( *bb++ );
}
};

int main( int argc, char * const argv[] )
{
using std::cout;
using std::setw;

adler32 computer;

computer( "Wikipedia", 9 );
cout << computer() << '\n';
computer = adler32();

for ( int i = 0 ; i < 6000 ; ++i, computer(UCHAR_MAX) )
cout << std::dec << setw(4) << i << '\t' << std::hex <<
setw(8) << static_cast<unsigned long>(computer.s1) << '\t' <<
setw(16) << computer.s1 << '\t' << setw(8) <<
static_cast<unsigned long>(computer.s2) << '\t' << setw(16)
<<
computer.s2 << '\n';

return 0;
}
//=============================================

It's a similar program, but I moved the byte-entry code to a custom
class to isolate it and make sure I didn't copy it down wrong. I also
tested the sample Adler-32 given at Wikipedia, the name itself. I got
the article's value of 300286872. (I have an ASCII-superset system.)

Request 7: Can someone else please confirm (or debunk) these
findings.

Observation 8: If this is accurate, I can't believe that no one EVER,
ANYWHERE, tried to confirm the limit after 15 years until now!

Request 9: I would REALLY like to know how the original guy got his
quadratic formula. I would like to see the reasoning and a
proposition on why I'm wrong. I showed how I got my formula, so maybe
deriving yours will revealed where I screwed up (if anywhere).

Request 10: If everyone has to update their code, feel free to give
me credit. -_^ (Of course, any value below the limit, like 5550,
works. That's why no one noticed.) I don't do the software patent
thing, so use the code if you need to (public domain).

Observation 11: Wait a minute! How the heck did the Wikipedia guy
get the right answer to the wrong equation instead of getting my
answer, then?!

Daryle Walker

Mark Adler

unread,
Aug 25, 2008, 10:22:40 AM8/25/08
to
Daryle,

There is no mistake in the zlib implementation of Adler-32. A quick
look at the wikipedia article shows an error, in that the author
claims that "a" can be 0x1013a or 65850 after "a" has been modded by
65521. That's not possible, since anything modulo 65521 will always
be in the range 0 to 65520.

The zlib derivation is simple. Assume that sum1 and sum2 have their
maximum possible values at the start of the loop, which are 65520 and
65520. Then run 255's through the loop until just before sum2
overflows 2^32 - 1. If you run 5,552 255's through the loop, then
sum1 is 1,481,280 and sum2 is 4,294,690,200. sum2 is less than 2^32
== 4,294,967,296. So the maximum n is 5,552.

That's all there is to it.

Mark

Wolfgang Ehrhardt

unread,
Aug 25, 2008, 12:20:57 PM8/25/08
to
On Sun, 24 Aug 2008 23:50:50 -0700 (PDT), Daryle Walker
<dar...@gmail.com> wrote:

[SNIP most text]


>
>255 * n^2 + 131,295 * n - 8,585,803,550 <= 0

[..]


>Observation 8: If this is accurate, I can't believe that no one EVER,
>ANYWHERE, tried to confirm the limit after 15 years until now!

There are at least two errors in your argumentation:

1. You did only a partial worst case analysis: As Mark already said:
Because the Adler-32 update function can be used with the result of a
previous Adler-32 calculation the maximum starting values of s1 and s2
are both BASE-1 (s1=s2=BASE-1 can be produced with BASE-2 bytes of 1).
The 0 and 1 are the very first starting values (and with these s2
overflows for n=5804 if all input is 255).

2. With the starting values s1=s2=BASE-1 the formula from zlib.h is
correct
255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1 or
255/2*n^2 + n*(BASE-1+255/2) + BASE-1 - (2^32-1) <= 0
The roots are n1=5552.190793923682684283598574 (or n2=-6067.0...),
calculated with Pari/GP:
BASE = 65520
a = 255/2
b = BASE-1 + 255/2
c = BASE-1 - (2^32-1)
d = sqrt(b^2 - 4*a*c)
n1 = (-b + d)/(2*a)
n2 = (-b - d)/(2*a)

3. Therefore the value n=5552 from RFC and zlib is correct. You could
also have easily shown this by experiment with a short program, which
determines the value n for which s2 overflows:

program adlerovr;
const
BASE = 65521;
lim = int64(1) shl 32;
var
s1,s2: int64;
n: longint;
begin
s1 := BASE-1;
s2 := BASE-1;
n := 0;
repeat
s1 := s1 + 255;
s2 := s2 + s1;
inc(n);
until s2>=lim;
writeln('Adler-32 overflow for n=',n);
end.

As expected the result is: Adler-32 overflow for n=5553.

4. If you are aware that you make typos or other glitches you should
double check your calculation, your formula

255 * n^2 + 131,295 * n - 8,585,803,550 <= 0

should read

255 * n^2 + 131,295 * n - 8,589,803,550 <= 0

Wolfgang

--
In order to e-mail me a reply to this message, you will have
to remove PLEASE.REMOVE from the address shown in the header
or get it from http://home.netsurf.de/wolfgang.ehrhardt
(Free open source Crypto, AES, CRC, Hash for Pascal/Delphi)

Wolfgang Ehrhardt

unread,
Aug 25, 2008, 12:20:58 PM8/25/08
to
On Mon, 25 Aug 2008 07:22:40 -0700 (PDT), Mark Adler
<mad...@alumni.caltech.edu> wrote:

>Daryle,
>
>There is no mistake in the zlib implementation of Adler-32. A quick
>look at the wikipedia article shows an error, in that the author
>claims that "a" can be 0x1013a or 65850 after "a" has been modded by
>65521. That's not possible, since anything modulo 65521 will always
>be in the range 0 to 65520.

The "magic value 5550" paragraph is an artifact of an old revision of
the article describing a C implementation without mod operations:

(<http://en.wikipedia.org/w/index.php?title=Adler-32&oldid=179953130>
or older)

Mod was introduced by a person named Vicaya at 01:11, 3 January 2008.
either s/he does not know better or does not care (anyway: the
sentence "Because this implementation does not completely reduce a" is
now obviously wrong!) But because 5550 is safe, there is no bug in the
code, it is only slightly suboptimal.

Wolfgang Ehrhardt

unread,
Aug 25, 2008, 1:37:06 PM8/25/08
to
A small bug fix to my numbers:

>calculated with Pari/GP:
> BASE = 65520
..

The correct numbers atre of course: BASE=65521 and
n1 = 5552.187045454599584508943831

Wolfgang


Mark Adler

unread,
Aug 25, 2008, 3:36:34 PM8/25/08
to
I fixed the wikipedia page.

Daryle Walker

unread,
Aug 26, 2008, 2:46:49 PM8/26/08
to
I had an e-mail conversation with Adler, and we figured out what
happened. I'm posting the results here for all to see. At least
others will finally find useful information on Adler-32 on Google.
(Which may be me in a few years.)

On Aug 25, 2:50 am, Daryle Walker <dary...@gmail.com> wrote:
[SNIP]

Here's where I screwed up. I used the initial conditions to trace the
procession. Adler used the worse case, where both registers start at
the highest value, BASE - 1. If we do that, we get:

s1(BASE - 1): BASE - 1 + CM, BASE - 1 + 2 * CM, BASE - 1 + 3 * CM,
... BASE - 1 + n * CM
s2(BASE - 1): 2(BASE-1) + CM, 3(BASE-1) + 3 * CM, 4(BASE-1) + 6 * CM,
... (n + 1)(BASE - 1) + n(n + 1)/2 * CM

>
> n + n(n+1)/2 * CM <= MAX_REGISTER - BASE
>
> where BASE is 65521 and MAX_REGISTER is the largest value the register
> type can take.  For (at the time) regular computers, the register was
> 32-bits long, giving a maximum value of 2^32 - 1 = 4,294,967,295.  We
> can reduce the equation to:
>
> n + (n^2 + n) * CM/2 <= MAX_REGISTER - BASE
>
> CM/2 * n^2 + (1 + CM/2) * n <= MAX_REGISTER - BASE
>
> CM/2 * n^2 + (1 + CM/2) * n + (BASE - MAX_REGISTER) <= 0
>
> CM * n^2 + (2 + CM) * n + 2 * (BASE - MAX_REGISTER) <= 0
>
> plugging in our system's variables gives:
>
> 255*n^2 + 257*n - 8,589,803,548 <= 0
>
> (Hopefully, that "4" isn't a "9;" can't read my writing.)  The roots
> of that equation are approximately 5,803.4 and -5,804.4.  Plugging in
> values around the positive root results in n <= 5,803.  I tried to
> send a private message to Mark Adler asking why my result differed,
> but I haven't got anything back (yet).  I've done other stuff for a
> week, but now I'm following up again.

[TRUNCATE]

Now the limit equation is:

(n + 1)(BASE - 1) + n(n + 1)/2 * CM <= MAX_REGISTER

which reduces to:

CM * n^2 + (CM + 2(BASE - 1)) * n + 2(BASE - 1 - MAX_REGISTER) <= 0

Using our default values gives:

255 * n^2 + 131,295 * n - 8,589,803,550 <= 0

which has roots around -6,067.07 and 5,552.19. Testing around the
positive root gives Adler's result of 5,552! So,... never mind.

Everyone can keep their entry limits to 5,552 to be safe. And this
proves the Wikipedia entry, which has now been purged (not by me), to
be wrong. However, you could use my 5,803 limit to start a check-sum
calculation from scratch, but reset it to 5,552 after the first
turnover.

By the way, I tried adapting my code for 64-bit registers and
retaining 8-bit bytes. The quadratic equation computer web page
couldn't accurately go up that high because it used floating point,
but it gave me an estimate to run with manually. The equivalent of
the 5,552 worse-case limit is 380,368,439. The equivalent for my
5,803 initial limit is 380,368,696. (Each run took over 20 minutes,
opposed to a few seconds for the 32-bit runs.)

Daryle Walker

Daryle Walker

unread,
Aug 26, 2008, 3:01:53 PM8/26/08
to
On Aug 25, 12:20 pm, Wolfgang.Ehrhardt.PLEASE.REM...@munich.netsurf.de

(Wolfgang Ehrhardt) wrote:
> On Sun, 24 Aug 2008 23:50:50 -0700 (PDT), Daryle Walker
>
> <dary...@gmail.com> wrote:
>
> [SNIP most text]
>
>
>
> >255 * n^2 + 131,295 * n - 8,585,803,550 <= 0
> [..]
> >Observation 8:  If this is accurate, I can't believe that no one EVER,
> >ANYWHERE, tried to confirm the limit after 15 years until now!
>
> There are at least two errors in your argumentation:
>
> 1. You did only a partial worst case analysis: As Mark already said:
> Because the Adler-32 update function can be used with the result of a
> previous Adler-32 calculation the maximum starting values of s1 and s2
> are both BASE-1 (s1=s2=BASE-1 can be produced with BASE-2 bytes of 1).
> The 0 and 1 are the very first starting values (and with these s2
> overflows for n=5804 if all input is 255).

I got Adler's post e-mailed to me pointing out my error.

>
> 2. With the starting values s1=s2=BASE-1 the formula from zlib.h is
> correct
>   255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1   or
>   255/2*n^2 + n*(BASE-1+255/2) + BASE-1 - (2^32-1) <= 0
> The roots are n1=5552.190793923682684283598574 (or n2=-6067.0...),
> calculated with Pari/GP:
>      BASE = 65520
>      a = 255/2
>      b = BASE-1 + 255/2
>      c = BASE-1 - (2^32-1)
>      d = sqrt(b^2 - 4*a*c)
>      n1 = (-b + d)/(2*a)
>      n2 = (-b - d)/(2*a)

Yep, got that too.

>
> 3. Therefore the value n=5552 from RFC and zlib is correct. You could
> also have easily shown this by experiment with a short program, which
> determines the value n for which s2 overflows:
>
>    program adlerovr;
>    const
>      BASE = 65521;
>      lim  = int64(1) shl 32;
>    var
>      s1,s2: int64;
>      n: longint;
>    begin
>      s1 := BASE-1;
>      s2 := BASE-1;
>      n := 0;
>      repeat
>        s1 := s1 + 255;
>        s2 := s2 + s1;
>        inc(n);
>      until s2>=lim;
>      writeln('Adler-32 overflow for n=',n);
>    end.
>
> As expected the result is: Adler-32 overflow for n=5553.

Yeah, I eventually wrote a similar program too. I also used it to
find the 64-bit register limits.

>
> 4. If you are aware that you make typos or other glitches you should
> double check your calculation, your formula
>
>   255 * n^2 + 131,295 * n - 8,585,803,550 <= 0
>
> should read
>
>   255 * n^2 + 131,295 * n - 8,589,803,550 <= 0

[Looks at my 5,552 sheet and 5,550 sheet.]
Oh yeah, I just realized that my calculations from my first post and
my recent retraction post should have had the same formula, but I
messed up somewhere and got the 5,550 result for the first post.
Oops. Thanks anyway.

Daryle Walker

Wolfgang Ehrhardt

unread,
Aug 26, 2008, 5:32:35 PM8/26/08
to
On Tue, 26 Aug 2008 11:46:49 -0700 (PDT), Daryle Walker
<dar...@gmail.com> wrote:

>Everyone can keep their entry limits to 5,552 to be safe. And this
>proves the Wikipedia entry, which has now been purged (not by me), to
>be wrong.

As Mark Adler wrote in another post, he has fixed the Wikipedia entry.

>By the way, I tried adapting my code for 64-bit registers and
>retaining 8-bit bytes. The quadratic equation computer web page
>couldn't accurately go up that high because it used floating point,
>but it gave me an estimate to run with manually. The equivalent of
>the 5,552 worse-case limit is 380,368,439. The equivalent for my
>5,803 initial limit is 380,368,696. (Each run took over 20 minutes,
>opposed to a few seconds for the 32-bit runs.)
>

your numbers seem correct if you are using 64 bit registers with the
old BASE = 65521 = predprime(2^16). IMHO a more canonical 64 bit
formulation would use BASE = predprime(2^32) = 4294967291 and the
limit 5552 equivalent becomes 363898415 (the other limit is
380368696.4763 for BASE=4294967291, this is almost the same as your
value 380368696.5206 for BASE=65521).

All calculations took less than 1 ms each on Pari/GP:

BASE = precprime(2^32)


a = 255/2
b = BASE-1 + 255/2

c = BASE-1 - (2^64-1)


d = sqrt(b^2 - 4*a*c)
n1 = (-b + d)/(2*a)
n2 = (-b - d)/(2*a)

(time for script writing not included :)

Mark Adler

unread,
Aug 28, 2008, 10:38:42 AM8/28/08
to
On Aug 25, 9:20 am, Wolfgang.Ehrhardt.PLEASE.REM...@munich.netsurf.de

(Wolfgang Ehrhardt) wrote:
> (s1=s2=BASE-1 can be produced with BASE-2 bytes of 1).

Wolfgang,

Thanks for that! I figured there must be a sequence that gives the
maximum s1 and s2 values, but I never tried to find one. I'm
surprised it's that easy. In fact, for any odd BASE, s2 ends up as
BASE-1 (as well as s1 of course).

Mark

Daryle Walker

unread,
Aug 29, 2008, 3:17:42 PM8/29/08
to

Wow, you didn't know that either. I want to know how that is proven,
so I can put it in my testing file's notes. I see that the sequences
from always entering ones are:

s1(1): 2, 3, 4, 5, ... n + 1
s2(0): 2, 5, 8, 14, ... (n + 1)(n + 2) / 2 - 1

and the two sequences equal when n is either +1 or -2, or due to
modulo arithmetic, when n is congruent to those values modulo BASE.
Now using 65,520 for n gives us what we want, but did we get lucky
there? Or is there a way to find an n for any given value (less than
BASE) that we want s1 and s2 to simultaneously cover?

Daryle Walker

Mark Adler

unread,
Aug 29, 2008, 4:37:48 PM8/29/08
to
On Aug 29, 12:17 pm, Daryle Walker <dary...@gmail.com> wrote:
> On Aug 28, 10:38 am, Mark Adler <mad...@alumni.caltech.edu> wrote:
> In fact, for any odd BASE, s2 ends up as BASE-1 (as well as s1 of course).
>
> I want to know how that is proven,
...

> s2(0): 2, 5, 8, 14, ... (n + 1)(n + 2) / 2 - 1

The proof is right there in your formula, staring back at you. n is
BASE-2. So we have (BASE-1)*BASE/2 - 1. For odd BASE, (BASE-1)/2 is
an integer. So for odd BASE we have BASE*(some integer) - 1. That
quantity modulo BASE is equal to BASE-1.

Mark

Daryle Walker

unread,
Aug 29, 2008, 5:34:11 PM8/29/08
to

No, I meant how did Wolfgang figure out that the worst-case scenario
with s1 and s2 (both BASE - 1) was actually constructable (without
rigging the registers in advance) and provide an actual method. Did
he get lucky?

Daryle Walker

Mark Adler

unread,
Aug 30, 2008, 1:10:08 AM8/30/08
to
On Aug 29, 2:34 pm, Daryle Walker <dary...@gmail.com> wrote:
> Did he get lucky?

I don't know, except that he got lucky in the sense that many years
ago I decided to start s1 at 1. If I had picked something else, which
I very well could have, then that sequence wouldn't have worked.

Just for kicks, I found the shortest sequence that gives s1 == s2 ==
65520. It is 128 255's, a 239, and 128 more 255's.

There are many others of this length (257), but this is this simplest
one to describe. There can't be any shorter sequences that give s1 ==
65520, since 256 bytes can't add up to 65519.

Mark

Wolfgang Ehrhardt

unread,
Aug 30, 2008, 3:05:50 PM8/30/08
to
On Fri, 29 Aug 2008 22:10:08 -0700 (PDT), Mark Adler
<mad...@alumni.caltech.edu> wrote:

>> Did he get lucky?
>
>I don't know, except that he got lucky in the sense that many years
>ago I decided to start s1 at 1. If I had picked something else, which
>I very well could have, then that sequence wouldn't have worked.

Daryle,

yes it was kind of luck, that the most obvious choice
for producing S1=BASE-1 gives S2=BASE-1 too. But that
result was not critical for the proof that n=5552 is
correct: From the simple fact that S1=BASE-1 is reachable
you can plug the min/max starting values of S2 in

255/2*n^2 + n*(BASE-1+255/2) + S2 - (2^32-1) <= 0

With the maximum S2=BASE-1 we have the old known value
5552.187045454599584508943831, and with the minimum
S2=0 you get 5552.231272070943830898881692. Therefore
the value is independent of the starting value of S2.


Mark,

if you start with any value S1(0) you can always
construct a sequence with S1=S2=BASE-1:

1. First feed in 1 bytes until you get S1=BASE-1

2. If S2=BASE-1 your done

3. Feed in a 0 byte. This does not change S1, and decrements
S2 by 1.
4. If S2=-1 mod BASE your done, otherwise repeat 3.

Wolfgang Ehrhardt

unread,
Aug 30, 2008, 4:46:36 PM8/30/08
to
On Fri, 29 Aug 2008 22:10:08 -0700 (PDT), Mark Adler
<mad...@alumni.caltech.edu> wrote:

>On Aug 29, 2:34=A0pm, Daryle Walker <dary...@gmail.com> wrote:
>> Did he get lucky?
>
>I don't know, except that he got lucky in the sense that many years
>ago I decided to start s1 at 1. If I had picked something else, which
>I very well could have, then that sequence wouldn't have worked.
>

Here is a generalization of my other post: It is easy to construct a
sequence that for arbitrary starting values S1_Start and S2_Start
which reaches arbitrary destination values of S1_Dest, S2_Dest (all
numbers between between 0 and BASE-1):

Step 0: Initialize Adler with starting values S1_Start, S2_Start

Step 1: If S1_Dest=0 goto step 6

Step 2: If S1=S1_Dest goto step 4.

Step 3: Feed in 1 and goto step 2

Step 4: Here S1=S1_Dest with S1<>0.

Step 5: If S2=S2_Dest then terminate, otherwise feed in 0
and repeat step 5. Note that this step does not
change S1=S1_Dest and increments S2 by S1_Dest.
Because S1_Dest is nonzero and BASE is prime
this step is repeated at most BASE times.

Step 6: Here S1_Dest=0. Use steps 0 to 6 to construct a
sequence for S1_Dest=BASE-1 and the given S2_Dest.
Feed in a final 1, this gives S1=0, S2=S2_Dest.

Mark Adler

unread,
Aug 31, 2008, 11:09:39 AM8/31/08
to
On Aug 30, 1:46 pm, Wolfgang.Ehrhardt.PLEASE.REM...@munich.netsurf.de

(Wolfgang Ehrhardt) wrote:
> Here is a generalization of my other post: It is easy to construct a
> sequence that for arbitrary starting values S1_Start and S2_Start
> which reaches arbitrary destination values of S1_Dest, S2_Dest (all
> numbers between between 0 and BASE-1):

Very nice. This also proves the assertion that, like a CRC, an
Adler-32 can be easily forged and therefore is unsuitable for
signature applications.

Mark

Jeffrey Walton

unread,
Aug 31, 2008, 6:28:47 PM8/31/08
to
> A quick look at the wikipedia article shows an error, in that the
> author claims that "a" can be 0x1013a or 65850 after...
Dr. Adler beat me to the punch: don't trust Wiki. I found too many
mistakes in the past.

Jeff

David R Tribble

unread,
Aug 31, 2008, 8:18:16 PM8/31/08
to
Jeffrey Walton wrote:
>> A quick look at the wikipedia article shows an error, in that the
>> author claims that "a" can be 0x1013a or 65850 after...
>

> Dr. Adler beat me to the punch: don't trust Wiki. I found too many
> mistakes in the past.

Of course, you can always correct those errors. That's the
great thing about Wikipedia, that anyone (even experts)
can make corrections and additions to the articles.

Phil Carmody

unread,
Sep 1, 2008, 3:30:37 AM9/1/08
to

The great thing about wikipedia is that everyone can be
part of something that's flawed. Gee, you really sell
that well.

Phil
--
The fact that a believer is happier than a sceptic is no more to the
point than the fact that a drunken man is happier than a sober one.
The happiness of credulity is a cheap and dangerous quality.
-- George Bernard Shaw (1856-1950), Preface to Androcles and the Lion

Reply all
Reply to author
Forward
0 new messages