http://eprint.iacr.org/2004/199.pdf
Thanks,
Simon
Sebastian Gottschalk wrote:
> Simon Josefsson schrieb:
>
>
>>Anyone know more about the kind of attack used in a new ePrint paper?
>>http://eprint.iacr.org/2004/199.pdf
>
>
> Seems like stupidy attack. I couldn't verify any of the results.
Did you read the other thread 'Collision in SHA-0' ?
M. K. Shen
> Simon Josefsson schrieb:
>
>> Anyone know more about the kind of attack used in a new ePrint paper?
>> http://eprint.iacr.org/2004/199.pdf
>
> Seems like stupidy attack. I couldn't verify any of the results.
Did anyone else note that the paper was 2 pages long, just randomly listed
values and didn't mention anything else?
Either they were really excited to get the data out or this isn't a serious
study.
Tom
> Simon Josefsson schrieb:
>
>> Anyone know more about the kind of attack used in a new ePrint paper?
>> http://eprint.iacr.org/2004/199.pdf
>
> Seems like stupidy attack. I couldn't verify any of the results.
I was able to verify the MD4 collision, after correcting for endian
problems. I don't get the same output as they do, though. Others
have verified the MD5 and HAVAL-128 collisions too. I don't know
about the RIPEMD-128 collision, I tried a RIPEMD-160 implementation of
it until I realized they used RIPEMD-128...
If you have Emacs and a modern Gnus, you can verify the MD4 collision
by evaluating the following expressions:
(encode-hex-string (md4 (rfc2104-hexstring-to-bitstring "839c7a4d7a92cb5678a5d5b9eea5a7573c8a74deb366c3dc20a083b69f5d2a3bb3719dc69891e9f95e809fd7e8b23ba6318edd45e51fe39708bf9427e9c3e8b9") 64))
=> "4d7e6a1defa93d2dde05b45d864c429b"
(insert (encode-hex-string (md4 (rfc2104-hexstring-to-bitstring "839c7a4d7a92cbd678a5d529eea5a7573c8a74deb366c3dc20a083b69f5d2a3bb3719dc69891e9f95e809fd7e8b23ba6318edc45e51fe39708bf9427e9c3e8b9") 64)))
=> "4d7e6a1defa93d2dde05b45d864c429b"
Thanks,
Simon
> Sebastian Gottschalk schrieb:
>
>> Seems like stupidy attack. I couldn't verify any of the results.
> I'm sorry sorry sorry. Those Chinese scientists didn't check for the
> endianess. Now I can verify the results.
Still, all of their hashes had 128 bit digests, unlike SHA-0 which is
160 bit. I believe it is possible to find collisions in 128 bit
digest hashes with a large computer, by simple brute force, if you are
willing to wait a while.
They'd convince me if they published lots of collisions. Or,
preferably, the details of the attack [if I can understand it].
Thanks,
Simon
So what would explain the fact that in over a decade not a single
collision for MD5 has ever been found?
Do the math, and you'll find that even if the fastest supercomputer on
the planet were used, you'd still not find a collision for many
centuries by brute force.
--
That's News To Me!
news...@comcast.net
Read this:
http://www.scs.carleton.ca/~paulv/papers/acmccs94.pdf
> --
> That's News To Me!
> news...@comcast.net
--
Ah, yes indeed, a special-purpose machine could be built, much like
the DES-cracker that actually was built. But the original poster was
claiming that finding collisions would be possible "with a large
computer, by simple brute force", which I interpreted to mean a large
general-purpose computer -- which is indeed out of the range of
practicality. Maybe that's not what he meant, but the point stands
that the tone of his posting is that he dismisses this as not being
terribly difficult -- and if that were so, why hasn't it been done?
news...@comcast.net wrote:
> Jean-Luc Cooke <jlc...@engsoc.org> wrote:
>
>>Read this:
>>http://www.scs.carleton.ca/~paulv/papers/acmccs94.pdf
>
>
> Ah, yes indeed, a special-purpose machine could be built, much like
> the DES-cracker that actually was built. But the original poster was
> claiming that finding collisions would be possible "with a large
> computer, by simple brute force", which I interpreted to mean a large
> general-purpose computer -- which is indeed out of the range of
> practicality. Maybe that's not what he meant, but the point stands
> that the tone of his posting is that he dismisses this as not being
> terribly difficult -- and if that were so, why hasn't it been done?
I guess that, when the economic return isn't large enough
and there are other things of higher priority (or simply
more interesting to the people having the needed knowhow),
then lots of things wouldn't be done in practice. There may
certainly also be others reasons, I would think. (Cf. e.g.
the Kyoto protocol.)
M. K. Shen
> Now I can verify the results.
So did I for md5. The md5 hash of the 1024-bit strings with hex dump
d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f8955ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5b960b1dd1dc417b9ce4d897f45a6555d535739ac7f0ebfd0c3029f166d109b18f75277f7930d55ceb22e8adba79cc155ced74cbdd5fc5d36db19b0ad835cca7e3
and
d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f8955ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5b960b1dd1dc417b9ce4d897f45a6555d535739a47f0ebfd0c3029f166d109b18f75277f7930d55ceb22e8adba794c155ced74cbdd5fc5d36db19b0a5835cca7e3
is identical: a4c0d35c95a63a805915367dcfe6b751
Congratulation to the authors !
Francois Grieu
Wow. I just verified this. I have no idea what their technique is (I
looked at the paper, and it just gives the data -- no info on
technique), but what strikes me just looking at the two strings is
this:
1) Very few bits change between these data values. My "gut feeling"
on cryptographic hash functions has always been that changing one
or a few bits leads to a large change in the output (a kind of
avalanche property). I had always thought that any collisions
that were found would have a relation between them that was so
complex that it would look almost random.
2) Not only have only 6 bits changed out of the 1024-bit values, but
the positions of these bits are exactly the same in both 512-bit
blocks. One interpretation (probably wrong) is that changing
these bits in the first application of the of the compression
function makes some change, and changing those same bits in the
second application of the compression function undoes this
change.
Anyway, the above is just wild speculation, but wild speculation can
be fun sometimes....
Verified. Interestingly, these strings differ in only six bit
locations: characters 19, 45, 59, 83, 109, and 123 each have the high
bit flipped.
--
Erwann ABALEA <erw...@abalea.com> - RSA PGP Key ID: 0x2D0EABD5
-----
> Désolé.
Ta gueule.
-+- LC in : Guide du Neuneu Usenet - Neuneu exaspère le dino -+-
> characters 19, 45, 59, 83, 109, and 123
> each have the high bit flipped.
And 83-19 = 109-45 = 123-59 = 512/8
Putting the values of A B C D side by side, I see that most of the
differences are in the high bits, and all the differences are within
the top 7 bits out of 32.
Francois Grieu
>Do the math, and you'll find that even if the fastest supercomputer on
>the planet were used, you'd still not find a collision for many
>centuries by brute force.
Dumb luck can give you a collision on the very first try, as
improbable as that is, it is not impossible. The estimated number of
tries is not a hard limit imposed by God almighty, it's an average
which could turn out to be as low as 1 and as high as the max. The
more samples you get the closer the average of those will be to the
expected average, but that tells you absolutely nothing where any
individuual sample will lie in the range.
--
"The most merciful thing in the world, I think, is the inability
of the human mind to correlate all its contents." - H.P. Lovecraft
Stimmt.
You need to publish the method by which you got these strings from
their paper. It's not obvious. :)
--Mike Amling
> You need to publish the method by which you got these strings from
> their paper. It's not obvious. :)
For MD5, from the August 17 version of the paper:
copy the values from Acrobat reader, trim the obious garbage
to keep blocks of 8 hex digits or slightly less, add leading
zeroes to get 8 hex digits;
so as to account for md5 little-endianness, cut each 8 digit
block into 4 subblocks of 2 digits, permutate first and fourth
subblocks, permutate second and third sublock;
remove whitespace within each 256-digit block.
Reportedly, in the August 16 version, endianess problems
affected the MD5 IV, therefore the messages given could not be
used to find a collision on normal MD5.
That the authors are able to correct this in one day does
prove their computational cost to find a collision with
whatever message header is low.
Francois Grieu
Johnny Bravo wrote:
> On Tue, 17 Aug 2004 15:02:41 GMT, news...@comcast.net wrote:
>
>
>>Do the math, and you'll find that even if the fastest supercomputer on
>>the planet were used, you'd still not find a collision for many
>>centuries by brute force.
>
>
> Dumb luck can give you a collision on the very first try, as
> improbable as that is, it is not impossible. The estimated number of
> tries is not a hard limit imposed by God almighty, it's an average
> which could turn out to be as low as 1 and as high as the max. The
> more samples you get the closer the average of those will be to the
> expected average, but that tells you absolutely nothing where any
> individuual sample will lie in the range.
Yes. In order to claim a break, one has to show that one manages
to find collisions (consistently) with a higher probability
than the expected value (the value computed on the assumption
that the algorithm were perfect), in other words with less
work on the average (not just in one single particular run)
than expected. In another thread Lassi Hippeläinen repeatedly
pointed out the luck issue but he was unfortunately not
properly understood for quite a time.
M. K. Shen
> In article <3VzUc.2184$um2...@newssvr31.news.prodigy.com>,
> Michael Amling <nos...@nospam.com> wrote:
>
>
>> You need to publish the method by which you got these strings from
>>their paper. It's not obvious. :)
>
>
>
> For MD5, from the August 17 version of the paper:
> copy the values from Acrobat reader, trim the obious garbage
> to keep blocks of 8 hex digits or slightly less, add leading
> zeroes to get 8 hex digits;
> so as to account for md5 little-endianness, cut each 8 digit
> block into 4 subblocks of 2 digits, permutate first and fourth
> subblocks, permutate second and third sublock;
> remove whitespace within each 256-digit block.
>
> Reportedly, in the August 16 version, endianess problems
> affected the MD5 IV, therefore the messages given could not be
> used to find a collision on normal MD5.
That's it. I had switched the endianness, as above, but I must have
the "August 16" paper. Using "Md5R", which is a copy of Md5 with the
initialization vector's endianness reversed, I get from that data:
[~/hashcollisions] hex <MD5coll1b
000000 .881.2)...0..... DD383831 C73229FC 17B730C0 AE1BFCBA
000010 ..sf....Y....=@. D7A87366 16F4DC9D 5908D785 B03D4099
000020 ..4..`s...X....! D1AD3406 046073C0 1FBD5895 8209E121
000030 ....in.j........ 0BC994CA 696EAE6A F11BF6CB 15E6B006
000040 ....a...b....... 8BD4822E 61F1BD16 62BD10CE 9D80C6C3
000050 9Vt.......seS... 395674B6 C7060EFC 14A97365 53D7F0BE
000060 U.{S..{I.Y.Fz4z} 55877B53 E8927B49 C259F546 7A347A7D
000070 ....h...YE..7... B1D81105 68EBEB98 5945CAC9 37E010EB
[~/hashcollisions] hex <MD5coll2b
000000 .881.2)...0..... DD383831 C73229FC 17B730C0 AE1BFCBA
000010 ..s.....Y....=@. D7A873E6 16F4DC9D 5908D785 B03D4099
000020 ..4..`s...X....! D1AD3406 046073C0 1FBD5895 8289E121
000030 ....in.j...K.... 0BC994CA 696EAE6A F11BF64B 15E6B006
000040 ....a...b....... 8BD4822E 61F1BD16 62BD10CE 9D80C6C3
000050 9Vt6......seS... 39567436 C7060EFC 14A97365 53D7F0BE
000060 U.{S..{I.Y.Fz.y} 55877B53 E8927B49 C259F546 7AB4797D
000070 ....h...YE.I7... B1D81105 68EBEB98 5945CA49 37E010EB
The two messages differ at offsets 0013, 002D, 003B, 0053, 006D and 007B.
[~/hashcollisions] java Md5R <MD5coll1b
000000 .....Ee..s..q`.. 8ADA1581 C24565AD AC73A2D2 7160CA90
[~/hashcollisions] java Md5R <MD5coll2b
000000 .....Ee..s..q`.. 8ADA1581 C24565AD AC73A2D2 7160CA90
It's a collision, all right, but the hash value is off from the value
reported in the paper, which is 21f15d09 3ef611d2 f9f09bfb 86b9cadf
> That the authors are able to correct this in one day does
> prove their computational cost to find a collision with
> whatever message header is low.
--Mike Amling
> The two messages differ at offsets 0013, 002D, 003B, 0053, 006D and 007B.
>
> [~/hashcollisions] java Md5R <MD5coll1b
> 000000 .....Ee..s..q`.. 8ADA1581 C24565AD AC73A2D2 7160CA90
> [~/hashcollisions] java Md5R <MD5coll2b
> 000000 .....Ee..s..q`.. 8ADA1581 C24565AD AC73A2D2 7160CA90
>
> It's a collision, all right, but the hash value is off from the value
> reported in the paper, which is 21f15d09 3ef611d2 f9f09bfb 86b9cadf
I'd guess (and it's only a guess since I didn't try it) that the
difference is in padding. From they way they presented them, I'd
guess that they didn't do the MD5 padding, so their hash values are
simply the output of the second compression function. Yours is a
regular MD5 call, which includes padding -- but since the messages
have the same length, the padding in both cases is the same, so you
still have a collision. Just a different value than the pre-padding
hash.
http://www.md5crk.com/md5col.zip
JLC - ending MD5CRK in 48hrs, DB will be available.
> --
--
Good guess. I added a little tracing code and
[~/hashcollisions] java Md5R <MD5coll1b
Block compression function output is:
000000 SL..1...J.n.d.bx 534C84E8 31B2D2CE 4A0E6ED3 640C6278
Block compression function output is:
000000 !.].........>... 21F15D09 86B9CADF F9F09BFB 3EF611D2
Block compression function output is:
000000 .....eE...s...`q 8115DA8A AD6545C2 D2A273AC 90CA6071
000000 .....Ee..s..q`.. 8ADA1581 C24565AD AC73A2D2 7160CA90
Except for the unexplained word order, the output from the second
call to the block compression function matches the hash value reported
in the August 16 paper.
Note the similarity in the first block's output from the two messages.
[~/hashcollisions] j Md5R < MD5coll2b
Block compression function output is:
000000 .L........n...bx D34C84E8 B3B2D2CE CC0E6ED3 E60C6278
Block compression function output is:
000000 !.].........>... 21F15D09 86B9CADF F9F09BFB 3EF611D2
Block compression function output is:
000000 .....eE...s...`q 8115DA8A AD6545C2 D2A273AC 90CA6071
000000 .....Ee..s..q`.. 8ADA1581 C24565AD AC73A2D2 7160CA90
--Mike Amling
> Wow. I just verified this. I have no idea what their technique is (I
> looked at the paper, and it just gives the data -- no info on
> technique)
After listing to the Crypto Rump-session talk, I also have no idea.
There's something of a language barrier, but there's also much
interest in translating and explaining the technique, so I expect
we'll soon have expositions we can understand.
--
--Bryan
I have been playing around with the MD5 collision. It seems they have
attacked 2 round MD5. Here is the MD5 state for the two input blocks of the
first collision reported in the paper after each assignment to the 4 32-bit
registers (i.e. after every 4th statement in MD5Transform() from
http://www.faqs.org/rfcs/rfc1321.html with debugging statements added).
First file:
A=67452301 B=EFCDAB89 C=98BADCFE D=10325476 (initial state)
A=13B8CFF6 B=B72F3B38 C=02273333 D=4C9A648D
A=88400025 B=01B11540 C=03FEF820 D=027FBC41
A=FBB03F3D B=02A81F27 C=1E71CED2 D=737FD864
A=715FF408 B=2D8B8334 C=715D8A20 D=14CBF5F8
A=6F1CEF26 B=11D55A6A C=65FC5E8E D=758E4828
A=69A9E2CA B=9B5D31D2 C=7E7D7849 D=72B1B3A4
A=2F010227 B=CBB2BD1A C=8B0E92EE D=7CB6CFD6
A=58DD88C2 B=8F25D190 C=9F30104A D=539B9C53 (after 2 rounds)
A=0D6C2B84 B=133FACE4 C=62BF3F11 D=EEC06B79
A=78EEFE75 B=A6A68057 C=F8F16F22 D=94858CD3
A=69116DD7 B=806F50BD C=A9BC1267 D=BC5B73A1
A=32958A0A B=163E312A C=B8379D04 D=4BD07F3F
A=FF40F477 B=A4D917B4 C=D4DDB70A D=C84A8080
A=9D73259F B=9E00CBE7 C=BFF44C18 D=8B4EFCDE
A=C3771C7C B=5DAFFF93 C=A05AD76E D=A1B47584
A=EB137023 B=40C62C41 C=914BFF56 D=10936990 (after 4 rounds)
A=52589324 B=3093D7CA C=2A06DC54 D=20C5BE06 (actually the state)
A=52589324 B=3093D7CA C=2A06DC54 D=20C5BE06
A=C43392D7 B=BA44E410 C=BF5FAD7E D=CD933FD9
A=4B2FAE50 B=8472D7F0 C=B6C59E25 D=24A2CC56
A=FC7939D9 B=8F0817B0 C=8F2DC7D3 D=8F8BCBFC
A=3F4FFA1A B=2E10DDAE C=7D512CB7 D=40FBF2BC
A=0D589E8F B=0B8CA638 C=655D9844 D=0AB2157D
A=15F188C4 B=D66B0227 C=1FEDF438 D=01726672
A=1A62C2AF B=56E3759B C=5264A1BE D=47FB139D
A=7727CF6C B=CEB2627F C=DF4847E7 D=CBF31C8C
A=B1453DA4 B=14DF3C14 C=B0278E38 D=CBDB3111
A=2F0E0C2A B=AC32F0E1 C=F3C1A164 D=889DB784
A=7AF655EA B=AFBAF231 C=8DF463F2 D=82727135
A=B608B6CF B=8D2690A4 C=71B791DA D=B34155A9
A=632A85F1 B=219BBE99 C=30521929 D=0AA41192
A=186ADBF3 B=41F78B13 C=0871662A D=027E8DA8
A=083F5850 B=E02B50A2 C=771B6B65 D=768D66DA
A=43AA82FB B=727BC5F5 C=755F2368 D=D35A09E9
A=9603161F B=A30F9DBF C=9F65FFBC D=F41FC7EF (state after 2 blocks)
Second file:
A=67452301 B=EFCDAB89 C=98BADCFE D=10325476
A=13B8CFF6 B=B72F3B38 C=02273333 D=4C9A648D
A=883FFFE5 B=012E9541 C=FC7EF7DF D=82FFBC01
A=7BB03EFE B=82A7FEA7 C=DE71CED2 D=F37FE864
A=F25FF408 B=8D8B8334 C=F15D0A28 D=94CBF5F8
A=EF1CEF26 B=91D55A6A C=E5FE5E8E D=F58E4828
A=E9A9E2CA B=9B5D31D2 C=7E7D7849 D=F2B1B3A4
A=2F010227 B=CBB2BD1A C=8B0E92EE D=7CB6CFD6
A=58DD88C2 B=8F25D190 C=9F30104A D=539B9C53
A=0D6C2B84 B=933FACE4 C=E2BF3F11 D=EEC06B79
A=F8EEFE75 B=26A68057 C=78F16F22 D=14858CD3
A=E9116DD7 B=006F50BD C=29BC1267 D=3C5B73A1
A=B2958A0A B=963E312A C=38379D04 D=CBD07F3F
A=7F40F477 B=24D917B4 C=54DDB70A D=484A8080
A=1D73259F B=1E00CBE7 C=3FF44C18 D=0B4EFCDE
A=43771C7C B=DDAFFF93 C=205AD76E D=21B47584
A=6B137023 B=C2C62C41 C=134BFF56 D=92936990
A=D2589324 B=B293D7CA C=AC06DC54 D=A2C5BE06
A=D2589324 B=B293D7CA C=AC06DC54 D=A2C5BE06
A=463392D7 B=3C44E42E C=4160B59E D=4F933FF9
A=CB2FB191 B=03F157F0 C=2EC59DE5 D=A491CC56
A=7C793A1A B=0F07F730 C=0F2DC7D3 D=0F8BDBFC
A=C04FFA1A B=8E10DDAE C=FD51ACBF D=C0FBF2BC
A=8D589E8F B=8B8CA638 C=E55F9844 D=8AB2157D
A=95F188C4 B=D66B0227 C=1FEDF438 D=81726672
A=1A62C2AF B=56E3759B C=5264A1BE D=47FB139D
A=7727CF6C B=CEB2627F C=DF4847E7 D=CBF31C8C
A=B1453DA4 B=94DF3C14 C=30278E38 D=CBDB3111
A=AF0E0C2A B=2C32F0E1 C=73C1A164 D=089DB784
A=FAF655EA B=2FBAF231 C=0DF463F2 D=02727135
A=3608B6CF B=0D2690A4 C=F1B791DA D=334155A9
A=E32A85F1 B=A19BBE99 C=B0521929 D=8AA41192
A=986ADBF3 B=C1F78B13 C=8871662A D=827E8DA8
A=883F5850 B=602B50A2 C=F71B6B65 D=F68D66DA
A=C3AA82FB B=F07BC5F5 C=F35F2368 D=515A09E9
A=9603161F B=A30F9DBF C=9F65FFBC D=F41FC7EF
Below, a * is shown where the hex digits differ:
A=67452301 B=EFCDAB89 C=98BADCFE D=10325476
A=13B8CFF6 B=B72F3B38 C=02273333 D=4C9A648D
A=88*****5 B=01***54* C=***EF*** D=*2*FBC*1
A=*BB03*** B=*2A****7 C=*E71CED2 D=*37F*864
A=**5FF408 B=*D8B8334 C=*15D*A2* D=*4CBF5F8
A=*F1CEF26 B=*1D55A6A C=*5F*5E8E D=*58E4828
A=*9A9E2CA B=9B5D31D2 C=7E7D7849 D=*2B1B3A4
A=2F010227 B=CBB2BD1A C=8B0E92EE D=7CB6CFD6
A=58DD88C2 B=8F25D190 C=9F30104A D=539B9C53 (collision after 2 rounds)
A=0D6C2B84 B=*33FACE4 C=*2BF3F11 D=EEC06B79
A=*8EEFE75 B=*6A68057 C=*8F16F22 D=*4858CD3
A=*9116DD7 B=*06F50BD C=*9BC1267 D=*C5B73A1
A=*2958A0A B=*63E312A C=*8379D04 D=*BD07F3F
A=*F40F477 B=*4D917B4 C=*4DDB70A D=*84A8080
A=*D73259F B=*E00CBE7 C=*FF44C18 D=*B4EFCDE
A=*3771C7C B=*DAFFF93 C=*05AD76E D=*1B47584
A=*B137023 B=**C62C41 C=**4BFF56 D=**936990
A=*2589324 B=**93D7CA C=**06DC54 D=**C5BE06
A=*2589324 B=**93D7CA C=**06DC54 D=**C5BE06
A=**3392D7 B=**44E4** C=*******E D=**933F*9
A=*B2F**** B=*****7F0 C=**C59**5 D=*4**CC56
A=*C793*** B=*F0**7*0 C=*F2DC7D3 D=*F8B*BFC
A=**4FFA1A B=*E10DDAE C=*D51*CB* D=*0FBF2BC
A=*D589E8F B=*B8CA638 C=*55*9844 D=*AB2157D
A=*5F188C4 B=D66B0227 C=1FEDF438 D=*1726672
A=1A62C2AF B=56E3759B C=5264A1BE D=47FB139D
A=7727CF6C B=CEB2627F C=DF4847E7 D=CBF31C8C (another collision after 2
rounds)
A=B1453DA4 B=*4DF3C14 C=*0278E38 D=CBDB3111
A=*F0E0C2A B=*C32F0E1 C=*3C1A164 D=*89DB784
A=*AF655EA B=*FBAF231 C=*DF463F2 D=*2727135
A=*608B6CF B=*D2690A4 C=*1B791DA D=*34155A9
A=*32A85F1 B=*19BBE99 C=*0521929 D=*AA41192
A=*86ADBF3 B=*1F78B13 C=*871662A D=*27E8DA8
A=*83F5850 B=*02B50A2 C=*71B6B65 D=*68D66DA
A=*3AA82FB B=**7BC5F5 C=**5F2368 D=**5A09E9
A=9603161F B=A30F9DBF C=9F65FFBC D=F41FC7EF
Notice that the states are identical after 2 rounds for both input blocks.
Also, the states differ only by a few high-order bits of each register at
all times, except in the first round.of each block.
I didn't show the third block (containing padding) because once the internal
state collides, the rest can be anything you want (but identical in both
files) and you still have a collision. In case you were curious, the output
is:
A=5CD3C0A4 B=803AA695 C=7D361559 D=51B7E6CF
MD5 (md5_collision_1) = a4c0d35c95a63a805915367dcfe6b751
A=5CD3C0A4 B=803AA695 C=7D361559 D=51B7E6CF
MD5 (md5_collision_2) = a4c0d35c95a63a805915367dcfe6b751
Note that the hash value reported in the paper is actually the state after 2
input blocks, before hashing the padding.
-- Matt Mahoney