[PATCH] Cosmetic updates to MD5 library

1 view
Skip to first unread message

Nick Glencross

unread,
May 9, 2005, 6:40:09 PM5/9/05
to Perl 6 Internals
Guys,

this patch makes some small updates to the MD5 files.

* Remove some code which was retained in case changes for 64-bit
processors didn't work

* Convert some macro temps to be .locals in calling function

* Omit 'library' path in load_bytecode calls

* General cleanup

Leo's previously reported that the wrong checksum is reported on big
endian systems, but I don't have access to one to investigate. Offers
kindly accepted.

[I know that MD5 is best implemented in C, but this library has been
useful in shaking out a few issues in the past, and now useful for
checking for regressions, especially in the GC and JIT. Breakage is very
easy to spot! It would also be interesting for benchmarking, but I
haven't got around to trying]

Regards,

Nick

md5.patch

Nick Glencross

unread,
May 9, 2005, 7:32:22 PM5/9/05
to Perl 6 Internals
Nick Glencross wrote:

> Guys,
>
> this patch makes some small updates to the MD5 files.

> ...


> It would also be interesting for benchmarking, but I haven't got
> around to trying


As a rough comparison running the md5sum.imc located in the examples
directory (on Linux/AMD Athlon), I get:

Empty file (to measure startup and assemble):
Default Core: 0.1s
JIT Core: 0.1s

2MB file:
Default Core: 19 seconds
JIT Core: 1.9 seconds

Which pretty much matches the 10-fold JIT increase that I suggested in
the comments. [Things were run a couple of times first to allow the
executable and files to get cached.]

Still some way off the OS md5sum, which is typically 0.15 seconds, about
12x quicker. That may sound quite a bit, but much of it can probably be
accounted for by inefficiencies in my conversion to parrot code (a
slightly awkward rol, and perhaps the manipulation of the arrays).
Contrary to what I keep being told by Java programmers, you'll never
going to be as fast as optimised C code, let's face it.

Regards,

Nick

Nick Glencross

unread,
May 9, 2005, 7:45:27 PM5/9/05
to jerry gay, Perl 6 Internals
jerry gay wrote:

>On 5/9/05, Nick Glencross <ni...@glencros.demon.co.uk> wrote:
>
>
>>- load_bytecode "library/Digest/MD5.imc"
>>+ load_bytecode "Digest/MD5.imc"
>>
>>
>
>the '.imc' extension has recently fallen out of favor and is being
>replaced with '.pir'. otherwise, looks good and works on
>win32--msvc-7.1--perl-5.8.6.
>
Thanks. I might have to ask a kind soul to do me a 'svn rename' as well
then. I've recently tested it on win32-cygwin, and it works there too.

Nick

Jerry Gay

unread,
May 9, 2005, 7:39:06 PM5/9/05
to Nick Glencross, Perl 6 Internals
On 5/9/05, Nick Glencross <ni...@glencros.demon.co.uk> wrote:
> - load_bytecode "library/Digest/MD5.imc"
> + load_bytecode "Digest/MD5.imc"

the '.imc' extension has recently fallen out of favor and is being
replaced with '.pir'. otherwise, looks good and works on
win32--msvc-7.1--perl-5.8.6.

~jerry

Leopold Toetsch

unread,
May 10, 2005, 3:18:58 AM5/10/05
to Nick Glencross, perl6-i...@perl.org
Nick Glencross <ni...@glencros.demon.co.uk> wrote:

> this patch makes some small updates to the MD5 files.

Thanks, applied.
leo

Leopold Toetsch

unread,
May 10, 2005, 3:12:41 AM5/10/05
to Nick Glencross, perl6-i...@perl.org
Nick Glencross <ni...@glencros.demon.co.uk> wrote:

> Still some way off the OS md5sum, which is typically 0.15 seconds, about
> 12x quicker. That may sound quite a bit, but much of it can probably be
> accounted for by inefficiencies in my conversion to parrot code (a
> slightly awkward rol, and perhaps the manipulation of the arrays).

A new opcode C<rol> would certainly help, yes. It would replace 4
instructions (each roughly executed once per char) with one instruction.

A JITted C<rol> opcode should give a speed up of one forth - which is a
lot.

Another problem is that MD5.imc doesn't process the file blockwise.
md5sum.imc first slurps the whole file into a string then it's converted
to a word array and in a third step it's processed. The influence of
this gets worse with bigger files.

> Contrary to what I keep being told by Java programmers, you'll never
> going to be as fast as optimised C code, let's face it.

There are always some optimizations still.

> Regards,

> Nick

leo

Leopold Toetsch

unread,
May 10, 2005, 4:11:53 AM5/10/05
to Dino Morelli, perl6-i...@perl.org
Dino Morelli <dmor...@reactorweb.net> wrote:

> I modified some of the .pod files in imcc/docs/ to reflect using .pir
> instead of .imc

Thanks, applied.
leo

Dino Morelli

unread,
May 10, 2005, 9:11:15 AM5/10/05
to Leopold Toetsch, perl6-i...@perl.org

Thank you, leo.


Some of these .pod files are used by the website, down in
http://www.parrotcode.org/docs/imcc

Should I change anything else to make the new documents used by the
site? At the moment, they don't match.

-Dino

--
.~. Dino Morelli
/V\ email: dmor...@reactorweb.net
/( )\ weblog: http://categorically.net/d/blog/
^^-^^ preferred distro: Debian GNU/Linux http://www.debian.org

Leopold Toetsch

unread,
May 10, 2005, 11:06:24 AM5/10/05
to Nick Glencross, perl6-i...@perl.org
Leopold Toetsch wrote:
> Nick Glencross <ni...@glencros.demon.co.uk> wrote:
>
>
>>Still some way off the OS md5sum, which is typically 0.15 seconds, about
>>12x quicker. That may sound quite a bit, but much of it can probably be
>>accounted for by inefficiencies in my conversion to parrot code (a
>>slightly awkward rol, and perhaps the manipulation of the arrays).
>
>
> A new opcode C<rol> would certainly help, yes. It would replace 4
> instructions (each roughly executed once per char) with one instruction.
>
> A JITted C<rol> opcode should give a speed up of one forth - which is a
> lot.

I was a bit too optimistic with my assumptions here. I have now
implemented a C<rot> opcode and the one used signature for MD5 as a JIT
opcode for x86. But the speedup is much smaller: around 5%.

md5sum of perl-5.8.0.tar.gz size=11023084

md5sum 0.11 user, 0.20 real
parrot -j 2.63 user 2.68 real
parrot -j rot 2.57 user 2.75 real

The problem with md5 code and Parrot JIT seems to be related to the
register allocator. md5 code is one big basic block of integer code. As
we don't do any register renaming, the CPU-register usage especially on
x86 is suboptimal.

leo

Nick Glencross

unread,
May 11, 2005, 6:05:38 PM5/11/05
to Perl 6 Internals, Leopold Toetsch
Leopold Toetsch wrote:

> I have now implemented a C<rot> opcode and the one used signature for
> MD5 as a JIT opcode for x86. But the speedup is much smaller: around 5%.

Thanks!

> The problem with md5 code and Parrot JIT seems to be related to the
> register allocator. md5 code is one big basic block of integer code.
> As we don't do any register renaming, the CPU-register usage
> especially on x86 is suboptimal.

I must admit that I thought that one basic block of integer code would
run pretty well (that's why I inlined it all). It touches lots of
registers because its job is to do lots of mashing of bits!

Comparing the performance with the OS md5sum was more just a curiosity;
it's not really a fair comparison.

Cheers Leo,

Nick

Leopold Toetsch

unread,
May 12, 2005, 7:10:55 AM5/12/05
to Nick Glencross, Perl 6 Internals
Nick Glencross wrote:

> Having looked into it a little further, it actually looks like the 'ord'
> operation is a significant part of the (JIT) run, perhaps as much as
> 50%, which seems disproportionately high... (you can just comment out
> the ord to see this)

Yeah, string_ord() isn't really fast, but ...

I wrote:
| There are always some optimizations still.

... so I just did fully inline JIT the fast path with fixed8 encoding.

And compiling an --optimize'd Parrot helps too a lot. Here are fresh
numbers:

md5sum of perl-5.8.0.tar.gz size=11023084

parrot -j 2.63 user 2.68 real # unoptimized build
parrot -j rot 2.57 user 2.75 real # unopt. buil rot opcode

parrot -j ord 1.10 user 1.26 real # opt. build rev 8072

md5sum 0.11 user, 0.20 real # C
Digest::Perl::MD5 29.00 real # perl
Digest::MD5 0.50 real # XS
Tools/scripts/md5sum.py 0.44 real # C

> Nick

leo

Leopold Toetsch

unread,
May 12, 2005, 12:01:34 PM5/12/05
to Nick Glencross, Perl 6 Internals
Leopold Toetsch wrote:

> parrot -j ord 1.10 user 1.26 real # opt. build rev 8072

and another 20% by inlining the fast path of the array get (set_i_p_ki)

0.91 user 1.09 real # opt build rev 8076

Now its your turn to process files blockwise ;-)

leo

Reply all
Reply to author
Forward
0 new messages