Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Making md5 checksum of files faster

268 views
Skip to first unread message

Alexandru

unread,
Mar 4, 2017, 1:58:01 PM3/4/17
to
Hi,

Currently I'm using the md5 package to compute the md5 checksum of file content.

For very large files I get to feel the time it need to compute the md5 checksum.

Would be a pure C implementation significantly faster the the Tcl package?

Thanks.
Alexandru

Rolf Ade

unread,
Mar 4, 2017, 7:19:47 PM3/4/17
to
Hmm... - you're aware, that the tcllib md5 package comes with a C
implementation, that is used if possible (that means: if critical is
available)? Btw: you don't need critical on every client system. It's
enough, to have it on your development system. Just deploy the mini
binary extention, that critical creates for you.

Could you provide some numbers: Size of the file, time to calculate the
md5 sum?

Alexandru

unread,
Mar 5, 2017, 3:06:55 AM3/5/17
to
Am Sonntag, 5. März 2017 01:19:47 UTC+1 schrieb Rolf Ade:
No, I wasn't aware. Do you mean Critcl instead of critical?

Here are the timings:

file size:119654985
checksum compute time:2731849 microseconds per iteration
file size:120518142
checksum compute time:2887563 microseconds per iteration
file size:147957277
checksum compute time:3509820 microseconds per iteration
file size:6415
checksum compute time:942 microseconds per iteration

Christian Gollwitzer

unread,
Mar 5, 2017, 3:29:08 AM3/5/17
to
Am 05.03.17 um 09:06 schrieb Alexandru:
> Am Sonntag, 5. März 2017 01:19:47 UTC+1 schrieb Rolf Ade:
>> Alexandru writes:
>>> Currently I'm using the md5 package to compute the md5 checksum of file content.
>>>
>>> For very large files I get to feel the time it need to compute the md5 checksum.
>>>
>>> Would be a pure C implementation significantly faster the the Tcl package?
>>
>> Hmm... - you're aware, that the tcllib md5 package comes with a C
>> implementation, that is used if possible (that means: if critical is
>> available)? Btw: you don't need critical on every client system. It's
>> enough, to have it on your development system. Just deploy the mini
>> binary extention, that critical creates for you.
>>
>> Could you provide some numbers: Size of the file, time to calculate the
>> md5 sum?
>
> No, I wasn't aware. Do you mean Critcl instead of critical?
>
> Here are the timings:
>
> file size:119654985
> checksum compute time:2731849 microseconds per iteration

120 Mbyte in 0.27s ? That is definitely running in C. For comparison,
you can use the md5sum command line tool, which is standard on Linux (or
md5 on OSX). On my machine,

Apfelkiste:Sources chris$ ls -lh VolViewer.dmg
-rw-r--r--@ 1 chris staff 174M 28 Feb 13:25 VolViewer.dmg
Apfelkiste:Sources chris$ time md5 VolViewer.dmg
MD5 (VolViewer.dmg) = 89da2625c578d69627622302bde180d4

real 0m0.456s
user 0m0.435s
sys 0m0.075s

so your numbers are even a bit faster, probably you have a faster
machine (this is on a MacBook).

How often do you need to recompute MD5? Alternatively, you can check
"file mtime". If that hasn't changed, it is unlikely that the file
content is different.

Christian

Alexandru

unread,
Mar 5, 2017, 4:07:40 AM3/5/17
to
Hi Christian,

Unless I'm totally mistaking, the timing is 2.7 seconds...

Alexandru

unread,
Mar 5, 2017, 5:54:36 AM3/5/17
to
Am Sonntag, 5. März 2017 01:19:47 UTC+1 schrieb Rolf Ade:
> Alexandru writes:
> > Currently I'm using the md5 package to compute the md5 checksum of file content.
> >
> > For very large files I get to feel the time it need to compute the md5 checksum.
> >
> > Would be a pure C implementation significantly faster the the Tcl package?
>
> Hmm... - you're aware, that the tcllib md5 package comes with a C
> implementation, that is used if possible (that means: if critical is
> available)? Btw: you don't need critical on every client system. It's
> enough, to have it on your development system. Just deploy the mini
> binary extention, that critical creates for you.

I have critcl and I still have this slow performance. Do I have to call ::md5::md5c instead of ::md5::md5? If I do, I get an empty string.

Rich

unread,
Mar 5, 2017, 11:28:43 AM3/5/17
to
If critcl was installed when tcllib was "built" and installed (and
tcllib's build scripts located critcl), then the build script
automatically compiles the C extensions in tcllib and sets up to
automatically load the C extensions instead of the pure Tcl variants.

But if the tcllib build script did not find critcl, you only get the
pure Tcl variants.

But you don't have to mess with doing anything different from Tcl
scripts, the package scripts take care of all of that.

So if you don't have the C extensions compiled, it is probably because
critcl was not found by tcllib's build scripts at the time tcllib was
built.

So you might just be able to rebuild and reinstall tcllib and get the C
extensions (assuming tcllib's builder finds critcl installed).

Alexandru

unread,
Mar 5, 2017, 12:16:10 PM3/5/17
to
Am Sonntag, 5. März 2017 17:28:43 UTC+1 schrieb Rich:
That might be it: I use Paul's Obermeier BAWT framework to compile Tcl/Tk. I'll see what I can do to get it working.

Optionally: Could I simply copy/paste the C code from md5c.tcl into my own C extension? It might be even faster that way.

Paul Obermeier

unread,
Mar 5, 2017, 1:59:53 PM3/5/17
to
BAWT's tcllib is critcl enabled.
Just compared md5:md5 against Windows program md5sums (
http://www.pc-tools.net/win32/md5sums/ ) and don't recongnized a speed
difference.

Paul

Alexandru

unread,
Mar 6, 2017, 1:48:56 AM3/6/17
to
Perhaps I'm missing something here. I read the content of a 200MB large file using [read]. Then I pass the file content to the md5 command. It takes almost 5 seconds to compute. Am I doing something wrong, that could cause the bad performance? How can I check wether md5 is using the C function or not?

Rich

unread,
Mar 6, 2017, 6:18:50 AM3/6/17
to
For tcllib, after package require of md5, see if you have a command
named ::md5::md5c or see if the contents of ::md5::accell(critcl) is
true.

Alexandru

unread,
Mar 6, 2017, 6:27:14 AM3/6/17
to
Am Montag, 6. März 2017 12:18:50 UTC+1 schrieb Rich:
::md5::md5c is available but ::md5::accell(critcl) is not:

puts $::md5::accell(critcl)
can't read "::md5::accell(critcl)": no such variable

Rich

unread,
Mar 6, 2017, 11:59:57 AM3/6/17
to
Sorry, typo, it is "accel" (one "l").

You know, you can read the source of the md5 module in tcllib and find
all of these details out yourself.

Alexandru

unread,
Mar 6, 2017, 12:05:24 PM3/6/17
to
Am Montag, 6. März 2017 17:59:57 UTC+1 schrieb Rich:
> Alexandru wrote:
> > Am Montag, 6. März 2017 12:18:50 UTC+1 schrieb Rich:
> >> Alexandru wrote:
> >> > Am Sonntag, 5. März 2017 19:59:53 UTC+1 schrieb Paul Obermeier:
> >> >> BAWT's tcllib is critcl enabled.
> >> >> Just compared md5:md5 against Windows program md5sums (
> >> >> http://www.pc-tools.net/win32/md5sums/ ) and don't recongnized a speed
> >> >> difference.
> >> >>
> >> >> Paul
> >> >
> >> > Perhaps I'm missing something here. I read the content of a 200MB
> >> > large file using [read]. Then I pass the file content to the md5
> >> > command. It takes almost 5 seconds to compute. Am I doing something
> >> > wrong, that could cause the bad performance? How can I check wether
> >> > md5 is using the C function or not?
> >>
> >> For tcllib, after package require of md5, see if you have a command
> >> named ::md5::md5c or see if the contents of ::md5::accell(critcl) is
> >> true.
> >
> > ::md5::md5c is available but ::md5::accell(critcl) is not:
> >
> > puts $::md5::accell(critcl)
> > can't read "::md5::accell(critcl)": no such variable
>
> Sorry, typo, it is "accel" (one "l").
Variable is available and has value 1.
>
> You know, you can read the source of the md5 module in tcllib and find
> all of these details out yourself.
Yes, I know that. I don't want to put you to work for me. As soon as I have the time, I'll debug the source code myself. I just wanted to know if the behavior is abnormal. Thank you for you help. I now know it's abnormal. As soon as I know more details about the cause of the problem, I'll be able or ask more targeted questions.

Rich

unread,
Mar 6, 2017, 12:13:48 PM3/6/17
to
Alexandru <alexandr...@meshparts.de> wrote:
> Am Montag, 6. März 2017 17:59:57 UTC+1 schrieb Rich:
>> Alexandru wrote:
>> > Am Montag, 6. März 2017 12:18:50 UTC+1 schrieb Rich:
>> >> Alexandru wrote:
>> >> > Am Sonntag, 5. März 2017 19:59:53 UTC+1 schrieb Paul Obermeier:
>> >> >> BAWT's tcllib is critcl enabled.
>> >> >> Just compared md5:md5 against Windows program md5sums (
>> >> >> http://www.pc-tools.net/win32/md5sums/ ) and don't recongnized a speed
>> >> >> difference.
>> >> >>
>> >> >> Paul
>> >> >
>> >> > Perhaps I'm missing something here. I read the content of a 200MB
>> >> > large file using [read]. Then I pass the file content to the md5
>> >> > command. It takes almost 5 seconds to compute. Am I doing something
>> >> > wrong, that could cause the bad performance? How can I check wether
>> >> > md5 is using the C function or not?
>> >>
>> >> For tcllib, after package require of md5, see if you have a command
>> >> named ::md5::md5c or see if the contents of ::md5::accell(critcl) is
>> >> true.
>> >
>> > ::md5::md5c is available but ::md5::accell(critcl) is not:
>> >
>> > puts $::md5::accell(critcl)
>> > can't read "::md5::accell(critcl)": no such variable
>>
>> Sorry, typo, it is "accel" (one "l").
> Variable is available and has value 1.

That says you should be using the C extension then.

So your performance should be something reasonably close to what
'md5sum' produces from the command line (assuming you are performing
reasonably large buffered reads vs. reading one byte at a time).

Alexandru

unread,
Mar 6, 2017, 12:18:40 PM3/6/17
to
Yes, I know this now thanks to your help. Or at least I know it should be fast. I'm not so sure that it uses the C extension. If it would, that it should be much faster.

As already wrote, I use [read] to read the whole file content at once. Don't know if this is good or bad for the performance. At least the reading should be faster with [read] than reading line by line.

Alexandru

unread,
Mar 6, 2017, 1:02:27 PM3/6/17
to
I'm on a faster machine now.
I measure the time needed to read the file content into a string an to actually compute the checksum:

Reading: 1333826 microseconds per iteration
Checksum: 2020911 microseconds per iteration

So reading does take an important amount of time, but checksum should be according to your timing significantly faster. I'll dig further.

Christian Gollwitzer

unread,
Mar 6, 2017, 1:23:10 PM3/6/17
to
Am 06.03.17 um 19:02 schrieb Alexandru:
> I'm on a faster machine now.
> I measure the time needed to read the file content into a string an to actually compute the checksum:
>
> Reading: 1333826 microseconds per iteration
> Checksum: 2020911 microseconds per iteration
>
> So reading does take an important amount of time,

Reading does encoding translation, unless you read the file in binary.
If not, then fconfigure -translation binary -encoding binary should
provide speedup


> but checksum should be according to your timing significantly faster. I'll dig further.

If the raw checksumming is slower than md5sum, it could be that the code
in tcllib is suboptimal, or that it is doing encoding conversion again.

Christian


Alexandru

unread,
Mar 6, 2017, 1:26:34 PM3/6/17
to
Say, should I include the source code of md5 at Tcl compile time or it's enough to add the package later and the C extension will be created at "package required"?

Alexandru

unread,
Mar 6, 2017, 1:29:23 PM3/6/17
to
Could you run this code for a 200MB large file:
package require md5
proc FileContentChecksum {path} {
puts FileRead:[time {
set code [catch {set fid [open $path r]} err]
if {$code} {
return ""
}
set content [read $fid]
close $fid
} 1]
# return [StringChecksum $content]
puts Checksum:[time {::md5::md5 -hex $content} 1]
puts FileSize:[file size $path]
}
FileContentChecksum $MyLargeFilePath

Christian Gollwitzer

unread,
Mar 6, 2017, 1:59:21 PM3/6/17
to
Am 06.03.17 um 19:29 schrieb Alexandru:
> Could you run this code for a 200MB large file:
> package require md5
> proc FileContentChecksum {path} {
> puts FileRead:[time {
> set code [catch {set fid [open $path r]} err]

why "r"? Try "rb" or the fconfigure that I suggested.


> if {$code} {
> return ""
> }
> set content [read $fid]
> close $fid
> } 1]
> # return [StringChecksum $content]
> puts Checksum:[time {::md5::md5 -hex $content} 1]
> puts FileSize:[file size $path]
> }
> FileContentChecksum $MyLargeFilePath
>

I don't have acceleration ready here, so it'll make no sense. Without
acceleration, md5 is 1000 (sic!) times slower than md5 on the command
line; it took ~0.02s for a file, whereas Tcl md5 takes 47s

Christian

Christian Gollwitzer

unread,
Mar 6, 2017, 2:03:38 PM3/6/17
to
Am 06.03.17 um 19:59 schrieb Christian Gollwitzer:
> Am 06.03.17 um 19:29 schrieb Alexandru:
>> Could you run this code for a 200MB large file:
>> package require md5
>> proc FileContentChecksum {path} {
>> puts FileRead:[time {
>> set code [catch {set fid [open $path r]} err]
>
> why "r"? Try "rb" or the fconfigure that I suggested.
>

BTW: Are you aware that md5 has an option -file to read a file and
compute the checksum? It also does the encoding binary thing correctly.

Christian

Paul Obermeier

unread,
Mar 6, 2017, 3:13:27 PM3/6/17
to
That's, what I wanted to propose, too.

> md5sums -e -b -s GeographicLibData.7z

GeographicLibData.7z
e51212894970c03873359d6c339090ed

133340832 bytes, 499 ms = 254.84 MB/sec


% time "md5::md5 -file GeographicLibData.7z -hex"
551656 microseconds per iteration

% puts [string tolower [md5::md5 -file GeographicLibData.7z -hex]]
e51212894970c03873359d6c339090ed


Paul

Alexandru

unread,
Mar 6, 2017, 3:30:00 PM3/6/17
to
Nice. With -file option it's significantly faster: 1.7 seconds instead of 3.3 seconds before.

It's still not so fast as it should be though... I'm still not sure it uses the C extension.

Paul Obermeier

unread,
Mar 6, 2017, 4:05:28 PM3/6/17
to
If disabling critcl when building tcllib (set buildTcllibC false in
tcllib.bawt), I get the following timing:
% time "md5::md5 -file GeographicLibData.7z -hex"
133288874 microseconds per iteration

That's more than 2 minutes in pure Tcl vs. 0.5 seconds in critcl mode.
So your tcllib implementation surely uses the C extension!

Paul

Alexandru

unread,
Mar 6, 2017, 7:24:19 PM3/6/17
to
I went throgh the code and coul'n find the cause for the bad performance.
One thing though: The cache directory ~/.critcl for critcl is empty, but it should containn the c files, right?

Rolf Ade

unread,
Mar 6, 2017, 7:44:37 PM3/6/17
to

Alexandru <alexandr...@meshparts.de> writes:
> Nice. With -file option it's significantly faster: 1.7 seconds instead of 3.3 seconds before.
>
> It's still not so fast as it should be though... I'm still not sure it uses the C extension.

I doubt, that it is possible to get the md5 sum of a 200 MB file with
just the tcl (script level) implementation in 1.7 seconds. I don't see,
how the algorithm could be spread over serveral cores. Without any
disrespect, I think you don't have such a box. You are using a C
implementation.

Compare the timing of what you get from what the C implementation in
tcllib provides and a native md5sum (with the data in the OS cache) - is
there significant differ?

Alexandru

unread,
Mar 7, 2017, 4:13:05 AM3/7/17
to
Am Dienstag, 7. März 2017 01:44:37 UTC+1 schrieb Rolf Ade:
Yes indeed, if I deactivate the critcl acceleration then md5 takes forever to compute the checksum. But why is the C extension so slow?

Alexandru

unread,
Mar 7, 2017, 6:55:45 AM3/7/17
to
Actually, thinking more about it, it's not much slower than what Paul gets:
Paul: 133MB, 0.5s
Me: 210MB, 1.7s

I guess, the relationship between file size and time is not linear, so the 1.7s could be normal for this file size...

Rich

unread,
Mar 7, 2017, 9:34:33 AM3/7/17
to
You are overlooking two significant determinants of performance here.
CPU performance and disk I/O bandwidth.

If Paul has a CPU that is more powerful, and an I/O system that is
faster, he will see a much faster time to compute the hash.

You should compare your Tcl time against the time on the same machine
for the md5sum command line command, that way you will be doing an
apples to apples comparison (same CPU, same I/O subsystem).

Case in point, I have an old Pentium4 system that is still running, I
get the following with the GNU md5sum tool (pure C code):

$ ls -sh 200megfile
209M 200megfile
$# time md5sum 200megfile
a4d6da974c3d8451eb976c45d38c8b46 200megfile

real 0m7.155s
user 0m1.170s
sys 0m0.654s

That is 4.1 times slower (in pure C) than your Tcl version.

Alexandru

unread,
Mar 7, 2017, 9:42:05 AM3/7/17
to
Am Dienstag, 7. März 2017 15:34:33 UTC+1 schrieb Rich:
Thanks for the point. Actually I thought about this but I was just assuming than Paul and I use an up to date computer and the differences should not be that much.

Rich

unread,
Mar 7, 2017, 10:30:06 AM3/7/17
to
Unless you and Paul have identical systems (CPU's, chipsets, disk
drives, memory types) then there's too many possible variables there.

Compare your machine with Tcl to your machine with the md5sum utility.
Then you remove all the architectural differences and have left only
"Tcl w/ C extension" and "pure C".

Alexandru

unread,
Mar 7, 2017, 10:46:15 AM3/7/17
to
I measured the time with openssl (time {exec openssl dgst -md5 $myfile} 1): 0.57s instead 1.76s with Tcl and C.
No it's clear that the C implementation in Tcl is far from being optimal. Could be because it's using Critcl or it's the C code itself?

Rich

unread,
Mar 7, 2017, 10:56:09 AM3/7/17
to
If you go read the code, you'll see that the critcl C code runs via the
Tcl I/O event loop reading 4k blocks at a time and calling into the C
code of the extension.

The openssl version is pure C, so no overhead from transitioning
back/forth in/out of the Tcl runtime.

As well, the openssl MD5 C code might just be higher performance C code
than the code in the Tcl critcl extension.

Alexandru

unread,
Mar 7, 2017, 11:01:44 AM3/7/17
to
Either way, I would be cool if the Tcl/C implementation would be faster...

Christian Gollwitzer

unread,
Mar 9, 2017, 2:44:23 AM3/9/17
to
Am 07.03.17 um 17:01 schrieb Alexandru:
>> As well, the openssl MD5 C code might just be higher performance C code
>> than the code in the Tcl critcl extension.
>
> Either way, I would be cool if the Tcl/C implementation would be faster...

Steps to make it faster:

1) Find the source code of a high performance C md5sum utility. For a
start, try this one:

http://www.netlib.org/crc/md5sum.c

2) Compile with -O3 -march=native (or equivalent) to get the maximum
performance

3) Benchmark. If it is faster than your Tcl solution, turn it into an
extension, which should be trivial.

Disadvantage: It won't read files from inside a Tclkit. If you must go
through Tcl I/O, this will slow down things. If you are only reading
"native" files, this will be as fast as calling the command line tool

It might well be that the OpenSSL implementation is especially fast,
because it uses manually tuned assembly (see
https://github.com/openssl/openssl/tree/master/crypto/md5 ). If it turns
out to be necessary, the easiest way would be to link against
openssl/libressl Maybe even the package TLS does that already (or you
can persuade Roy Keene to expose the MD5 function to Tcl)

Christian

Alexandru

unread,
Mar 14, 2017, 5:39:12 PM3/14/17
to
Thanks, Christian. I invested a few hours in this in the past days. The link to the C function was helpful. I had to do some modifications so that the sum function can return a hex formatted checksum. For some unexplainable reason, the function works only if the file has one single line. As soon I add a new line to the file, the result is wrong. I know this is a Tcl forum but I want to make a C extension for Tcl. This is the code:

void
sum(FILE *fd, char *name, char *checksum)
{
byte *buf;
byte digest[16];
char pr64[25];
char tmp[2];
int i, n;
MD5state *s;

s = nil;
n = 0;
buf = calloc(256,64);
for(;;){
i = fread(buf+n, 1, 128*64-n, fd);
if(i <= 0)
break;
n += i;
if(n & 0x3f)
continue;
s = md5(buf, n, 0, s);
n = 0;
}
md5(buf, n, digest, s);
for(i=0;i<16;i++) {
sprintf(tmp,"%.2X", digest[i]);
checksum[2*i] = tmp[0];
checksum[2*i+1] = tmp[1];
}
free(buf);
}

It works only when the file has one line. Any ideas what is wrong?

Christian Gollwitzer

unread,
Mar 14, 2017, 6:10:49 PM3/14/17
to
Am 14.03.17 um 22:39 schrieb Alexandru:
Christian
>
> Thanks, Christian. I invested a few hours in this in the past days.
The link to the C function was helpful. I had to do some modifications
so that the sum function can return a hex formatted checksum. For some
unexplainable reason, the function works only if the file has one single
line. As soon I add a new line to the file, the result is wrong. I know
this is a Tcl forum but I want to make a C extension for Tcl. This is
the code:

> void
> sum(FILE *fd, char *name, char *checksum)

> It works only when the file has one line. Any ideas what is wrong?

I have not digested your code (no pun intended), but I suspect that it
is the same issue you had in your original Tcl version, namely reading
the file as a text file instead of binary. This is then outside of the
function you posted. Are you opening the file with the flag "r"? On
Windows, you need to use "rb" to ensure binary reading.

Christian



Alexandru

unread,
Mar 14, 2017, 6:15:39 PM3/14/17
to
Yes! That was it! So simple... Thanks!
I'll check the timings again and post the results.

Alexandru

unread,
Mar 14, 2017, 6:41:14 PM3/14/17
to
Hm..., the pure C implementation is 2x *slower* than the Tcl package md5, which also uses C.... This hold for large file such as 200MB. If the file is small (1KB) than the pure C implementation is faster. Could it be that I must read bigger file chunks at once?

Christian Gollwitzer

unread,
Mar 15, 2017, 2:49:40 AM3/15/17
to
Am 14.03.17 um 23:41 schrieb Alexandru:
Can well be - why not just try it? The code apparently reads 1kB chunks,
I guess these days it would be faster to read 1MB at once.

The buffer logic in that code is a bit impenetrable to me, so if you
can't figure it out correctly, the safest way would be to rewrite the
loop in a more sane way. It seems to take care that you only feed
multiples of 64 byte into the md5 call up until the last one, which
reads the final digest value.

Christian

Alexandru

unread,
Mar 15, 2017, 2:53:18 AM3/15/17
to
I'll try it. I just wanted to give you this feedback before I move further.
I'm glad to hear that the code is impenetrable even to you:) I also have by difficulties reading it...

Christian Gollwitzer

unread,
Mar 15, 2017, 3:18:55 AM3/15/17
to
Am 15.03.17 um 07:53 schrieb Alexandru:
Haha. Well I guess I have understood it correctly, it keeps reading if
the amount of bytes is smaller than requested, until the bytes read is
really < 0. For a file you can safely assume that you got to the end, if
fread returns less bytes than the buffer size, maybe they do it this way
because it could be reading from a pipe or socket.

Here is approximately how I'd do it:

const int bufsize = 1024*1024; // 1M, must be divisible by 64
char *buffer = malloc(bufsize);
while (true) {
int bytes_read = fread(buffer, 1, bufsize, fd);
if (bytes_read < bufsize || feof(fd)) break;
s = md5(buffer, bufsize, 0, s);
}
md5(buffer, bytes_read, digest, s);
free(buffer);

THis code is untested, so don't blame me if it doesn't work

Christian

Alexandru

unread,
Mar 15, 2017, 3:34:28 AM3/15/17
to
I tested your code (and had to add some declarations) and now it runs but:
1. I get the wrong checksum
2. It's still 2x slower...

Alexandru

unread,
Mar 15, 2017, 3:35:23 AM3/15/17
to
This is the code:
s = nil;
const int bufsize = 1024*1024; // 1M, must be divisible by 64
int bytes_read;
char *buffer = malloc(bufsize);
while (1) {
int bytes_read = fread(buffer, 1, bufsize, fd);
if (bytes_read < bufsize || feof(fd)) break;
s = md5(buffer, bufsize, 0, s);
}
md5(buffer, bytes_read, digest, s);
for(i=0;i<16;i++) {
sprintf(tmp,"%.2X", digest[i]);
checksum[2*i] = tmp[0];
checksum[2*i+1] = tmp[1];
}
free(buffer);

Christian Gollwitzer

unread,
Mar 15, 2017, 4:37:30 AM3/15/17
to
Am 15.03.17 um 08:35 schrieb Alexandru:
>> I tested your code (and had to add some declarations) and now it runs but:
>> 1. I get the wrong checksum


>> 2. It's still 2x slower...

are you sure you compile with full optimizations? Maybe also the Tcllib
implementation is not that bad (it uses the RSA code). Have you
benchmarked against another standalone program that *IS* faster?


> This is the code:
> s = nil;
> const int bufsize = 1024*1024; // 1M, must be divisible by 64
> int bytes_read;
> char *buffer = malloc(bufsize);
> while (1) {
> int bytes_read = fread(buffer, 1, bufsize, fd);

delete the int. This is an error, now you have two variables bytes_read.

> if (bytes_read < bufsize || feof(fd)) break;
> s = md5(buffer, bufsize, 0, s);
> }
> md5(buffer, bytes_read, digest, s);
> for(i=0;i<16;i++) {
> sprintf(tmp,"%.2X", digest[i]);
> checksum[2*i] = tmp[0];
> checksum[2*i+1] = tmp[1];
> }
> free(buffer);
>

Christian

Alexandru

unread,
Mar 15, 2017, 5:42:09 AM3/15/17
to
Am Mittwoch, 15. März 2017 09:37:30 UTC+1 schrieb Christian Gollwitzer:
> Am 15.03.17 um 08:35 schrieb Alexandru:
> >> I tested your code (and had to add some declarations) and now it runs but:
> >> 1. I get the wrong checksum
>
>
> >> 2. It's still 2x slower...
>
> are you sure you compile with full optimizations? Maybe also the Tcllib
> implementation is not that bad (it uses the RSA code). Have you
> benchmarked against another standalone program that *IS* faster?
>
In one of my previous answers in this thread I wrote the result of openssl, which is 3x faster than the Tcllib implementation. The funny thing is that I could call openssl with the Tcl command "exec" and the problem would be solved the easy way. But then I would have to make sure the openssl executable is available somewhere on users hard drive. Not so optimal...

I used until now this command to compile:
gcc C:\md5sum.c -shared -o C:\md5sum.dll -DUSE_TCL_STUBS -IC:/Tcl/include -LC:/Tcl/lib -ltclstub86

After adding "-O3 -march=native" it's 50% faster! than the Tcllib implementation. Thanks for the tip! No idea yet what -O3 -march=native does, but are there any other magical keywords that I can use to make it as fast as the openssl?

>
> > This is the code:
> > s = nil;
> > const int bufsize = 1024*1024; // 1M, must be divisible by 64
> > int bytes_read;
> > char *buffer = malloc(bufsize);
> > while (1) {
> > int bytes_read = fread(buffer, 1, bufsize, fd);
>
> delete the int. This is an error, now you have two variables bytes_read.

The compile said nothing about it. Weird... I removed the second int declaration and now I get the correct result. But still 2x slower.

Alexandru

unread,
Mar 15, 2017, 6:18:56 AM3/15/17
to
Am Mittwoch, 15. März 2017 10:42:09 UTC+1 schrieb Alexandru:
> Am Mittwoch, 15. März 2017 09:37:30 UTC+1 schrieb Christian Gollwitzer:
> > Am 15.03.17 um 08:35 schrieb Alexandru:
> > >> I tested your code (and had to add some declarations) and now it runs but:
> > >> 1. I get the wrong checksum
> >
> >
> > >> 2. It's still 2x slower...
> >
> > are you sure you compile with full optimizations? Maybe also the Tcllib
> > implementation is not that bad (it uses the RSA code). Have you
> > benchmarked against another standalone program that *IS* faster?
> >
> In one of my previous answers in this thread I wrote the result of openssl, which is 3x faster than the Tcllib implementation. The funny thing is that I could call openssl with the Tcl command "exec" and the problem would be solved the easy way. But then I would have to make sure the openssl executable is available somewhere on users hard drive. Not so optimal...
>
> I used until now this command to compile:
> gcc C:\md5sum.c -shared -o C:\md5sum.dll -DUSE_TCL_STUBS -IC:/Tcl/include -LC:/Tcl/lib -ltclstub86
>
> After adding "-O3 -march=native" it's 50% faster! than the Tcllib implementation. Thanks for the tip! No idea yet what -O3 -march=native does, but are there any other magical keywords that I can use to make it as fast as the openssl?
>
> >
> > > This is the code:
> > > s = nil;
> > > const int bufsize = 1024*1024; // 1M, must be divisible by 64
> > > int bytes_read;
> > > char *buffer = malloc(bufsize);
> > > while (1) {
> > > int bytes_read = fread(buffer, 1, bufsize, fd);
> >
> > delete the int. This is an error, now you have two variables bytes_read.
>
> The compile said nothing about it. Weird... I removed the second int declaration and now I get the correct result. But still 2x slower.
This was before I added the new compile options. Ignore this sentence.

Rich

unread,
Mar 15, 2017, 11:08:03 AM3/15/17
to
Alexandru <alexandr...@meshparts.de> wrote:
> Am Mittwoch, 15. März 2017 09:37:30 UTC+1 schrieb Christian Gollwitzer:
>> Am 15.03.17 um 08:35 schrieb Alexandru:
>> >> I tested your code (and had to add some declarations) and now it runs but:
>> >> 1. I get the wrong checksum
>>
>>
>> >> 2. It's still 2x slower...
>>
>> are you sure you compile with full optimizations?
>
> After adding "-O3 -march=native" it's 50% faster! than the Tcllib
> implementation. Thanks for the tip! No idea yet what -O3
> -march=native does, but are there any other magical keywords that I
> can use to make it as fast as the openssl?

-O3 tells the compiler to apply full optimizations to its output (which
often results in substantial speedups.

-march=native means the output object file is specific to the
particular CPU generation you compiled it upon. It will fail to run on
other CPU's that do not have the particular mix of instructions the
compiler chose due to whichever CPU you compiled it upon. This is fine
as long as you know an end user will have a CPU that is compatible with
the code. But if you plan a more 'general' distribution you'll want to
not use "native" and instead pick a target that most everyone will
almost always have (you can get the targets your compiler supports by
reading the GCC info page for the compiler).

>
>>
>> > This is the code:
>> > s = nil;
>> > const int bufsize = 1024*1024; // 1M, must be divisible by 64
>> > int bytes_read;
>> > char *buffer = malloc(bufsize);
>> > while (1) {
>> > int bytes_read = fread(buffer, 1, bufsize, fd);
>>
>> delete the int. This is an error, now you have two variables
>> bytes_read.
>
> The compile said nothing about it. Weird... I removed the second int
> declaration and now I get the correct result. But still 2x slower.

Because it was not an 'error' the compiler can detect. It is perfect,
standards compilant, C (well C99 or better at least) code.

The reason it is also a bug is you have a "bytes_read" variable
declared outside the block that is the while loop body, and you have a
second "bytes_read" variable declared inside the block that is the
while loop body.

This is perfectly valid C, but incorrect for your use, because the
"bytes_read" inside the block only exists inside the block (the
technical term is it "shadows" the "bytes_read" variable declared
outside the block). What you have is two separate, independent,
"bytes_read" variables.

As soon as you exit the while loop, the variable that was declared
inside the loop disappears, and the variable that is used is the one
declared outside the loop. So your later test wasn't testing the
result of the last fread, but the value before you even read any data
at all. And as you didn't initalize bytes_read, the 'error' comes from
using an uninitalized variable outside the loop.

Alexandru

unread,
Mar 15, 2017, 11:15:30 AM3/15/17
to
Am Mittwoch, 15. März 2017 16:08:03 UTC+1 schrieb Rich:
Thanks for the explanation. It makes sense now.

Alexandru

unread,
Mar 31, 2017, 12:34:00 PM3/31/17
to
Am Donnerstag, 9. März 2017 08:44:23 UTC+1 schrieb Christian Gollwitzer:
Thanks to all for the valuable help. Now I have a package that runs significantly faster than the Tcllib implementation. It's still slower than the openssl implementation but I can live with that for now.
You can download the package and source code from this link: www.meshparts.de/download/checksum.zip

Andreas Kupries

unread,
May 6, 2017, 2:38:06 AM5/6/17
to
Alexandru <alexandr...@meshparts.de> writes:


>> > Nice. With -file option it's significantly faster: 1.7 seconds
>> > instead of 3.3 seconds before.

About 1.7 seconds for about 200 MB ?

That is about 200/1.7 MB/seconds, i.e. 117 MB/s

That is (IMNSHO) _not_ slow.

>> > It's still not so fast as it should be though...

How fast do you expect it to be ?

Note that Paul mentioned a separate md5sum.exe.
How fast is that (pure C) on your machine ?

> I'm still not sure it uses the C extension.
>> >
>> If disabling critcl when building tcllib (set buildTcllibC false in
>> tcllib.bawt), I get the following timing:
>> % time "md5::md5 -file GeographicLibData.7z -hex"
>> 133288874 microseconds per iteration
>>
>> That's more than 2 minutes in pure Tcl vs. 0.5 seconds in critcl mode.
>> So your tcllib implementation surely uses the C extension!

When you use

puts [join [info loaded] \n]

you should see a list of loaded shared libaries, and tcllibc.so in
that list.

>>
>> Paul
>
> I went throgh the code and coul'n find the cause for the bad performance.
> One thing though: The cache directory ~/.critcl for critcl is empty,
> but it should containn the c files, right?

No. The .critcl directory is only used when you __dynamically__ use
critcl-based code, where the code is compiled it on the fly when
needed.

The tcllibc.so is __precompiled__ with critcl. At that point it is
regular binary shared library which gets loaded with [load], and
critcl is not involved at all anymore.


--
See you,
Andreas Kupries <akup...@shaw.ca>
<http://core.tcl.tk/akupries/>
Developer @ SUSE (MicroFocus Canada LLC)
<andreas...@suse.com>

Tcl'2017, Oct 16-20, Houston, TX, USA. http://www.tcl.tk/community/tcl2017/
EuroTcl 2017, Jul 8-9, Berlin/DE, http://www.eurotcl.tcl3d.org/
-------------------------------------------------------------------------------

Andreas Kupries

unread,
May 6, 2017, 2:38:06 AM5/6/17
to

Alexandru writes:
>>> > Actually, thinking more about it, it's not much slower than what Paul
>>> > gets:
>>> > Paul: 133MB, 0.5s
>>> > Me: 210MB, 1.7s
>>> >
>>> > I guess, the relationship between file size and time is not
>>> > linear,

Run your test with files of size 1, 10, 100 MB.
compute the speed of MB/s for each one.

Linear means that the speed is roughly the same for all cases.

Alternatively you can compare the ratio of times for the sucessive
cases I gave. For linear this ratio is constant (within measurement
error), here about 10.

Intuituively, as the size of the file goes by a factor X for linear
the time spent on the calculation goes up by the same factor X.

In my example the file sizes go up by 10, so does the time. This means
you can divide the time for 100 MB by the time for 10 MB and it should
be about 10. Etc.

For quadratic, cubic, etc, and exponentional, etc. this ratio between
sucessive cases is not contant, but goes up instead, significantly.


Alexandru writes:
>> Thanks for the point. Actually I thought about this but I was just
>> assuming than Paul and I use an up to date computer and the
>> differences should not be that much.

Rich <ri...@example.invalid> writes:
> Unless you and Paul have identical systems (CPU's, chipsets, disk
> drives, memory types) then there's too many possible variables there.

Fully agreeing with Rich here. You can compare only times for the same
configuration, basically one machine, for different implementations of
the same algorithm.

Same implementation of an algorithm on different machines compares the
machines, not the implementation.

Different implementations on different machines is an apples vs nails
comparison.

> Compare your machine with Tcl to your machine with the md5sum utility.
> Then you remove all the architectural differences and have left only
> "Tcl w/ C extension" and "pure C".
>

Andreas Kupries

unread,
May 6, 2017, 2:38:07 AM5/6/17
to
Alexandru <alexandr...@meshparts.de> writes:

>> > This is the code:
>> > s = nil;
>> > const int bufsize = 1024*1024; // 1M, must be divisible by 64
>> > int bytes_read;
>> > char *buffer = malloc(bufsize);
>> > while (1) {
>> > int bytes_read = fread(buffer, 1, bufsize, fd);
>>
>> delete the int. This is an error, now you have two variables bytes_read.
>
> The compile said nothing about it.

It should not. This is completely legitmate C code. With the 'while
... {' you opened an inner scope/block and you can override all
variables from the outer scope/block as you see fit.

At best you might be able to convince the cc to give you a warning. I
have no idea however what options might be needed for that.

Alexandru

unread,
May 6, 2017, 4:45:29 AM5/6/17
to
Am Samstag, 6. Mai 2017 08:38:06 UTC+2 schrieb Andreas Kupries:
> Alexandru writes:
>
>
> >> > Nice. With -file option it's significantly faster: 1.7 seconds
> >> > instead of 3.3 seconds before.
>
> About 1.7 seconds for about 200 MB ?
>
> That is about 200/1.7 MB/seconds, i.e. 117 MB/s
>
> That is (IMNSHO) _not_ slow.
>
> >> > It's still not so fast as it should be though...
>
> How fast do you expect it to be ?

As fast as the openssl.exe, which is 2x faster than my own C implementation and 4x faster than the Tcllib implementation.

>
> Note that Paul mentioned a separate md5sum.exe.
> How fast is that (pure C) on your machine ?
>
> > I'm still not sure it uses the C extension.
> >> >
> >> If disabling critcl when building tcllib (set buildTcllibC false in
> >> tcllib.bawt), I get the following timing:
> >> % time "md5::md5 -file GeographicLibData.7z -hex"
> >> 133288874 microseconds per iteration
> >>
> >> That's more than 2 minutes in pure Tcl vs. 0.5 seconds in critcl mode.
> >> So your tcllib implementation surely uses the C extension!
>
> When you use
>
> puts [join [info loaded] \n]
>
> you should see a list of loaded shared libaries, and tcllibc.so in
> that list.

No, I don't see that.

Alexandru

unread,
May 6, 2017, 4:57:04 AM5/6/17
to
Am Samstag, 6. Mai 2017 08:38:06 UTC+2 schrieb Andreas Kupries:
> Alexandru writes:
> >>> > Actually, thinking more about it, it's not much slower than what Paul
> >>> > gets:
> >>> > Paul: 133MB, 0.5s
> >>> > Me: 210MB, 1.7s
> >>> >
> >>> > I guess, the relationship between file size and time is not
> >>> > linear,
>
> Run your test with files of size 1, 10, 100 MB.
> compute the speed of MB/s for each one.
>
> Linear means that the speed is roughly the same for all cases.
Thanks, I think I got this ;)
Yes,

Rich

unread,
May 6, 2017, 10:44:52 AM5/6/17
to
Alexandru <alexandr...@meshparts.de> wrote:
> Am Samstag, 6. Mai 2017 08:38:06 UTC+2 schrieb Andreas Kupries:
>> Alexandru writes:
>>
>>
>> >> > Nice. With -file option it's significantly faster: 1.7 seconds
>> >> > instead of 3.3 seconds before.
>>
>> About 1.7 seconds for about 200 MB ?
>>
>> That is about 200/1.7 MB/seconds, i.e. 117 MB/s
>>
>> That is (IMNSHO) _not_ slow.
>>
>> >> > It's still not so fast as it should be though...
>>
>> How fast do you expect it to be ?
>
> As fast as the openssl.exe, which is 2x faster than my own C
> implementation and 4x faster than the Tcllib implementation.

In which case you may have to extract out the openssl source and wrap
it into a Tcl C module wrapper. It (openssl) obviously has some
additional optimizations (or a different, more efficient, algorithm)
that is not captured by either your C nor the Tcllib version.

Alexandru

unread,
May 7, 2017, 2:25:46 AM5/7/17
to
Am Samstag, 6. Mai 2017 16:44:52 UTC+2 schrieb Rich:
Yes, I will probably do that. For the moment I'm happy with the performance of my own implementation.

Regards
Alexandru
0 new messages