reverse complement benchmark

79 views
Skip to first unread message

Stuart Halloway

unread,
May 28, 2012, 6:00:12 PM5/28/12
to cloju...@googlegroups.com
I have added a stab at reverse-complement [1] to the test.benchmark repos. I believe it will do better than the one currently on alioth [2].

I would be thrilled if anybody wanted to look at this and do any of: test correctness, verify perf, publish to alioth.

Cheers,
Stu

[1] https://github.com/clojure/test.benchmark/commit/8d222d4160fd4aa63988e6bb197c562e607798f0
[2] http://shootout.alioth.debian.org/u32q/performance.php?test=revcomp

Andy Fingerhut

unread,
May 29, 2012, 12:17:56 PM5/29/12
to cloju...@googlegroups.com
Stuart:

When I AOT compile your code with Clojure 1.4.0 -- after adding (:gen-class) to ns declaration to fit in with my clojure-benchmarks environment, in case that makes a difference -- I get these reflection warnings:

Reflection warning, revcomp.clj:50 - call to invokePrim can't be resolved.
Reflection warning, revcomp.clj:54 - call to invokePrim can't be resolved.

Those lines are the two calls to the function revcomp inside of function with-each-line-rc. As a consequence of the reflection, performance is pretty bad -- much slower than the current Alioth web site code. I give some measurements below. There are more reflection warnings if I AOT-compile with Clojure 1.3.0.

If I remove the two ^long type hints for the arguments of function revcomp, the reflection warnings go away, but the output of the program is incorrect. The goal of a reverse complement program is not to reverse each line of a DNA sequence, but each entire DNA sequence, where a DNA sequence includes all lines found after a line beginning with >, up to but not including the next such line, or the end of file.

I made some small changes to correct that (link below -- feel free to commit the changes if you like). Below are some measurements for several versions of the reverse complement program. All times are elapsed seconds, not total CPU time, and give only the fastest time of 3 different runs.

Stuart's program with reflection warnings left in the code: 31.3 sec
Stuart's program with reflection warnings eliminated, but output is incorrect: 13.9 sec
Stuart's program with reflection warnings eliminated and Andy's modifications to give correct output: 13.4 sec

revcomp.clojure-4.clojure is the current fastest Clojure program on the Alioth web site (link below)
revcomp.clojure-4.clojure: 4.8 sec (Alioth 64-bit quad-core elapsed time: 4.25 sec)

revcomp.java-3.java is the current fastest Java program on the Alioth web site, at least for the quad-core benchmark machines, and I am running on a quad-core machine, too (link below)
revcomp.java-3.java: 1.5 sec (Alioth 64-bit quad-core elapsed time: 1.31 sec)


More details about the performance numbers, if you are curious:

AOT compilation was done in all cases, and the compilation time is not included in what is reported.

All runs were with the same input file used by the Alioth web site for reporting performance of reverse complement programs. It uses N=25,000,000. The file is approximately 250 megabytes in size.

All numbers are using Oracle JDK 1.7.0_04 for Linux, and Clojure 1.4.0, which is what the Alioth web site is currently using. My machine is similar but not identical to the Alioth 64-bit quad-core benchmark machine, so comparing my absolute numbers with theirs give similar but not identical results. It is best to compare the numbers from my machine to each other, which use the same machine/OS/JDK/Clojure-version, but different programs.

Link to my edited version of Stuart's program: https://github.com/jafingerhut/clojure-benchmarks/blob/master/revcomp/revcomp.clj-14-fixes.clj

Link to revcomp.clojure-4.clojure 64-bit quad-core results and source code: http://shootout.alioth.debian.org/u64q/program.php?test=revcomp&lang=clojure&id=4

Link to revcomp.java-3.java 64-bit quad-core results and source code: http://shootout.alioth.debian.org/u64q/program.php?test=revcomp&lang=java&id=3

Andy

Stuart Halloway

unread,
May 29, 2012, 6:15:36 PM5/29/12
to cloju...@googlegroups.com
Hi Andy,

Heading out and don't have time to grok this fully now, but quickly: Is the addition of gen-class introducing these reflection warnings? I didn't have any reflection warnings running locally.

Thanks for all your work on this! Feel free to commit your fixes, or better yet, commit your better benchmark over the top of mine if, as I gather, it is winning substantially.

Stu

Andy Fingerhut

unread,
May 30, 2012, 1:21:38 PM5/30/12
to cloju...@googlegroups.com
The reflection warnings are due to my changing the namespace to "revcomp", the same name as the function giving the reflection warnings, plus using :gen-class and AOT compilation. Short answer: I should just use a multi-segment namespace like sane people do.

I ran a few tests, and found that the reflection warnings only appeared when I had this combination:

(1) ns declaration with namespace name "revcomp", which is the same as the function revcomp in the program
(2) (:gen-class) in the ns declaration
(3) AOT compilation

This reminds me of CLJ-446. The only difference there was instead of (1) I had a namespace and a deftype-declared type with the same name.

Andy
> --
> You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
> To post to this group, send email to cloju...@googlegroups.com.
> To unsubscribe from this group, send email to clojure-dev...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/clojure-dev?hl=en.
>

Reply all
Reply to author
Forward
0 new messages