100x startup for Clojure using GraalVM

600 views
Skip to first unread message

Alan Thompson

unread,
Nov 8, 2019, 3:43:42 PM11/8/19
to clojure
Some people I know have been interested in switching from Clojure to Go in order to get faster startup times and statically linked executables for microservices on AWS Lambda, etc.  Having recently reviewed the Go: the Good, the Bad, and the Ugly, as well as stumbling through a Go bootcamp, I was looking for a good counter-argument.  While the command-line usage via Clojure/CLI:

> clj -e '(println "Hello World")'      # 0.98 sec

takes only about a second, and 

> java -jar hello-standalone.jar        # 1.30 sec

takes only about 1.3 seconds, I needed something faster.  Joker has a lot of potential, but it includes only basic Clojure namespaces.  Having tracked news about GraalVM over the past couple of years, I thought it was time for a quick demo project.

TL;DR:  The bottom line:

> target/hello-world                    # 0.009 sec

Yes, you read that right!  The statically linked executable is over 130x faster than running via the uberjar.

I summarized all the steps to install and run using GraalVM in this demo project:


Just peruse the README and you'll be off to the races.

See also:


Kim Kinnear

unread,
Nov 9, 2019, 5:04:21 PM11/9/19
to Clojure
I have found the same improvements with graalVM binaries!  It is amazing!

I was just testing zprint before releasing it, and here are the numbers for a moderately sized program to start on a 2012 MacBook Air:

>java -jar zprint-filter-0.5.3 <helloworld.clj       2.483s

>same as above using appcds                          1.160s

>zprintm-0.5.3 <helloworld.clj                       0.019s

If you care about startup, you should definitely give graalVM a try.  Not to say that it doesn't have its quirks, and I have spent a good bit of time working around some of issues in the older versions (that may be fixed now), but it has certainly been worth it!


Gerard Klijs

unread,
Nov 10, 2019, 1:51:16 AM11/10/19
to Clojure
You probably aware of this, but in case you don't. Running JVM as native image does start up faster, but throughput is less and latency is higher than running on the JVM.

Michiel Borkent

unread,
Nov 10, 2019, 10:54:09 AM11/10/19
to Clojure
Might be worth mentioning that lread and I are collecting information about GraalVM here:


Alan Thompson

unread,
Nov 12, 2019, 1:42:36 PM11/12/19
to clojure
In my initial post, I failed to mention the huge memory savings achieved by the standalone executable (in addition to the startup time savings).  

Note that using time at the command line resolves to a shell built-in command. We can get more information from the standard Unix version of time:
# JVM+UberJar
> /usr/bin/time -l  java -jar target/hello-world-0.1.0-SNAPSHOT-standalone.jar
Hello, World!
Goodbye...
        1.20 real         2.47 user         0.24 sys
       409  maximum resident set size (MB)
    100469  page reclaims
      3569  involuntary context switches

# Static Executable
> /usr/bin/time -l  target/hello-world
Hello, World!
Goodbye...
        0.00 real         0.00 user         0.00 sys
         2  maximum resident set size (MB)
       657  page reclaims
         4  involuntary context switches
So we see that the maximum RSS memory requirement was reduced from 409 MB to 2 MB. Yes, an improvement over 200x!  Note also that context switches have been reduced by 900x, and page reclaims by about 200x.

So, it is the combination of reduced startup time and vastly reduced memory requirements that make standalone executables ideal for short-lived tasks, especially in constrained environments such as serverless/lambda.



On Sun, Nov 10, 2019 at 7:54 AM Michiel Borkent <michiel...@gmail.com> wrote:
Might be worth mentioning that lread and I are collecting information about GraalVM here:


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/clojure/8dbd1a23-61b4-42dc-8815-3d8422956901%40googlegroups.com.

Colin Yates

unread,
Nov 12, 2019, 1:46:11 PM11/12/19
to clo...@googlegroups.com
Do we have any idea how that memory saving scales? 

I know a bunch of meta data isn’t needed as it is hotspot specific, but are there any other memory savings?

Sent from my iPhone

Nate Sutton

unread,
Nov 12, 2019, 3:58:14 PM11/12/19
to clo...@googlegroups.com
Another way to achieve fast startup is to compile clojurescript to a nodejs target and then use the nodejs library called pkg to bundle the nodejs binary with the script. I haven't timed it but it's an interesting alternative.

Alan Thompson

unread,
Nov 12, 2019, 6:54:28 PM11/12/19
to clojure
A quick comparison with python:

> time python -c 'print("Hello world!")'
Hello world!
0.03s user 0.01s system 80% cpu 0.048 total

> /usr/bin/time -l  python -c 'print("Hello world!")'
Hello world!
        0.04 real         0.02 user         0.01 sys
         6  maximum resident set size (MB)
      2110  page reclaims
        24  involuntary context switches


So the Python version takes 5x longer, and uses 3x more memory.


Gerard Klijs

unread,
Nov 12, 2019, 11:45:48 PM11/12/19
to Clojure
Hello world is fun, but doesn't say much. I would like to see benchmarks on the actual application. Ideally it would take several jvm's so also Graal and J9 and also use the commercial version of making a native image, asses how much memory is needed when run on the JVM and limit that, since otherwise it will might take upto a quarter of available memory. Measures both the time to the first successful handled request from startup, the max throughput after being warmed up properly, the .99 percentile latency after being warmed up, and the memory use.
Only if you have those numbers you could decide if a native image is worth it. For example if with the current load one instance with a native image could always handle the load, it might save a lot, because you can and to 0, where the total first request just take seconds. But if at loyal moments with a native image you need 30 instances, while on the JVM you need only 3 it might not.

Andy Fingerhut

unread,
Nov 13, 2019, 1:06:41 AM11/13/19
to clo...@googlegroups.com
I believe at least some of the people working on this, and interested in these results, would like to use Clojure for command line utilities and such, which tend to have quite short run times when implemented in C/C++/Python/etc.  They are probably much less interested in using these methods for long-running server processes.

Andy

On Tue, Nov 12, 2019 at 8:46 PM 'Gerard Klijs' via Clojure <clo...@googlegroups.com> wrote:
Hello world is fun, but doesn't say much. I would like to see benchmarks on the actual application. Ideally it would take several jvm's so also Graal and J9 and also use the commercial version of making a native image, asses how much memory is needed when run on the JVM and limit that, since otherwise it will might take upto a quarter of available memory. Measures both the time to the first successful handled request from startup, the max throughput after being warmed up properly, the .99 percentile latency after being warmed up, and the memory use.
Only if you have those numbers you could decide if a native image is worth it. For example if with the current load one instance with a native image could always handle the load, it might save a lot, because you can and to 0, where the total first request just take seconds. But if at loyal moments with a native image you need 30 instances, while on the JVM you need only 3 it might not.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.

Daniel Compton

unread,
Nov 28, 2019, 5:14:10 PM11/28/19
to Clojure List
> They are probably much less interested in using these methods for long-running server processes.

At work, we were quite interested in making a native-image of our API server recently. Having a fast boot would have opened a lot more possibilities around where we could run it. Serverless environments like AWS Lambda, Google Cloud Run, Google App Engine Standard, and other places that boot in response to a request were not even close to workable with our moderately sized Clojure API server. We did some experiments with making a native-image, but there were too many libraries that needed to be updated to do it now. We're going to chip away at making our dependencies native-image compatible and then have another crack later.

Bruno Bonacci has a collection of sample apps at https://github.com/BrunoBonacci/graalvm-clojure showing how to build various libraries and highlighting where there are issues currently.

david hoyt

unread,
Nov 29, 2019, 1:32:52 PM11/29/19
to Clojure
This kind of think is really only interesting for shell piping in bash. It won’t help numerical, tensor, neural, simulation, nor business codes. Good FORTRAN environments can do hard numerical faster. To a lesser amount, so can C. In supercomputing, that advantage is reduced. I/O bandwidth becomes more important. In business, I/O is the only thing that’s important. The only think that reliably makes a difference is the quality of the developers.

Startup in Java, Microsoft’s library framework, &c. have slower start up times because no one has bothered to optimize things for startup time. The time to get the entire task completed (or cost) is the only thing that is important.

Alan Thompson

unread,
Dec 3, 2019, 6:15:52 PM12/3/19
to clojure
When you have a short-running task that completes in 0.01 sec, a startup delay of 1.3 seconds (or more) is the total execution time.

On Fri, Nov 29, 2019 at 10:32 AM david hoyt <davidp...@gmail.com> wrote:
This kind of think is really only interesting for shell piping in bash. It won’t help numerical, tensor, neural, simulation, nor business codes. Good FORTRAN environments can do hard numerical faster. To a lesser amount, so can C. In supercomputing, that advantage is reduced. I/O bandwidth becomes more important. In business, I/O is the only thing that’s important. The only think that reliably makes a difference is the quality of the developers.

Startup in Java, Microsoft’s library framework, &c. have slower start up times because no one has bothered to optimize things for startup time. The time to get the entire task completed (or cost) is the only thing that is important.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages