[ANN] Vertigo: fast, idiomatic C-style structs

773 views
Skip to first unread message

Zach Tellman

unread,
Jul 9, 2013, 11:56:03 PM7/9/13
to clo...@googlegroups.com
Last year, I gave a talk at the Conj on my attempt to write an AI for the board game Go.  Two things I discovered is that it was hard to get predictable performance, but even once I made sure I had all the right type hints, there was still a lot of room at the bottom for performance improvements.  Towards the end [1], I mentioned a few ideas for improvements, one of which was simply using ByteBuffers rather than objects to host the data.  This would remove all the levels of indirection, giving much better cache coherency, and also allow for fast unsynchronized mutability when the situation called for it.

So, ten months and several supporting libraries [2] [3] later, here it is: https://github.com/ztellman/vertigo

At a high level, this library is useful whenever your datatype has a fixed layout and is used more than once.  Depending on your type, it will give you moderate to large memory savings, and if you're willing to forgo some of core library in favor of Vertigo's operators, you can get significant performance gains on batch operations.  And, in the cases where performance doesn't matter, it will behave exactly like any other Clojure data structure.

I want to point out that something like this would be more or less impossible in Java; reading from an offset in a ByteBuffer without the compile-time inference and validation provided by this library would be pointlessly risky.  There's not a lot of low-level Clojure libraries, but there's an increasing amount of production usage where people are using Clojure for performance-sensitive work.  I'm looking forward to seeing what people do with Vertigo and libraries like it.

Zach

kovas boguta

unread,
Jul 15, 2013, 12:16:29 AM7/15/13
to clo...@googlegroups.com
This is pretty neat. 

Anyone try using this in conjunction with mmap? 

It would be nice to have some way to deal with strings & other variable-length data. 

I'm also curious if its possible to make the analog of this for fressian, basically to avoid unpacking objects that are not necessary for the computation at hand. 






--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Karsten Schmidt

unread,
Jul 15, 2013, 11:12:24 AM7/15/13
to clo...@googlegroups.com
Hi Zach, this looks very interesting indeed and a great next step
after Gloss. For my SimpleCL[1] project I needed something v.similar,
but went a slightly different way to represent clj data as byte
buffers and compile them from specs parsed from C (header) files[2]. I
will give Vertigo a try ASAP and see how (much better) performance is
with that... great stuff & thanks for sharing!

[1] http://hg.postspectacular/simplecl
[2] http://hg.postspectacular/structgen

Best, K.
--
Karsten Schmidt
http://postspectacular.com | http://toxiclibs.org | http://toxi.co.uk

Zach Tellman

unread,
Jul 15, 2013, 4:40:55 PM7/15/13
to clo...@googlegroups.com
If you (vertigo.core/wrap "a-file-name"), that will use mmap under the covers, so if no one's tried it, it's easy enough to start.  

With respect to non-fixed data layouts, that could be supported by a library which parses the framing information, and then layers Vertigo atop the actual data.  In effect, that's what Gloss [1] is going to become, so keep watching the skies.

Zach



You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/BayfuaqMzvs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

kovas boguta

unread,
Jul 16, 2013, 12:48:35 AM7/16/13
to clo...@googlegroups.com
Interesting. This seems like a pretty promising direction for the bottom of the big-data stack.

A use case on my mind is sorting a big list of datastructures by key (or some set of keys/paths) . 

Once the data gets big, you need to do an external sort, which means tons of serialization round trips if implemented naively. Being able to just pluck out the values you need really helps in that case. Besides saving on the serialization overhead, it also cuts down on memory which means you can sort much bigger segments at a time, and complete the overall sort in fewer passes.





Zach Tellman

unread,
Jul 16, 2013, 7:27:34 PM7/16/13
to clo...@googlegroups.com
Yeah, one nice property of this is that all the underlying objects play nicely with byte-streams [1], so it's trivial to do something like:

---

(def s (vertigo.core/wrap type "some-huge-file"))

(let [indices (sort-by #(vertigo.core/get-in s [% :foo :bar]) (range (count s))]
  (doseq [idx indices]
    (byte-streams/transfer (nth s idx) sorted-file {:append true})))

---

If there's framing information we won't be able to do this quite as efficiently, but it shouldn't be too much more expensive.

Daniel

unread,
Jul 17, 2013, 2:49:17 PM7/17/13
to clo...@googlegroups.com
How did this affect performance in your Go AI?

Zach Tellman

unread,
Jul 17, 2013, 11:35:00 PM7/17/13
to clo...@googlegroups.com
I actually haven't applied it yet.  I'll post results once I have.


--

Patrick Wright

unread,
Jul 19, 2013, 3:13:04 AM7/19/13
to clo...@googlegroups.com
Zach,  

it might be interesting to keep an eye on the newly-announced ObjectLayout project by Gil Tene and Martin Thompson. Discussion/overview is on this mail thread

and the project is here

Regards
Patrick

Ezra Lee

unread,
Jul 31, 2013, 1:10:54 PM7/31/13
to clo...@googlegroups.com
Hi,
I'm trying out vertigo and hoping you can help me figure out what I am missing, I get an error when I use get-in:

; nREPL 0.1.8-preview
user> (use 'vertigo.structs)
nil
user> (def-typed-struct ints-and-floats :ints (array uint32 10) :floats (array float32 10))
#'user/ints-and-floats
user> (def x {:ints (range 10) :floats (map float (range 10))})
#'user/x
user> (require '[vertigo.core :as v])
nil
user> (def ^:ints-and-floats s (v/marshal-seq ints-and-floats [x]))
#'user/s
user> (v/get-in s [:floats 4])
IllegalArgumentException Invalid field '4' for type ints-and-floats  vertigo.core/validate-lookup (core.clj:177)
user> (v/get-in s [4 :floats])
IllegalArgumentException   java.nio.Buffer.position (Buffer.java:216)

Thanks,
Ezra

Zach Tellman

unread,
Jul 31, 2013, 1:17:17 PM7/31/13
to clo...@googlegroups.com
Hi Ezra,

This is admittedly a little confusing, but you're hinting 's' with the type of the *element*.  Here you've created a sequence containing a single 'ints-and-floats' struct, so you'd want to do this:

user> (v/get-in s [0 :floats 4])
4.0
user> (v/get-in s [0])
{:ints (0 1 2 3 4 5 6 7 8 9), :floats (0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0)}

I'll try to make this clearer in the documentation.  Let me know if you have any other questions.

Zach


--

Zach Tellman

unread,
Jul 31, 2013, 1:21:06 PM7/31/13
to clo...@googlegroups.com
Actually, looking at the readme, I can see the code you were trying to use.  Sorry, I'm not sure how I didn't catch that before, but I've fixed it.

Zach
Reply all
Reply to author
Forward
0 new messages