Transient support for gvec

173 views
Skip to first unread message

Bryce Nyeggen

unread,
Aug 25, 2013, 1:43:44 AM8/25/13
to cloju...@googlegroups.com
I'm interested in implementing transient support for gvec / Vec / vector-of, hewing closely to the implementation in PersistentVector.  Would such a patch be welcomed?

Michał Marczyk

unread,
Aug 25, 2013, 4:31:19 AM8/25/13
to cloju...@googlegroups.com
Actually I have an implementation of transients for gvec lying around -- I've put a sketch together after implementing transients for ClojureScript vectors, then did the actual coding for core.rrb-vector and lately actually ported the code to gvec. I haven't got around to offering a patch yet primarly because I still need to put in some extra tests, and additionally because there are some rough edges around gvec which I'd like to treat this as an opportunity to polish (off the top of my head, cache hashing and hasheq -- both pretty much copy & paste from core.rrb-vector at this point).

If there's interest in this, I guess I'd better make the patch public -- here it is in a WIP commit on top of current master:


I'll do the polishing outlined above in the next couple of days. (And try to be more open about almost-done improvements in the future, I suppose?)

Cheers,
Michał


On 25 August 2013 07:43, Bryce Nyeggen <fiat....@gmail.com> wrote:
I'm interested in implementing transient support for gvec / Vec / vector-of, hewing closely to the implementation in PersistentVector.  Would such a patch be welcomed?

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure-dev...@googlegroups.com.
To post to this group, send email to cloju...@googlegroups.com.
Visit this group at http://groups.google.com/group/clojure-dev.
For more options, visit https://groups.google.com/groups/opt_out.

Bryce Nyeggen

unread,
Aug 25, 2013, 11:33:51 AM8/25/13
to cloju...@googlegroups.com
That would be very useful - I'm writing some data structures on top of gvec, and they get much faster if you can "mutate" the backing data as a transient when you need to make a bunch of edits.  If there are parts of the change that could use additional help I'd be happy to jump in.

Michał Marczyk

unread,
May 6, 2014, 8:09:01 PM5/6/14
to cloju...@googlegroups.com
Bumping this to ask if there's interest in getting transient gvec in
1.7. If so, I'll bring the patch up to date with master and create a
ticket to track it.

Bryce Nyeggen

unread,
May 6, 2014, 8:44:56 PM5/6/14
to cloju...@googlegroups.com
I remain interested in it, at least :).  Relatively transparent support for primitive-backed data structures, without having to explicitly specialize for them, is one of the big appeals of Clojure; transient support helps this quite a bit if you're trying to optimize for both speed and compactness.

al...@puredanger.com

unread,
May 6, 2014, 9:20:34 PM5/6/14
to cloju...@googlegroups.com, cloju...@googlegroups.com
Yes, would be interested. I think there is a ticket for this already but can't search right now.
> For more options, visit https://groups.google.com/d/optout.

Michał Marczyk

unread,
May 6, 2014, 10:14:52 PM5/6/14
to cloju...@googlegroups.com
Great! I couldn't find an existing ticket, so here's a new one:

http://dev.clojure.org/jira/browse/CLJ-1416

Note that core.rrb-vector's vectors of primitives have had transient
support for a while. As of 0.0.11, fv/vec, when presented with an
existing vector, simply rewraps its internal tree in an RRB wrapper.
There's also fv/vector-of, which uses an RRB wrapper from the get-go.
See below for some benchmarks using Clojure 1.6.0 and core.rrb-vector
0.0.11 -- constructing a gvec of 1 << 20 entries takes 4.6 seconds vs.
165 ms for "RRB gvec" on my box, with comparable lookup times.

Also note that there are some performance caveats that will not be
apparent from these; in particular, mixing gvec-based and PV-based RRB
vectors in a single JVM process leads to weird slowdowns. I hope to
address this problem in the not-too-distant future, although it's
difficult to say when exactly, since I think it'll be as part of a
major revision. For the time being, applications that have no need to
perform RRB slicing/splicing but would benefit from fast gvec
construction times could use core.rrb-vector just for that.

Cheers,
Michał


user> (let [cnt (bit-shift-left 1 20)
idx 16384
gv (apply vector-of :long (range cnt))
rv (apply fv/vector-of :long (range cnt))]
(c/quick-bench (nth gv idx))
(c/quick-bench (nth rv idx))
(c/quick-bench (apply vector-of :long (range cnt)))
(c/quick-bench (apply fv/vector-of :long (range cnt))))
WARNING: Final GC required 45.92110074953044 % of runtime
Evaluation count : 21241356 in 6 samples of 3540226 calls.
Execution time mean : 26.905769 ns
Execution time std-deviation : 1.098666 ns
Execution time lower quantile : 25.860689 ns ( 2.5%)
Execution time upper quantile : 28.387844 ns (97.5%)
Overhead used : 2.264025 ns
WARNING: Final GC required 36.04944591232015 % of runtime
Evaluation count : 21511626 in 6 samples of 3585271 calls.
Execution time mean : 29.052520 ns
Execution time std-deviation : 1.400857 ns
Execution time lower quantile : 27.320555 ns ( 2.5%)
Execution time upper quantile : 30.367833 ns (97.5%)
Overhead used : 2.264025 ns
WARNING: Final GC required 1.9373098392272299 % of runtime
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 4.601333 sec
Execution time std-deviation : 23.939245 ms
Execution time lower quantile : 4.560345 sec ( 2.5%)
Execution time upper quantile : 4.621618 sec (97.5%)
Overhead used : 2.264025 ns

Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
WARNING: Final GC required 33.30312357504195 % of runtime
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 164.981093 ms
Execution time std-deviation : 16.188718 ms
Execution time lower quantile : 143.712449 ms ( 2.5%)
Execution time upper quantile : 180.581614 ms (97.5%)
Overhead used : 2.264025 ns

Alex Miller

unread,
May 6, 2014, 10:29:30 PM5/6/14
to cloju...@googlegroups.com
Yeah, I looked and couldn't find one either. There are several other gvec-related and transient-related tickets but not this one. 

Michał Marczyk

unread,
May 13, 2014, 6:34:08 AM5/13/14
to cloju...@googlegroups.com
Just attached the patch to the ticket.

Changes:

1. transient support for gvec
2. hash{eq,Code} caching for gvec, gvec seqs
3. equals, toString, hashCode implementations for gvec seqs

Cheers,
Michał

Michał Marczyk

unread,
May 13, 2014, 6:36:49 AM5/13/14
to cloju...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages