A value can have more than one form or implementation. For example, a String and a char array may both represent the same value. Most objects, for example, can be serialized. Changing the form of a value then is not a change of value. This is the idea behind lazy deserialization / reserialization.
We can represent a data structure as a tree, where the values in the tree can be either the immutable value of the object or a read-only ByteBuffer holding the serialized form of the value, or both. Both forms can be held by nodes using atoms, so the form of the values can be changed without having to create a new tree. The value remains unchanged and the tree can be safely shared between threads.
Lazy deserialization comes after loading a map or vector from a ByteBuffer. Initially only the root exists, which holds only the ByteBuffer. As you access and alter the contents of the map or vector, the portions of the tree which are accessed are deserialized.
Incremental reserialization then occurs when writing a map or vector back to a ByteBuffer. Unaltered portions of the tree need not be serialized, as a reference to the ByteBuffer that the structure was loaded from is retained.
Currently only the top-level vector or map is lazy. The contents of the structure is serialized/deserialized using prn-str/read-string. Read-string options, if needed, can be passed in an opts map (i.e. resources map) which is the last argument of the lazy-load method.
The benchmark for map illustrates this:
(ns aatree.map-updates
(:require [aatree.core :refer :all])
(:import (java.nio ByteBuffer)))
(def map-size 1000000)
(def updates 1000)
(defn bld [m i]
(conj m [i i]))
(println)
(def t0 (System/currentTimeMillis))
(def lazy-map (reduce bld emptyLazyAAMap (range map-size)))
(def t1 (System/currentTimeMillis))
(def micr-0 (* 1000. (- t1 t0)))
(println "Time to build a map of size" map-size "=" micr-0 "microseconds")
(println "Time per entry:" (/ micr-0 map-size) "microseconds")
(defn upd [m i]
(let [m1 (assoc m i (- i))
bb (ByteBuffer/allocate (lazy-byte-length m1))]
(lazy-write m1 bb)
(.flip bb)
(load-aamap bb)))
(println)
(def t0 (System/currentTimeMillis))
(def lazy-m (reduce upd lazy-map (range updates)))
(def t1 (System/currentTimeMillis))
(def micr-0 (* 1000. (- t1 t0)))
(println "Time to deserialize/update/reserialize " updates "times =" micr-0 "microseconds")
(println "Time per complete update:" (/ micr-0 updates) "microseconds")
(println)
A value can have more than one form or implementation. For example, a String and a char array may both represent the same value. Most objects, for example, can be serialized. Changing the form of a value then is not a change of value. This is the idea behind lazy deserialization / reserialization.
We can represent a data structure as a tree, where the values in the tree can be either the immutable value of the object or a read-only ByteBuffer holding the serialized form of the value, or both. Both forms can be held by nodes using atoms, so the form of the values can be changed without having to create a new tree. The value remains unchanged and the tree can be safely shared between threads.
Lazy deserialization comes after loading a map or vector from a ByteBuffer. Initially only the root exists, which holds only the ByteBuffer. As you access and alter the contents of the map or vector, the portions of the tree which are accessed are deserialized.
(defn upd [m i]
(let [m1 (assoc m i (- i))
bb (ByteBuffer/allocate (lazy-byte-length m1))]
(lazy-write m1 bb)
(.flip bb)
(load-aamap bb)))
And its invocation...
(def lazy-m (reduce upd lazy-map (range updates)))
Here we see that upd is called repeatedly by the reduce, its first argument being the result of the previous call to upd.
m1 (assoc m i (- i)
The above is in the upd function. It uses assoc to create an updated copy of the map.
bb (ByteBuffer/allocate (lazy-byte-length m1)
bb is a byte buffer with a capacity equal to the length of the serialized contents of m1.
(lazy-write m1 bb)
(.flip bb)
(load-aamap bb))
Now we write the updated map to the byte buffer, flip it, and then create a new map from the byte buffer. This new map is then passed back to the reduce method which calls upd again with the new/updated map.
This usage, being a benchmark, is a bit atypical as we are serializing the contents of the map with each update and then loading the result back into a new map. But. You can see that the map works like the standard clojure map, and only it works additionally with the lazy-byte-length, lazy-write and aamap-load functions. The underlying tree is not normally accessed by the application developer, just as the red/black tree used to implement clojure sorted maps are not normally accessed by the application developer.
So aamap is used just like sorted-map, except for the additional capability of being able to quickly load it from and write it to a ByteBuffer. And, like a sorted-map, you can also specify a comparator for ordering the keys, though the comparator would be passed in the optional opts map on the aamap-load function or in the create-lazy-aamap function.
The full API is given here: https://github.com/laforge49/aatree/blob/master/src/aatree/core.clj (Still need to add doc strings though.)