I have hacked the Clojure core to add durability to refs. The syntax
to create these is (dref <val> <key> <path>), where <key> and <path>
are strings. Then you use them just like refs. Creating a dref
creates a global identity, such that subsequent dref calls to the same
key and path will get the same dref. On subsequent dref calls, the
<val> will be ignored and the persisted value used. This includes in
subsequent VM instances.
Get it here: git://github.com/kwanalyssa/clojure.git
1. <path> refers to BDB JE databases which get created in the "data"
directory of your project.
2. BDB JE is used. I'm ignorant of IP and licensing issues. Making
BDB JE core to Clojure is probably an issue.
3. Currently only a subset of Clojure primitives types is supported.
No BigDecimal or Ratio yet. See the comprehensive list (and
serialization mappings) at the bottom of src/jvm/clojure/lang/
DRef.java. BDB JE TupleBindings are used. Submissions welcome,
especially for the persistent data structures.
4. How do we approach the problem of storing objects with lexical
environments?
5. Unit tests welcome! I didn't do TDD since the work is in Java and
there's no Java-level tests in the project. Please add your own in
test/clojure/test_clojure/drefs.clj. I'm new to concurrency, so tests
along those lines would be awesome!
6. The ACID part is not really guaranteed!!! STM is currently one-
phase-commit. I inserted two-phase-commits to the data stores in the
middle of the one-phase-commit. There's the remote possibility that
STM in-memory changes fail AFTER writing to disk. It's REALLY remote,
but it is possible. STM would have to be made 2PC to make this
airtight. That's way beyond my current grasp of both concurrency and
Clojure implementation.
7. I was aiming for an API where <path> is optional. However, I
didn't want to stray from the ref API, which has variable arity.
Suggestions on how to reconcile the two are welcome!
8. To maintain global identity, I use a static cache, which requires
non-hard references to avoid OOM issues. This is my first time doing
this, so please check my code to make sure that I'm doing it right.
I'm using SoftReferences, though WeakReferences may be better for real-
life usage patterns. Let me know!
I'm getting back to Clojure after an extended absence. Just today I was pondering the design of a solution to a similar problem, though I suspect our requirements diverge on several points. My tentative conclusion was that it could be done entirely in Clojure and without modifying existing code. Maybe you can poke holes in my fledging plan since you've obviously been thinking about this sort of problem longer than me:
There's a new pref reference type. It consists of a key and an atom containing nil for unloaded objects and an STM reference for loaded objects.
When a pref is dereferenced, it checks its atom. If nil, it first loads the object from disk into a fresh STM reference (which has a metadata field pointing back to the pref) and mutates the atom so it points to it. In either case it finishes by dereferencing the STM reference.
When a pref is mutated, it first goes through the same motions as for dereferencing. Then it simply forwards the mutation to the underlying STM reference.
Watchers are installed on STM references backed by prefs. Thus we are notified when something is mutated.
There is a pref-specific transaction boundary form called atomic, analogous to dosync. The watchers are used to determine which prefs were mutated during the transaction so as to flag them dirty for write-back or write-through caching; this is why we need the pref back reference in the metadata.
Anyway, even assuming this all works, it will obviously be less computationally efficient than extending LockingTransaction.java with special support.
> I have hacked the Clojure core to add durability to refs. The syntax > to create these is (dref <val> <key> <path>), where <key> and <path> > are strings. Then you use them just like refs. Creating a dref > creates a global identity, such that subsequent dref calls to the same > key and path will get the same dref. On subsequent dref calls, the > <val> will be ignored and the persisted value used. This includes in > subsequent VM instances.
> Get it here: git://github.com/kwanalyssa/clojure.git
> 1. <path> refers to BDB JE databases which get created in the "data" > directory of your project. > 2. BDB JE is used. I'm ignorant of IP and licensing issues. Making > BDB JE core to Clojure is probably an issue. > 3. Currently only a subset of Clojure primitives types is supported. > No BigDecimal or Ratio yet. See the comprehensive list (and > serialization mappings) at the bottom of src/jvm/clojure/lang/ > DRef.java. BDB JE TupleBindings are used. Submissions welcome, > especially for the persistent data structures. > 4. How do we approach the problem of storing objects with lexical > environments? > 5. Unit tests welcome! I didn't do TDD since the work is in Java and > there's no Java-level tests in the project. Please add your own in > test/clojure/test_clojure/drefs.clj. I'm new to concurrency, so tests > along those lines would be awesome! > 6. The ACID part is not really guaranteed!!! STM is currently one- > phase-commit. I inserted two-phase-commits to the data stores in the > middle of the one-phase-commit. There's the remote possibility that > STM in-memory changes fail AFTER writing to disk. It's REALLY remote, > but it is possible. STM would have to be made 2PC to make this > airtight. That's way beyond my current grasp of both concurrency and > Clojure implementation. > 7. I was aiming for an API where <path> is optional. However, I > didn't want to stray from the ref API, which has variable arity. > Suggestions on how to reconcile the two are welcome! > 8. To maintain global identity, I use a static cache, which requires > non-hard references to avoid OOM issues. This is my first time doing > this, so please check my code to make sure that I'm doing it right. > I'm using SoftReferences, though WeakReferences may be better for real- > life usage patterns. Let me know!
> -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your first post. > To unsubscribe from this group, send email to > clojure+unsubscribe@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en
It's probably possible to do it completely in Clojure, but you have to
subclass Atom. There's no need for any transaction boundary; you just
have to make sure that compareAndSet does a durable swap.
My plan was to get durable refs done and then extend the other mutable
identities, including atom. I'd love to work with you on it!
On Sep 23, 8:27 am, Per Vognsen <per.vogn...@gmail.com> wrote:
> I'm getting back to Clojure after an extended absence. Just today I
> was pondering the design of a solution to a similar problem, though I
> suspect our requirements diverge on several points. My tentative
> conclusion was that it could be done entirely in Clojure and without
> modifying existing code. Maybe you can poke holes in my fledging plan
> since you've obviously been thinking about this sort of problem longer
> than me:
> There's a new pref reference type. It consists of a key and an atom
> containing nil for unloaded objects and an STM reference for loaded
> objects.
> When a pref is dereferenced, it checks its atom. If nil, it first
> loads the object from disk into a fresh STM reference (which has a
> metadata field pointing back to the pref) and mutates the atom so it
> points to it. In either case it finishes by dereferencing the STM
> reference.
> When a pref is mutated, it first goes through the same motions as for
> dereferencing. Then it simply forwards the mutation to the underlying
> STM reference.
> Watchers are installed on STM references backed by prefs. Thus we are
> notified when something is mutated.
> There is a pref-specific transaction boundary form called atomic,
> analogous to dosync. The watchers are used to determine which prefs
> were mutated during the transaction so as to flag them dirty for
> write-back or write-through caching; this is why we need the pref back
> reference in the metadata.
> Anyway, even assuming this all works, it will obviously be less
> computationally efficient than extending LockingTransaction.java with
> special support.
> > I have hacked the Clojure core to add durability to refs. The syntax
> > to create these is (dref <val> <key> <path>), where <key> and <path>
> > are strings. Then you use them just like refs. Creating a dref
> > creates a global identity, such that subsequent dref calls to the same
> > key and path will get the same dref. On subsequent dref calls, the
> > <val> will be ignored and the persisted value used. This includes in
> > subsequent VM instances.
> > Get it here: git://github.com/kwanalyssa/clojure.git
> > 1. <path> refers to BDB JE databases which get created in the "data"
> > directory of your project.
> > 2. BDB JE is used. I'm ignorant of IP and licensing issues. Making
> > BDB JE core to Clojure is probably an issue.
> > 3. Currently only a subset of Clojure primitives types is supported.
> > No BigDecimal or Ratio yet. See the comprehensive list (and
> > serialization mappings) at the bottom of src/jvm/clojure/lang/
> > DRef.java. BDB JE TupleBindings are used. Submissions welcome,
> > especially for the persistent data structures.
> > 4. How do we approach the problem of storing objects with lexical
> > environments?
> > 5. Unit tests welcome! I didn't do TDD since the work is in Java and
> > there's no Java-level tests in the project. Please add your own in
> > test/clojure/test_clojure/drefs.clj. I'm new to concurrency, so tests
> > along those lines would be awesome!
> > 6. The ACID part is not really guaranteed!!! STM is currently one-
> > phase-commit. I inserted two-phase-commits to the data stores in the
> > middle of the one-phase-commit. There's the remote possibility that
> > STM in-memory changes fail AFTER writing to disk. It's REALLY remote,
> > but it is possible. STM would have to be made 2PC to make this
> > airtight. That's way beyond my current grasp of both concurrency and
> > Clojure implementation.
> > 7. I was aiming for an API where <path> is optional. However, I
> > didn't want to stray from the ref API, which has variable arity.
> > Suggestions on how to reconcile the two are welcome!
> > 8. To maintain global identity, I use a static cache, which requires
> > non-hard references to avoid OOM issues. This is my first time doing
> > this, so please check my code to make sure that I'm doing it right.
> > I'm using SoftReferences, though WeakReferences may be better for real-
> > life usage patterns. Let me know!
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to clojure@googlegroups.com
> > Note that posts from new members are moderated - please be patient with your first post.
> > To unsubscribe from this group, send email to
> > clojure+unsubscribe@googlegroups.com
> > For more options, visit this group at
> >http://groups.google.com/group/clojure?hl=en
> I have hacked the Clojure core to add durability to refs. The syntax
> to create these is (dref <val> <key> <path>), where <key> and <path>
> are strings. Then you use them just like refs. Creating a dref
> creates a global identity, such that subsequent dref calls to the same
> key and path will get the same dref. On subsequent dref calls, the
> <val> will be ignored and the persisted value used. This includes in
> subsequent VM instances.
> Get it here: git://github.com/kwanalyssa/clojure.git
> 1. <path> refers to BDB JE databases which get created in the "data"
> directory of your project.
> 2. BDB JE is used. I'm ignorant of IP and licensing issues. Making
> BDB JE core to Clojure is probably an issue.
> 3. Currently only a subset of Clojure primitives types is supported.
> No BigDecimal or Ratio yet. See the comprehensive list (and
> serialization mappings) at the bottom of src/jvm/clojure/lang/
> DRef.java. BDB JE TupleBindings are used. Submissions welcome,
> especially for the persistent data structures.
> 4. How do we approach the problem of storing objects with lexical
> environments?
> 5. Unit tests welcome! I didn't do TDD since the work is in Java and
> there's no Java-level tests in the project. Please add your own in
> test/clojure/test_clojure/drefs.clj. I'm new to concurrency, so tests
> along those lines would be awesome!
> 6. The ACID part is not really guaranteed!!! STM is currently one-
> phase-commit. I inserted two-phase-commits to the data stores in the
> middle of the one-phase-commit. There's the remote possibility that
> STM in-memory changes fail AFTER writing to disk. It's REALLY remote,
> but it is possible. STM would have to be made 2PC to make this
> airtight. That's way beyond my current grasp of both concurrency and
> Clojure implementation.
> 7. I was aiming for an API where <path> is optional. However, I
> didn't want to stray from the ref API, which has variable arity.
> Suggestions on how to reconcile the two are welcome!
> 8. To maintain global identity, I use a static cache, which requires
> non-hard references to avoid OOM issues. This is my first time doing
> this, so please check my code to make sure that I'm doing it right.
> I'm using SoftReferences, though WeakReferences may be better for real-
> life usage patterns. Let me know!
On Thu, Sep 23, 2010 at 8:16 PM, Alyssa Kwan <alyssa.c.k...@gmail.com> wrote: > There's no need for any transaction boundary; you just > have to make sure that compareAndSet does a durable swap.
I had the chance to read your code today. You have a transaction boundary in DRef.set() which is called by LockingTransaction.run() at commit time. My point was that if you weren't intrusively modifying LockingTransaction.java you would need to take care of that somewhere else, the most obvious place being a dosync wrapper form. All you would need is a seq of 'vals' returned on a commited run(). That would also be useful for application-side transaction logging, etc.
> > I have hacked the Clojure core to add durability to refs. The syntax
> > to create these is (dref <val> <key> <path>), where <key> and <path>
> > are strings. Then you use them just like refs. Creating a dref
> > creates a global identity, such that subsequent dref calls to the same
> > key and path will get the same dref. On subsequent dref calls, the
> > <val> will be ignored and the persisted value used. This includes in
> > subsequent VM instances.
> > Get it here: git://github.com/kwanalyssa/clojure.git
> > 1. <path> refers to BDB JE databases which get created in the "data"
> > directory of your project.
> > 2. BDB JE is used. I'm ignorant of IP and licensing issues. Making
> > BDB JE core to Clojure is probably an issue.
> > 3. Currently only a subset of Clojure primitives types is supported.
> > No BigDecimal or Ratio yet. See the comprehensive list (and
> > serialization mappings) at the bottom of src/jvm/clojure/lang/
> > DRef.java. BDB JE TupleBindings are used. Submissions welcome,
> > especially for the persistent data structures.
> > 4. How do we approach the problem of storing objects with lexical
> > environments?
> > 5. Unit tests welcome! I didn't do TDD since the work is in Java and
> > there's no Java-level tests in the project. Please add your own in
> > test/clojure/test_clojure/drefs.clj. I'm new to concurrency, so tests
> > along those lines would be awesome!
> > 6. The ACID part is not really guaranteed!!! STM is currently one-
> > phase-commit. I inserted two-phase-commits to the data stores in the
> > middle of the one-phase-commit. There's the remote possibility that
> > STM in-memory changes fail AFTER writing to disk. It's REALLY remote,
> > but it is possible. STM would have to be made 2PC to make this
> > airtight. That's way beyond my current grasp of both concurrency and
> > Clojure implementation.
> > 7. I was aiming for an API where <path> is optional. However, I
> > didn't want to stray from the ref API, which has variable arity.
> > Suggestions on how to reconcile the two are welcome!
> > 8. To maintain global identity, I use a static cache, which requires
> > non-hard references to avoid OOM issues. This is my first time doing
> > this, so please check my code to make sure that I'm doing it right.
> > I'm using SoftReferences, though WeakReferences may be better for real-
> > life usage patterns. Let me know!
Ah. I thought we were discussing prefs, or datoms (durable atoms), as
I would call them. Because datoms are only synchronous but not
coordinated, there's no transaction boundary. (More accurately, the
swap! is the transaction boundary, much like auto-commit.) dosync has
no effect on datoms.
drefs, being coordinated, do require a transaction boundary. However,
I don't think it's possible without modifying LockingTransaction.
It's bad enough that the current implementation has 2PC against ACID
resources wrapped inside of a 1PC STM transaction. To place the
durable write outside of the 1PC would be much less safe. dosync
enforces a global transaction order. If writes were outside
LockingTransaction.run(), the order could (and probably would) be
different between in-memory resources and durable resources. For
ultimate safety, we need to be even more intrusive and add a prepare
phase to the STM.
On Sep 23, 11:05 pm, Per Vognsen <per.vogn...@gmail.com> wrote:
> On Thu, Sep 23, 2010 at 8:16 PM, Alyssa Kwan <alyssa.c.k...@gmail.com> wrote:
> > There's no need for any transaction boundary; you just
> > have to make sure that compareAndSet does a durable swap.
> I had the chance to read your code today. You have a transaction
> boundary in DRef.set() which is called by LockingTransaction.run() at
> commit time. My point was that if you weren't intrusively modifying
> LockingTransaction.java you would need to take care of that somewhere
> else, the most obvious place being a dosync wrapper form. All you
> would need is a seq of 'vals' returned on a commited run(). That would
> also be useful for application-side transaction logging, etc.
This probably comes back to divergent requirements. Strict durability is much too expensive for what I need to do. For me the more important thing is that whatever authoritative data lives on disk is consistent with the application transaction boundaries. This means that I need to tag persistent refs as dirty or increment a version number when an STM transaction commits, so that when they are evicted from cache or a consistent snapshot is written to disk, I know what to write out. Another simplifying requirement is that I don't have to worry about different database domains for persistent refs. The application is organized around a single database that serves as a persistent store for application domain data.
On Fri, Sep 24, 2010 at 10:40 AM, Alyssa Kwan <alyssa.c.k...@gmail.com> wrote: > Ah. I thought we were discussing prefs, or datoms (durable atoms), as > I would call them. Because datoms are only synchronous but not > coordinated, there's no transaction boundary. (More accurately, the > swap! is the transaction boundary, much like auto-commit.) dosync has > no effect on datoms.
> drefs, being coordinated, do require a transaction boundary. However, > I don't think it's possible without modifying LockingTransaction. > It's bad enough that the current implementation has 2PC against ACID > resources wrapped inside of a 1PC STM transaction. To place the > durable write outside of the 1PC would be much less safe. dosync > enforces a global transaction order. If writes were outside > LockingTransaction.run(), the order could (and probably would) be > different between in-memory resources and durable resources. For > ultimate safety, we need to be even more intrusive and add a prepare > phase to the STM.
> On Sep 23, 11:05 pm, Per Vognsen <per.vogn...@gmail.com> wrote: >> On Thu, Sep 23, 2010 at 8:16 PM, Alyssa Kwan <alyssa.c.k...@gmail.com> wrote: >> > There's no need for any transaction boundary; you just >> > have to make sure that compareAndSet does a durable swap.
>> I had the chance to read your code today. You have a transaction >> boundary in DRef.set() which is called by LockingTransaction.run() at >> commit time. My point was that if you weren't intrusively modifying >> LockingTransaction.java you would need to take care of that somewhere >> else, the most obvious place being a dosync wrapper form. All you >> would need is a seq of 'vals' returned on a commited run(). That would >> also be useful for application-side transaction logging, etc.
>> -Per
> -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your first post. > To unsubscribe from this group, send email to > clojure+unsubscribe@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en