Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
The unshared part of two mostly-shared structures
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  6 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
mikel  
View profile  
 More options Mar 21 2009, 12:23 pm
From: mikel <mev...@mac.com>
Date: Sat, 21 Mar 2009 09:23:13 -0700 (PDT)
Local: Sat, Mar 21 2009 12:23 pm
Subject: The unshared part of two mostly-shared structures
Clojure efficiently shares structure of composite objects. For
example, given a Map with 10,000 entries, you can inexpensively
created a new Map with one changed entry, because Clojure reuses the
other 9999 entries.

Is there a convenient and efficient API that can return just the
changed entry?

If there is, it would be very convenient indeed for one of my
application's needs: it would enable me to efficiently copy deltas
from processes on one machine to processes on another. As a process
updates application state, I could accrue deltas and send just the
changed portion of the state across the network to the remote process,
which, being in Map form, would be very easy to merge with the remote
state.

I haven't looked at Clojure at all with this in mind, but I know I'm
soon going to need to implement this updating mechanism. If you know
how to efficiently obtain just the diffrences between two otherwise
shared Maps, I'd be interested to learn how to do it.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chouser  
View profile  
 More options Mar 21 2009, 5:01 pm
From: Chouser <chou...@gmail.com>
Date: Sat, 21 Mar 2009 17:01:09 -0400
Local: Sat, Mar 21 2009 5:01 pm
Subject: Re: The unshared part of two mostly-shared structures

On Sat, Mar 21, 2009 at 12:23 PM, mikel <mev...@mac.com> wrote:

> I haven't looked at Clojure at all with this in mind, but I know I'm
> soon going to need to implement this updating mechanism. If you know
> how to efficiently obtain just the diffrences between two otherwise
> shared Maps, I'd be interested to learn how to do it.

This is interesting to me.  I've thought for a while that I wanted to
be notified of changes to a collection, such as in a Ref's watcher.
But I couldn't think of a good way to go about it, since the new value
of a Ref may have no relationship at all with the old value.

But your question suggests a different approach.  Efficiently
obtaining a "diff" that describes the differences between two
collections would suffice for most of my use cases.o

The collections don't support this now, as far as I know, but off the
top of my head it seems like it would be possible for any
collection instance to walk itself at the same time as another
instance of the exact same type, following only nodes that are not
identical references.  This wouldn't work to compare an array-map with
a hash-map, for example, but in an ideal case of two hash-maps with a
single changed entry, it seems like it could be very efficient.
Perhaps O(n) where n is the number of changed leaf nodes?

I'm not sure what the resulting "diff" would look like.  It would be
nice if it were simple another hash-map: (diff a b) would return a map
c such that (merge a c) would produce a value equal to b again.  But
for course this can't be as 'merge' never removes an entry.  An option
that would work (but doesn't strike me as terribly elegant) would be
for (diff a b) to return [c d], such that:

  (= b (merge (apply dissoc a c) d))

--Chouser


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
rapido  
View profile  
 More options Mar 21 2009, 6:01 pm
From: rapido <robbert.van.da...@gmail.com>
Date: Sat, 21 Mar 2009 15:01:33 -0700 (PDT)
Local: Sat, Mar 21 2009 6:01 pm
Subject: Re: The unshared part of two mostly-shared structures
i have done some homework on this problem :)

(warning: not clojure specific)

my claim is that diffing two similar sets (or maps) *can* be made
efficient only *if* you can add arbitrary items to sets efficiently
(i.e. O(log(n)).

remember that immutable sets are usually laid out in a tree structure.
this also means that similar sets share a lot of tree structure.

if pointer equality is allowed, the differencing of two sets can be
made O(n) where n is the number of differences.
this can be achieved by comparing (on equality) left and right (sub)
trees of two sets.

more strongly, if the (sorted) differences between two sets are
consecutive, it can be made O(log n).
and more generally, if there are (m) blocks of consecutive (n)
differences, you can get it O(m*log(n)))
(google search 'treaps')

pointer equality doesn't always cut it however:
when two disparate processes build two similar sets (or trees), their
pointers will be different and pointer equality fails.

this can be solved by replacing pointers with hashes (of recursive
tree content), effectively replacing pointer equality by hash
equality.

my programming language enchilada (www.enchiladacode.nl) has
hash=pointer equality build in.
i believe it shouldn't be to difficult to introduce some of
enchilada's internals to clojure.

- robbert.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Stuart Sierra  
View profile  
 More options Mar 22 2009, 12:26 pm
From: Stuart Sierra <the.stuart.sie...@gmail.com>
Date: Sun, 22 Mar 2009 09:26:15 -0700 (PDT)
Local: Sun, Mar 22 2009 12:26 pm
Subject: Re: The unshared part of two mostly-shared structures
On Mar 21, 6:01 pm, rapido <robbert.van.da...@gmail.com> wrote:

> my programming language enchilada (www.enchiladacode.nl) has
> hash=pointer equality build in.
> i believe it shouldn't be to difficult to introduce some of
> enchilada's internals to clojure.

Hi Robbert,

I imagine that hash-pointer equality would be difficult to impossible
in Clojure, since it relies on Java's hash/pointer/equality semantics.

But Enchilada looks interesting, and a hash-based data structure in
Clojure would also be interesting.

-Stuart Sierra


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
mikel  
View profile  
 More options Mar 22 2009, 7:25 pm
From: mikel <mev...@mac.com>
Date: Sun, 22 Mar 2009 16:25:12 -0700 (PDT)
Local: Sun, Mar 22 2009 7:25 pm
Subject: Re: The unshared part of two mostly-shared structures
Thanks for the comments, folks. I may be able to raw some ideas from
them. In my specific case, restrictive rules about the types of the
objects may be quite workable. For example, I don't think I'd suffer
if the state-preserving objects were all required to be hash-maps.

I'll think some more about it.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
e  
View profile  
 More options Mar 23 2009, 12:06 am
From: e <evier...@gmail.com>
Date: Mon, 23 Mar 2009 00:06:27 -0400
Local: Mon, Mar 23 2009 12:06 am
Subject: Re: The unshared part of two mostly-shared structures

> my programming language enchilada (www.enchiladacode.nl)

 interesting.  I was thinking today about how STM reminds me of version
control.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »