Get difference between two lists with java objects of same class

4,675 views
Skip to first unread message

Ryan

unread,
Mar 11, 2013, 2:15:31 PM3/11/13
to clo...@googlegroups.com
Hello,

I have two lists which contain java objects of the same class. I am trying to find out if there is a clojure function which i can use to compare those lists based on a key and return the objects from list A that do not exist in list B.

Is there such a function in clojure? If not, what would be the most elegant way to achieve it.

Thank you for your time

Ryan

Jim - FooBar();

unread,
Mar 11, 2013, 2:35:58 PM3/11/13
to clo...@googlegroups.com
Well, java.util.List specifies a retainAll(Collection c) method which is
basically the intersection between the 2 collections (the Collection
this is called on and the argument). You are actually looking for the
'difference' but if you have the intersection and the total it's pretty
trivial to find the difference.

alternatively, you can pour the lists into 2 clojure sets and take their
proper difference (but this will remove duplicates as well)...

I'm not sure what you mean 'compare those lists based on a key' though...


Jim
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to clojure+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Jim - FooBar();

unread,
Mar 11, 2013, 2:49:45 PM3/11/13
to clo...@googlegroups.com
On 11/03/13 18:35, Jim - FooBar(); wrote:
> Well, java.util.List specifies a retainAll(Collection c) method which
> is basically the intersection between the 2 collections (the
> Collection this is called on and the argument). You are actually
> looking for the 'difference' but if you have the intersection and the
> total it's pretty trivial to find the difference.

actually there is a removeAll(Collection c) which will (destructively)
give you exactly what you want...I guess that would be the fastest way
if you don't care about what happens to list A...

hope that helps,

Jim

Ryan

unread,
Mar 11, 2013, 3:35:57 PM3/11/13
to clo...@googlegroups.com
Hey Jim,

Thanks for your replies for starters.

Indeed I do not care what will happen to the original lists, i only care to find out which objects from list A do not exist in list B. Ignore the key part. 
I was aware of the functionality which is provided by java.util.List but I was hoping for a more clojurish way instead of calling java methods.

or is it pointless and I should just use removeAll()?

Ryan

Andy Fingerhut

unread,
Mar 11, 2013, 3:51:12 PM3/11/13
to clo...@googlegroups.com
There is clojure.data/diff, but whether it would work for you would depend on whether Clojure's = would compare your Java objects for equality in the way that you wanted.  You could try it out on some test case to see.


Andy

Jim - FooBar();

unread,
Mar 11, 2013, 4:05:44 PM3/11/13
to clo...@googlegroups.com
Clojure itself is being pragmatic about many many things... If you want removeAll() then use it...what can be better than a single method call?

mind you though, what Andy said applies here. It depends what you want to compare for...you want to do a 'deep' comparison (per =) or fall back to java's broken equality semantics? If you use removeAll() you automatically lose the ability to make such decisions for yourself... :-)


Jim
--

Marko Topolnik

unread,
Mar 11, 2013, 5:01:09 PM3/11/13
to clo...@googlegroups.com

I have two lists which contain java objects of the same class. I am trying to find out if there is a clojure function which i can use to compare those lists based on a key and return the objects from list A that do not exist in list B.

Lists as in Clojure lists, or as in java.util.Lists?

Anyway, since you apparently need your own equality function (can't rely on Object#equals), you'll have to build more structures. Make a function that extract the key from the entire object, say key-fn, and then do

(apply dissoc (into {} (map #(-> [(key-fn %) %] list-a))) (map key-fn list-b))

Marko Topolnik

unread,
Mar 11, 2013, 5:05:47 PM3/11/13
to clo...@googlegroups.com
Another approach, preserving the order of list-a, would be

(remove (comp (into #{} (map key-fn list-b)) key-fn) list-a)

Ryan

unread,
Mar 11, 2013, 5:55:12 PM3/11/13
to clo...@googlegroups.com
Thank you all for your replies.

@Marko, well, at first my question was intended to be more generic, but in reality I am dealing with two java.util.Lists

I probably didn't described very well what I wanted in the first place. All the objects in those lists contain an "id" property which has a unique value. I am trying to find out, which objects from the first list, based on that "id" property, are not included in list b.

I assume this rules out the use of clojure.data/diff and I will need my own function like Marko suggested to make those property comparisons.

Ryan

Marko Topolnik

unread,
Mar 11, 2013, 6:03:47 PM3/11/13
to clo...@googlegroups.com
On Monday, March 11, 2013 10:55:12 PM UTC+1, Ryan wrote:
Thank you all for your replies.

@Marko, well, at first my question was intended to be more generic, but in reality I am dealing with two java.util.Lists

I only asked because if they aren't java.util.Lists to begin with, you'd definitely not want to convert into one just to use removeAll.
 
I probably didn't described very well what I wanted in the first place. All the objects in those lists contain an "id" property which has a unique value. I am trying to find out, which objects from the first list, based on that "id" property, are not included in list b.

I assume this rules out the use of clojure.data/diff and I will need my own function like Marko suggested to make those property comparisons.

Yes, from data/diff's perspective you'd have all distinct objects.

Ryan

unread,
Mar 11, 2013, 6:09:31 PM3/11/13
to clo...@googlegroups.com
I only asked because if they aren't java.util.Lists to begin with, you'd definitely not want to convert into one just to use removeAll.

What if, i had two clojure lists, with hash-maps which have the same keys and based on a specific key, i wanted to find the items from list-a which do not exist in list-b. Would i go with the two functions you suggested or is there something else I could use?

Ryan

Jim - FooBar();

unread,
Mar 11, 2013, 6:30:11 PM3/11/13
to clo...@googlegroups.com
On 11/03/13 21:55, Ryan wrote:
> I probably didn't described very well what I wanted in the first
> place. All the objects in those lists contain an "id" property which
> has a unique value. I am trying to find out, which objects from the
> first list, based on that "id" property, are not included in list b.
>

If those objects override .equals() properly then I presume it would
work just fine calling removeAll() and relying on the Collections
framework which will eventually call .equals(). Do you own those
objects? Is the 'id' field declared final and is it involved in the
corresponding .equals() methods? Are those objects inheriting stuff?
there are so many things that can go wrong when you start to wonder
about object equality...

if performance is not critical to your problem you could create
intermediate records/maps , do your logic safe and concise, get the ids
you want and go find the original objects these ids belong to...in fact
you can attach those objects as meta-data to save yourself the hassle!


;;not tested but seems reasonable!
(defrecord TEMP [id])

(def step1
(clojure.set/difference
(set (map #(TEMP. (.getId %) {:ob %} nil) java-util-list-with-objects1))
(set (map #(TEMP. (.getId %) {:ob %} nil)
java-util-list-with-objects2)) ) ) ;;half way there

(def step2
(for [t step1]
(-> t meta :ob)))

;;not tested but seems reasonable doesn't it?

hope that helps...

Jim

ps: now that I look at it maybe a map (with 2 keys :id :ob) seems a
simpler choice...


Michael Gardner

unread,
Mar 12, 2013, 4:04:44 AM3/12/13
to clo...@googlegroups.com
On Mar 11, 2013, at 17:09 , Ryan <areka...@gmail.com> wrote:

> What if, i had two clojure lists, with hash-maps which have the same keys and based on a specific key, i wanted to find the items from list-a which do not exist in list-b. Would i go with the two functions you suggested or is there something else I could use?

Assuming :id is the key you care about:

(filter (comp (complement (set (map :id list-b))) :id)
list-a)

user=> (defn rand-seq [n] (repeatedly #(rand-int n)))
#'user/rand-seq
user=> (def list-a (map (partial hash-map :id) (take 5 (rand-seq 10))))
#'user/list-a
user=> list-a
({:id 8} {:id 4} {:id 5} {:id 1} {:id 6})
user=> (def list-b (map (partial hash-map :id) (take 5 (rand-seq 10))))
#'user/list-b
user=> list-b
({:id 9} {:id 6} {:id 3} {:id 6} {:id 3})
user=> (filter (comp (complement (set (map :id list-b))) :id) list-a)
({:id 8} {:id 4} {:id 5} {:id 1})

Marko Topolnik

unread,
Mar 12, 2013, 5:11:46 AM3/12/13
to clo...@googlegroups.com
Assuming :id is the key you care about: 
(filter (comp (complement (set (map :id list-b))) :id)
    list-a)

This is almost exactly the same as the one from an earlier post here:

(remove (comp (into #{} (map key-fn list-b)) key-fn) list-a) 

I'd prefer remove to filter + complement, though.

-Marko

Marko Topolnik

unread,
Mar 12, 2013, 5:14:17 AM3/12/13
to clo...@googlegroups.com
On Monday, March 11, 2013 11:09:31 PM UTC+1, Ryan wrote:
What if, i had two clojure lists, with hash-maps which have the same keys and based on a specific key, i wanted to find the items from list-a which do not exist in list-b. Would i go with the two functions you suggested or is there something else I could use?

The only difference would be in the definition of the key-fn. With Java classes it would be #(.getId %) and with Clojure maps it would be just :id --- keywords are already functions!

-Marko

Ryan

unread,
Mar 12, 2013, 5:22:07 AM3/12/13
to clo...@googlegroups.com
Thanks guys for your replies. I will re-read everything carefully and decide what to do :)

Ryan

Michael Gardner

unread,
Mar 12, 2013, 5:26:02 AM3/12/13
to clo...@googlegroups.com
On Mar 12, 2013, at 04:11 , Marko Topolnik <marko.t...@gmail.com> wrote:

> This is almost exactly the same as the one from an earlier post here:
>
> (remove (comp (into #{} (map key-fn list-b)) key-fn) list-a)
>
> I'd prefer remove to filter + complement, though.

Ah, I should have read the rest of the thread more carefully. I keep forgetting that 'remove exists.

I do prefer 'set and friends to 'into, though.

Ryan

unread,
Mar 21, 2013, 11:14:46 AM3/21/13
to clo...@googlegroups.com
Marko,

Can you please do me a favor and break down the function you suggested me? I understand partially how it works but I am having trouble to fully get it.

Thank you for your time.

On Monday, March 11, 2013 11:05:47 PM UTC+2, Marko Topolnik wrote:

Marko Topolnik

unread,
Mar 21, 2013, 11:37:33 AM3/21/13
to clo...@googlegroups.com
First we build a set of all the keys in list-b

(into #{} (map key-fn list-b))

Let's call that set keyset-b. Then we use keyset-b as a function which returns truthy (non-nil) for any key that is contained in it, and compose it with our key-fn:

(comp keyset-b key-fn)

This results in a function that first applies key-fn, then keyset-b. So it's like #(keyset-b (key-fn %)). Let's call this function predicate.

Finally, we use predicate to remove any member of list-a for which it is truthy:

(remove predicate list-a)

-marko

On Thursday, March 21, 2013 4:14:46 PM UTC+1, Ryan wrote:
Marko,

Can you please do me a favor and break down the function you suggested me? I understand partially how it works but I am having trouble to fully get it.

Ryan

unread,
Mar 21, 2013, 11:58:08 AM3/21/13
to clo...@googlegroups.com
Thanks a lot Marko. Much better now :)

I also wanted to ask you why did you mention in a previous post that you prefer using remove than filter + complement. Is there a reason for this or just a personal preference?

Ryan

Marko Topolnik

unread,
Mar 21, 2013, 12:09:42 PM3/21/13
to clo...@googlegroups.com
Personal preference. It causes less mental load because it more obviously spells out what you are doing.

Ryan

unread,
Mar 21, 2013, 12:21:53 PM3/21/13
to clo...@googlegroups.com
Thanks Marko. I do have couple more q's for you just to ensure I got everything right:

(comp keyset-b key-fn)
This results in a function that first applies key-fn, then keyset-b. So it's like #(keyset-b (key-fn %)). Let's call this function predicate.

1. What exactly happens when an item is passed to #(keyset-b (key-fn %)) ? Does keyset-b looks up itself (because collections are functions) for the item which contains :id X and returns true/false?
2. Isn't it more idiomatic to write #((key-fn %) keyset-b) ?
3. Does remove loops list-a internally and applies the predicate to each item? (if the answer is no my head will definitely explode)

Thanks again for your patience :)

Ryan

Marko Topolnik

unread,
Mar 21, 2013, 2:48:51 PM3/21/13
to clo...@googlegroups.com
On Thursday, March 21, 2013 5:21:53 PM UTC+1, Ryan wrote:
Thanks Marko. I do have couple more q's for you just to ensure I got everything right:

(comp keyset-b key-fn)
This results in a function that first applies key-fn, then keyset-b. So it's like #(keyset-b (key-fn %)). Let's call this function predicate.

1. What exactly happens when an item is passed to #(keyset-b (key-fn %)) ? Does keyset-b looks up itself (because collections are functions) for the item which contains :id X and returns true/false?

 
2. Isn't it more idiomatic to write #((key-fn %) keyset-b) ?

No, because it doesn't work :) An arbitrary object cannot be applied as a function.
 
3. Does remove loops list-a internally and applies the predicate to each item? (if the answer is no my head will definitely explode)

remove is just like filter, only with reversed logic. Its implementation in fact is literally

Ryan

unread,
Mar 22, 2013, 6:21:17 AM3/22/13
to clo...@googlegroups.com
Thanks once again Marko. The only thing that I am having trouble understanding is this:

1. What exactly happens when an item is passed to #(keyset-b (key-fn %)) ? Does keyset-b looks up itself (because collections are functions) for the item which contains :id X and returns true/false?

It returns the argument if it contains it, and otherwise nil.

Let's assume that key-fn is defined as #(.getID %) so we have:

#(keyset-b #(.getID %))

And now let's assume that item-object is passed to it. So, #(.getID %) returns, let's say, the number 3 (which is the value of the id). How exactly is that number is being looked up in keyset-b? How does keyset-b knows we are looking for an item with key id and value 3?

Apparently I am missing something here hence my confusion.

Cheers

Marko Topolnik

unread,
Mar 22, 2013, 6:44:06 AM3/22/13
to clo...@googlegroups.com
Let's assume that key-fn is defined as #(.getID %) so we have:

#(keyset-b #(.getID %))

And now let's assume that item-object is passed to it. So, #(.getID %) returns, let's say, the number 3 (which is the value of the id). How exactly is that number is being looked up in keyset-b? How does keyset-b knows we are looking for an item with key id and value 3?

Well, as its name already indicates, te keyset contains the keys. So it will literally contain that 3 as its member.

-marko 

Ryan

unread,
Mar 22, 2013, 12:05:45 PM3/22/13
to clo...@googlegroups.com
Thanks a lot Marko :)
Reply all
Reply to author
Forward
0 new messages