Booleans don't seem to be serialized properly

110 views
Skip to first unread message

sritchie

unread,
Sep 10, 2011, 5:55:02 PM9/10/11
to cascading-user
Hey all,

I'm using Cascalog here, and I think this bug surfaced due to
Clojure's take on truthiness.

I'm trying to pass "false", which, in clojure, is equal to
Boolean.FALSE as a parameter into a filter op. On serialization, I
seem to be getting a new Boolean instance, which evaluates to true
(from clojure's perspective) but still acts as false, to Java.

Here's an example of this issue:

user=> (or (Boolean. false) true)
false


And here are the relevant cascalog queries: https://gist.github.com/1206888

It seems like Cascading should make sure that booleans always reach
filtering operations as Boolean.FALSE or Boolean.TRUE, rather than
serialized versions. (Note that this issue only shows up on filters.)

Thanks a lot, I hope this is clear!
Sam

nathanmarz

unread,
Sep 10, 2011, 6:02:28 PM9/10/11
to cascading-user
Note that the javadoc for the Boolean constructor says the following
(http://download.oracle.com/javase/1.4.2/docs/api/java/lang/
Boolean.html#Boolean(boolean)):

"Note: It is rarely appropriate to use this constructor. Unless a new
instance is required, the static factory valueOf(boolean) is generally
a better choice. It is likely to yield significantly better space and
time performance."

Clojure does not test for Booleans of this type in conditionals for
performance reasons.

-Nathan

Chris K Wensel

unread,
Sep 11, 2011, 12:56:00 PM9/11/11
to cascadi...@googlegroups.com
A quick inspection of the Cascading deserialization code and a text search of new Boolean shows no place where a Boolean Object is constructed, even during coercion. Only primitives are passed around.

If anything, it's an issue with autoboxing. Tuple stores everything as an Object. but calling Tuple#getBoolean returns a primitive.

that said, we could wrap DataInputStream#readBoolean with Boolean#valueOf(). but i've no idea how autoboxing works in this case, I would have assumed it did the "right" thing.

if you want to make the change and let me know if it resolves your issue, i'll make the same change in wip-1.2 and push a wip release.

https://github.com/cwensel/cascading/blob/master/src/core/cascading/tuple/TupleInputStream.java#L147

chris

> --
> You received this message because you are subscribed to the Google Groups "cascading-user" group.
> To post to this group, send email to cascadi...@googlegroups.com.
> To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.
>

--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com

-- Concurrent, Inc. offers mentoring, support for Cascading

Ted Dunning

unread,
Sep 11, 2011, 4:02:17 PM9/11/11
to cascadi...@googlegroups.com
It sounds to me like the problem is really with Clojure doing an eq test instead of equals on Booleans.

Surely Clojure doesn't do the same for Integer.  How is it justifiable to do so for Boolean?  Just because it often works?  That seems like a weak guarantee to me.
Reply all
Reply to author
Forward
0 new messages