neatly handling immutable builder types

29 views
Skip to first unread message

Adrian Cole

unread,
Jul 20, 2020, 12:40:23 AM7/20/20
to DataStax Java Driver for Apache Cassandra User Mailing List
I'm trying to transition zipkin from the 3.x driver to the 4.x one. I'll have several questions, but I'll start with one about the immutable builders.

One thing that's caused a bit of distraction porting, is that things that build something else, like prepared statement or binding.. these are now immutable.

Ex. we have things like this where we'd tried to keep CQL smaller or not error when things are null.

// now set the nullable fields
if (null != input.trace_id_high()) bound = bound.setString("trace_id_high", input.trace_id_high());

When porting we have to remember to do the 'switcheroo' here (bound = bound.setX), else changes are lost

I'm sure this was a design intent, but it has impact including having to hunt around to see where any of this happens.. It is also the case that people already think java churns too much and I can't imagine doing things like this makes anything better (allocating on each return).

Can you recommend an efficient and not error prone way to bind statements where we won't know which fields are null until request time? I'd like to go ahead and move to that if possible. If above is right, I'd appreciate the ack that it is.

Best,
-A

Adrian Cole

unread,
Jul 20, 2020, 12:43:05 AM7/20/20
to DataStax Java Driver for Apache Cassandra User Mailing List
I should clarify that we have fields we set which are long (uint64) and we guard setting them on 0L as that's not a valid value. Similarly some booleans are never set (shouldn't initialize to false). As zipkin allows patching, almost all fields can be empty values and shouldn't spam the CQL if we know that client side I think..

Adrian Cole

unread,
Jul 20, 2020, 1:43:56 AM7/20/20
to DataStax Java Driver for Apache Cassandra User Mailing List
Noticed boundStatementBuilder() solves this one.. I just didn't notice it. switcheroo undone now when setting.

That preparing a statement allocates on each return isn't a perf issue, more a "oops" when porting, for similar reasons of switcheroo

insertQuery = insertQuery.value("l_service", QueryBuilder.bindMarker("l_service"));

Might warn in the upgrade guide about these things as it can save some time and frustration knowing!

-A

Olivier Michallat

unread,
Jul 20, 2020, 4:48:17 PM7/20/20
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi,

I'm sure this was a design intent

Yes, making statement classes immutable was a deliberate decision. The advantage is that they are automatically thread-safe, and also safe to share around your application: for example you can store a SimpleStatement in a public constant, and you'll never have to worry about some other part of your code silently changing some attribute (page size, consistency level...) on it.
The drawback is indeed the "gotcha" of forgetting to reassign the variable. I'll admit that we underestimated a bit how annoying that would be :(
There are a couple of things you can do to mitigate the issue:
1) All the methods are annotated with @CheckReturnValue. You can configure ErrorProne in your build to detect bad usage patterns.
2) As you noticed, every statement class has an equivalent builder:
SimpleStatement.builder()
BatchStatement.builder()
PreparedStatement.boundStatementBuilder()

Builders are mutable, so you can use them like driver 3 statements, you just need to call build() at the end (and we could even avoid that with Session.execute aliases that take a builder, I'll add that). Note that they are not thread-safe.


Might warn in the upgrade guide about these things as it can save some time and frustration knowing!

Duly noted, I'll see how we can emphasize that. And this topic probably deserves a dedicated page as well, with everything I explained above.


Can you recommend an efficient and not error prone way to bind statements where we won't know which fields are null until request time?

So there are two ways to deal with nulls in Cassandra: insert a CQL NULL (this will create a tombstone), or don't set the column. From your code examples I think you mean the latter.

If you use bind(Object...),  the only way to leave values unset is if they appear at the end of the list. So if you have a query containing VALUES (:a,:b,:c) and call bind(1,2), then c will not be set.
If the value to leave unset is in the middle, like b, then the only option is to set the values one by one, and not invoke the setter for b. There is also an unset() method if it was set previously and you want to remove it.

(If you do want to insert a CQL NULL, you can pass a null element to bind(Object...), call the regular setter with null, or use setToNull() for primitive types)


I should clarify that we have fields we set which are long (uint64) and we guard setting them on 0L as that's not a valid value

You could also leave them unset, or set them to null with setToNull() as explained above.
When reading back the data, you'll have to call isNull() before getLong() (which converts CQL NULL to 0). Or you can call get(i, Long.class) if you don't mind boxing every time.
 
Hope this helps,

Olivier Michallat



--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
To view this discussion on the web visit https://groups.google.com/a/lists.datastax.com/d/msgid/java-driver-user/127f06d9-c68d-443a-95b9-e8ce97c4b3f6o%40lists.datastax.com.

Adrian Cole

unread,
Jul 21, 2020, 12:43:50 AM7/21/20
to java-dri...@lists.datastax.com
These tips are great, Oliver. I learned a couple things I can use in
the same codebase to tighten it up. Cheers!

-A
> You received this message because you are subscribed to a topic in the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
> To unsubscribe from this topic, visit https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/YbXjSoNV6Ns/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to java-driver-us...@lists.datastax.com.
> To view this discussion on the web visit https://groups.google.com/a/lists.datastax.com/d/msgid/java-driver-user/CANAMFDkVYk2Nu0YvRfDMbUVO2-yHpsjFmn8wZDizEBB6YTrdxw%40mail.gmail.com.

Adrian Cole

unread,
Jul 21, 2020, 9:27:57 AM7/21/20
to java-dri...@lists.datastax.com
incidentally, you do warn about the check return on statements

https://docs.datastax.com/en/developer/java-driver/4.7/upgrade_guide/#statements

Even if error-prone isn't hooked up, running Intellij analysis "Result
of method call ignored" can help with this (and be advise equally
ignored)

another neat thing is to suggest using Intellij "Null value for
Optional type" analysis as porting from null -> Optional also ends up
with this problem.

Adrian Cole

unread,
Jul 21, 2020, 9:31:39 AM7/21/20
to DataStax Java Driver for Apache Cassandra User Mailing List
one more switcheroo to mention is UDTs

Ex I'm not sure how to get a mutable copy of this..

    UdtValue result = getCqlType().newValue();

If there is one, let me know!

-A

Alexandre Dutra

unread,
Jul 21, 2020, 10:42:47 AM7/21/20
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Adrian,

I'm glad to see that you've started porting Zipkin to driver 4! Is there an issue or pull request that we could watch? I found this GitHub issue, but I'm not sure this is the one to follow. Don't hesitate to request reviews from us.

Now back to your question: as stated in the javadocs of UdtValue and TupleValue, built-in implementations for these two interfaces are mutable, contrary to most other built-in implementations.

Hope that helps and let us know if you have more questions (we'll answer the other ones you posted shortly)!

Alex


--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.


--
Alexandre Dutra

Adrian Cole

unread,
Jul 21, 2020, 11:33:34 AM7/21/20
to java-dri...@lists.datastax.com
> I'm glad to see that you've started porting Zipkin to driver 4! Is there an issue or pull request that we could watch? I found this GitHub issue, but I'm not sure this is the one to follow. Don't hesitate to request reviews from us.

Thanks, I'm trying to get to a point where things pass in basic form
prior to a pull request as it would be distracting otherwise to the
committers who are less familiar with the nuance. At first I thought
about having both drivers in until it could swap out the other, but
that turned into more trouble. In a way I suppose I'm moving some of
that distraction here. Appreciate the help and patience.

>
> Now back to your question: as stated in the javadocs of UdtValue and TupleValue, built-in implementations for these two interfaces are mutable, contrary to most other built-in implementations.

I see. I think what confused me a bit was the @CheckReturnValue on the
methods. This seems to be due to the SettableByName interface in use.
It would be neat, if possible to have a non checked result when using
mutable things, as IntelliJ etc find them, which can lead you to do
irrational things, like swap the refs thinking maybe that's why things
didn't pass.

> Hope that helps and let us know if you have more questions (we'll answer the other ones you posted shortly)!

Yep very helpful so far. Thanks and will get something up soon for
review (just plodding along until then). I think I have a couple more
commented out things and will raise those as email, solving them ready
a PR.

Alexandre Dutra

unread,
Jul 24, 2020, 12:06:40 PM7/24/20
to DataStax Java Driver for Apache Cassandra User Mailing List
I think what confused me a bit was the @CheckReturnValue on the methods. This seems to be due to the SettableByName interface in use.

Indeed, and we are on the same boat here. We tried to use ErrorProne's @CanIgnoreReturnType, but it's not widely supported, and moreover, since it doesn't apply to inherited methods, it doesn't really solve the problem. The only solution, to our best knowledge, would be to override each and every method and annotate with @CanIgnoreReturnType where appropriate, but that's a line that we decided not to cross. With the current situation, at least we have to deal with false positives, not false negatives.

Good luck!
Cheers,

Alex

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
Reply all
Reply to author
Forward
0 new messages