How do PreparedStatements work in Cassandra?

Drew Kutcharian

unread,

Jan 8, 2014, 9:45:16 PM1/8/14

to java-dri...@lists.datastax.com

Looking at the code seems like that every time I prepare a statement, the driver connects to Cassandra. So there is a blocking network round-trip each time I prepare a statement. Am I correct?

This brings me to the main point. I'm building a fully async/non-blocking app, and since there is no prepareAsync method, it's pretty much pointless to use prepared statements since they always just wait for the prepared statement to come back (see Uninterruptibles.getUninterruptibly(future) in Session.toPreparedStatement).

As a workaround I was thinking of preparing all the statements once at boot time, but I remember seeing on the mailing list that the prepared statements are stored on the server side in an LRU map and also don't survive server restarts. So not sure if it would be possible to cache the prepared statements.

So it seems like I'm stuck with using regular statements.

Is my thinking correct here or am I missing something?

BTW, If I can't use prepared statements, what's the best way to bind variables to a statement to prevent CQL injection?

Thanks,

Drew

Steven A Robenalt

unread,

Jan 9, 2014, 2:39:57 AM1/9/14

to java-dri...@lists.datastax.com

Hi Drew,

You only need to prepare the statement once for many uses, so there is no need for a prepareAsync. Instead, you prepare the statement, then create a BoundStatement from your PreparedStatement and your bind variables, and use session.executeAsync() on the BoundStatement for each invocation.

Hope that helps,

Steve

To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

--

Steve Robenalt

Software Architect

HighWire | Stanford University

425 Broadway St, Redwood City, CA 94063

srob...@stanford.edu

http://highwire.stanford.edu

Drew Kutcharian

unread,

Jan 9, 2014, 7:56:07 PM1/9/14

to java-dri...@lists.datastax.com

Hi Steve,

I see. Say I have a method called getUser(id: UUID) that gets called many times during the life of the application, when should I prepare the statement? At application boot time? And then, I have to cache that prepared statement somehow to use it again in consecutive calls. I know that prepared statements are not persisted during server restarts of replicated to new nodes. What happens when a server restarts of a new node gets added? Can I still use that prepared statement that was created at the application boot time?

- Drew

Steven A Robenalt

unread,

Jan 9, 2014, 8:24:20 PM1/9/14

to java-dri...@lists.datastax.com

Hi Drew,

I do my statement prepares at first use of a service and cache the prepared statement(s) for the life of the service. Redeploying the service app or restarting the app server thus forces the statements to be prepared again. The Java Driver handles all of the interaction with the Cassandra cluster, and my prepared statements are handled by any available node, even through additions, removals, restarts, repairs, etc of nodes.

I do also use a health check service that validates connectivity from the app server to the Cassandra cluster and takes the app server offline with an alarm if connectivity fails, though such outages have mostly been transient and self-repairing to date.

Steve

Drew Kutcharian

unread,

Jan 9, 2014, 10:03:31 PM1/9/14

to java-dri...@lists.datastax.com

Thanks Steve, this really cleared things up. BTW, I was digging thru the drive code and came across the javadoc for the SimpleStatement. Seems like it’s a hybrid between a prepared statement and good ol’ text statement. The things I like about it (according to the javadoc) are:

1. It doesn’t convert the parameters to string even though it allows parameter binding

2. Does one round-trip to Cassandra node

3. Does not leave any prepared statement stored in memory (neither client or server side) once it has been executed.

Have you given this a try? I wonder how much difference there is between SimpleStatement and PreparedStatement performance-wise and if the PreparedStatements are worth the headache.

- Drew

Steven A Robenalt

unread,

Jan 9, 2014, 10:26:33 PM1/9/14

to java-dri...@lists.datastax.com

I suppose I should test it to verify, but it seems to me that SimpleStatement is intended for the case when a statement is to be executed only once, or at least infrequently. Parsing a statement is usually a nontrivial part of the performance, so better to parse only once and cache it than to parse every time it's executed.

Steve

Alex Popescu

unread,

Jan 9, 2014, 10:47:10 PM1/9/14

to java-dri...@lists.datastax.com

Some tests run but Netflix comparing statements with prepared statements:

http://techblog.netflix.com/2013/12/astyanax-update.html

:- a)

Alex Popescu

Sen. Product Manager @ DataStax

@al3xandru

Drew Kutcharian

unread,

Jan 9, 2014, 10:50:04 PM1/9/14

to java-dri...@lists.datastax.com

I also did a bit more digging and found out that the driver is already keeping a cache of all the prepared statements (see https://github.com/datastax/java-driver/blob/2.0/driver-core/src/main/java/com/datastax/driver/core/Cluster.java#L851) but strangely when you call session.prepare (https://github.com/datastax/java-driver/blob/2.0/driver-core/src/main/java/com/datastax/driver/core/SessionManager.java#L95) it doesn’t use that cache. I wonder why is that.

It’d be great if Sylvan can weigh in here. ;)

- Drew

Drew Kutcharian

unread,

Jan 9, 2014, 11:04:18 PM1/9/14

to java-dri...@lists.datastax.com

Thanks Alex. This is a great read. Based on this prepared statements are the way to go. Let’s hope we hear from Sylvain on the caching semantics of the driver and why session.prepare doesn’t use the statement cache in the Cluster class.

Alex Popescu

unread,

Jan 9, 2014, 11:09:39 PM1/9/14

to java-dri...@lists.datastax.com

On Thu, Jan 9, 2014 at 8:04 PM, Drew Kutcharian <dr...@venarc.com> wrote:

Thanks Alex. This is a great read. Based on this prepared statements are the way to go. Let’s hope we hear from Sylvain on the caching semantics of the driver and why session.prepare doesn’t use the statement cache in the Cluster class.

Check the last comment on this issue which provides an answer: https://datastax-oss.atlassian.net/browse/JAVA-236

Reply all

Reply to author

Forward