Trouble with many PreparedStatements in 1.x driver?

753 views
Skip to first unread message

Keith Freeman

unread,
Jan 8, 2014, 5:05:37 PM1/8/14
to java-dri...@lists.datastax.com
Hello,

We recently upgraded from driver v1.0.4 to v1.0.5, and started getting these warnings in our logs:

14-01-08 20:52:54.879 [pool-11-thread-10] WARN  com.datastax.driver.core.Cluster - Re-preparing already prepared query BEGIN BATCH INSERT INTO...

We discovered that indeed we were sometimes preparing the same statements multiple times, and started fixing our code to prevent that.  These statements are the 1.x prepared batches -- we have a bunch of "messages" to save in our "messages" table, so we prepare them at a (sometimes VERY long) batch statement to improve write performance.

But since the design of our app is multi-threaded and generally non-blocking, the size of the bunch is not always the same when it comes time to save.  We have a configurable number of threads that split up the incoming bunch of messages and save them, so all those threads have identical save sizes for each pass, but the next pass might have a different size.  So each time we create the right-size PreparedStatement and use it in all the threads.  We started implementing a cache-map of PreparedStatements as well to avoid re-creating sizes used before, then realized that obviously the datastax Cluster is maintaining the same cache!

So we looked at the code for Cluster and indeed it has a map of PreparedStatements that appears to contain ALL of them that it's ever seen.  This leads me to a couple of questions:

Q1: it looks like this map will grow forever with a use case like ours, right?
Q2: would you consider exposing the map somehow so that we could use it rather than duplicating it in our code
Q3: any change that PreparedStatement could be made Serializable?  (in our larger batch statements (1000's of inserts), it can take 7+ seconds just to prepare a single statement, so it would be much faster to load serialized copies at restart instead of building them through the API)

Thanks.

Sylvain Lebresne

unread,
Jan 9, 2014, 5:25:10 AM1/9/14
to java-dri...@lists.datastax.com
Keith,

> Q1: it looks like this map will grow forever with a use case like ours, right?

Not unless your application is misbehaving in the first place. Since 1.0.5
weak references so that only PreparedStatement that are referenced in the
application are kept. If your application continuously create new
PreparedStatement without ever unreferencing previously created ones, then yes,
the map will grow forever, but you'd still be screwed even if Cluster was not
caching PreparedStatement at all :).

> Q2: would you consider exposing the map somehow so that we could use it rather than duplicating it in our code

The behavior above means it wouldn't work for the use case you've described, as
a PreparedStatement is automatically removed from the cache as soon
(technically "when the GC feels like it" but well, you get the idea) as your
application don't hold a strong reference to it.

So really, it's meant to be the job of your application to cache the
PreparedStatement that it plans on reusing, and to implement whatever
kind of size management you want for that cache. The internal cache of Cluster
will follow.

> Q3: any change that PreparedStatement could be made Serializable?  (in our
> larger batch statements (1000's of inserts), it can take 7+ seconds just to
> prepare a single statement, so it would be much faster to load serialized
> copies at restart instead of building them through the API)

To be honest, the idea don't feel me with joy a priori. Though I understand
the annoying aspect of what you describe, Serializing PreparedStatement feels
more like a hack that something that really make sense. Besides, making it
serializable is not enough, we'd have to also add a way to add newly
deserialized PreparedStatement to a Cluster instance. That's a bit too much new
API for a use case that, frankly, I'm not sure is something I'd advise doing in
general. Also, while I completely understand that you're just trying to work
around the lack of PreparedStatement batching in the 1.0 driver, that problem
is fixed in the 2.0 version.

Also, even in the 1.0 version, I suspect that having such large batch statement
might be something that can be somewhat limited in practice. For instance, at
least for the cases where batches are used for performance sake (which is
probably the most common case in my experience), one thing that I'd try would
be that instead of preparing very large batches of different size, I'd prepare
one batch with say 50-100 inserts, which I'd use repeatedly (of course, you
can end up with a "remainder", but you can use tricks like repeating the last
record multiple times since inserts are idempotent). I suspect that in practice
this might not be a lot slower than preparing a huge batch.

Anyway, I don't pretend I know your exact use case because I don't, and I'm sure
you've considered many possibility and do it the way you do it for good reason.
My point is merely to say that I do think the problem you are describing is
rather specific. And this to justify why I'm not convinced it justify making
PreparedStatement Serializable, especially given that Serialization in Java is
known to be a pain to maintain in particular.

--
Sylvain

Keith Freeman

unread,
Jan 9, 2014, 9:57:08 AM1/9/14
to java-dri...@lists.datastax.com
Thanks for the responses.  I like the idea of preparing a couple of prepared sizes in advance to use repeatedly with the "remainder" trick you suggest.

One followup question: do you know why it takes so long to prepare these statements?  In one case we have a batch of 250 identical inserts into a table that has 14 fields, and it takes 9-10 seconds just to prepare that one statement on a 3GHz CPU.  That seems like a crazy amount of time.

Sylvain Lebresne

unread,
Jan 9, 2014, 10:17:46 AM1/9/14
to java-dri...@lists.datastax.com

One followup question: do you know why it takes so long to prepare these statements?  In one case we have a batch of 250 identical inserts into a table that has 14 fields, and it takes 9-10 seconds just to prepare that one statement on a 3GHz CPU.  That seems like a crazy amount of time.

I believe it's due to https://issues.apache.org/jira/browse/CASSANDRA-6107 (unless you're using something older than 1.2.11). Basically, Cassandra has a cache of prepared statement, and that cache is limited in size to protect the server against misbehaving clients (or long lived clients that prepare new statements continuously over time). To enforce that limit, we compute the actual memory size used by the prepared statement object which involve walking the whole object and that's not particularly fast. 9-10 seconds does feel *really* slow though, I grant you that. If you happen to use 1.2.11 exactly, part of the problem might be that you ran into https://issues.apache.org/jira/browse/CASSANDRA-6369 (that was doing tons of useless work), but if you're on 1.2.12+ that issue has been fixed.

If you do are on C* 1.2.12+, do feel free to open an "improvement" Cassandra ticket with a simple example of crazy timing (one we can easily reproduce). I can't promise anyone will make it a priority, but there is no harm in at least looking if there is an easy way to make that faster.

--
Sylvain

 
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Keith Freeman

unread,
Jan 10, 2014, 7:47:56 PM1/10/14
to java-dri...@lists.datastax.com
We're on v1.2.12.  

And for anyone visiting this thread, I discovered that building large prepared batch statements should definitely be considered harmful.  In my example above (where preparing a 250-insert batch tool 9+ seconds), preparing a 10-insert batch for the same table only takes about 100 milliseconds and binding + executing 25 10-insert PS's is about 30% FASTER than binding + executing a single 250-insert PS.  My guess is that there might be some network packet-size/fragmentation happening, as well as some extra-linear growth in the Cassandra API overhead. In any case I'm now happily inserting well over 5000 records/second (24 threads) using my 10-insert prepared statement repeatedly.

I can't really break out a simple example of this right now, but I can say that the slowness in building the large PS's appears to be outside of the Datastax API/classes and in the underlying Cassandra API.
Reply all
Reply to author
Forward
0 new messages