Trouble with many PreparedStatements in 1.x driver?

Keith Freeman

unread,

Jan 8, 2014, 5:05:37 PM1/8/14

to java-dri...@lists.datastax.com

Hello,

We recently upgraded from driver v1.0.4 to v1.0.5, and started getting these warnings in our logs:

14-01-08 20:52:54.879 [pool-11-thread-10] WARN com.datastax.driver.core.Cluster - Re-preparing already prepared query BEGIN BATCH INSERT INTO...

We discovered that indeed we were sometimes preparing the same statements multiple times, and started fixing our code to prevent that. These statements are the 1.x prepared batches -- we have a bunch of "messages" to save in our "messages" table, so we prepare them at a (sometimes VERY long) batch statement to improve write performance.

But since the design of our app is multi-threaded and generally non-blocking, the size of the bunch is not always the same when it comes time to save. We have a configurable number of threads that split up the incoming bunch of messages and save them, so all those threads have identical save sizes for each pass, but the next pass might have a different size. So each time we create the right-size PreparedStatement and use it in all the threads. We started implementing a cache-map of PreparedStatements as well to avoid re-creating sizes used before, then realized that obviously the datastax Cluster is maintaining the same cache!

So we looked at the code for Cluster and indeed it has a map of PreparedStatements that appears to contain ALL of them that it's ever seen. This leads me to a couple of questions:

Q1: it looks like this map will grow forever with a use case like ours, right?

Q2: would you consider exposing the map somehow so that we could use it rather than duplicating it in our code

Q3: any change that PreparedStatement could be made Serializable? (in our larger batch statements (1000's of inserts), it can take 7+ seconds just to prepare a single statement, so it would be much faster to load serialized copies at restart instead of building them through the API)

Thanks.

Sylvain Lebresne

unread,

Jan 9, 2014, 5:25:10 AM1/9/14

to java-dri...@lists.datastax.com

Keith,

> Q1: it looks like this map will grow forever with a use case like ours, right?

Not unless your application is misbehaving in the first place. Since 1.0.5

(https://datastax-oss.atlassian.net/browse/JAVA-201), the internal map uses

weak references so that only PreparedStatement that are referenced in the

application are kept. If your application continuously create new

PreparedStatement without ever unreferencing previously created ones, then yes,

the map will grow forever, but you'd still be screwed even if Cluster was not

caching PreparedStatement at all :).

> Q2: would you consider exposing the map somehow so that we could use it rather than duplicating it in our code

The behavior above means it wouldn't work for the use case you've described, as

a PreparedStatement is automatically removed from the cache as soon

(technically "when the GC feels like it" but well, you get the idea) as your

application don't hold a strong reference to it.

So really, it's meant to be the job of your application to cache the

PreparedStatement that it plans on reusing, and to implement whatever

kind of size management you want for that cache. The internal cache of Cluster

will follow.

> Q3: any change that PreparedStatement could be made Serializable? (in our

> larger batch statements (1000's of inserts), it can take 7+ seconds just to

> prepare a single statement, so it would be much faster to load serialized

> copies at restart instead of building them through the API)

To be honest, the idea don't feel me with joy a priori. Though I understand

the annoying aspect of what you describe, Serializing PreparedStatement feels

more like a hack that something that really make sense. Besides, making it

serializable is not enough, we'd have to also add a way to add newly

deserialized PreparedStatement to a Cluster instance. That's a bit too much new

API for a use case that, frankly, I'm not sure is something I'd advise doing in

general. Also, while I completely understand that you're just trying to work

around the lack of PreparedStatement batching in the 1.0 driver, that problem

is fixed in the 2.0 version.

Also, even in the 1.0 version, I suspect that having such large batch statement

might be something that can be somewhat limited in practice. For instance, at

least for the cases where batches are used for performance sake (which is

probably the most common case in my experience), one thing that I'd try would

be that instead of preparing very large batches of different size, I'd prepare

one batch with say 50-100 inserts, which I'd use repeatedly (of course, you

can end up with a "remainder", but you can use tricks like repeating the last

record multiple times since inserts are idempotent). I suspect that in practice

this might not be a lot slower than preparing a huge batch.

Anyway, I don't pretend I know your exact use case because I don't, and I'm sure

you've considered many possibility and do it the way you do it for good reason.

My point is merely to say that I do think the problem you are describing is

rather specific. And this to justify why I'm not convinced it justify making

PreparedStatement Serializable, especially given that Serialization in Java is

known to be a pain to maintain in particular.

--

Sylvain

Keith Freeman

unread,

Jan 9, 2014, 9:57:08 AM1/9/14

to java-dri...@lists.datastax.com

Thanks for the responses. I like the idea of preparing a couple of prepared sizes in advance to use repeatedly with the "remainder" trick you suggest.

One followup question: do you know why it takes so long to prepare these statements? In one case we have a batch of 250 identical inserts into a table that has 14 fields, and it takes 9-10 seconds just to prepare that one statement on a 3GHz CPU. That seems like a crazy amount of time.

Sylvain Lebresne

unread,

Jan 9, 2014, 10:17:46 AM1/9/14

to java-dri...@lists.datastax.com

One followup question: do you know why it takes so long to prepare these statements? In one case we have a batch of 250 identical inserts into a table that has 14 fields, and it takes 9-10 seconds just to prepare that one statement on a 3GHz CPU. That seems like a crazy amount of time.

I believe it's due to https://issues.apache.org/jira/browse/CASSANDRA-6107 (unless you're using something older than 1.2.11). Basically, Cassandra has a cache of prepared statement, and that cache is limited in size to protect the server against misbehaving clients (or long lived clients that prepare new statements continuously over time). To enforce that limit, we compute the actual memory size used by the prepared statement object which involve walking the whole object and that's not particularly fast. 9-10 seconds does feel *really* slow though, I grant you that. If you happen to use 1.2.11 exactly, part of the problem might be that you ran into https://issues.apache.org/jira/browse/CASSANDRA-6369 (that was doing tons of useless work), but if you're on 1.2.12+ that issue has been fixed.

If you do are on C* 1.2.12+, do feel free to open an "improvement" Cassandra ticket with a simple example of crazy timing (one we can easily reproduce). I can't promise anyone will make it a priority, but there is no harm in at least looking if there is an easy way to make that faster.

--

Sylvain

To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Keith Freeman

unread,

Jan 10, 2014, 7:47:56 PM1/10/14

to java-dri...@lists.datastax.com

We're on v1.2.12.

And for anyone visiting this thread, I discovered that building large prepared batch statements should definitely be considered harmful. In my example above (where preparing a 250-insert batch tool 9+ seconds), preparing a 10-insert batch for the same table only takes about 100 milliseconds and binding + executing 25 10-insert PS's is about 30% FASTER than binding + executing a single 250-insert PS. My guess is that there might be some network packet-size/fragmentation happening, as well as some extra-linear growth in the Cassandra API overhead. In any case I'm now happily inserting well over 5000 records/second (24 threads) using my 10-insert prepared statement repeatedly.

I can't really break out a simple example of this right now, but I can say that the slowness in building the large PS's appears to be outside of the Datastax API/classes and in the underlying Cassandra API.

Reply all

Reply to author

Forward