executeAsync failing silently when inserting records

1,686 views
Skip to first unread message

Hale Sostock

unread,
Jun 19, 2015, 1:02:48 PM6/19/15
to java-dri...@lists.datastax.com

I am attempting to use the Cassandra Java driver to do a bulk insert of a large amount of records to two different tables on a cluster. I'm starting with a test data set of about 600 records (resulting in 600 records on each table), but will eventually be inserting about 2.5 million. I am using a prepared statement, and binding each record PreparedStatement and BoundStatement.

When I perform the inserts synchronously (with session.execute), everything behaves as expected, with all records being inserted and showing up when I do a SELECT COUNT(*) from cqlsh. However, if I use session.executeAsync, only about 90% of the records are returned in each table when I query from cqlsh. After the inserts, I am waiting on all of the futures using ResultSetFuture.getUninterruptibly, but this has not made a difference, even if I attempt to batch the inserts in groups (e.g. call session.executeAsync 50 times, then call ResultSetFuture.getUninterruptibly on those 50 before continuing).

I do not see any exceptions, and looking at resultSet.wasApplied() always returns true. I am using Cassandra 2.1.4, and have tried this both on a 3-node CCM cluster on localhost, as well as a deployed 3-node cluster on AWS.

I tried processing the larger dataset of 2.5 million records just to see what might happen, and after about 30,000 records, I start seeing the following message in console:

[cluster1-timeouter-0] DEBUG c.d.driver.core.RequestHandler - onTimeout triggered but the response was completed by another thread, cancelling (retryCount = 0, queryState = QueryState(count=0, inProgress=false, cancelled=false), queryStateRef = QueryState(count=0, inProgress=false, cancelled=false))

Additionally, with the larger dataset, the longer the inserts continue, the larger the gap between processed records and successful inserts. E.g. after about 600,000 records processed, I only see about 440,000 records in cqlsh. I also tried adding a callback to each future to log any failures to console:

ResultSetFuture accountFuture = session.executeAsync(insertAccount(insertStmt, record));
Futures.addCallback(accountFuture, new FutureCallback<ResultSet>() {
    @Override
    public void onSuccess(@Nullable com.datastax.driver.core.ResultSet resultSet) {
        // do nothing
    }

    @Override
    public void onFailure(Throwable throwable) {
        System.out.printf("Failed with: %s\n", throwable);
    }
});

However, I do not see any failures in console.

Georg Köster

unread,
Jun 20, 2015, 5:58:37 AM6/20/15
to java-dri...@lists.datastax.com
Hi Hale,

What consistency level are you using - in cqlsh and for the writers?

The message you are seeing is level Debug because it is not a problem. The timeout runnable is queued but the response processing runnable is queued (and executed) before it. So everything was cleaned up correctly when the timeout runnable checks the situation.
Furthermore it looks to me as if no writes were lost (if you didn't kill a server under cl.ONE). 

What makes you think you lost writes? If the inserting is still going on while you check the count on cqlsh the servers may not be consistent depending on the CL - replication is affected by the high load.

If you don't use a CL of 3 on the writer side or on the reader (or quorum on both) you will see row counts that differ from the insertion writer status until the C* replication has caught up and consistency has been established again. This is how replication with eventual consistency works.

Cheers
Georg

Hale Sostock

unread,
Jun 22, 2015, 12:43:00 PM6/22/15
to java-dri...@lists.datastax.com
Hi Georg, thanks for the reply.

As a test, I tried writing to a keyspace with the following configuration (as a local 3-node cluster set up with CCM):

CREATE KEYSPACE account WITH replication = {

  'class': 'SimpleStrategy',

  'replication_factor': '2'

};


I tried to insert 293,400 records, using consistency level ALL. I then tried to query via cqlsh using consistency level ALL, both immediately after the writes completed, and several minutes later. In both cases (and in any subsequent queries), the count reported was 208,764.

I tried a second, smaller test with 584 records, using the same configuration and consistency as above. For this test, after the inserts completed, I saw a count of 575 records.

I also tried inserting the 584 records, using a replication factor of three and consistency levels ALL and THREE in separate tests, and in both cases I received similar results (577 records in one case, 569 in another).

Georg Köster

unread,
Jun 22, 2015, 5:15:29 PM6/22/15
to java-dri...@lists.datastax.com
Hi Hale,

sorry that I couldn't be of more help. Are you doing using queries like this for counting the results:

select * from TABLE ; (no WHERE clause or WHERE clause without no or incomplete partition key)  

Maybe this changed but AFAIK iteration of all 'rows' in a table wasn't reliable. I never use this in my Cassandra applications. Instead I use some kind of index (usually of my own design) and query by row key/partition key.

If you could post your data model and its requirements I could propose a design change to accommodate your requirements.

Can you change your checking query to query rows by partition key?

What do the others say? Is iteration of all rows now working?

Best,
Georg

Steve Robenalt

unread,
Jun 22, 2015, 5:51:49 PM6/22/15
to Java Driver User mailing list
Hi Hale,

Do you have all nodes in your cluster properly time-synced via ntp? I've seen similar issues when timestamps were mismatched as a consequence of Cassandra's "last write wins" philosophy. Another thing that might be worth trying is grouping updates using BatchStatements. If you try that, you can experiment with various batch sizes and see if they make any difference in the consistency.

Hope that helps,
Steve




To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.



--
Steve Robenalt 
Software Architect
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063

Technology for Scholarly Communication

Hale Sostock

unread,
Jun 22, 2015, 6:18:48 PM6/22/15
to java-dri...@lists.datastax.com
To verify record count, I am just doing a simple SELECT COUNT(*) FROM account.account LIMIT {value greater than # of records};

What concerns me is that if I perform all of the inserts synchronously using session.execute(statement), all the records always appear immediately after the inserts finish, with no apparent delay. It is only when I asynchronously insert them using session.executeAsync(statement) that there is any discrepancy.

My data model isn't anything particularly fancy, just a list of string/date fields with a single string field as the primary/partition key. I have tested on multiple tables, and i always see the same behavior when using session.executeAsync.

Hale Sostock

unread,
Jun 22, 2015, 6:22:19 PM6/22/15
to java-dri...@lists.datastax.com
Steve, I have not looked into that at all, but with a local cluster created with CCM, this should not be an issue, correct? Also, as I mentioned in my reply to Georg, this behavior only occurs when using asynchronous queries. If I use synchronous queries with session.execute(statement), no inserts are dropped, and the count via cqlsh is immediately accurate.

I did try using batch statements, but I still saw the same behavior when using session.executeAsync with dropped inserts. Additionally, I am trying to avoid using batches, per https://lostechies.com/ryansvihla/2014/08/28/cassandra-batch-loading-without-the-batch-keyword/ (and my inserts will be distributed pretty evenly across partitions).

Steve Robenalt

unread,
Jun 22, 2015, 6:32:46 PM6/22/15
to Java Driver User mailing list
Hi Hale,

I have seen records dropped in a local cluster due to unsynchronized clocks. Not sure if CCM makes a difference. Based on what I know about how the records are dropped, it seems that using async mode would aggravate such an issue by allowing records to be created more quickly, thus introducing more chances for the timestamps to mismatch relative to the node handling each request.

Regarding batches, they worked well in conjunction with our data model and application design. Our primary concern was one of consistency, not an attempt at performance optimization, and we did see a big improvement in consistency, particularly with counters.

Steve

Hale Sostock

unread,
Jun 23, 2015, 5:04:46 PM6/23/15
to java-dri...@lists.datastax.com
Hi Steve,
using clockdiff on the deployed servers (which are all running ntpd):

cassandra01$ clockdiff cassandra02
..................................................
host
=cassandra02 rtt=1(0)ms/1ms delta=4ms/4ms Tue Jun 23 20:26:15 2015
cassandra01$ clockdiff cassandra03
.
host
=cassandra03 rtt=750(187)ms/0ms delta=4ms/4ms Tue Jun 23 20:26:21 2015
cassandra01$ clockdiff cassandra04
..
host
=cassandra04 rtt=562(280)ms/0ms delta=4ms/4ms Tue Jun 23 20:26:28 2015

Since my local cluster (with CCM) runs entirely on my machine, I would have to assume they would all refer to the same clock?
Also, since all of the records being inserted have unique partition keys, would that even be affected by unsynchronized clocks (since I am only performing inserts, and not updates)?

Giampaolo Trapasso

unread,
Feb 4, 2016, 5:52:05 AM2/4/16
to DataStax Java Driver for Apache Cassandra User Mailing List, hsos...@cyngn.com
Hi Hale (and all the group),

I think I'm having the same problem. Any update on this? Did you solve it or did you try another approach?

Giampaolo

Avi Levi

unread,
Nov 28, 2016, 4:35:32 AM11/28/16
to DataStax Java Driver for Apache Cassandra User Mailing List, hsos...@cyngn.com
Having the same issue on my local machine , is there a solution for that ?

Arun Pandian

unread,
Nov 28, 2016, 5:32:35 AM11/28/16
to DataStax Java Driver for Apache Cassandra User Mailing List, hsos...@cyngn.com
Hi all,
  I faced a similar issue recently, turns out it was due to a race condition in my code. The class "BoundStatement" is not thread safe make sure you are not sharing a BoundStatement across different threads. In the code posted by Hale, if the "insertStmt" is a BoundStatement then it should be due to the race condition in the application.

Thanks,
Arun.

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
Reply all
Reply to author
Forward
0 new messages