Writing into table that was created using CASS_CONSISTENCY_ALL fails with 'unconfigured columnfamily' periodically

37 views
Skip to first unread message

Robin Mahony

unread,
May 1, 2015, 6:30:45 PM5/1/15
to cpp-dri...@lists.datastax.com
Hi there,

So I am creating a table through the driver at CASS_CONSISTENCY_ALL in a synchronous fashion. I would expect if the table creation succeeds, that all nodes should now have this new table. However, periodically when I try to insert into this table immediately after creating it, it fails with:
'unconfigured columnfamily sometable'.

Can someone explain this to me? This seems to say the write consistency level isn't actually being applied to table creation.

Cheers,

Robin

Michael Penick

unread,
May 4, 2015, 5:04:49 PM5/4/15
to cpp-dri...@lists.datastax.com
Consistency levels do not apply to schema changes because schema changes happen through a different mechanism than queries. When the driver returns the schema should be propagated to the cluster, unless the operation of propagating the schema times out (maximum schema agreement time is 10 seconds). You'll see this in the driver logs:

"No schema agreement on live nodes after XXXX ms. Schema may not be up-to-date on some nodes."


Mike

To unsubscribe from this group and stop receiving emails from it, send an email to cpp-driver-us...@lists.datastax.com.

Karl Lehenbauer

unread,
May 5, 2015, 7:29:52 AM5/5/15
to cpp-dri...@lists.datastax.com
On Monday, May 4, 2015 at 4:04:49 PM UTC-5, Michael Penick wrote:
> Consistency levels do not apply to schema changes because schema changes happen through a different mechanism than queries. When the driver returns the schema should be propagated to the cluster, unless the operation of propagating the schema times out (maximum schema agreement time is 10 seconds). You'll see this in the driver logs:
>
> "No schema agreement on live nodes after XXXX ms. Schema may not be up-to-date on some nodes."

Hi Mike,

We are seeing the same problem Robin reported, that shortly after a schema change has been performed that we'll get "unconfigured columnfamily" attempting to insert into it, so I don't think the schema has been fully propagated to the cluster after the driver returns.

There was an error message at one point along the lines of schema hasn't propagated yet, but I didn't capture the message and can't find it right now.

Looking for that message I also found quite a few null pointer exceptions in the logs, don't know if that's a problem or not. Here's one:

ERROR [CompactionExecutor:7159] 2015-05-04 17:14:04,571 CassandraDaemon.java:167 - Exception in thread Thread[CompactionExecutor:7159,1,main]
java.lang.NullPointerException: null
at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.3.jar:2.1.3]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]

Joe Mistachkin

unread,
May 5, 2015, 4:19:40 PM5/5/15
to cpp-dri...@lists.datastax.com
I'm also seeing this behavior. Creating a new table and then using it ends
up resulting in errors about the table or "columnfamily" not being found.

I enabled full tracing in our test suite and I think I've uncovered some
important clues as to why:

If you search for "Schema change ", you'll note there are 105 matches. The
second and third matches, on lines 1063 and 1125, have table names. The
second match is of type "CREATED" and the third is of type "DROPPED". Other
matches do not have this, even when other new tables are created and dropped.
It should also be noted that test cass-6.2, which the second and third matches
are associated with, always passes and it is almost identical to another test,
cass-6.4, which fails.

I'm not sure where the missing table schema changes are going; however, after
looking at the cpp-driver code, I don't think they are being dropped there.

I can provide the full test log upon request. It does not contain anything
sensitive; however, it is around 5MB uncompressed or 184K zip compressed.

--
Joe Mistachkin

Karl Lehenbauer

unread,
May 5, 2015, 4:36:35 PM5/5/15
to cpp-dri...@lists.datastax.com
Joe asked that we check the system clocks on all the cluster nodes and sure enough one of four did not have ntp synchronization established. After correcting that, ALL of our tests are now passing.

Also the nodes were not in the reverse DNS, something we fixed. I don't know if this could have anything to do with it but fixing it sure made ssh connect more quickly.

Michael Penick

unread,
May 5, 2015, 4:49:06 PM5/5/15
to cpp-dri...@lists.datastax.com
Is anyone seeing failure to reach schema agreement in the driver logs? Or does anyone have debug-level driver logs they could share when this was happening?

I'd like to try to track down if this is a driver issue. I think I see one issue if the 'system.local' and 'system.peers' table are empty.

Mike

Reply all
Reply to author
Forward
0 new messages