Memory issue caused by driver-side metadata refresh

61 views
Skip to first unread message

Jing Meng

unread,
Mar 9, 2018, 2:36:14 AM3/9/18
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi, we are currently experiencing a issue related to cassandra-driver's metadata, and the root cause (specific to our cluster env) seems to be different replication strategies existing in our cluster, especially system_auth.

cassandra-driver: 3.1.4
cassandra: 2.1.18 (maybe not related)

The problem:
    Every time we join multiple nodes to the cluster (or change replication factors of multiple keyspaces) in a small time window, client-side applications using cassandra java-driver triggers Young/Full GC.

Analysis:
    We figured adding new node or changing keyspace replication would trigger TOPOLOGY_CHANGE/SCHEMA_CHANGE event, which was listened by driver-side application, and the extra memory pressure is caused by a TokenMap.build.
    
    Per https://datastax-oss.atlassian.net/browse/JAVA-664, mappings for different keyspaces with same replication factors are cached, in which case our cluster has 5 different dc-replication-factors (including the non-replicated system keyspace), and 3 of them are basically same. The system_auth keyspace, however, has a ~20 replication factor in 2 major datacenters (sum of RF: 50).
   
    We packaged a modified version of cassandra-driver-core, which only calculated mapping for keyspaces specified by application, and conducted simple tests. It seemed that calculating mapping for system_auth alone takes around 100Mb memory...

Question:
    1. We read the source code concerning system_auth of cassandra-driver and cassandra, is it true that if the application never uses system_auth, the cassandra-driver itself won't ever send query to system_auth? Just queried when cassandra node authenticates the user? 
    2. If we keep the metadataEnabled configuration to true while just computing token map for restricted keyspaces (e.g. system and keyspaces that the application uses), will we lose something like token-awareness? Furthermore, what would happen exactly if we disable metadata at driver-side?
    3. Is there any delicate solution to this in new versions of (java) cassandra-driver?

Thanks for reading, and sorry for my poor expression :p

Andy Tolbert

unread,
Mar 9, 2018, 10:41:21 AM3/9/18
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Jing,

Unfortunately this is an issue that has come up a few times,  especially with larger clusters with vnodes (I suspect you are using vnodes, is that correct?).  As you indicated, we've done some work to try to limit the computation and space of the token map, but if there is a keyspace in particular with a high RF this is still an issue.   One idea I had was to 'memoize' the TokenMap (JAVA-857), that is, only load a keyspace's token map when it's requested.  I think that would work great in cases where you have complex keyspaces that aren't used by the application (such as system_auth), but I think the implementation could be complex (especially in a concurrent application).   A way to filter keyspaces would probably be the best approach.

To answer your questions:

    1. We read the source code concerning system_auth of cassandra-driver and cassandra, is it true that if the application never uses system_auth, the cassandra-driver itself won't ever send query to system_auth? Just queried when cassandra node authenticates the user? 

That is true, the driver itself will never use the system_auth table, that is unless you use it to make queries to the table yourself.

    2. If we keep the metadataEnabled configuration to true while just computing token map for restricted keyspaces (e.g. system and keyspaces that the application uses), will we lose something like token-awareness? Furthermore, what would happen exactly if we disable metadata at driver-side?

If you make a change to allow restricting token map for keyspaces, I would anticipate that all token oriented features, such as TokenAwareRouting won't work for those restricted keyspaces.  If implemented correctly, token aware routing would work for keyspaces that you don't restrict.  The schema API methods in Metadata I would anticipate to still work.

    3. Is there any delicate solution to this in new versions of (java) cassandra-driver?

I am glad you asked that, we do recognize this as being a possible issue, so In the upcoming major version of the java driver (4.0) we do offer a configuration option, refreshed-keyspaces, to specify which keyspaces you want to explicitly include.

Thanks,
Andy

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.

Jing Meng

unread,
Mar 10, 2018, 1:48:26 AM3/10/18
to DataStax Java Driver for Apache Cassandra User Mailing List
Happy to know that its already handled with and we'd check out what's new in newer drivers, maybe we've already been left miles behind.

Yes we are using vnodes (but still not quite clear about how vnode is represented in driver source code), under which condition having many actually nodes seems  irrelevant with the memory issue, is that right? 
And just to be clear (sorry for another round),
1. As for system_traces, is it only queried when (manually by client application) using QueryTrace or literal query? I take it that the trace stored is saved by server nodes.
2. Per "token oriented features", for plain java applications that "use cassandra" (i.e. just using queries for user-defined keyspaces, without using spark) rather than "complement/integrate with cassandra"  it's literally just TokenAwarePolicy, am I right? I would imagine other applications could use token-awareness for gathering information or managing token-related events.

Thanks for ur detailed explanation.
It's awesome to have reliable community to consult on as cassandra is really rarely used and lesser known/understood than deserved in Chinese companies/developers...

在 2018年3月9日星期五 UTC+8下午11:41:21,Andrew Tolbert写道:
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Reply all
Reply to author
Forward
0 new messages