ControlConnection refactoring

16 views
Skip to first unread message

Dmitry

unread,
Jun 14, 2016, 5:46:13 AM6/14/16
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi guys!

I've been recently looking into java driver code and it seems that there are several issues with code organization, I thought that, for example, ControlConnection may benefit from some refactoring. 

So i've started a branch to see what can be done, but is seems a lot of work and a major code change (though most of it is copy-pasting) so I decided to ask if these are indeed "issues" to your opinion.

So here are some thoughs (about usages from ControlConnection mainly):

1. Cluster acts a a God-object, everybody have access to all its fields so it is not easy to check what methods are actually needed for specific class. Adding more granular interfaces (like ClusterHosts for Metadata with getHost/getAllHosts/addIfAbsent) and passing them via dependency injection would allow better unit testing, because such dependencies are more easily mocked.

2. Replacing static methods with instance ones (same dependency injection pricinple).

3. Moving metadata update code to owner class. Currently Host and some other data like partitioner can be updated from any part of code, for most cases it may be better to have a single place for such updates - in the class holding most of this data, for better encapsulation. Not sure about implementation details though.

4. Splitting ControlConnection into several small classes each with 1-2 methods doing something very specific - connection logic, quering data, parsing result set rows, updating metadata to match queried data, etc. Purpose - more granular unit testing.

5. Dependency injection for loggers (if messages should be asserted). Static loggers are easy to use but asserting messages requires too much code. Since most objects live for a long time dependency injecting logger into them should not be a problem.

So let's discuss it. If some of these issues are really considered issues I may try to contribute some code later.

Alex Popescu

unread,
Jun 14, 2016, 4:42:43 PM6/14/16
to java-dri...@lists.datastax.com
Hi Dmitry,

Thanks for your investigation. While all your points are good & valid, if you are planning to work on some proposed changes I'd suggest a more pragmatic approach that would start with the problem being solved first (e.g. instead of more granular unit testing, the question would be "what unit tests are needed in this area that cannot be done right now", etc.). There are thing that needs to be kept in mind is maintaining the existing API and also not introducing unnecessary new APIs.

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.



--
Bests,

Alex Popescu | @al3xandru
Sen. Product Manager @ DataStax



» DataStax Enterprise - the database for cloud applications. «


Dmitry

unread,
Jun 16, 2016, 12:39:44 PM6/16/16
to DataStax Java Driver for Apache Cassandra User Mailing List
So I've spent some more time looking into it. Results are not that great, but still I think it is better to share some thoughts.

Initial intention was to be able to create Cluster in unit tests and have some mock objects to imitate nodes and be able to trigger connection failures.


1. I've moved ControlConnection system tables quering logic into several small separate classes. This is highly subjective, but I personally think that having 1-2 methods per class results is simplier unit tests organization - tests for class MyObject go to MyObjectClass and it is easy to find them and check if some specific test is present (than in 1000 line test class or among case-named classes like ClusterInitTest, SchemaAgreementTest, etc.)

2.  I've moved ControlConnection system tables quering logic into small separate classes. Since cluster nodes are stored in system.local, system.peers and these tables are queried with 2-3 basic queries - I've wrapped these queries into SystemTableQuery class to provide ability to test metadata parsing without real cassanda. Approach resulted in tests beeing very similar to those using scassandra, but beeing 100x times faster and not dependent on protocol version and other transport-related stuff. However, this still doesn't help in creating Cluster via pure unit test, but it is one step further in this direction.

3. Next things got complicated. ControlConnection has isInitialConnection flag. As far as I understood it was added because first connect attempt is performed when LoadBalancingPolicy is not initialized. It seems that in most cases it can be removed and invocations to LBP be blocked via wrapper class that ignores all invocations till init() is called. But I'm not sure if this approach is safe enough and there is also a single host.setDown() call in ControlConnection.reconnectInternal() that finished me off :) Reconnection and node up/down marking logic is not that clear so I failed to understand if it is possible to replace it with some other call. I think better approach to provide different behavior upon first connect would be to move that initial-connect-specific logic into Cluster.init(), but that would require some ControlConnection flow changes that are not that safe.

Olivier Michallat

unread,
Jun 22, 2016, 2:07:57 PM6/22/16
to java-dri...@lists.datastax.com
Hi Dmitry,

I agree with your remarks, we're aware that the codebase has organization issues that need to be addressed. But as you've seen yourself some parts are complex and sensitive, and what looks like simple refactorings can quickly become more involved. There's also the issue of backward compatibility, we can't do any change on the 3.x branch that would break the public API.

We're thinking of ways to reorganize the codebase, but it might take a bit before we get to it. In the meantime I would be in favor of avoiding huge refactorings, in order to minimize regressions.

Olivier

--

Olivier Michallat

Driver & tools engineer, DataStax


--
Reply all
Reply to author
Forward
0 new messages