Use of MessageDigest.getInstance() results in lock contention

424 views
Skip to first unread message

Tom Leach

unread,
Feb 6, 2017, 9:20:07 PM2/6/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi folks, 

I hope this is the right place to raise this. I'm working on a loader service which is driving about 20,000 writes per second to a Cassandra cluster from a single machine, across about 48 threads. 

This service is seeing heavy lock contention from the MessageDigest.get("MD5") call inside Token.java

Here's the relevant portion of stack trace taken from the thread dump:

"ForkJoinPool-1-worker-8" #227 daemon prio=5 os_prio=0 tid=0x00007f29b8005000 nid=0x160c waiting for monitor entry [0x00007f281022c000]

   java.lang.Thread.State: BLOCKED (on object monitor)

at java.security.Provider.getService(Provider.java:1035)

- waiting to lock <0x00007f2af73e7838> (a sun.security.provider.Sun)

at sun.security.jca.ProviderList.getService(ProviderList.java:332)

at sun.security.jca.GetInstance.getInstance(GetInstance.java:157)

at java.security.Security.getImpl(Security.java:695)

at java.security.MessageDigest.getInstance(MessageDigest.java:167)

at com.datastax.driver.core.Token$RPToken$RPTokenFactory.md5(Token.java:574)

at com.datastax.driver.core.Token$RPToken$RPTokenFactory.hash(Token.java:604)

at com.datastax.driver.core.Token$RPToken$RPTokenFactory.hash(Token.java:564)

at com.datastax.driver.core.Metadata.getReplicas(Metadata.java:300)

at com.datastax.driver.core.policies.TokenAwarePolicy.newQueryPlan(TokenAwarePolicy.java:129)

at com.netflix.aeneas.nf.policies.EurekaAwarePolicy.newQueryPlan(EurekaAwarePolicy.java:91)

at com.datastax.driver.core.RequestHandler.<init>(RequestHandler.java:82)

at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:132)

        ...


It appears that the underlying java.security.Provider.getService method is synchronized and when creating enough query plans concurrently it starts to become a bottleneck. 

Note that similar issues with MessageDigest.getInstance() seemingly were raised with Guava and AtlasDB and were fixed by switching the approach to clone from a prototype MessageDigest instance. 

I'm happy to submit a PR for this - does my analysis seem reasonable?

Thanks, 

Tom

Alexandre Dutra

unread,
Feb 7, 2017, 8:26:41 AM2/7/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Tom,

Thank you for reporting this issue. It does seem to me that we could reduce the contention here by adopting a strategy similar to Guava's. I created JAVA-1392 to track that.

On a side note, you are using RandomPartitioner instead of the default (Murmur3Partitioner). Any particular reason for that? I know it is hard to change the partitioner once it is set, but it is doable (with tools like sstableloader). Is this something you could envisage?

Hope that helps,

Alexandre

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
--
Alexandre Dutra
Driver & Tools Engineer @ DataStax

Alexandre Dutra

unread,
Feb 7, 2017, 8:28:35 AM2/7/17
to DataStax Java Driver for Apache Cassandra User Mailing List
oh – and by all means, feel free to open a PR. Please use 3.x as your base branch and be sure to follow our guidelines.
Thanks again!

Tom Leach

unread,
Feb 7, 2017, 5:11:46 PM2/7/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Thanks for the quick response Alexandre. I will attempt to put together a PR. 

Re: choice of partitioner - it's mostly for historical / backward compatibility reasons. Depending on how quick the turnaround of this fix is, we may evaluate migrating to Murmur3 as a workaround. Are you able to give me a ball-park idea of how long it typically takes to get a fix merged and released?

Thanks again.

Tom Leach
Senior Software Engineer
Playback Features @ Netflix

Alexandre Dutra

unread,
Feb 7, 2017, 6:18:17 PM2/7/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Tom,

As you can imagine, we cannot commit to a precise date right now. But I assure you that we are definitely considering the next 3.2.0 release as one of our priorities right now; so if you submit a PR in the next weeks I promise it won't be long until it gets released.

Thanks for your active involvement in the driver.

Alexandre

Tom Leach

unread,
Feb 7, 2017, 7:10:01 PM2/7/17
to DataStax Java Driver for Apache Cassandra User Mailing List
That's reasonable, thanks. 

This is actually turning out to be a little more difficult to fix than I'd originally anticipated. 

The issue is that we're creating and reusing a static singleton instance of RPTokenFactory, which therefore will need to be thread-safe. We can create a MessageDigest prototype on construction of the RPTokenFactory instance, but I see no evidence that MessageDigest.clone() is thread-safe. 

This might be OK given the prototype should never get mutated, only cloned, but it's hard to say for sure. 

Opinions gratefully received.

Alexandre Dutra

unread,
Feb 8, 2017, 5:40:40 AM2/8/17
to DataStax Java Driver for Apache Cassandra User Mailing List
I think this is a safe assumption, given that Guava also stores the prototype in a singleton instance, see here.

Alexandre Dutra

unread,
Feb 8, 2017, 8:11:20 AM2/8/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Tom,

While investigating potential thread safety issues with MessageDigest.clone(), I ended up building a fully functional fix for JAVA-1392, so I went ahead and created the pull request myself:

My apologies if you already had something working on your side too. In any case, feel free to add your remarks or feedback to the PR.

Thanks again,

Alexandre

Tom Leach

unread,
Feb 8, 2017, 1:36:47 PM2/8/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Great! It looks very similar to what I had in progress. I'll add comments.

Stéphane

unread,
Feb 9, 2017, 8:30:05 AM2/9/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Why not store the MessageDigests in a ThreadLocal and recycle them?

Olivier Michallat

unread,
Feb 10, 2017, 6:13:13 PM2/10/17
to java-dri...@lists.datastax.com
ThreadLocals have their own issues, they can introduce class loader leaks when you deploy the driver in a managed container.
See JAVA-647 for example.

--

Olivier Michallat

Driver & tools engineer, DataStax


To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.

Stéphane LANDELLE

unread,
Feb 11, 2017, 2:46:47 AM2/11/17
to java-dri...@lists.datastax.com
Can such ClassLoader leaks happen for classes that are defined in the JDK, such as MessageDigest? Would MessageDigest class always be loader by the top parent ClassLoader?

Stéphane Landelle
GatlingCorp CEO


--
You received this message because you are subscribed to a topic in the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this topic, visit https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/OVIwK1ZFd3A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to java-driver-user+unsubscribe@lists.datastax.com.

Olivier Michallat

unread,
Feb 21, 2017, 2:16:56 PM2/21/17
to java-dri...@lists.datastax.com
The leak happens when you use this pattern:

new ThreadLocal<MessageDigest>() {
        @Override protected MessageDigest initialValue() { ... }
    };
}

This creates an anonymous inner class that keeps a reference to the webapp's classloader.
There are ways around it, but after benchmarking the code we decided it was not worth it and went with the simpler solution.

--

Olivier Michallat

Driver & tools engineer, DataStax


On Fri, Feb 10, 2017 at 11:46 PM, Stéphane LANDELLE <slan...@gatling.io> wrote:
Can such ClassLoader leaks happen for classes that are defined in the JDK, such as MessageDigest? Would MessageDigest class always be loader by the top parent ClassLoader?

Stéphane Landelle
GatlingCorp CEO


Stéphane LANDELLE

unread,
Feb 21, 2017, 3:02:42 PM2/21/17
to java-dri...@lists.datastax.com
Thanks!

Stéphane Landelle
GatlingCorp CEO


On Tue, Feb 21, 2017 at 8:16 PM, Olivier Michallat <olivier....@datastax.com> wrote:
The leak happens when you use this pattern:

new ThreadLocal<MessageDigest>() {
        @Override protected MessageDigest initialValue() { ... }
    };
}

This creates an anonymous inner class that keeps a reference to the webapp's classloader.
There are ways around it, but after benchmarking the code we decided it was not worth it and went with the simpler solution.

--

Olivier Michallat

Driver & tools engineer, DataStax


On Fri, Feb 10, 2017 at 11:46 PM, Stéphane LANDELLE <slan...@gatling.io> wrote:
Can such ClassLoader leaks happen for classes that are defined in the JDK, such as MessageDigest? Would MessageDigest class always be loader by the top parent ClassLoader?

Stéphane Landelle
GatlingCorp CEO


Reply all
Reply to author
Forward
0 new messages