org.reflections.Reflections scan takes too long for Kafka Connect keeping it non-functional

824 views
Skip to first unread message

Alexander Jipa

unread,
Aug 8, 2016, 5:54:56 PM8/8/16
to Confluent Platform
Hello,
I've spent quite some time trying to figure out why my control center instance is not showing available connect source configs.
While I initially suspected wrong replication factor for control-center topics (3 by default), setting it to 1 did help.

And only when I gave the Kafka Connect considerable time to run before I checked for configs I found the following:

[2016-08-08 17:39:11,764] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect:58)
[2016-08-08 17:43:36,234] INFO Reflections took 265266 ms to scan 271 urls, producing 13409 keys and 85120 values  (org.reflections.Reflections:229)

And only after that it returned the config list...

5 minutes to do the scan might be fine but I believe there should be some indication it's not fully loaded (it's actually not very honest when it says it's started).

Are you guys considering an alternative to the Reflections library or another approach to identifying connector configs? E.g. built-in Java SPI API.
It can get very tedious the more jars one put in the classpath - especially for the custom connectors.

---
Cheers,
Alexander

Roger Hoover

unread,
Aug 8, 2016, 6:15:15 PM8/8/16
to confluent...@googlegroups.com
Hi Alexander,

One reason the scan may take so long is if Kafka Connect is launched from the root directory.  Doing something like this should reduce the scan to a few seconds.

cd /etc/kafka && /usr/bin/connect-distributed /tmp/connect-distributed.properties

Cheers,

Roger


--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsubscribe@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/9f4298dd-e18a-4117-83e2-ff52c8ee02fd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexander Jipa

unread,
Aug 8, 2016, 11:06:56 PM8/8/16
to Confluent Platform
Hello,
My Kafka Connect is not launched from root directory. In fact, it's launched from a dedicated folder that has no other files but from confluent.
Reflections is only traversing the jar files I have on class path. This, again, can be an issue the more libraries one uses (with all the transitive dependencies).
Several seconds sounds like an overkill for such a simple task, don't you agree?
Why would we need to scan whole classpath just to find a few classes we are interested in?
There are so many ways to tackle this problem and they all seem like a better alternative:
- slf4j static binding: StaticLoggerBinder (with class loader isolation for individual connectors)
- Java SPI
- Manifest

---
Cheers,
Alexander

On Monday, August 8, 2016 at 6:15:15 PM UTC-4, Roger Hoover wrote:
Hi Alexander,

One reason the scan may take so long is if Kafka Connect is launched from the root directory.  Doing something like this should reduce the scan to a few seconds.

cd /etc/kafka && /usr/bin/connect-distributed /tmp/connect-distributed.properties

Cheers,

Roger

On Mon, Aug 8, 2016 at 2:54 PM, Alexander Jipa <alexand...@gmail.com> wrote:
Hello,
I've spent quite some time trying to figure out why my control center instance is not showing available connect source configs.
While I initially suspected wrong replication factor for control-center topics (3 by default), setting it to 1 did help.

And only when I gave the Kafka Connect considerable time to run before I checked for configs I found the following:

[2016-08-08 17:39:11,764] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect:58)
[2016-08-08 17:43:36,234] INFO Reflections took 265266 ms to scan 271 urls, producing 13409 keys and 85120 values  (org.reflections.Reflections:229)

And only after that it returned the config list...

5 minutes to do the scan might be fine but I believe there should be some indication it's not fully loaded (it's actually not very honest when it says it's started).

Are you guys considering an alternative to the Reflections library or another approach to identifying connector configs? E.g. built-in Java SPI API.
It can get very tedious the more jars one put in the classpath - especially for the custom connectors.

---
Cheers,
Alexander

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.

Roger Hoover

unread,
Aug 9, 2016, 12:38:24 AM8/9/16
to confluent...@googlegroups.com
Yeah, I agree that the design should be improved for loading connectors.  I'm not up to date on the latest plans to fix it though.

To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

Ewen Cheslack-Postava

unread,
Aug 9, 2016, 11:31:31 PM8/9/16
to Confluent Platform
Yes, we plan to improve this. Classpath isolation for Connectors is planned and should help improve this.

-Ewen

To post to this group, send email to confluent-platform@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen

elizabet...@stitchfix.com

unread,
Apr 11, 2017, 3:12:55 AM4/11/17
to Confluent Platform
I encountered this behavior when my CLASSPATH variable ended with a : character. I had set CLASSPATH=/path/to/connectors:$CLASSPATH but CLASSPATH was empty, thus the extra colon. The extra colon caused the connector to scan the root directory.
Reply all
Reply to author
Forward
0 new messages