I’m facing an issue with a Kafka-based lookup in Druid and would appreciate your guidance.
I have a lookup backed by a Kafka topic containing approximately 60 million rows. The lookup is configured as one-to-one, using Kafka as the source.
this is json
{
"type": "kafka",
"kafkaProperties": {
"bootstrap.servers": "kafka.kafka.svc.cluster.local:9092",
"security.protocol": "PLAINTEXT"
},
"connectTimeout": 100000,
"kafkaTopic": "lookup-device",
"isOneToOne": true
}
From the Coordinator endpoint
/druid/coordinator/v1/lookups/status?detailed=true,
I can see that:
loaded = true
pendingNodes is empty
So from the Coordinator perspective, the lookup appears to be fully loaded on all nodes.
However, when I run queries that depend on this lookup, the data is not available immediately. The lookup only starts returning data after waiting around 2 hours, without any configuration changes or reloads triggered manually.
This is my setup
druid.lookup.namespace.cache.type=offHeap druid.lookup.namespace.numExtractionThreads=4 druid.lookup.namespace.numBufferedEntries=1000 druid.lookup.snapshotWorkingDir=/tmp druid.lookup.enableLookupSyncOnStartup=true druid.lookup.numLookupLoadingThreads=20 druid.lookup.coordinatorFetchRetries=6 druid.lookup.lookupStartRetries=6
druid.extensions.loadList=["druid-kafka-indexing-service", "postgresql-metadata-storage", "druid-s3-extensions", "druid-datasketches", "druid-stats", "druid-histogram", "prometheus-emitter", "druid-multi-stage-query", "druid-kafka-extraction-namespace", "druid-lookups-cached-global"] this is the extensions
This behavior makes it difficult to rely on the lookup for near-real-time queries.
My questions are:
Is this delay expected for large Kafka-based lookups (~60M entries)?
Which parameters or components should be tuned to reduce the lookup load time and make it available faster?
Are there best practices or limits recommended for lookup size when using Kafka as the source?
Any recommendations on memory settings, lookup loading strategy, or alternative approaches would be very helpful.
Thanks in advance for your support.
Best regards,