curl http://localhost:8081/druid/coordinator/v1/lookups/config/historicalLookupTier/advertiser_lookup
curl "http://localhost:8083/druid/v1/namespaces/advertiser_lookup?namespaceclass=com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.JDBCExtractionNamespace"
curl "http://localhost:8083/druid/v1/namespaces/advertiser_lookup?namespaceclass=com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.JDBCExtractionNamespace&key=1&valueColumn=status&debug=true"
2020-12-03T19:22:44,457 WARN [qtp372245-71] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.flatbuffer.FlatBufferSchemaFactoryProvider - Implementation of FlatBufferSchemaFactory class name is black in the MahaNamespaceExtractionConfig, considering default implementation
2020-12-03T19:22:44,465 WARN [qtp372245-71] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.flatbuffer.FlatBufferSchemaFactoryProvider - Implementation of FlatBufferSchemaFactory class name is black in the MahaNamespaceExtractionConfig, considering default implementation
2020-12-03T19:22:44,470 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.query.lookup.MahaLookupExtractorFactory - Received request [RocksDBExtractionNamespace{rocksDbInstanceHDFSPath='/', lookupAuditingHDFSPath='/rocksdb_audit', namespace='customer', pollPeriod=PT30S, kafkaTopic='null', cacheEnabled=false, lookupAuditingEnabled=false, lookupName='customer_lookup', tsColumn='null', missingLookupConfig=null, lastUpdatedTime=-1, cacheActionRunner=CacheActionRunner{}, overrideLookupServiceHosts=[], randomLocalPathSuffixEnabled=false}]
2020-12-03T19:22:44,470 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - [customer_lookup] is new
2020-12-03T19:22:44,470 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Trying to update namespace [customer_lookup]
2020-12-03T19:22:44,471 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Passed through namespace: RocksDBExtractionNamespace{rocksDbInstanceHDFSPath='/', lookupAuditingHDFSPath='/rocksdb_audit', namespace='customer', pollPeriod=PT30S, kafkaTopic='null', cacheEnabled=false, lookupAuditingEnabled=false, lookupName='customer_lookup', tsColumn='null', missingLookupConfig=null, lastUpdatedTime=-1, cacheActionRunner=CacheActionRunner{}, overrideLookupServiceHosts=[], randomLocalPathSuffixEnabled=false}
with concrete className: com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace
2020-12-03T19:22:44,486 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Namespace [customer_lookup] successfully updated. preVersion [null], newVersion [0]
I tried to create advertiser Lookups with the configuration you provided in previous email, and found the following errors in historical.log file:
2020-12-04T04:09:12,894 INFO [MahaNamespaceExtractionCacheManager-27] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - successMarkerPath [/data/druid/lookups/snapshots/advertiser/load_time=202012030000/_SUCCESS], lastUpdate [0]
2020-12-04T04:09:12,895 ERROR [MahaNamespaceExtractionCacheManager-27] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - RocksDB instance not present for namespace [advertiser] loadTime [202012030000], will check for previous loadTime
2020-12-04T04:09:12,895 ERROR [MahaNamespaceExtractionCacheManager-27] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - RocksDB instance not present for previous loadTime [202012020000] too for namespace [advertiser]
The JSON configuration file:
{
"historicalLookupTier": {
"advertiser_lookup": {
"version": "v0",
"lookupExtractorFactory": {
"type": "cachedNamespace",
"extractionNamespace": {
"type": "maharocksdb",
"namespace": "advertiser",
"rocksDbInstanceHDFSPath": "/data/druid/lookups/snapshots/advertiser",
"lookupAuditingHDFSPath": "/data/druid/lookups/audits/advertiser",
"pollPeriod": "PT30S",
"cacheEnabled": true,
"lookupName": "advertiser_lookup"
}
}
}
}
}
The command to create the Lookup: curl -XPOST -H'Content-Type: application/json' -d '@maharocksdb_config_advertiser.json' http://localhost:8081/druid/coordinator/v1/lookups/config
List the files in HDFS:
╭─jchome@jcmbp-2 /usr/local/Cellar/hadoop/hdfs
╰─$ hdfs dfs -ls /data/druid/lookups/audits/
2020-12-03 23:14:00,588 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x - jchome supergroup 0 2020-12-03 23:07 /data/druid/lookups/audits/advertiser
╭─jchome@jcmbp-2 /usr/local/Cellar/hadoop/hdfs
╰─$ hdfs dfs -ls /data/druid/lookups/snapshots
2020-12-03 23:14:20,947 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x - jchome supergroup 0 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser
╭─jchome@jcmbp-2 /usr/local/Cellar/hadoop/hdfs
╰─$ hdfs dfs -ls /data/druid/lookups/snapshots/advertiser
2020-12-03 23:14:28,646 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 23 items
-rw-r--r-- 1 jchome supergroup 781 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/000004.sst
-rw-r--r-- 1 jchome supergroup 795 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/000007.sst
-rw-r--r-- 1 jchome supergroup 26 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/000009.log
-rw-r--r-- 1 jchome supergroup 16 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/CURRENT
-rw-r--r-- 1 jchome supergroup 33 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/IDENTITY
-rw-r--r-- 1 jchome supergroup 0 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOCK
-rw-r--r-- 1 jchome supergroup 17523 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG
-rw-r--r-- 1 jchome supergroup 8093 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606922775407795
-rw-r--r-- 1 jchome supergroup 21529 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606922814108690
-rw-r--r-- 1 jchome supergroup 17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606922818488882
-rw-r--r-- 1 jchome supergroup 17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606922929043833
-rw-r--r-- 1 jchome supergroup 17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606931552747952
-rw-r--r-- 1 jchome supergroup 17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606936265590609
-rw-r--r-- 1 jchome supergroup 17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606936718495964
-rw-r--r-- 1 jchome supergroup 17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607038910772162
-rw-r--r-- 1 jchome supergroup 17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039454741716
-rw-r--r-- 1 jchome supergroup 17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039510515607
-rw-r--r-- 1 jchome supergroup 24061 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039520978662
-rw-r--r-- 1 jchome supergroup 17511 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039544330285
-rw-r--r-- 1 jchome supergroup 24073 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039548828711
-rw-r--r-- 1 jchome supergroup 161 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/MANIFEST-000008
-rw-r--r-- 1 jchome supergroup 5320 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/OPTIONS-000008
-rw-r--r-- 1 jchome supergroup 5320 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/OPTIONS-000011
Really appreciate it if you can provide some insights how Maha Lookups can access the files (RocksDB instances files in HDFS). If you can point me to the code, it will be great. If you can introduce how you setup the HDFS, it will also be very helpful.
Regards,
Jason
Jason (Jianbin) Chen Senior Data Developer p: +1 2066608351 | e: jason...@shopify.com a: 234 Laurier Ave W Ottawa, ON K1N 5X8
On Dec 3, 2020, 1:46 PM -0500, pb <prana...@gmail.com>, wrote:
Saw the exception stack on the coord logs that json is missing "namespace": "advertiser" property, in your case it might be 'customer'.{
"type": "maharocksdb",
"namespace": "advertiser",
"rocksDbInstanceHDFSPath": "/data/druid/lookups/snapshots/advertiser",
"lookupAuditingHDFSPath": "/data/druid/lookups/audits/advertiser",
"pollPeriod": "PT30S",
"cacheEnabled": true,
"lookupName": "advertiser_lookup"
}{class=org.apache.druid.server.lookup.cache.LookupCoordinatorManager, exceptionType=class org.apache.druid.java.util.common.IOE, exceptionMessage=Bad update request to [http://localhost:8083/druid/listen/v1/lookups/updates] : [400] : [Bad Request] Response: [:)
��error�Missing required creator property 'namespace' (index 0)
at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: -1, column: 252] (through reference chain: org.apache.druid.query.lookup.LookupsState["toLoad"]->java.util.LinkedHashMap["customer_lookup"]->org.apache.druid.query.lookup.LookupExtractorFactoryContainer["lookupExtractorFactory"]->com.yahoo.maha.maha_druid_lookups.query.lookup.MahaLookupExtractorFactory["extractionNamespace"]->com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace["namespace"])��]}
org.apache.druid.java.util.common.IOE: Bad update request to [http://localhost:8083/druid/listen/v1/lookups/updates] : [400] : [Bad Request] Response: [:)
��error�Missing required creator property 'namespace' (index 0)
at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: -1, column: 252] (through reference chain: org.apache.druid.query.lookup.LookupsState["toLoad"]->java.util.LinkedHashMap["customer_lookup"]->org.apache.druid.query.lookup.LookupExtractorFactoryContainer["lookupExtractorFactory"]->com.yahoo.maha.maha_druid_lookups.query.lookup.MahaLookupExtractorFactory["extractionNamespace"]->com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace["namespace"])��]
at org.apache.druid.server.lookup.cache.LookupCoordinatorManager$LookupsCommunicator.updateNode(LookupCoordinatorManager.java:834) ~[druid-server-0.20.0.jar:0.20.0]
at org.apache.druid.server.lookup.cache.LookupCoordinatorManager.doLookupManagementOnNode(LookupCoordinatorManager.java:663) ~[druid-server-0.20.0.jar:0.20.0]
at org.apache.druid.server.lookup.cache.LookupCoordinatorManager.lambda$lookupManagementLoop$2(LookupCoordinatorManager.java:590) ~[druid-server-0.20.0.jar:0.20.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_272]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_272]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_272]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_272]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_272]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_272]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_272]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]
On Thu, Dec 3, 2020 at 9:45 AM Jason Chen <jason...@shopify.com> wrote:Hey, sure. I attached the logs. Let me know if you need more information for diagnosis.
Thank you very much
Regards,
Jason
Jason (Jianbin) Chen Senior Data Developer p: +1 2066608351 | e: jason...@shopify.com a: 234 Laurier Ave W Ottawa, ON K1N 5X8
On Dec 3, 2020, 12:33 PM -0500, pb <prana...@gmail.com>, wrote:
Can you send me the full exception stack?
On Thu, Dec 3, 2020 at 7:48 AM Jason Chen <jason...@shopify.com> wrote:Hey,
Thank you very much for your reply.
I generated the Customers Photo with the following command:
```
protoc --proto_path=/Users/jchome/Downloads --java_out=src/main/java/com/shopify/data/rocksdb customers.proto
```
This is my Runtime properties.
I still got the following errors:
Unknown exception / org.apache.calcite.runtime.CalciteContextException: From line 1, column 22 to line 1, column 43: Object 'customer_lookup' not found within 'lookup' / org.apache.calcite.tools.ValidationException
Could this error be caused by the wrong Rocks path in HDFS? I upload RocksDB instance to HDFS through the command “hdfs dfs -put /tmp/rocksdb_directory /“.
Regards,
Jason
Jason (Jianbin) Chen Senior Data Developer p: +1 2066608351 | e: jason...@shopify.com a: 234 Laurier Ave W Ottawa, ON K1N 5X8
On Dec 2, 2020, 4:53 PM -0500, pb , wrote:
Hello Jason,Thanks for reaching to us, welcome to maha community. I think you missed the step of providing schema factory class in druid extension config,druid.lookup.maha.namespace.schemaFactory=com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.protobuf.NoopProtobufSchemaFactoryDruid lookups needs to know that customer_lookup -> Protobuf mapping. and put the jar containing this class to same location as maha-druid-lookups jar.We should update onboarding documentation for maharocksdb. I will do that. Let me know if that helps.
thank you
On Wed, Dec 2, 2020 at 1:18 PM Jason Chen <jason...@shopify.com> wrote:Hey Pranav,
I am an engineer from Shopify in Canada. I am working on a project which does dimension federation utilizes Druid Lookups. We are looking at the solution to store dimensions data in third-party storage, e.g. RocksDB. I tried the Maha Druid Lookups following the steps in (https://github.com/yahoo/maha/tree/master/druid-lookups#registering-druid-lookups).
However, the Lookup object cannot be found. The following is the error:
Unknown exception / org.apache.calcite.runtime.CalciteContextException: From line 1, column 22 to line 1, column 43: Object 'customer_lookup' not found within 'lookup' / org.apache.calcite.tools.ValidationException
So what I did was:
- Install Druid cluster in my Mac OSX, and build Maha, copy the Maha jar to the extension folder
- Install RocksDB using Home-brew, and PUT a key-value pair: (k1, v1)
- Install Hadoop through Home-brew, and put the RocksDB instances to HDFS using the command: “hdfs dfs -put /tmp/rocksdb_directory /"
- Run “rm -rf var/* && rm -rf log && ./bin/start-micro-quickstart” to start Druid
I attached my “@maharocksdb_lookup_config_for_historical.json” file in this email. Could you give me some insights about the error?
Thank you
Regards,
Jason
Jason (Jianbin) Chen Senior Data Developer p: +1 2066608351 | e: jason...@shopify.com a: 234 Laurier Ave W Ottawa, ON K1N 5X8
--
Cell Phone No: 408-507-4773.
--
Cell Phone No: 408-507-4773.
--
Cell Phone No: 408-507-4773.
![]() |
|

![]() |
|
![]() |
|
public class TestProtobufSchemaFactory extends DefaultProtobufSchemaFactory {
public TestProtobufSchemaFactory() {
super(ImmutableMap.<String, GeneratedMessageV3>of("ad_lookup", AdProtos.Ad.getDefaultInstance()));
}
}
![]() |
|
![]() |
|
![]() |
|
![]() |
|
String id = String.valueOf(i);
Message message = CustomerProtos.Customer.newBuilder()
.setId(id)
.setAddress("address_"+id)
.setName("name_"+id)
.setLastUpdated(""+System.currentTimeMillis())
.setStatus(status[i%2])
.build();
rocksDB.put(id.getBytes(),message.toByteArray());
![]() |
|
druid.lookup.lookupTier
MahaRegisteredLookupExtractionFn