Re: Have some questions regarding Maha Druid Lookups

186 views
Skip to first unread message

pb

unread,
Dec 4, 2020, 2:20:44 AM12/4/20
to Jason Chen, maha-users
Here is code for RocksDB manager, it expects the success marker and load_time partition in hdfs. It is used for lookup versioning. it picksup latest available zip with success marker.
PATH/load_time=YYYYMMDD/rocksdb.zip
PATH/load_time=YYYYMMDD/_SUCCESS

We do not have proper documentation for rocksdb extraction lookup, I am planning to document it and also create youtube video on how to get started with fresh druid. (may be over weekend).


I see that customer_lookup is instantiated on historicals. Can you try querying it via REST API on historical?   you can also query submitting druid json query with extraction lookup.  in rocksdb case namespace class should be https://github.com/yahoo/maha/blob/master/druid-lookups/src/main/java/com/yahoo/maha/maha_druid_lookups/query/lookup/namespace/RocksDBExtractionNamespace.java
 



2020-12-03T19:22:44,457 WARN [qtp372245-71] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.flatbuffer.FlatBufferSchemaFactoryProvider - Implementation of FlatBufferSchemaFactory class name is black in the MahaNamespaceExtractionConfig, considering default implementation
2020-12-03T19:22:44,465 WARN [qtp372245-71] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.flatbuffer.FlatBufferSchemaFactoryProvider - Implementation of FlatBufferSchemaFactory class name is black in the MahaNamespaceExtractionConfig, considering default implementation
2020-12-03T19:22:44,470 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.query.lookup.MahaLookupExtractorFactory -  Received request [RocksDBExtractionNamespace{rocksDbInstanceHDFSPath='/', lookupAuditingHDFSPath='/rocksdb_audit', namespace='customer', pollPeriod=PT30S, kafkaTopic='null', cacheEnabled=false, lookupAuditingEnabled=false, lookupName='customer_lookup', tsColumn='null', missingLookupConfig=null, lastUpdatedTime=-1, cacheActionRunner=CacheActionRunner{}, overrideLookupServiceHosts=[], randomLocalPathSuffixEnabled=false}]
2020-12-03T19:22:44,470 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - [customer_lookup] is new
2020-12-03T19:22:44,470 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Trying to update namespace [customer_lookup]
2020-12-03T19:22:44,471 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Passed through namespace: RocksDBExtractionNamespace{rocksDbInstanceHDFSPath='/', lookupAuditingHDFSPath='/rocksdb_audit', namespace='customer', pollPeriod=PT30S, kafkaTopic='null', cacheEnabled=false, lookupAuditingEnabled=false, lookupName='customer_lookup', tsColumn='null', missingLookupConfig=null, lastUpdatedTime=-1, cacheActionRunner=CacheActionRunner{}, overrideLookupServiceHosts=[], randomLocalPathSuffixEnabled=false}
with concrete className: com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace
2020-12-03T19:22:44,486 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Namespace [customer_lookup] successfully updated. preVersion [null], newVersion [0]

On Thu, Dec 3, 2020 at 8:17 PM Jason Chen <jason...@shopify.com> wrote:
I tried to create advertiser Lookups with the configuration you provided in previous email, and found the following errors in historical.log file:


2020-12-04T04:09:12,894 INFO [MahaNamespaceExtractionCacheManager-27] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - successMarkerPath [/data/druid/lookups/snapshots/advertiser/load_time=202012030000/_SUCCESS], lastUpdate [0]
2020-12-04T04:09:12,895 ERROR [MahaNamespaceExtractionCacheManager-27] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - RocksDB instance not present for namespace [advertiser] loadTime [202012030000], will check for previous loadTime
2020-12-04T04:09:12,895 ERROR [MahaNamespaceExtractionCacheManager-27] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - RocksDB instance not present for previous loadTime [202012020000] too for namespace [advertiser]

The JSON configuration file:


  "historicalLookupTier": {
    "advertiser_lookup": {
      "version": "v0",
      "lookupExtractorFactory": {
        "type": "cachedNamespace",
        "extractionNamespace": {
          "type": "maharocksdb",
          "namespace": "advertiser",
          "rocksDbInstanceHDFSPath": "/data/druid/lookups/snapshots/advertiser",
          "lookupAuditingHDFSPath": "/data/druid/lookups/audits/advertiser",
          "pollPeriod": "PT30S",
          "cacheEnabled": true,
          "lookupName": "advertiser_lookup"
        }
      }
    }
  }
}

The command to create the Lookup: curl -XPOST -H'Content-Type: application/json' -d '@maharocksdb_config_advertiser.json' http://localhost:8081/druid/coordinator/v1/lookups/config

List the files in HDFS:

╭─jchome@jcmbp-2 /usr/local/Cellar/hadoop/hdfs
╰─$ hdfs dfs -ls /data/druid/lookups/audits/
2020-12-03 23:14:00,588 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - jchome supergroup          0 2020-12-03 23:07 /data/druid/lookups/audits/advertiser
╭─jchome@jcmbp-2 /usr/local/Cellar/hadoop/hdfs
╰─$ hdfs dfs -ls /data/druid/lookups/snapshots
2020-12-03 23:14:20,947 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - jchome supergroup          0 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser
╭─jchome@jcmbp-2 /usr/local/Cellar/hadoop/hdfs
╰─$ hdfs dfs -ls /data/druid/lookups/snapshots/advertiser
2020-12-03 23:14:28,646 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 23 items
-rw-r--r--   1 jchome supergroup        781 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/000004.sst
-rw-r--r--   1 jchome supergroup        795 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/000007.sst
-rw-r--r--   1 jchome supergroup         26 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/000009.log
-rw-r--r--   1 jchome supergroup         16 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/CURRENT
-rw-r--r--   1 jchome supergroup         33 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/IDENTITY
-rw-r--r--   1 jchome supergroup          0 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOCK
-rw-r--r--   1 jchome supergroup      17523 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG
-rw-r--r--   1 jchome supergroup       8093 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606922775407795
-rw-r--r--   1 jchome supergroup      21529 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606922814108690
-rw-r--r--   1 jchome supergroup      17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606922818488882
-rw-r--r--   1 jchome supergroup      17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606922929043833
-rw-r--r--   1 jchome supergroup      17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606931552747952
-rw-r--r--   1 jchome supergroup      17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606936265590609
-rw-r--r--   1 jchome supergroup      17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1606936718495964
-rw-r--r--   1 jchome supergroup      17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607038910772162
-rw-r--r--   1 jchome supergroup      17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039454741716
-rw-r--r--   1 jchome supergroup      17499 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039510515607
-rw-r--r--   1 jchome supergroup      24061 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039520978662
-rw-r--r--   1 jchome supergroup      17511 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039544330285
-rw-r--r--   1 jchome supergroup      24073 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/LOG.old.1607039548828711
-rw-r--r--   1 jchome supergroup        161 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/MANIFEST-000008
-rw-r--r--   1 jchome supergroup       5320 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/OPTIONS-000008
-rw-r--r--   1 jchome supergroup       5320 2020-12-03 23:06 /data/druid/lookups/snapshots/advertiser/OPTIONS-000011


Really appreciate it if you can provide some insights how Maha Lookups can access the files (RocksDB instances files in HDFS). If you can point me to the code, it will be great. If you can introduce how you setup the HDFS, it will also be very helpful.


Regards,
Jason








Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8
On Dec 3, 2020, 1:46 PM -0500, pb <prana...@gmail.com>, wrote:
Saw the exception stack on the coord logs that json is missing "namespace": "advertiser" property, in your case it might be 'customer'. 

{
  "type": "maharocksdb",
  "namespace": "advertiser",
  "rocksDbInstanceHDFSPath": "/data/druid/lookups/snapshots/advertiser",
  "lookupAuditingHDFSPath": "/data/druid/lookups/audits/advertiser",
  "pollPeriod": "PT30S",
  "cacheEnabled": true,
  "lookupName": "advertiser_lookup"
}

{class=org.apache.druid.server.lookup.cache.LookupCoordinatorManager, exceptionType=class org.apache.druid.java.util.common.IOE, exceptionMessage=Bad update request to [http://localhost:8083/druid/listen/v1/lookups/updates] : [400] : [Bad Request]  Response: [:)
��error�Missing required creator property 'namespace' (index 0)
 at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: -1, column: 252] (through reference chain: org.apache.druid.query.lookup.LookupsState["toLoad"]->java.util.LinkedHashMap["customer_lookup"]->org.apache.druid.query.lookup.LookupExtractorFactoryContainer["lookupExtractorFactory"]->com.yahoo.maha.maha_druid_lookups.query.lookup.MahaLookupExtractorFactory["extractionNamespace"]->com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace["namespace"])��]}
org.apache.druid.java.util.common.IOE: Bad update request to [http://localhost:8083/druid/listen/v1/lookups/updates] : [400] : [Bad Request]  Response: [:)
��error�Missing required creator property 'namespace' (index 0)
 at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: -1, column: 252] (through reference chain: org.apache.druid.query.lookup.LookupsState["toLoad"]->java.util.LinkedHashMap["customer_lookup"]->org.apache.druid.query.lookup.LookupExtractorFactoryContainer["lookupExtractorFactory"]->com.yahoo.maha.maha_druid_lookups.query.lookup.MahaLookupExtractorFactory["extractionNamespace"]->com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace["namespace"])��]
at org.apache.druid.server.lookup.cache.LookupCoordinatorManager$LookupsCommunicator.updateNode(LookupCoordinatorManager.java:834) ~[druid-server-0.20.0.jar:0.20.0]
at org.apache.druid.server.lookup.cache.LookupCoordinatorManager.doLookupManagementOnNode(LookupCoordinatorManager.java:663) ~[druid-server-0.20.0.jar:0.20.0]
at org.apache.druid.server.lookup.cache.LookupCoordinatorManager.lambda$lookupManagementLoop$2(LookupCoordinatorManager.java:590) ~[druid-server-0.20.0.jar:0.20.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_272]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_272]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_272]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_272]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_272]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_272]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_272]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]

On Thu, Dec 3, 2020 at 9:45 AM Jason Chen <jason...@shopify.com> wrote:
Hey, sure. I attached the logs. Let me know if you need more information for diagnosis.

Thank you  very much

Regards,
Jason


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8
On Dec 3, 2020, 12:33 PM -0500, pb <prana...@gmail.com>, wrote:
Can you send me the full exception stack?

On Thu, Dec 3, 2020 at 7:48 AM Jason Chen <jason...@shopify.com> wrote:
Hey,

Thank you very much for your reply.

I generated the Customers Photo with the following command:

```
protoc --proto_path=/Users/jchome/Downloads --java_out=src/main/java/com/shopify/data/rocksdb customers.proto
``` 

This is my Runtime properties.

I still got the following errors:

Unknown exception / org.apache.calcite.runtime.CalciteContextException: From line 1, column 22 to line 1, column 43: Object 'customer_lookup' not found within 'lookup' / org.apache.calcite.tools.ValidationException


Could this error be caused by the wrong Rocks path in HDFS? I upload RocksDB instance to HDFS through the command “hdfs dfs -put /tmp/rocksdb_directory /.

Regards,
Jason


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8
On Dec 2, 2020, 4:53 PM -0500, pb , wrote:
Hello Jason, 
  Thanks for reaching to us, welcome to maha community.  I think you missed the step of providing schema factory class in druid extension config,
druid.lookup.maha.namespace.schemaFactory=com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.protobuf.NoopProtobufSchemaFactory

Druid lookups needs to know that customer_lookup -> Protobuf mapping. and put the jar containing this class to same location as maha-druid-lookups jar.

We should update onboarding documentation for maharocksdb. I will do that. Let me know if that helps.

thank you


On Wed, Dec 2, 2020 at 1:18 PM Jason Chen <jason...@shopify.com> wrote:
Hey Pranav,

I am an engineer from Shopify in Canada. I am working on a project which does dimension federation utilizes Druid Lookups. We are looking at the solution to store dimensions data in third-party storage, e.g. RocksDB. I tried the Maha Druid Lookups following the steps in (https://github.com/yahoo/maha/tree/master/druid-lookups#registering-druid-lookups).

However, the Lookup object cannot be found. The following is the error:

Unknown exception / org.apache.calcite.runtime.CalciteContextException: From line 1, column 22 to line 1, column 43: Object 'customer_lookup' not found within 'lookup' / org.apache.calcite.tools.ValidationException

So what I did was:
  1. Install Druid cluster in my Mac OSX, and build Maha, copy the Maha jar to the extension folder
  2. Install RocksDB using Home-brew, and PUT a key-value pair: (k1, v1)
  3. Install Hadoop through Home-brew, and put the RocksDB instances to HDFS using the command: “hdfs dfs -put /tmp/rocksdb_directory /"
  4. Run “rm -rf var/* && rm -rf log && ./bin/start-micro-quickstart” to start Druid

I attached my “@maharocksdb_lookup_config_for_historical.json” file in this email. Could you give me some insights about the error?

Thank you

Regards,
Jason


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8


--
Pranav Bhole 
Software Development Engineer at Yahoo!
Email: prana...@gmail.com 
Cell Phone No: 408-507-4773.



--
Pranav Bhole 
Software Development Engineer at Yahoo!
Email: prana...@gmail.com 
Cell Phone No: 408-507-4773.



--
Pranav Bhole 
Software Development Engineer at Yahoo!
Email: prana...@gmail.com 
Cell Phone No: 408-507-4773.



--
Pranav Bhole 
Software Development Engineer at Yahoo!
Email: prana...@gmail.com 
Cell Phone No: 408-507-4773.

Jason Chen

unread,
Dec 4, 2020, 9:50:00 AM12/4/20
to maha-users, pb
Hey,

Thanks for your reply.

I created time partition in the HDFS directory:
╰─$ hdfs dfs -ls /rocksdb_directory/20201204
2020-12-04 09:42:55,221 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   1 jchome supergroup          0 2020-12-04 09:22 /rocksdb_directory/20201204/_SUCCESS
-rw-r--r--   1 jchome supergroup      33873 2020-12-04 09:21 /rocksdb_directory/20201204/rocksdb.zip

And then I saw the name space has been successfully uploaded in broker.log:
2020-12-04T14:25:48,668 INFO [qtp372245-86] com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace - no input from overrideLookupServiceHosts
2020-12-04T14:25:48,672 WARN [qtp372245-86] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.flatbuffer.FlatBufferSchemaFactoryProvider - Implementation of FlatBufferSchemaFactory class name is black in the MahaNamespaceExtractionConfig, considering default implementation
2020-12-04T14:25:48,682 WARN [qtp372245-86] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.flatbuffer.FlatBufferSchemaFactoryProvider - Implementation of FlatBufferSchemaFactory class name is black in the MahaNamespaceExtractionConfig, considering default implementation
2020-12-04T14:25:48,687 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.query.lookup.MahaLookupExtractorFactory -  Received request [RocksDBExtractionNamespace{rocksDbInstanceHDFSPath='/rocksdb_directory', lookupAuditingHDFSPath='/rocksdb_audit', namespace='customer', pollPeriod=PT30S, kafkaTopic='null', cacheEnabled=false, lookupAuditingEnabled=false, lookupName='customer_lookup', tsColumn='null', missingLookupConfig=null, lastUpdatedTime=-1, cacheActionRunner=CacheActionRunner{}, overrideLookupServiceHosts=[], randomLocalPathSuffixEnabled=false}]
2020-12-04T14:25:48,687 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - [customer_lookup] is new
2020-12-04T14:25:48,687 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Trying to update namespace [customer_lookup]
2020-12-04T14:25:48,688 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Passed through namespace: RocksDBExtractionNamespace{rocksDbInstanceHDFSPath='/rocksdb_directory', lookupAuditingHDFSPath='/rocksdb_audit', namespace='customer', pollPeriod=PT30S, kafkaTopic='null', cacheEnabled=false, lookupAuditingEnabled=false, lookupName='customer_lookup', tsColumn='null', missingLookupConfig=null, lastUpdatedTime=-1, cacheActionRunner=CacheActionRunner{}, overrideLookupServiceHosts=[], randomLocalPathSuffixEnabled=false}
with concrete className: com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace
2020-12-04T14:25:48,707 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Namespace [customer_lookup] successfully updated. preVersion [null], newVersion [0]

But when I am trying to query the Lookup table:
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ curl http://localhost:8081/druid/coordinator/v1/lookups/config//historicalLookupTier/customer_lookup

╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ curl "http://localhost:8083/druid/v1/namespaces/customer_lookup?namespaceclass=com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace"
0%
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ curl "http://localhost:8083/druid/v1/namespaces/customer_lookup?namespaceclass=com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace&id=k1&valueColumn=value&debug=true"
2%


I got nothing when querying the schema, and got 0% when querying the HashMap size, and get 2% when querying whatever key (tried k1, k2, k3 ).



I got the following error in historical log:
2020-12-04T14:25:46,697 WARN [sql[9aed8ee1-5151-46d8-ab42-ebd93e1b2475]] org.apache.druid.sql.http.SqlResource - Failed to handle query: SqlQuery{query='SELECT "k", "v" FROM lookup.customer_lookup LIMIT 5000', resultFormat=OBJECT, header=false, context={}, parameters=[]}
org.apache.calcite.tools.ValidationException: org.apache.calcite.runtime.CalciteContextException: From line 1, column 22 to line 1, column 43: Object 'customer_lookup' not found within 'lookup'
 at org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:217) ~[calcite-core-1.21.0.jar:1.21.0]
 at org.apache.druid.sql.calcite.planner.DruidPlanner.plan(DruidPlanner.java:133) ~[druid-sql-0.20.0.jar:0.20.0]
 at org.apache.druid.sql.SqlLifecycle.plan(SqlLifecycle.java:168) ~[druid-sql-0.20.0.jar:0.20.0]
 at org.apache.druid.sql.SqlLifecycle.plan(SqlLifecycle.java:179) ~[druid-sql-0.20.0.jar:0.20.0]
 at org.apache.druid.sql.SqlLifecycle.planAndAuthorize(SqlLifecycle.java:240) ~[druid-sql-0.20.0.jar:0.20.0]
 at org.apache.druid.sql.http.SqlResource.doPost(SqlResource.java:95) ~[druid-sql-0.20.0.jar:0.20.0]
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_272]
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_272]
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_272]
 at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_272]
 at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) ~[jersey-server-1.19.3.jar:1.19.3]
 at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) ~[jersey-servlet-1.19.3.jar:1.19.3]
 at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) ~[jersey-servlet-1.19.3.jar:1.19.3]
 at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) ~[jersey-servlet-1.19.3.jar:1.19.3]
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[javax.servlet-api-3.1.0.jar:3.1.0]
 at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286) ~[guice-servlet-4.1.0.jar:?]
 at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276) ~[guice-servlet-4.1.0.jar:?]
 at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181) ~[guice-servlet-4.1.0.jar:?]
 at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) ~[guice-servlet-4.1.0.jar:?]
 at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) ~[guice-servlet-4.1.0.jar:?]
 at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) ~[guice-servlet-4.1.0.jar:?]
 at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135) ~[guice-servlet-4.1.0.jar:?]
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1618) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.apache.druid.server.security.PreResponseAuthorizationCheckFilter.doFilter(PreResponseAuthorizationCheckFilter.java:82) ~[druid-server-0.20.0.jar:0.20.0]
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1618) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.apache.druid.server.security.AllowHttpMethodsResourceFilter.doFilter(AllowHttpMethodsResourceFilter.java:78) ~[druid-server-0.20.0.jar:0.20.0]
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1618) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.apache.druid.server.security.AllowOptionsResourceFilter.doFilter(AllowOptionsResourceFilter.java:75) ~[druid-server-0.20.0.jar:0.20.0]
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1618) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.apache.druid.server.security.AllowAllAuthenticator$1.doFilter(AllowAllAuthenticator.java:84) ~[druid-server-0.20.0.jar:0.20.0]
 at org.apache.druid.server.security.AuthenticationWrappingFilter.doFilter(AuthenticationWrappingFilter.java:59) ~[druid-server-0.20.0.jar:0.20.0]
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1618) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.apache.druid.server.security.SecuritySanityCheckFilter.doFilter(SecuritySanityCheckFilter.java:86) ~[druid-server-0.20.0.jar:0.20.0]
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1618) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:549) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1369) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:489) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]


Regards,
Jason


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8

pb

unread,
Dec 4, 2020, 1:29:14 PM12/4/20
to Jason Chen, maha-users
One more thing that i missed in the last email. 'load_time=yyyyMMdd0000'  (yyyyMMdd0000 is local time) should be part of hdfs path to know the daily snapshots and version tracking.

Path should be  /rocksdb_directory/load_time=202012040000/ 

Jason Chen

unread,
Dec 4, 2020, 2:32:33 PM12/4/20
to pb, maha-users
HDFS Directory:
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ hdfs dfs -ls /rocksdb_directory/load_time=202012040000                                                                                                                                                                              
-rw-r--r--   1 jchome supergroup          0 2020-12-04 13:42 /rocksdb_directory/load_time=202012040000/_SUCCESS
-rw-r--r--   1 jchome supergroup      33873 2020-12-04 13:41 /rocksdb_directory/load_time=202012040000/rocksdb.zip


I attached the JSON configuration, and the command to initialize the Lookups:
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ curl -XPOST -H'Content-Type: application/json' -d '{}' http://localhost:8081/druid/coordinator/v1/lookups/config
{}%
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ curl -XPOST -H'Content-Type: application/json' -d '@quickstart/tutorial/maharocksdb_lookup_config_for_historical.json' http://localhost:8081/druid/coordinator/v1/lookups/config
{"historicalLookupTier":{"customer_lookup":{"version":"v0","lookupExtractorFactory":{"type":"cachedNamespace","extractionNamespace":{"type":"maharocksdb","namespace":"customer","lookupName":"customer_lookup","rocksDbInstanceHDFSPath":"/rocksdb_directory","lookupAuditingHDFSPath":"/rocksdb_audit","pollPeriod":"PT30S","cacheEnabled":false}}}}}%

No errors in any log file, except the “customer_lookup object not found” error in broker log.

In the UI, I tried to open the Lookup, there are still Not Found Error.

Really appreciate that you are willing to spend time with me to debug the issue. It’s super helpful for me.

Regards,
Jason


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8
maharocksdb_lookup_config_for_historical.json
sv.zip

Jason Chen

unread,
Dec 5, 2020, 2:11:36 AM12/5/20
to pb, maha-users
Hey,

I finally got the RocksDB loaded. I missed few things:
  1. "cacheEnabled": true - this option should be true, otherwise it will not load RocksDB files from HDFS
  2. In RockdDBManager, I have to set config.set("fs.defaultFS", "hdfs://localhost:8020”);  which is my local HDFS URI.
With the above setting, I can successfully load RocksDB into HDFS. The following logs indicate it is successful:

2020-12-05T06:58:22,789 INFO [qtp1777112002-71] com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace - no input from overrideLookupServiceHosts
2020-12-05T06:58:22,791 WARN [qtp1777112002-71] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.flatbuffer.FlatBufferSchemaFactoryProvider - Implementation of FlatBufferSchemaFactory class name is black in the MahaNamespa
ceExtractionConfig, considering default implementation
2020-12-05T06:58:22,800 WARN [qtp1777112002-71] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.flatbuffer.FlatBufferSchemaFactoryProvider - Implementation of FlatBufferSchemaFactory class name is black in the MahaNamespa
ceExtractionConfig, considering default implementation
2020-12-05T06:58:22,804 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.query.lookup.MahaLookupExtractorFactory -  Received request [RocksDBExtractionNamespace{rocksDbInstanceHDFSPath='/rocksdb_directory', lookupAuditingHDFSPath='/rocksdb_audit', namespace='customer', pollPeriod=PT30S, kafkaTopic='null', cacheEnabled=true, lookupAuditingEnabled=false, lookupName='customer_lookup', tsColumn='null', missingLookupConfig=null, lastUpdatedTime=-1, cacheActionRunner=CacheActionRunner{}, overrideLookupServiceHosts=[], randomLocalPathSuffixEnabled=false}]
2020-12-05T06:58:22,804 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - [customer_lookup] is new
2020-12-05T06:58:22,804 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Trying to update namespace [customer_lookup]
2020-12-05T06:58:22,805 INFO [LookupExtractorFactoryContainerProvider-MainThread] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Passed through namespace: RocksDBExtractionNamespace{rocksDbInstanceHDFSPath='/rocksdb_directory', lookupAuditingHDFSPath='/rocksdb_audit', namespace='customer', pollPeriod=PT30S, kafkaTopic='null', cacheEnabled=true, lookupAuditingEnabled=false, lookupName='customer_lookup', tsColumn='null', missingLookupConfig=null, lastUpdatedTime=-1, cacheActionRunner=CacheActionRunner{}, overrideLookupServiceHosts=[], randomLocalPathSuffixEnabled=false}
with concrete className: com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace
2020-12-05T06:58:22,821 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - successMarkerPath [/rocksdb_directory/load_time=202012040000/_SUCCESS], lastUpdate [0]
2020-12-05T06:58:22,901 ERROR [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - hdfsPath [/rocksdb_directory/load_time=202012040000/rocksdb.zip]
2020-12-05T06:58:22,903 ERROR [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - localZippedFileNameWithPath [/tmp/customer/202012040000rocksdb_.zip]
2020-12-05T06:58:22,903 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - starting new instance for namespace[customer]...
2020-12-05T06:58:22,903 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - Downloading RocksDB instance from [/rocksdb_directory/load_time=202012040000/rocksdb.zip] to [/tmp/customer/202012040000rocksdb_.zip]
2020-12-05T06:58:23,017 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - Downloaded RocksDB instance from [/rocksdb_directory/load_time=202012040000/rocksdb.zip] to [/tmp/customer/202012040000rocksdb_.zip]
2020-12-05T06:58:23,017 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - Unzipping RocksDB instance [/tmp/customer/202012040000rocksdb_.zip]
2020-12-05T06:58:23,037 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - Unzipped RocksDB instance [/tmp/customer/202012040000rocksdb_.zip]
2020-12-05T06:58:23,037 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager - Cleaned up [/tmp/customer/202012040000rocksdb_.zip]
2020-12-05T06:58:23,054 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBManager -
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      3/0    2.52 KB   0.8      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0      1.5      0.00              0.00         1    0.001       0      0
 Sum      3/0    2.52 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0      1.5      0.00              0.00         1    0.001       0      0
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0

** Compaction Stats [default] **
Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
User      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      1.5      0.00              0.00         1    0.001       0      0
Uptime(secs): 0.0 total, 0.0 interval
Flush(GB): cumulative 0.000, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.00 GB write, 0.16 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count

** File Read Latency Histogram By Level [default] **

** DB Stats **
Uptime(secs): 0.0 total, 0.0 interval
Cumulative writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 GB, 0.00 MB/s
Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 GB, 0.00 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 MB, 0.00 MB/s
Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

2020-12-05T06:58:23,054 INFO [MahaNamespaceExtractionCacheManager-0] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.cache.MahaNamespaceExtractionCacheManager - Namespace [customer_lookup] successfully updated. preVersion [null], newVersion [202012040000]


However, when I tried to query the Lookup, I still get the following error:

2020-12-05T07:03:18,226 WARN [qtp1777112002-75] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.MahaNamespacesCacheResource - Key is not passed hence returning the size of the cache
2020-12-05T07:03:31,933 WARN [qtp1777112002-86] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.MahaNamespacesCacheResource - Key is not passed hence returning the size of the cache
2020-12-05T07:03:53,992 INFO [qtp1777112002-88] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.MahaNamespacesCacheResource - Fetching cache value for key [k1] and valueColumn [value]
2020-12-05T07:03:53,993 ERROR [qtp1777112002-88] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.entity.CacheActionRunner - Caught exception while getting cache value
java.lang.IllegalArgumentException: unknown messageType customer
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:125) ~[guava-16.0.1.jar:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.protobuf.DefaultProtobufSchemaFactory.getProtobufParser(DefaultProtobufSchemaFactory.java:48) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.entity.CacheActionRunner.getCacheValue(CacheActionRunner.java:38) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBExtractionNamespaceCacheFactory.getCacheValue(RocksDBExtractionNamespaceCacheFactory.java:97) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBExtractionNamespaceCacheFactory.getCacheValue(RocksDBExtractionNamespaceCacheFactory.java:29) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.MahaNamespacesCacheResource.getCacheValue(MahaNamespacesCacheResource.java:97) ~[?:?]

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_272]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_272]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_272]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_272]
        at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) ~[jersey-server-1.19.3.jar:1.19.3]
        at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) ~[jersey-server-1.19.3.jar:1.19.3]
        at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) ~[jersey-server-1.19.3.jar:1.19.3]
        at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) ~[jersey-server-1.19.3.jar:1.19.3]
The error seems like relating to the CacheActionRunner and the Proto I created. I will dig into it tomorrow. If you have any insights, please share with us. Thank you!


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8

pb

unread,
Dec 5, 2020, 2:19:11 AM12/5/20
to Jason Chen, maha-users
Happy to hear that it is working for you, I will update the documentation.
public class TestProtobufSchemaFactory extends DefaultProtobufSchemaFactory {
public TestProtobufSchemaFactory() {
super(ImmutableMap.<String, GeneratedMessageV3>of("ad_lookup", AdProtos.Ad.getDefaultInstance()));
}
}
above class has mapping to namespace -> proto, I think you had specified namespace as customer,  in your case messageType is customer, need to change that in the SchemaFactory.

Thanks

Jason Chen

unread,
Dec 5, 2020, 2:32:40 AM12/5/20
to pb, maha-users
This is my CustomerProtobufSchemaFactory:

package com.shopify.data.rocksdb;

import com.google.common.collect.ImmutableMap;
import com.shopify.data.rocksdb.entity.CustomersProtos;
import com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.schema.protobuf.DefaultProtobufSchemaFactory;

public class CustomerProtobufSchemaFactory extends DefaultProtobufSchemaFactory {
    public CustomerProtobufSchemaFactory() {
        super(ImmutableMap.of("customer_lookup", CustomersProtos.customers.getDefaultInstance()));
    }
}

Is the key of the map should be “customer” rather than “customer_lookup”?


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8

Jason Chen

unread,
Dec 5, 2020, 2:33:29 AM12/5/20
to pb, maha-users
This is my proto:

package com.shopify.data.rocksdb.entity;

option java_outer_classname = "CustomersProtos";

message customers {
  optional string id = 1;
  optional string value = 2;
}


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8

Jason Chen

unread,
Dec 5, 2020, 2:34:39 AM12/5/20
to pb, maha-users
Yes. The namespace is “customer” in my JSON configuration file:


"extractionNamespace": {
          "type": "maharocksdb",
          "namespace": "customer",
          "lookupName": "customer_lookup",
          "rocksDbInstanceHDFSPath": "/rocksdb_directory",
          "lookupAuditingHDFSPath": "/rocksdb_audit",
          "pollPeriod": "PT30S",
          "cacheEnabled": true
        }


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8

Jason Chen

unread,
Dec 5, 2020, 2:48:57 AM12/5/20
to pb, maha-users
Using ‘customer’ as the key of the Map works. But I got another exception regarding the Protobuf, and I will dig into it tomorrow. Thank you for your help.

2020-12-05T07:46:17,040 INFO [qtp1777112002-88] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.MahaNamespacesCacheResource - Fetching cache value for key [k1] and valueColumn [value]
2020-12-05T07:46:17,042 ERROR [qtp1777112002-88] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.entity.CacheActionRunner - Caught exception while getting cache value
com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException: Protocol message tag had invalid wire type.
        at com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:111) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:557) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.GeneratedMessageV3.parseUnknownField(GeneratedMessageV3.java:320) ~[protobuf-java-3.11.0.jar:?]
        at com.shopify.data.rocksdb.entity.CustomersProtos$customers.<init>(CustomersProtos.java:116) ~[?:?]
        at com.shopify.data.rocksdb.entity.CustomersProtos$customers.<init>(CustomersProtos.java:58) ~[?:?]
        at com.shopify.data.rocksdb.entity.CustomersProtos$customers$1.parsePartialFrom(CustomersProtos.java:785) ~[?:?]
        at com.shopify.data.rocksdb.entity.CustomersProtos$customers$1.parsePartialFrom(CustomersProtos.java:779) ~[?:?]
        at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:158) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:191) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:203) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:208) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:48) ~[protobuf-java-3.11.0.jar:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.entity.CacheActionRunner.getCacheValue(CacheActionRunner.java:43) ~[?:?]

        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBExtractionNamespaceCacheFactory.getCacheValue(RocksDBExtractionNamespaceCacheFactory.java:97) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBExtractionNamespaceCacheFactory.getCacheValue(RocksDBExtractionNamespaceCacheFactory.java:29) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.MahaNamespacesCacheResource.getCacheValue(MahaNamespacesCacheResource.java:97) ~[?:?]

Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8

Jason Chen

unread,
Dec 6, 2020, 12:54:58 AM12/6/20
to pb, maha-users
I still cannot figure out the root cause of InvalidWireTypeException:

2020-12-06T05:24:47,794 ERROR [qtp591352568-77] com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.entity.CacheActionRunner - Caught exception while getting cache value

com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException: Protocol message tag had invalid wire type.
        at com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:111) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:557) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.GeneratedMessageV3.parseUnknownField(GeneratedMessageV3.java:320) ~[protobuf-java-3.11.0.jar:?]
        at com.shopify.data.rocksdb.entity.CustomerProtos$Customer.<init>(CustomerProtos.java:116) ~[?:?]
        at com.shopify.data.rocksdb.entity.CustomerProtos$Customer.<init>(CustomerProtos.java:58) ~[?:?]
        at com.shopify.data.rocksdb.entity.CustomerProtos$Customer$1.parsePartialFrom(CustomerProtos.java:785) ~[?:?]
        at com.shopify.data.rocksdb.entity.CustomerProtos$Customer$1.parsePartialFrom(CustomerProtos.java:779) ~[?:?]

        at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:158) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:191) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:203) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:208) ~[protobuf-java-3.11.0.jar:?]
        at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:48) ~[protobuf-java-3.11.0.jar:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.entity.CacheActionRunner.getCacheValue(CacheActionRunner.java:43) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBExtractionNamespaceCacheFactory.getCacheValue(RocksDBExtractionNamespaceCacheFactory.java:97) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.RocksDBExtractionNamespaceCacheFactory.getCacheValue(RocksDBExtractionNamespaceCacheFactory.java:29) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.server.lookup.namespace.MahaNamespacesCacheResource.getCacheValue(MahaNamespacesCacheResource.java:97) ~[?:?]

I attached my proto file, generated Java code, and the factory class. I used the following command to generate the Java code:

protoc --proto_path=/Users/jchome/src/github.com/drinkbeer/rocksdb-poc/src/main/java/com/shopify/data/rocksdb/entity/ --java_out=/Users/jchome/src/github.com/drinkbeer/rocksdb-poc/src/main/java/ Customer.proto

If possible, could you please share your steps that creates the Proto and the schema factory?

Regards,
Jason
Customer.proto
CustomerProtos.java
CustomerProtobufSchemaFactory.java

pb

unread,
Dec 6, 2020, 4:09:28 AM12/6/20
to Jason Chen, maha-users, tazan007, pavan ab
Hello Jason,
   I have spend some time today on installing rocksdb maha-druid-lookups on fresh druid 0.17 . and uploaded the guide (Ans/Explanations to all the difficulties that you faced) at https://github.com/pranavbhole/maha-druid-lookups-example repo. I picked up the Customer.proto as example with 3-4 fields. Guide explains about how to test rocksdb.zip in local file system instead of hadoop. You dont have to change any code in
  • RockdDBManager config.set("fs.defaultFS", "hdfs://localhost:8020”); , all you need to do is place core-site.xml in _common confi dir. It picks up config from there before instantiating.

com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException: Protocol message tag had invalid wire type.
it means there is something wrong Ser of values while creating rocksdb instance,
```
String id = String.valueOf(i);
Message message = CustomerProtos.Customer.newBuilder()
.setId(id)
.setAddress("address_"+id)
.setName("name_"+id)
.setLastUpdated(""+System.currentTimeMillis())
.setStatus(status[i%2])
.build();
rocksDB.put(id.getBytes(),message.toByteArray());

we put bytes of string key and message bytes as value in rocksdb. I have added the CreateExampleRocksDBInstance class which creates the rocksdb instance. 

You can follow the guide and let me know if you have face any issues, I wrote it today, I can also explain you over google hangout meeting and record video out of this.

Apart from ProtoBuf, we also support the flatbuffers as DeSer, found that It works pretty well.

Thank you !!


  •  

Jason Chen

unread,
Dec 6, 2020, 5:41:24 PM12/6/20
to pb, maha-users, tazan007, pavan ab
Hey, Pranav

Thank you very much! I want to confirm that the steps in maha-druid-lookups repository works fine for me. I am feeling excited about it.

By the way I found that the Lookup doesn’t work in the Druid console. I tried to query the Lookup, and found the following errors.

The query: “SELECT "k", "v" FROM lookup.customer_lookup LIMIT 5000"

The error: 

Error: Unknown exception

org.apache.calcite.runtime.CalciteContextException: From line 1, column 22 to line 1, column 43: Object 'customer_lookup' not found within 'lookup'
org.apache.calcite.tools.ValidationException


I added the following lines in /druid/conf/druid/single-server/micro-quickstart/historical/runtime.properties:

druid.lookup.lookupTier=historicalLookupTier

I added the following lines in “/druid/conf/druid/single-server/micro-quickstart/broker/runtime.properties :

druid.lookup.lookupTier=brokerLookupTier


Just wondering, does the Maha Lookup works in the console?


Jason (Jianbin) Chen
Senior Data Developer
p: +1 2066608351 | e: jason...@shopify.com
a: 234 Laurier Ave W Ottawa, ON K1N 5X8

pb

unread,
Dec 6, 2020, 7:39:32 PM12/6/20
to Jason Chen, maha-users, tazan007, pavan ab
we never used druid console for querying lookup, we always use API or druid query with extraction fn to query lookup, SQL support was added in the later versions, I think we need implement some more interfaces to add sql support in console, feel free to contribute to maha-lookups.

druid.lookup.lookupTier is "The tier for lookups for this process. This is independent of other tiers".

thanks !

Jason Chen

unread,
Dec 7, 2020, 5:55:01 PM12/7/20
to pb, maha-users, tazan007, pavan ab
Cool. Yes. We can try to contribute to Maha Druid Lookups by adding support of SQL in the console. But we may need a bit guidance of contributing. Do you have a CONTRIBUTING.md in the Maha repository?

I am doing a Join of facts table with Customer dimension.

So this is the schema of facts:

Id: int | orders_customer_id: string | count: long

The schema of Customer dimension:

Id: string | name: string | status: string | lastUpdate: string | address: string

I tried to do the following guery:

{
  "queryType": "groupBy",
  "dataSource": "select_type1_1k_output",
  "granularity": "day",
  "dimensions": [
    {
      "type": "default",
      "dimension": "id",
      "outputName": "Customer ID",
      "outputType": "STRING"
    },
    {
      "type": "extraction",
      "dimension": "id",
      "outputName": "status",
      "outputType": "STRING",
      "extractionFn": {
        "type": "mahaRegisteredLookup",
        "lookup": "customer_lookup",
        "retainMissingValue": false,
        "replaceMissingValueWith": "null",
        "injective": false,
        "optimize": true,
        "valueColumn": "status",
        "decode": null,
        "dimensionOverrideMap": {},
        "useQueryLevelCache": false
      }
    }
  ],
  "intervals": [ "2010-09-12T00:00:00.000/2020-09-13T00:00:00.000" ]
}

But I got the following 500 errors:

title>Error 500 javax.servlet.ServletException: java.lang.NoSuchMethodError: org.apache.druid.query.lookup.LookupReferencesManager.get(Ljava/lang/String;)Lorg/apache/druid/query/lookup/LookupExtractorFactoryContainer;</title>
</head>
<body><h2>HTTP ERROR 500 javax.servlet.ServletException: java.lang.NoSuchMethodError: org.apache.druid.query.lookup.LookupReferencesManager.get(Ljava/lang/String;)Lorg/apache/druid/query/lookup/LookupExtractorFactoryContainer;</h2>
<table>
<tr><th>URI:</th><td>/druid/v2/</td></tr>
<tr><th>STATUS:</th><td>500</td></tr>
<tr><th>MESSAGE:</th><td>javax.servlet.ServletException: java.lang.NoSuchMethodError: org.apache.druid.query.lookup.LookupReferencesManager.get(Ljava/lang/String;)Lorg/apache/druid/query/lookup/LookupExtractorFactoryContainer;</td></tr>
<tr><th>SERVLET:</th><td>org.eclipse.jetty.servlet.DefaultServlet-7d61468c</td></tr>
<tr><th>CAUSED BY:</th><td>javax.servlet.ServletException: java.lang.NoSuchMethodError: org.apache.druid.query.lookup.LookupReferencesManager.get(Ljava/lang/String;)Lorg/apache/druid/query/lookup/LookupExtractorFactoryContainer;</td></tr>
<tr><th>CAUSED BY:</th><td>java.lang.NoSuchMethodError: org.apache.druid.query.lookup.LookupReferencesManager.get(Ljava/lang/String;)Lorg/apache/druid/query/lookup/LookupExtractorFactoryContainer;</td></tr>
</table>
<h3>Caused by:</h3><pre>javax.servlet.ServletException: java.lang.NoSuchMethodError: org.apache.druid.query.lookup.LookupReferencesManager.get(Ljava/lang/String;)Lorg/apache/druid/query/lookup/LookupExtractorFactoryContainer;
at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:420)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)


I also attached the stack trace in this email. Do you have any insights about the root cause of the exceptions? Thank you very much!

Regards,
Jianbin Chen
On Dec 6, 2020, 7:39 PM -0500, pb , wrote:

druid.lookup.lookupTier
error.query.druid.txt

pb

unread,
Dec 7, 2020, 6:06:12 PM12/7/20
to Jason Chen, maha-users, tazan007, pavan ab
This looks like druid.io processing lib mis match in your api containers.
You can also explore maha api-example and use maha for querying. In maha you need to define the schema for your druid data source and lookups and configure maha-service-config, maha-api-jersey provides library to explore json apis and you can query druid using json. Take look at getting started guide of maha.


Jason Chen

unread,
Dec 7, 2020, 6:16:46 PM12/7/20
to pb, maha-users, tazan007, pavan ab
Hey.

Thanks for your reply. I am definitely willing to explore the API approach later. 

My Druid version is: 0.20.0
While the Maha is using Druid: 0.17.1

Should I download the 0.17.1, and retry the query through HTTP?

Regards,
Jianbin Chen

pb

unread,
Dec 7, 2020, 6:19:32 PM12/7/20
to Jason Chen, maha-users, tazan007, pavan ab
Querying through maha is backward compatible. You can still query .20 druid using latest maha lib. 

Jason Chen

unread,
Dec 7, 2020, 6:21:56 PM12/7/20
to pb, maha-users, tazan007, pavan ab
Okay. I found the the `LookupReferencesManager.get(String)` API changes its return type.

In 0.17.1:

    @Nullable
    public LookupExtractorFactoryContainer get(String lookupName) {
        Preconditions.checkState(this.lifecycleLock.awaitStarted(1L, TimeUnit.MILLISECONDS));
        return (LookupExtractorFactoryContainer)((LookupReferencesManager.LookupUpdateState)this.stateRef.get()).lookupMap.get(lookupName);
    }

In 0.20.0:

  @Override
  public Optional<LookupExtractorFactoryContainer> get(String lookupName)
  {
    Preconditions.checkState(lifecycleLock.awaitStarted(1, TimeUnit.MILLISECONDS));
    return Optional.ofNullable(stateRef.get().lookupMap.get(lookupName));
  }

Could this be the root cause of the query error I have seen?

Regards,
Jianbin Chen

pb

unread,
Dec 7, 2020, 6:31:37 PM12/7/20
to Jason Chen, maha-users, tazan007, pavan ab
ya, we have 0.17.1 version in maha lookup extension and you are using druid 0.20.  Created new branch https://github.com/yahoo/maha/tree/druid-0.20, you can update the druid version in the lookup extension and fix the compilation errors and update the changes there. you can contribute by creating maha fork.

Thanks.

Jason Chen

unread,
Dec 7, 2020, 6:47:16 PM12/7/20
to pb, maha-users, tazan007, pavan ab
Sounds good to me. Yes. I can try to upgrade the Druid version in Maha.

Regards,
Jianbin Chen

Jason Chen

unread,
Dec 7, 2020, 7:58:32 PM12/7/20
to pb, maha-users, tazan007, pavan ab
Hey, Pranav,

I created a PR that bumps the version of Druid and Maha. Please check: https://github.com/yahoo/maha/pull/748

Regards,
Jianbin Chen

Jason Chen

unread,
Dec 7, 2020, 8:37:51 PM12/7/20
to pb, maha-users, tazan007, pavan ab
I bumped the version to 0.20.0, and unzip the Maha built zip file to Druid extension folder. Then I restart the Druid cluster, I got the following errors:

╰─$ curl -L -H 'Content-Type:application/json' -XPOST --data-binary '@quickstart/tutorial/join_facts_dimensions_query.json' 'http://localhost:8082/druid/v2/?pretty'
{
  "error" : "Unknown exception",
  "errorMessage" : "Lookup [customer_lookup] not found",
  "errorClass" : "java.lang.IllegalArgumentException",
  "host" : null
}%
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ cat quickstart/tutorial/join_facts_dimensions_query.json
}%
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ curl "http://localhost:8083/druid/v1/namespaces/customer_lookup?namespaceclass=com.yahoo.maha.maha_druid_lookups.query.lookup.namespace.RocksDBExtractionNamespace&key=113440282&valueColumn=address&debug=true"
address_113440282%

The error logs in broker logs:

2020-12-08T01:31:44,034 INFO [DruidSchema-Cache-0] org.apache.druid.sql.calcite.schema.DruidSchema - dataSource [select_type1_1k_output] has new signature: {__time:LONG, attributed_sessions_id:LONG, attributed_sessions_marketing_event_id:LONG, attributed_sessions_order_id:LONG, attributed_sessions_updated_at:LONG, count:LONG, line_items_collection_id:LONG, line_items_id:LONG, line_items_order_id:LONG, line_items_product_id:LONG, line_items_product_type:STRING, line_items_title:STRING, line_items_updated_at:LONG, line_items_variant_id:LONG, line_items_variant_title:STRING, line_items_vendor:STRING, marketing_events_id:LONG, marketing_events_marketing_campaign_id:LONG, marketing_events_started_at:LONG, marketing_events_updated_at:LONG, orders_api_client_id:LONG, orders_billing_address_id:LONG, orders_cancelled_at:LONG, orders_created_at:LONG, orders_customer_id:STRING, orders_deleted_at:LONG, orders_financial_status:STRING, orders_fulfillment_status:STRING, orders_id:LONG, orders_is_deleted:STRING, orders_location_id:LONG, orders_name:STRING, orders_presentment_currency:STRING, orders_shipping_address_id:LONG, orders_shop_id:LONG, orders_test:STRING, orders_updated_at:LONG, orders_user_id:LONG, retail_sale_attributions_id:LONG, retail_sale_attributions_sale_id:LONG, retail_sale_attributions_updated_at:LONG, retail_sale_attributions_user_id:LONG, sale_unit_costs_created_at:LONG, sale_unit_costs_id:LONG, sale_unit_costs_sale_id:LONG, sale_unit_costs_shop_id:LONG, sale_unit_costs_updated_at:LONG, sales_id:LONG, sales_kind:STRING, sales_line_item_id:LONG, sales_line_type:STRING, sales_order_id:LONG, sales_shop_id:LONG, sales_updated_at:LONG, sum_total_line_items_price:LONG, sum_total_sales_allocated_order_discount_amount_before_taxes:LONG, sum_total_sales_amount_after_discounts_after_taxes:LONG, sum_total_sales_amount_after_discounts_before_taxes:LONG, sum_total_sales_amount_after_line_discount_before_order_discount_before_taxes:LONG, sum_total_sales_amount_before_discounts_before_taxes:LONG, sum_total_sales_count:LONG, sum_total_sales_quantity:LONG, sum_total_sales_tax_amount:LONG}.
2020-12-08T01:32:13,782 WARN [qtp23568923-123[groupBy_[select_type1_1k_output]_1859a370-975b-4457-b0ab-362148419f5a]] org.apache.druid.server.QueryLifecycle - Exception while processing queryId [1859a370-975b-4457-b0ab-362148419f5a] (java.lang.IllegalArgumentException: Lookup [customer_lookup] not found)
2020-12-08T01:32:13,806 ERROR [qtp23568923-123[groupBy_[select_type1_1k_output]_1859a370-975b-4457-b0ab-362148419f5a]] org.apache.druid.server.QueryResource - Exception handling request: {class=org.apache.druid.server.QueryResource, exceptionType=class java.lang.IllegalArgumentException, exceptionMessage=Lookup [customer_lookup] not found, query={"queryType":"groupBy","dataSource":{"type":"table","name":"select_type1_1k_output"},"intervals":{"type":"LegacySegmentSpec","intervals":["2010-09-12T00:00:00.000Z/2020-09-13T00:00:00.000Z"]},"virtualColumns":[],"filter":null,"granularity":"DAY","dimensions":[{"type":"default","dimension":"id","outputName":"Customer ID","outputType":"STRING"},{"type":"extraction","dimension":"id","outputName":"status","outputType":"STRING","extractionFn":{"type":"mahaRegisteredLookup","lookup":"customer_lookup","retainMissingValue":false,"replaceMissingValueWith":"null","injective":false,"optimize":true,"valueColumn":"status","decode":null,"dimensionOverrideMap":{},"useQueryLevelCache":false}}],"aggregations":[],"postAggregations":[],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"queryId":"1859a370-975b-4457-b0ab-362148419f5a"},"descending":false}, peer=0:0:0:0:0:0:0:1} (java.lang.IllegalArgumentException: Lookup [customer_lookup] not found)

While the Lookup works when I query the Lookup table only. I specified the key and volumeColumn. Do you have any insights to these errors?

Regards,
Jianbin Chen

pb

unread,
Dec 8, 2020, 12:22:58 AM12/8/20
to Jason Chen, maha-users
did you try deleting lookup and create it again?

Jason Chen

unread,
Dec 8, 2020, 12:26:29 AM12/8/20
to pb, maha-users
Hey, Yes. I deleted Lookup and tried again.

I found the error logs are in Druid-Server QueryLifecycle class. It only prints no stack trace logs. So now I am building a new druid-server 0.20.0 jar package which enables logs.error() so I can see the stack trace. The following code snippets are where the errors are emitted:

      if (e != null) {
        statsMap.put("exception", e.toString());
        log.error(e, "Exception while processing queryId [%s]", baseQuery.getId());
        if (e instanceof QueryInterruptedException) {
          // Mimic behavior from QueryResource, where this code was originally taken from.
          statsMap.put("interrupted", true);
          statsMap.put("reason", e.toString());
        }
      }

Regards,
Jianbin Chen

Jason Chen

unread,
Dec 8, 2020, 12:39:02 AM12/8/20
to pb, maha-users
I got the error stack trace:

2020-12-08T05:36:41,793 ERROR [qtp23568923-127[groupBy_[select_type1_1k_output]_9b5892b8-ecaa-44de-b557-7c1e9b623b73]] org.apache.druid.server.QueryLifecycle - Exception while processing queryId [9b5892b8-ecaa-44de-b557-7c1e9b623b73]
java.lang.IllegalArgumentException: Lookup [customer_lookup] not found
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:148) ~[guava-16.0.1.jar:?]
        at com.yahoo.maha.maha_druid_lookups.query.lookup.MahaRegisteredLookupExtractionFn.ensureDelegate(MahaRegisteredLookupExtractionFn.java:225) ~[?:?]
        at com.yahoo.maha.maha_druid_lookups.query.lookup.MahaRegisteredLookupExtractionFn.getExtractionType(MahaRegisteredLookupExtractionFn.java:202) ~[?:?]
        at org.apache.druid.query.groupby.GroupByQueryQueryToolChest.extractionsToRewrite(GroupByQueryQueryToolChest.java:715) ~[druid-processing-0.20.0.jar:0.20.0]
        at org.apache.druid.query.groupby.GroupByQueryQueryToolChest.makePostComputeManipulatorFn(GroupByQueryQueryToolChest.java:362) ~[druid-processing-0.20.0.jar:0.20.0]
        at org.apache.druid.query.groupby.GroupByQueryQueryToolChest.makePostComputeManipulatorFn(GroupByQueryQueryToolChest.java:81) ~[druid-processing-0.20.0.jar:0.20.0]
        at org.apache.druid.query.FinalizeResultsQueryRunner.run(FinalizeResultsQueryRunner.java:105) ~[druid-processing-0.20.0.jar:0.20.0]
        at org.apache.druid.query.CPUTimeMetricQueryRunner.run(CPUTimeMetricQueryRunner.java:65) ~[druid-processing-0.20.0.jar:0.20.0]
        at org.apache.druid.query.ResultLevelCachingQueryRunner.run(ResultLevelCachingQueryRunner.java:158) ~[druid-server-0.20.0.jar:0.20.0]
        at org.apache.druid.query.FluentQueryRunnerBuilder$FluentQueryRunner.run(FluentQueryRunnerBuilder.java:55) ~[druid-processing-0.20.0.jar:0.20.0]
        at org.apache.druid.server.ClientQuerySegmentWalker$QuerySwappingQueryRunner.run(ClientQuerySegmentWalker.java:508) ~[druid-server-0.20.0.jar:0.20.0]
        at org.apache.druid.query.QueryPlus.run(QueryPlus.java:149) ~[druid-processing-0.20.0.jar:0.20.0]
        at org.apache.druid.server.QueryLifecycle.execute(QueryLifecycle.java:268) ~[druid-server-0.20.0.jar:0.20.0]
        at org.apache.druid.server.QueryResource.doPost(QueryResource.java:210) ~[druid-server-0.20.0.jar:0.20.0]

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_272]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_272]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_272]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_272]
        at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) ~[jersey-server-1.19.3.jar:1.19.3]
        at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) ~[jersey-server-1.19.3.jar:1.19.3]
        at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) ~[jersey-server-1.19.3.jar:1.19.3]
        at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) ~[jersey-server-1.19.3.jar:1.19.3]
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:489) ~[jetty-servlet-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1284) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:767) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:173) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.Server.handle(Server.java:501) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) ~[jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:556) [jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) [jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:272) [jetty-server-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) [jetty-io-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) [jetty-io-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) [jetty-io-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) [jetty-util-9.4.30.v20200611.jar:9.4.30.v20200611]
        at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) [jetty-util-9.4.30.v20200611.jar:9.4.30.v20200611]

        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]
2020-12-08T05:36:41,834 ERROR [qtp23568923-127[groupBy_[select_type1_1k_output]_9b5892b8-ecaa-44de-b557-7c1e9b623b73]] org.apache.druid.server.QueryResource - Exception handling request: {class=org.apache.druid.server.QueryResource, exceptionType=class java.lang.IllegalArgumentException, exceptionMessage=Lookup [customer_lookup] not found, query={"queryType":"groupBy","dataSource":{"type":"table","name":"select_type1_1k_output"},"intervals":{"type":"LegacySegmentSpec","intervals":["2010-09-12T00:00:00.000Z/2020-09-13T00:00:00.000Z"]},"virtualColumns":[],"filter":null,"granularity":"DAY","dimensions":[{"type":"default","dimension":"id","outputName":"Customer ID","outputType":"STRING"},{"type":"extraction","dimension":"id","outputName":"status","outputType":"STRING","extractionFn":{"type":"mahaRegisteredLookup","lookup":"customer_lookup","retainMissingValue":false,"replaceMissingValueWith":"null","injective":false,"optimize":true,"valueColumn":"status","decode":null,"dimensionOverrideMap":{},"useQueryLevelCache":false}}],"aggregations":[],"postAggregations":[],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"queryId":"9b5892b8-ecaa-44de-b557-7c1e9b623b73"},"descending":false}, peer=0:0:0:0:0:0:0:1} (java.lang.IllegalArgumentException: Lookup [customer_lookup] not found)


I will dig into it tomorrow.

Regards,
Jianbin Chen

Jason Chen

unread,
Dec 8, 2020, 12:41:40 AM12/8/20
to pb, maha-users
The error is caused by a Precondition check failure:

    private MahaLookupExtractionFn ensureDelegate() {
        if (null == delegate) {
            // http://www.javamex.com/tutorials/double_checked_locking.shtml
            synchronized (delegateLock) {
                if (null == delegate) {
                    Preconditions.checkArgument(manager.get(getLookup()).isPresent(), "Lookup [%s] not found", getLookup());
                    delegate = new MahaLookupExtractionFn(
                            manager.get(getLookup()).get().getLookupExtractorFactory().get(),
                            isRetainMissingValue(),
                            getReplaceMissingValueWith(),
                            isInjective(),
                            isOptimize(),
                            valueColumn,
                            decodeConfig,
                            dimensionOverrideMap
                    );
                }
            }
        }
        return delegate;
    }

Regards,
Jianbin Chen
On Dec 8, 2020, 12:38 AM -0500, Jason Chen <jason...@shopify.com>, wrote:

MahaRegisteredLookupExtractionFn

Jason Chen

unread,
Dec 9, 2020, 4:24:04 PM12/9/20
to pb, maha-users
Hey Pranav,

Still the same problem as previous email. When I try to query the facts and lookup customer dimension using the attached JSON, I found that the Lookup is not registered in the lookupMap in LookupReferenceManager class. Do you have any idea about why the Lookup “customer_lookup” is not registered? Do you have some query example?

Regards,
Jianbin Chen
join_facts_dimensions_query.json

Jason Chen

unread,
Dec 9, 2020, 4:27:09 PM12/9/20
to pb, maha-users
I forgot to mention that this is the command I used to create lookup table and query the table:

╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›

╰─$ curl -XPOST -H'Content-Type: application/json' -d '{}' http://localhost:8081/druid/coordinator/v1/lookups/config
{}%
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.20.0 ‹druid-local*›
╰─$ curl -XPOST -H'Content-Type: application/json' -d '@quickstart/tutorial/historical_lookup.json' http://localhost:8081/druid/coordinator/v1/lookups/config
{"historicalLookupTier":{"customer_lookup":{"version":"v0","lookupExtractorFactory":{"type":"cachedNamespace","extractionNamespace":{"type":"maharocksdb","namespace":"customer_lookup","lookupName":"customer_lookup","rocksDbInstanceHDFSPath":"/Users/jchome/src/github.com/drinkbeer/maha-druid-lookups-example/target","lookupAuditingHDFSPath":"/tmp","pollPeriod":"PT30S","cacheEnabled":true}}}}}%
╰─$ curl -L -H 'Content-Type:application/json' -XPOST --data-binary '@quickstart/tutorial/join_facts_dimensions_query.json' 'http://localhost:8082/druid/v2/?pretty'
{
  "error" : "Unknown exception",
  "errorMessage" : "Lookup [customer_lookup] not found",
  "errorClass" : "java.lang.IllegalArgumentException",
  "host" : null
}%

Regards,
Jianbin Chen

pb

unread,
Dec 9, 2020, 4:45:36 PM12/9/20
to Jason Chen, maha-users
https://github.com/apache/druid/releases you have updated the ext to support druid 0.20, need to revisit the release notes again if we miss something. Ext works perfectly for 0.17.

pb

unread,
Dec 9, 2020, 7:53:24 PM12/9/20
to Jason Chen, maha-users
Let's update the druid20 branch if we find something. Will try to port it to druid20 whenever I get time.
Thanks

Jason Chen

unread,
Dec 10, 2020, 2:44:18 AM12/10/20
to pb, maha-users
Sure. We can debug the Druid 0.20.0 some time in the future. But I think the most important thing is to validate if the query works in Druid 0.17.1.

I switched to Druid 0.17.1, and found the same error:

2020-12-10T07:31:55,589 ERROR [qtp1612539426-140[groupBy_[select_type1_1k_output]_e1268d6a-cb46-4df8-859d-bd61a41ec2e2]] org.apache.druid.server.QueryResource - Exception handling request: {class=org.apache.druid.server.QueryResource, exceptionType=class java.lang.NullPointerException, exceptionMessage=Lookup [customer_lookup] not found, query={"queryType":"groupBy","dataSource":{"type":"table","name":"select_type1_1k_output"},"intervals":{"type":"LegacySegmentSpec","intervals":["2010-09-12T00:00:00.000Z/2020-09-13T00:00:00.000Z"]},"virtualColumns":[],"filter":null,"granularity":"DAY","dimensions":[{"type":"default","dimension":"id","outputName":"Customer ID","outputType":"STRING"},{"type":"extraction","dimension":"id","outputName":"status","outputType":"STRING","extractionFn":{"type":"mahaRegisteredLookup","lookup":"customer_lookup","retainMissingValue":false,"replaceMissingValueWith":"null","injective":false,"optimize":true,"valueColumn":"status","decode":null,"dimensionOverrideMap":{},"useQueryLevelCache":false}}],"aggregations":[],"postAggregations":[],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"queryId":"e1268d6a-cb46-4df8-859d-bd61a41ec2e2"},"descending":false}, peer=0:0:0:0:0:0:0:1} (java.lang.NullPointerException: Lookup [customer_lookup] not found)

╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.17.1 ‹druid-local*›
╭─jchome@jcmbp-2 ~/src/github.com/Shopify/druid-poc/druid-bin/apache-druid-0.17.1 ‹druid-local*›

╰─$ curl -L -H 'Content-Type:application/json' -XPOST --data-binary '@quickstart/tutorial/join_facts_dimensions_query.json' 'http://localhost:8082/druid/v2/?pretty'
{
  "error" : "Unknown exception",
  "errorMessage" : "Lookup [customer_lookup] not found",
  "errorClass" : "java.lang.NullPointerException",
  "host" : null
}%

I guess I missed some steps in creating Lookups. What I can find is the ‘LookupMap' is empty in LookupReferenceManager. I want to understand 
  1. How “LookupReferencesManager” is injected into “MahaRegisteredLookupExtractionFn”? I didn’t find it in “MahaNamespaceExtractionModule”.
  2. Could you point me to the codes where “LookupExtractorFactoryContainer” is created and put into LookupMap?
Thank you very much!

Regards,
Jianbin Chen

Jason Chen

unread,
Dec 10, 2020, 1:11:04 PM12/10/20
to pb, maha-users
I am continue digging into the Lookup is null issue. I found that the following method in LookupReferenceManager is causing the issue:

  /**
   * Returns a list of lookups from the snapshot if the lookupsnapshottaker is configured. If it's not available,
   * returns null.
   *
   * @return list of LookupBean objects, or null
   */
  @Nullable
  private List<LookupBean> getLookupListFromSnapshot()
  {
    if (lookupSnapshotTaker != null) {
      LOG.info("LogBreakPoint 15 - getLookupListFromSnapshot(), lookupSnapshotTaker is not null, tier: " + lookupListeningAnnouncerConfig.getLookupTier());
      return lookupSnapshotTaker.pullExistingSnapshot(lookupListeningAnnouncerConfig.getLookupTier());
    }
    LOG.info("LogBreakPoint 16 - getLookupListFromSnapshot(), lookupSnapshotTaker is null, tier: " + lookupListeningAnnouncerConfig.getLookupTier());
    return null;
  }

Broker Logs:
2020-12-10T18:02:20,448 INFO [main] org.apache.druid.query.lookup.LookupReferencesManager - LogBreakPoint 16 - getLookupListFromSnapshot(), lookupSnapshotTaker is null, tier: brokerLookupTier

Config in broker runtime.properties:
druid.lookup.lookupTier=brokerLookupTier

Config in historical runtime.properties:
druid.lookup.lookupTier=historicalLookupTier

I found this method is called in LookupReferenceManager, and it is reading from Snapshot file in brokerLookupTier. However, we store our Lookup in Historical tier in Historical nodes. Is this the root cause that LookupReferenceManager cannot read the [customer_lookup]? Looking forward to hear your opinion.

Thank you

Regards,
Jianbin Chen

Jason Chen

unread,
Dec 10, 2020, 1:35:22 PM12/10/20
to pb, maha-users
Hey. After setting the broker config with the correct tier, it works.

Configure in broker runtime.properties:
# Maha
druid.lookup.lookupTier=historicalLookupTier

Query result:

╰─$ curl -L -H 'Content-Type:application/json' -XPOST --data-binary '@quickstart/tutorial/join_facts_dimensions_query.json' 'http://localhost:8082/druid/v2/?pretty'
[ {
  "version" : "v1",
  "timestamp" : "2013-12-03T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-03-31T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-11-27T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-11-28T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-11-29T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-12-09T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-12-11T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-12-21T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-12-22T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-12-24T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2014-12-27T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-01-01T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-01-02T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-01-07T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-01-08T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-01-10T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-02-12T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-02-13T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-02-14T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-02-18T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-02-19T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-02-20T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-02-28T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-03-01T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-03-02T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-03-03T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-03-06T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-03-11T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-03-13T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-03-23T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-03-27T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-04-26T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-04-27T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-04-28T00:00:00.000Z",
  "event" : { }
}, {
  "version" : "v1",
  "timestamp" : "2015-04-29T00:00:00.000Z",
  "event" : { }
}


Do you have some guidance on different type of queries? Could I directly use SQL in the query?

Regards,
Jianbin Chen

pb

unread,
Dec 10, 2020, 2:10:54 PM12/10/20
to Jason Chen, maha-users
nice, glad to know that it worked. yup broker config is needed.
https://druid.apache.org/docs/latest/querying/lookups.html once lookup is registered, you can use SQL as well using LOOKUP() function. More usage in druid docs.

Thanks

Jason Chen

unread,
Dec 16, 2020, 3:59:35 PM12/16/20
to pb, maha-users
Hey Pranav,

I created ‘lookup.customer_lookup’ which the dimensions in Maha Druid Lookups (underlying cached in RocksDB). I can do some basic query to the right-hand side, like this:

SELECT *
FROM lookup.customer_lookup

However, I found that the Join of Table Datasource with the Maha Lookup table will cause Druid broker node crashes. Here are an example:

SELECT *
FROM select_type1_1k_output AS FACT
INNER JOIN lookup.customer_lookup AS DIM ON FACT.orders_customer_id = DIM.k

‘select_type1_1k_output' is Table datasource of facts.

I didn’t find the error logs in broker nodes. In the console, or curl command, I only get “Bad Gateway” message. Do you have any idea why the Join of datasource and Druid Lookup table failed?

Thank you if you can share some ideas. If you can share some of the Join queries you used with Maha Druid Lookups, it will be real appreciated.

Regards,
Jianbin Chen

trine breiting

unread,
May 12, 2021, 6:24:25 AM5/12/21
to maha-users
Did you find out?
Reply all
Reply to author
Forward
0 new messages