Create new datasource out of an existing one

16 views
Skip to first unread message

Lax T

unread,
Oct 11, 2021, 1:00:12 AM10/11/21
to Druid User
Hi,
I just started using Druid.
I currently have the following datasource that is created using kafka injestion. I want to create one other datasource out of this by extracting a subset of datapoints from the original. But the task is failing without any details.

Please help with figuring out what could be wrong. The total data is less 200MB. Thanks

Here is the original datasource:
{
    "spec": {
        "dataSchema": {
            "dataSource": "OrigDataSrc",
            "dimensionsSpec": {
                "dimensions": [
                    "org_id",
                    "user_id",
                    "login_id"
                ]
            },
            "granularitySpec": {
                "queryGranularity": "NONE",
                "rollup": false,
                "segmentGranularity": "HOUR",
                "type": "uniform"
            },
            "timestampSpec": {
                "column": "timestamp",
                "format": "iso"
            }
        },
        "ioConfig": {
            "consumerProperties": {
                "bootstrap.servers": "kafka:9092"
            },
            "inputFormat": {
                "type": "json"
            },
            "topic": "DemDataXfrm.ProbeData",
            "type": "kafka",
            "useEarliestOffset": true
        },
        "tuningConfig": {
            "type": "kafka"
        }
    },
    "type": "kafka"
}




Task Json:
{
  "type": "index_parallel",
  "spec": {

    "ioConfig": {
      "type": "index_parallel",
      "inputSource": {
        "type": "druid",
        "dataSource": "OrigDataSrc",
        "interval": "2021-10-07/P2D"
      }
    },


    "dataSchema": {
      "dataSource": "DerivedDataSrc1",
      "granularitySpec": {
        "type": "uniform",
        "queryGranularity": "NONE",
        "segmentGranularity": "HOUR",
        "rollup": false
      },
      "timestampSpec": {
        "column": "__time",
        "format": "auto"
      },
      
      "dimensionsSpec": {
        "dimensions": [
          "org_id",
          "user_id",
          "login_id"
         ]
      }
    }
  }
}

Ben Krug

unread,
Oct 11, 2021, 5:58:44 PM10/11/21
to druid...@googlegroups.com
No errors makes it hard, but maybe it's not failing... I'm not sure about this part

   "interval": "2021-10-07/P2D"

Can you try

   "interval": "2021-10-07/2021-10-09"

And see if data gets loaded?

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/3505442b-16e9-4d65-aa7e-e6745b74ecfdn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages