timestampSpec format within spec file

997 views
Skip to first unread message

Lunen de Lange

unread,
Jun 18, 2014, 8:18:47 AM6/18/14
to druid-de...@googlegroups.com
Hi guys.. last question(s)then I'll be finished with my setup!!! So sorry for spamming the forum. I have two Scenario's

Scenario 1. timestamp filed comes in to the system in a random format and I need to make sure it's in a standard format. I have the following, but it does not seem to format the datetime to a Joda format.
"firehose":{
    "type": "kafka-0.8"
     "consumerProps": {
      ...
      ...
      },
     "feed": "testdata",
      "parser": {
           "timestampSpec": {
            "column": "timestamp",
            "format": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
            },

Scenario 2. There will be no timestamp field given to us. How can this be set up to take in the current datetime as a default/auto. I was thinking of doing something like:
"firehose":{
    "type": "kafka-0.8"
     "consumerProps": {
      ...
      ...
      },
     "feed": "testdata",
      "parser": {
           "timestampSpec": {
            "column": auto,
            "format": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
            },

Thanks guys!

Fangjin Yang

unread,
Jun 18, 2014, 7:16:02 PM6/18/14
to druid-de...@googlegroups.com
Hi Lunen, see inline.


On Wednesday, June 18, 2014 5:18:47 AM UTC-7, Lunen de Lange wrote:
Hi guys.. last question(s)then I'll be finished with my setup!!! So sorry for spamming the forum. I have two Scenario's

Scenario 1. timestamp filed comes in to the system in a random format and I need to make sure it's in a standard format. I have the following, but it does not seem to format the datetime to a Joda format.
"firehose":{
    "type": "kafka-0.8"
     "consumerProps": {
      ...
      ...
      },
     "feed": "testdata",
      "parser": {
           "timestampSpec": {
            "column": "timestamp",
            "format": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
            },

If you timestamp is a standard ISO-8601 format, you can select timestamp "auto" or "iso" here and Druid will figure out how to parse the timestamp. If the timestamp is some other format, Druid will try to use Joda's DateTimeFormatter (http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormatter.html) to parse things. What exception do you see with your format?

Scenario 2. There will be no timestamp field given to us. How can this be set up to take in the current datetime as a default/auto. I was thinking of doing something like:
"firehose":{
    "type": "kafka-0.8"
     "consumerProps": {
      ...
      ...
      },
     "feed": "testdata",
      "parser": {
           "timestampSpec": {
            "column": auto,
            "format": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
            },

Druid ingestion requires a timestamp for all events. I believe exceptions will be thrown if no timestamps are provided and you may have to add code to append a timestamp in such a case. Alternatively, you can add a timestamp during ETL.

Let me know if that helps.

- FJ
 
Thanks guys!

Lunen de Lange

unread,
Jun 19, 2014, 5:59:45 AM6/19/14
to druid-de...@googlegroups.com
Thank you.. I'll try and get it sent from ETL as standard.
Reply all
Reply to author
Forward
0 new messages