What format should I use when my timestamp data has this format "2018-01-26T15:15:54.000-05:00"?

29 views
Skip to first unread message

Christine Li

unread,
Nov 6, 2018, 4:03:31 PM11/6/18
to Druid User
I read the source, looks like org/joda/time/format/DateTimeFormat is used for my setting. But java doc doesn't give detail other than says it is similar to java/text/SimpleDateFormat. But I tried both and didn't work. Can someone help?

"timestampSpec": {
            "column": "TXN_TIMESTAMP",
            "format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ"
 }

&

"timestampSpec": {
            "column": "TXN_TIMESTAMP",
            "format": "yyyy-MM-dd'T'HH:mm:ss.SSSZZ"
 }

io.druid.java.util.common.parsers.ParseException: Unparseable timestamp found!
	at io.druid.data.input.impl.MapInputRowParser.parseBatch(MapInputRowParser.java:75) ~[druid-api-0.12.3.jar:0.12.3]
	at io.druid.data.input.impl.StringInputRowParser.parseMap(StringInputRowParser.java:165) ~[druid-api-0.12.3.jar:0.12.3]

Surekha Saharan

unread,
Nov 6, 2018, 4:18:11 PM11/6/18
to druid...@googlegroups.com
Hi Christine,

I have not tested this, but may be try this format "yyyy-MM-dd'T'HH:mm:ssX"

Thanks,
Surekha

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/2fe3f12a-38c9-465a-983e-5e0a04048b3a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christine Li

unread,
Nov 6, 2018, 10:46:56 PM11/6/18
to Druid User
Hmm, to get the function working, I modified the timestamp field to be: "TXN_TIMESTAMP": "2018-02-21T12:54:08.000Z" and have "format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ"

It should work in some degree, since in the log I can see 

2018-11-07T03:41:53,932 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.BaseAppenderatorDriver - New segment[timhortons_2017-07-14T00:00:00.000Z_2017-07-15T00:00:00.000Z_2018-11-07T03:41:49.272Z]......

but still failed with the following error. Can anyone help? Thanks


2018-11-07T03:41:54,020 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[timhortons_2017-12-07T00:00:00.000Z_2017-12-08T00:00:00.000Z_2018-11-07T03:41:49.272Z].
2018-11-07T03:41:54,020 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[timhortons_2017-06-27T00:00:00.000Z_2017-06-28T00:00:00.000Z_2018-11-07T03:41:49.272Z].
2018-11-07T03:41:54,023 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[timhortons_2017-08-04T00:00:00.000Z_2017-08-05T00:00:00.000Z_2018-11-07T03:41:49.272Z].
2018-11-07T03:41:54,025 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[AbstractTask{id='index_timhortons_2018-11-07T03:41:49.269Z', groupId='index_timhortons_2018-11-07T03:41:49.269Z', taskResource=TaskResource{availabilityGroup='index_timhortons_2018-11-07T03:41:49.269Z', requiredCapacity=1}, dataSource='timhortons', context={}}]
io.druid.java.util.common.parsers.ParseException: Unparseable timestamp found!
	at io.druid.data.input.impl.MapInputRowParser.parseBatch(MapInputRowParser.java:75) ~[druid-api-0.12.3.jar:0.12.3]
	at io.druid.data.input.impl.StringInputRowParser.parseMap(StringInputRowParser.java:165) ~[druid-api-0.12.3.jar:0.12.3]
	at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:148) ~[druid-api-0.12.3.jar:0.12.3]
	at io.druid.segment.transform.TransformingStringInputRowParser.parse(TransformingStringInputRowParser.java:57) ~[druid-processing-0.12.3.jar:0.12.3]
	at io.druid.data.input.impl.FileIteratingFirehose.nextRow(FileIteratingFirehose.java:81) ~[druid-api-0.12.3.jar:0.12.3]
	at io.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:661) ~[druid-indexing-service-0.12.3.jar:0.12.3]
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:264) ~[druid-indexing-service-0.12.3.jar:0.12.3]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444) [druid-indexing-service-0.12.3.jar:0.12.3]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416) [druid-indexing-service-0.12.3.jar:0.12.3]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
Caused by: java.lang.IllegalArgumentException: Invalid format: "TXN_TIMESTAMP"
	at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:945) ~[joda-time-2.9.9.jar:2.9.9]
	at io.druid.java.util.common.DateTimes$UtcFormatter.parse(DateTimes.java:57) ~[java-util-0.12.3.jar:0.12.3]
	at io.druid.java.util.common.parsers.TimestampParser.lambda$createTimestampParser$4(TimestampParser.java:93) ~[java-util-0.12.3.jar:0.12.3]
	at io.druid.java.util.common.parsers.TimestampParser.lambda$createObjectTimestampParser$8(TimestampParser.java:129) ~[java-util-0.12.3.jar:0.12.3]
	at io.druid.data.input.impl.TimestampSpec.parseDateTime(TimestampSpec.java:106) ~[druid-api-0.12.3.jar:0.12.3]
	at io.druid.data.input.impl.TimestampSpec.extractTimestamp(TimestampSpec.java:94) ~[druid-api-0.12.3.jar:0.12.3]
	at io.druid.data.input.impl.MapInputRowParser.parseBatch(MapInputRowParser.java:63) ~[druid-api-0.12.3.jar:0.12.3]
	... 12 more
2018-11-07T03:41:54,032 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_timhortons_2018-11-07T03:41:49.269Z] status changed to [FAILED].
2018-11-07T03:41:54,036 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_timhortons_2018-11-07T03:41:49.269Z",
  "status" : "FAILED",
  "duration" : 481
}

Eyal Yurman

unread,
Nov 7, 2018, 10:32:53 AM11/7/18
to druid...@googlegroups.com
Hi, 

The documentation states: 
iso, millis, posix, auto or any Joda time format.

I would try "iso" as the timestamp format, as it follows the ISO 8601 standards.

If you specify a specific format instead, it would rather follow the non-standard  Joda time's own format.



--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Best regards,

Eyal Yurman
+972-54-3056315

Christine Li

unread,
Nov 7, 2018, 10:47:03 PM11/7/18
to Druid User
Thanks all for help. Apparently, there is a bad row in my data which caused the parsing error. 
Eyal, thanks for pointing out the valid formations. Works now!
Reply all
Reply to author
Forward
0 new messages