TimestampSpec for fractional (epoch + miliseconds) time

307 views
Skip to first unread message

William Cox

unread,
Feb 17, 2017, 12:23:53 PM2/17/17
to Druid User
Hello,

I am experimenting with Druid and I have files where the timestamp is expressed like `seconds.fractional_seconds`. Example: 1487249458.633

How do I specify my TimestampSpec in the Ingestion Spec to deal with this? `auto` yields `Caused by: com.metamx.common.parsers.ParseException: Unparseable timestamp found!` errors in the logs.

Thanks.
-William


Slim Bouguerra

unread,
Feb 17, 2017, 2:27:57 PM2/17/17
to druid...@googlegroups.com
Hi druid supports only 
iso, millis, posix or any Joda time format. 
Auto means it will try to solve any of the accepted types posted above.
Seems like you time is not a standard that druid can read so you need to convert it before hand or plugin your own parser spec ParseSpec.java

 
-- 

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/c8c8dc73-634b-4a4b-bcfe-d3d8114ffca4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

William Cox

unread,
Feb 17, 2017, 2:42:26 PM2/17/17
to Druid User
I'm trying to understand what Posix actually does. Looking at github it seems that this should be:
return new DateTime(input.longValue() * 1000);
https://github.com/metamx/java-util/blob/master/src/main/java/com/metamx/common/parsers/TimestampParser.java#L111

which in my case would be DateTime(1487249458.633 * 1000). That seems like it should work, but I still get the following error:

Caused by: com.metamx.common.parsers.ParseException: Unparseable timestamp found!
	at io.druid.data.input.impl.MapInputRowParser.parse(MapInputRowParser.java:72) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.StringInputRowParser.parseMap(StringInputRowParser.java:136) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:131) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopyStringInputRowParser.parse(HadoopyStringInputRowParser.java:48) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidIndexerMapper.parseInputRow(HadoopDruidIndexerMapper.java:105) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:72) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.DetermineHashedPartitionsJob$DetermineCardinalityMapper.run(DetermineHashedPartitionsJob.java:285) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_73]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_73]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_73]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_73]
	at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_73]
Caused by: java.lang.NumberFormatException: For input string: "1487249458.633"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_73]
	at java.lang.Long.parseLong(Long.java:589) ~[?:1.8.0_73]
	at java.lang.Long.parseLong(Long.java:631) ~[?:1.8.0_73]
	at com.metamx.common.parsers.TimestampParser$3.apply(TimestampParser.java:73) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.parsers.TimestampParser$3.apply(TimestampParser.java:68) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.parsers.TimestampParser$9.apply(TimestampParser.java:159) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.parsers.TimestampParser$9.apply(TimestampParser.java:150) ~[java-util-0.27.10.jar:?]
	at io.druid.data.input.impl.TimestampSpec.extractTimestamp(TimestampSpec.java:97) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.MapInputRowParser.parse(MapInputRowParser.java:60) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.StringInputRowParser.parseMap(StringInputRowParser.java:136) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:131) ~[druid-api-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopyStringInputRowParser.parse(HadoopyStringInputRowParser.java:48) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidIndexerMapper.parseInputRow(HadoopDruidIndexerMapper.java:105) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:72) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.DetermineHashedPartitionsJob$DetermineCardinalityMapper.run(DetermineHashedPartitionsJob.java:285) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_73]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_73]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_73]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_73]
	at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_73]

Why is posix the wrong choice in this case?
Thanks.
-William

Slim Bouguerra

unread,
Feb 17, 2017, 3:48:04 PM2/17/17
to druid...@googlegroups.com
as you can see it is trying to pars long which fails due to “.”
Caused by: java.lang.NumberFormatException: For input string: "1487249458.633"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_73]
	at java.lang.Long.parseLong(Long.java:589) ~[?:1.8.0_73]
	at java.lang.Long.parseLong(Long.java:631) ~[?:1.8.0_73]

thought druid can read this as number but not sure why is considered as a string.
Can you please file an issue and will look at it ASAP.
Also please upload the job spec and a sample of the data like that we can reproduce/Unit test it.
Thanks


-- 

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

William Cox

unread,
Feb 17, 2017, 4:36:54 PM2/17/17
to Druid User
Thanks B-Slim. Issue submitted here: https://github.com/druid-io/druid/issues/3952

Let me know if I left out any important details.
-William

William Cox

unread,
Feb 20, 2017, 9:30:38 AM2/20/17
to Druid User
It's been a long time since I've done any Java development. 

Is there a tutorial for creating a custom parser spec?
Thanks.
-William

Slim Bouguerra

unread,
Feb 20, 2017, 8:51:19 PM2/20/17
to druid...@googlegroups.com
Maybe avro reader http://druid.io/docs/latest/development/extensions-core/avro.html it does more than what you need but it is a good example.

-- 

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

丁凯剑

unread,
Feb 20, 2017, 9:34:04 PM2/20/17
to Druid User
I think you should use ruby, https://github.com/metamx/java-util/blob/master/src/main/java/com/metamx/common/parsers/TimestampParser.java#L84

在 2017年2月18日星期六 UTC+8上午1:23:53,William Cox写道:

William Cox

unread,
Feb 23, 2017, 3:43:31 PM2/23/17
to druid...@googlegroups.com
Hello kaijian,

It appears that `ruby` does work as a timestampSpec! That's awesome and will save me a lot of time.
Thanks.
-William

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/quYLsW5JoSc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages