Null timestamp in input but the timestamp is there

527 views
Skip to first unread message

Elysia

unread,
Mar 29, 2018, 6:37:25 PM3/29/18
to Druid User
Hi all!
I'm getting a ParseException while trying to ingest Kakfa events that look like:

 {"timestamp":"2018-03-29T23:31:26.077Z","idRequest":3753192267374775741,"codAggregator":"DM","result":"OK","stackTrace":"LVS-5345"}

The exceptions clearly shows that the timestamp field has not been sent to tranquility kakfa, but why? the timestamp is well json formatted as it has been successfully processed by kafka.

I'm using Druid 0.12.0 and Tranquility 0.8.0, the exception is:

com.metamx.common.parsers.ParseException: Unparseable timestamp found!
    at io.druid.data.input.impl.MapInputRowParser.parse(MapInputRowParser.java:72)
    at io.druid.data.input.impl.StringInputRowParser.parseMap(StringInputRowParser.java:136)
    at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:74)
    at io.druid.data.input.impl.StringInputRowParser.parse(StringInputRowParser.java:37)
    at com.metamx.tranquility.druid.DruidBeams$$anonfun$1$$anonfun$7.apply(DruidBeams.scala:177)
    at com.metamx.tranquility.druid.DruidBeams$$anonfun$1$$anonfun$7.apply(DruidBeams.scala:177)
    at com.metamx.tranquility.druid.DruidBeams$$anonfun$1$$anonfun$apply$1.apply(DruidBeams.scala:195)
    at com.metamx.tranquility.druid.DruidBeams$$anonfun$1$$anonfun$apply$1.apply(DruidBeams.scala:195)
    at com.metamx.tranquility.beam.TransformingBeam$$anonfun$sendAll$2$$anonfun$2.apply(TransformingBeam.scala:36)
    at com.twitter.util.Try$.apply(Try.scala:13)
    at com.metamx.tranquility.beam.TransformingBeam$$anonfun$sendAll$2.apply(TransformingBeam.scala:36)
    at com.metamx.tranquility.beam.TransformingBeam$$anonfun$sendAll$2.apply(TransformingBeam.scala:35)
    at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:778)
    at scala.collection.Iterator$class.foreach(Iterator.scala:742)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:777)
    at com.metamx.tranquility.beam.TransformingBeam.sendAll(TransformingBeam.scala:35)
    at com.metamx.tranquility.tranquilizer.Tranquilizer.com$metamx$tranquility$tranquilizer$Tranquilizer$$sendBuffer(Tranquilizer.scala:301)
    at com.metamx.tranquility.tranquilizer.Tranquilizer$$anonfun$send$1.apply(Tranquilizer.scala:202)
    at com.metamx.tranquility.tranquilizer.Tranquilizer$$anonfun$send$1.apply(Tranquilizer.scala:202)
    at scala.Option.foreach(Option.scala:257)
    at com.metamx.tranquility.tranquilizer.Tranquilizer.send(Tranquilizer.scala:202)
    at com.metamx.tranquility.kafka.writer.TranquilityEventWriter.send(TranquilityEventWriter.java:76)
    at com.metamx.tranquility.kafka.KafkaConsumer$2.run(KafkaConsumer.java:231)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: Null timestamp in input: {idRequest=8643556674707087331, codAggregator=DM, result=KO, stackTrace=Dummy...
    at io.druid.data.input.impl.MapInputRowParser.parse(MapInputRowParser.java:63)
    ... 30 more

and the .json configuration file looks like:

{
  "dataSources" : {
    "aggregator-kafka" : {
      "spec" : {
        "dataSchema" : {
          "dataSource" : "aggregator-kafka",
          "parser" : {
            "type" : "string",
            "parseSpec" : {
              "timestampSpec" : {"format" : "auto", "column" : "timestamp"},
              "dimensionsSpec" : {
                "dimensions" : ["timestamp","codAggregator", "result", "stackTrace"],
                "dimensionExclusions" : [
                  "idRequest"
                ]
              },
              "format" : "json"
            }
          },
          "granularitySpec" : {
            "type" : "uniform",
            "segmentGranularity" : "minute",
            "queryGranularity" : "none"
          },
          "metricsSpec" : [
            {
              "type" : "count",
              "name" : "count"
            }
          ]
        },
        "ioConfig" : {
          "type" : "realtime"
        },
        "tuningConfig" : {
          "type" : "realtime",
          "maxRowsInMemory" : "100000",
          "intermediatePersistPeriod" : "PT10M",
          "windowPeriod" : "PT10M"
        }
      },
      "properties" : {
        "task.partitions" : "1",
        "task.replicants" : "1",
        "topicPattern" : "aggregator-out"
      }
    }
  },
  "properties" : {
    "zookeeper.connect" : "localhost",
    "druid.discovery.curator.path" : "/druid/discovery",
    "druid.selectors.indexing.serviceName" : "druid/overlord",
    "commit.periodMillis" : "15000",
    "consumer.numThreads" : "2",
    "kafka.zookeeper.connect" : "localhost",
    "kafka.group.id" : "tranquility-kafka"
  }
}


Thanks in advance!

E

Slim Bouguerra

unread,
Mar 29, 2018, 7:04:44 PM3/29/18
to druid...@googlegroups.com
can you make sure that the Druid JVM is running with UTC timezone ? -Duser.timezone=UTC
-- 

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/d5866796-0909-4b97-87aa-14b5178e3456%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elysia

unread,
Mar 30, 2018, 4:23:50 AM3/30/18
to Druid User
I'm using the default /conf-quickstart configuration shipped with the distribution, in whole JVM I can see:

-Duser.timezone=UTC


Elysia

unread,
Apr 5, 2018, 5:11:51 PM4/5/18
to Druid User
Trying to solve the problem first injecting events directly through:

https://github.com/acesinc/json-data-generator

my configuration file for json-data-generator looks like:

      "type": "tranquility",
      "zookeeper.host": "localhost",
      "zookeeper.port": 2181,
      "overlord.name":"overlord",
      "firehose.pattern":"druid:firehose:%s",
      "discovery.path":"/druid/discovery",
      "datasource.name":"aggregatest",
      "timestamp.name":"eventTimestamp",
      "sync": true       

while overlord starts with:

java `cat conf-quickstart/druid/overlord/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/overlord:lib/*" io.druid.cli.Main server overlord

But not able to inject nothing as I'm getting:

2018-04-05 22:49:36,312 ERROR n.a.d.j.g.l.TranquilityLogger [Thread-1] Error sending event to Druid
java.lang.IllegalStateException: Failed to save new beam for identifier[overlord/aggregatest] timestamp[2018-04-05T22:00:00.000+02:00]
    at com.metamx.tranquility.beam.ClusteredBeam$$anonfun$2.applyOrElse(ClusteredBeam.scala:264) ~[tranquility_2.10-0.4.2.jar:0.4.2]
    at com.metamx.tranquility.beam.ClusteredBeam$$anonfun$2.applyOrElse(ClusteredBeam.scala:261) ~[tranquility_2.10-0.4.2.jar:0.4.2]
    at com.twitter.util.Future$$anonfun$rescue$1.apply(Future.scala:843) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.Future$$anonfun$rescue$1.apply(Future.scala:841) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.Promise$Transformer.liftedTree1$1(Promise.scala:100) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.Promise$Transformer.k(Promise.scala:100) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.Promise$Transformer.apply(Promise.scala:110) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.Promise$Transformer.apply(Promise.scala:91) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.Promise$$anon$2.run(Promise.scala:345) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.concurrent.LocalScheduler$Activation.run(Scheduler.scala:186) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.concurrent.LocalScheduler$Activation.submit(Scheduler.scala:157) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.concurrent.LocalScheduler.submit(Scheduler.scala:212) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.concurrent.Scheduler$.submit(Scheduler.scala:86) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.Promise.runq(Promise.scala:331) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.Promise.updateIfEmpty(Promise.scala:642) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at com.twitter.util.ExecutorServiceFuturePool$$anon$2.run(FuturePool.scala:112) ~[util-core_2.10-6.23.0.jar:6.23.0]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_20]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_20]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_20]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_20]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_20]
Caused by: com.twitter.finagle.NoBrokersAvailableException: No hosts are available for overlord
    at com.twitter.finagle.NoStacktrace(Unknown Source) ~[?:?]

Thanks in advance...

Gaurav Shah

unread,
Apr 5, 2018, 9:25:11 PM4/5/18
to Druid User
You should probably check logs during startup of druid, one of the service is failing to start.

Elysia

unread,
Apr 11, 2018, 3:50:00 PM4/11/18
to Druid User
I have found that all the jvm.config files under conf-quickstart/druid/ folders of broker, coordinator, historical, middleManager, overlord had a wrong -Duser.timezone. Now I have set all of them to the proper value. To make a simplier test, I have avoided to feed druid via Kafka and I have just used json-data-generator-1.3.0 whose config.json looks like:

...
"producers": [
    {

      "type": "tranquility",
      "zookeeper.host": "localhost",
      "zookeeper.port": 2181,
      "overlord.name":"overlord",
      "firehose.pattern":"druid:firehose:%s",
      "discovery.path":"/druid/discovery",
      "datasource.name":"aggregatest",
      "timestamp.name":"eventTimestamp",
      "sync": true       
    },

I started the json generator to feed Druid, but still not able to see nothing in the datasources of the Druid Console nor evident exceptions in the logs.

Where I should start from to understand where is the problem and why the ingestion of events doesn't work?

Thanks in advance...



Jonathan Wei

unread,
Apr 11, 2018, 4:22:13 PM4/11/18
to druid...@googlegroups.com
I would recommend double checking your input data, it's possible that some rows have missing or malformed timestamps.

e.g.,
`Caused by: java.lang.NullPointerException: Null timestamp in input: {idRequest=8643556674707087331, codAggregator=DM, result=KO, stackTrace=Dummy...`

I would look for the specific row in your input that has those dimension values.

-----------------------------

If you want to validate your input data or ingestion schema, it may be simpler/more easily repeatable to use a batch ingestion task on a fixed input set. 


Be sure to set `reportParseExceptions` to true in the `tuningConfig`.



Jonathan Wei

unread,
Apr 11, 2018, 4:38:45 PM4/11/18
to druid...@googlegroups.com
I started the json generator to feed Druid, but still not able to see nothing in the datasources of the Druid Console nor evident exceptions in the logs. Where I should start from to understand where is the problem and why the ingestion of events doesn't work?

I would start by checking the Overlord console in your browser (port 8090 by default) to verify that a task was submitted, ran, and successfully completed.

Elysia

unread,
Apr 22, 2018, 4:37:20 PM4/22/18
to Druid User
Hi all!

I have checked the logs and I have not any of such exceptions:

e.g.,
`Caused by: java.lang.NullPointerException: Null timestamp in input: {idRequest=8643556674707087331, codAggregator=DM, result=KO, stackTrace=Dummy...`

Instead I have found that in the Druid cooordinator Console (http://localhost:8090/console.html), there is an entry also before I have been starting sending events with the json-data-generator. There is a single row that looks like:

worker scheme => http   
worker host    => myHostName:8091   
worker ip => myHostName
worker capacity    => 3
worker version => 0
currCapacityUsed => 0
availabilityGroups => []
runningTasks => []   
lastCompletedTaskTime => 2018-04-22T20:10:50.414Z   
blacklistedUntil => null

and the task time however shows that it seems the coordinator have not read the jvm.config file under druid-0.12.0/conf-quickstart/druid/coordinator where it is set as:

-Duser.timezone=Europe/Rome

Any ideas?

TIA!




Aditya Patil

unread,
May 30, 2018, 10:37:48 AM5/30/18
to Druid User
I am having exact same issue, did anyone fix it or resolve that error ?
Message has been deleted

Sagar Navgire

unread,
Oct 25, 2018, 6:25:01 PM10/25/18
to Druid User

Check if your dimension name and timestamp column name matches exactly.

Sairam Asapu

unread,
Oct 29, 2018, 8:34:04 AM10/29/18
to Druid User
Faced a similar issue while ingesting CSV file, the timestamp issue could be because of BOM (byte order mark) 

Looks like the below: 

For my case the timestamp is the first column and this BOM get appended to the timestamp field. 

Try how we can remove BOM from ur file. 

I opened the CSV in notepad++ and changed the encoding from UTF-8-BOM to UTF-8 and the issue got resolved. 

Hope this helps!
 
Reply all
Reply to author
Forward
0 new messages