realtime ingestion tsv error

68 views
Skip to first unread message

Rui Wang

unread,
Nov 18, 2013, 8:12:06 PM11/18/13
to druid-de...@googlegroups.com
Hi,

After I went through the realtime example as on the wiki, I'm trying to make realtime ingestion working on tsv, however, it kept failing upon starting.

I guess that my syntax was wrong? have anyone seen this before?

Here is my realtime.spec:

[{
  "schema" : { "dataSource":"real-1",
               "aggregators":[ {"type":"count", "name":"impressions"} ],
               "indexGranularity":"hour",
               "shardSpec" : { "type": "none" } },
  "config" : { "maxRowsInMemory" : 500000,
               "intermediatePersistPeriod" : "PT10m" },
  "firehose" : { "type" : "kafka-0.7.2",
                 "consumerProps" : { "zk.connect" : "caaa-l13:2181",
                                     "zk.connectiontimeout.ms" : "15000",
                                     "zk.sessiontimeout.ms" : "15000",
                                     "zk.synctime.ms" : "5000",
                                     "groupid" : "topic-pixel-local",
                                     "fetch.size" : "1048586",
                                     "autooffset.reset" : "largest",
                                     "autocommit.enable" : "false" },
                 "feed" : "real-1",
                 "parser" : { "timestampSpec" : { "column" : "ts", "format" : "iso" },
                              "data" : { "format" : "tsv" },
                              "columns": ["ts", "adtype", "mkt_op", "xpi", "pai", "psi", "padu", "api", "acc", "aline", "icrid", "size", "ctype" ],
                              "dimensions": [ "adtype", "mkt_op", "xpi", "pai", "psi", "padu", "api", "acc", "aline", "icrid", "size", "ctype" ]
                             },
  "plumber" : { "type" : "realtime",
                "windowPeriod" : "PT10m",
                "segmentGranularity":"hour",
                "basePersistDirectory" : "/var/druid/realtime/basePersist",
                "rejectionPolicy": {"type": "messageTime"} }
}]

The error message when I start the realtime daemon is:

2013-11-19 01:08:26,746 INFO [main] com.metamx.druid.realtime.RealtimeMain - Throwable caught at startup, committing seppuku
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
...  
Caused by: java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class com.metamx.druid.indexer.data.DelimitedDataSpec] value failed: null
        at com.google.common.base.Throwables.propagate(Throwables.java:160)
        at com.metamx.druid.realtime.RealtimeNode.initializeFireDepartments(RealtimeNode.java:219)
...
Caused by: java.lang.NullPointerException
        at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
        at com.metamx.druid.indexer.data.DelimitedDataSpec.<init>(DelimitedDataSpec.java:49)

I checked the code and seems it needs to have a delimiter defined? However I don't see this anywhere in the documentation. Could someone help? :-)

Thanks!
Rui

Fangjin Yang

unread,
Nov 18, 2013, 9:40:13 PM11/18/13
to druid-de...@googlegroups.com
Hi Rui,

Your schema is malformed. The JSON blob is not valid (you are missing an ending '}'), and also your data spec is not correct. You should have:

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/1985cf30-056c-4e21-82ed-5e5be44cce94%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages