Index Service for local CSV file

267 views
Skip to first unread message

sc...@switzer.org

unread,
May 20, 2015, 5:18:00 PM5/20/15
to druid...@googlegroups.com
I am trying to load data via CSV from my local filesystem, so that I can test different data schema formats.  I am having an issue configuring Druid to accept this as a data source.  I errors that make me think there is something wrong with my spec file, which is here:

{
"type" : "index",
"spec" : {
"dataSchema" : {
"dataSource" : "authenticated",
"parser" : {
"type" : "string",
"parseSpec" : {
"format" : "csv",
"timestampSpec" : {
"column" : "minute",
"format" : "auto"
},
"columns":["minute","cli_id","domain_est","xurl_domain"],
"dimensionsSpec" : {
"dimensions": ["cli_id","domain_est","xurl_domain"],
"dimensionExclusions" : [],
"spatialDimensions" : []
}
}
},
"metricsSpec" : [
{
"type" : "count",
"name" : "count"
},
{
"type" : "doubleSum",
"name" : "measured",
"fieldName" : "measured"
},
{
"type" : "doubleSum",
"name" : "matched",
"fieldName" : "matched"
}
],
"granularitySpec" : {
"type" : "uniform",
"segmentGranularity" : "DAY",
"queryGranularity" : "NONE",
"intervals" : [ "2013-08-31/2015-09-01" ]
}
},
"ioConfig" : {
"type" : "index",
"firehose" : {
"type" : "local",
"baseDir" : "examples/authenticated/",
"filter" : "ads_data.csv"
}
},
"tuningConfig" : {
"type" : "index",
"targetPartitionSize" : 0,
"rowFlushBoundary" : 0
}
}
}

The errors that I see:

Can not deserialize instance of java.util.ArrayList out of START_OBJECT token


When I encapsulate the JSON object in an array, then I get the following error:


1) Error injecting constructor, java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class io.druid.segment.realtime.FireDepartment] value failed: dataSchema (through reference chain: java.util.ArrayList[0])

  at io.druid.guice.FireDepartmentsProvider.<init>(FireDepartmentsProvider.java:41)

  while locating io.druid.guice.FireDepartmentsProvider

  at io.druid.guice.RealtimeModule.configure(RealtimeModule.java:79)

  while locating java.util.List<io.druid.segment.realtime.FireDepartment>

    for parameter 0 at io.druid.segment.realtime.RealtimeManager.<init>(RealtimeManager.java:78)

  while locating io.druid.segment.realtime.RealtimeManager

  at io.druid.guice.RealtimeModule.configure(RealtimeModule.java:83)

  while locating io.druid.query.QuerySegmentWalker

    for parameter 3 at io.druid.server.QueryResource.<init>(QueryResource.java:89)

  while locating io.druid.server.QueryResource


Any advice on how to get a working spec file is appreciated.


Himanshu

unread,
May 20, 2015, 5:36:56 PM5/20/15
to sc...@switzer.org, druid...@googlegroups.com
What is the full stacktrace (in overlord log) you get in the 1st case? You don't need to put your top level json object into an array.

-- Himanshu

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/d34d6e3d-a038-49b7-86d8-765f68e2ea1b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Himanshu

unread,
May 20, 2015, 5:47:02 PM5/20/15
to Scott Switzer, druid...@googlegroups.com
OK, I see now. You are trying to setup a standalone realtime node which has different json format, see http://druid.io/docs/0.7.1.1/Realtime-ingestion.html#realtime-node-ingestion  and correct it.

the json you've used is the format of realtime task which can be submitted to overlord.

-- Himanshu

On Wed, May 20, 2015 at 4:42 PM, Scott Switzer <sc...@switzer.org> wrote:
Here is the full output after the error:

2015-05-20T21:41:20,905 INFO [main] io.druid.guice.JsonConfigurator - Loaded class[interface io.druid.server.log.RequestLoggerProvider] from props[druid.request.logging.] as [io.druid.server.log.NoopRequestLoggerProvider@10823d72]

2015-05-20T21:41:20,931 ERROR [main] io.druid.cli.CliBroker - Error when starting up.  Failing.

com.google.inject.ProvisionException: Guice provision errors:


1) Error injecting constructor, java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of java.util.ArrayList out of START_OBJECT token

 at [Source: /Users/scott/druid/druid-0.8.0-SNAPSHOT/examples/authenticated/authenticated_realtime.spec; line: 1, column: 1]

  at io.druid.guice.FireDepartmentsProvider.<init>(FireDepartmentsProvider.java:41)

  while locating io.druid.guice.FireDepartmentsProvider

  at io.druid.guice.RealtimeModule.configure(RealtimeModule.java:79)

  while locating java.util.List<io.druid.segment.realtime.FireDepartment>

    for parameter 0 at io.druid.segment.realtime.RealtimeManager.<init>(RealtimeManager.java:78)

  while locating io.druid.segment.realtime.RealtimeManager

  at io.druid.guice.RealtimeModule.configure(RealtimeModule.java:83)

  while locating io.druid.query.QuerySegmentWalker

    for parameter 3 at io.druid.server.QueryResource.<init>(QueryResource.java:89)

  while locating io.druid.server.QueryResource


1 error

at com.google.inject.internal.InjectorImpl$3.get(InjectorImpl.java:1014) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1036) ~[guice-4.0-beta.jar:?]

at io.druid.guice.LifecycleModule$2.start(LifecycleModule.java:134) ~[druid-api-0.3.8.jar:0.8.0-SNAPSHOT]

at io.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:71) [druid-services-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]

at io.druid.cli.ServerRunnable.run(ServerRunnable.java:38) [druid-services-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]

at io.druid.cli.Main.main(Main.java:88) [druid-services-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]

Caused by: java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of java.util.ArrayList out of START_OBJECT token

 at [Source: /Users/scott/druid/druid-0.8.0-SNAPSHOT/examples/authenticated/authenticated_realtime.spec; line: 1, column: 1]

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]

at io.druid.guice.FireDepartmentsProvider.<init>(FireDepartmentsProvider.java:52) ~[druid-server-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]

at io.druid.guice.FireDepartmentsProvider$$FastClassByGuice$$229da177.newInstance(<generated>) ~[guice-4.0-beta.jar:0.8.0-SNAPSHOT]

at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:61) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:108) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:88) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:269) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.BoundProviderFactory.get(BoundProviderFactory.java:62) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1058) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) ~[guice-4.0-beta.jar:?]

at com.google.inject.Scopes$1$1.get(Scopes.java:65) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:38) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:62) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:107) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:88) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:269) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:56) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1058) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) ~[guice-4.0-beta.jar:?]

at io.druid.guice.LifecycleScope$1.get(LifecycleScope.java:49) ~[druid-api-0.3.8.jar:0.8.0-SNAPSHOT]

at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:38) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:62) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:107) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:88) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:269) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl$3$1.call(InjectorImpl.java:1005) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1051) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl$3.get(InjectorImpl.java:1001) ~[guice-4.0-beta.jar:?]

... 5 more

Caused by: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of java.util.ArrayList out of START_OBJECT token

 at [Source: /Users/scott/druid/druid-0.8.0-SNAPSHOT/examples/authenticated/authenticated_realtime.spec; line: 1, column: 1]

at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148) ~[jackson-databind-2.4.4.jar:2.4.4]

at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:762) ~[jackson-databind-2.4.4.jar:2.4.4]

at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:758) ~[jackson-databind-2.4.4.jar:2.4.4]

at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.handleNonArray(CollectionDeserializer.java:275) ~[jackson-databind-2.4.4.jar:2.4.4]

at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:216) ~[jackson-databind-2.4.4.jar:2.4.4]

at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:206) ~[jackson-databind-2.4.4.jar:2.4.4]

at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:25) ~[jackson-databind-2.4.4.jar:2.4.4]

at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3066) ~[jackson-databind-2.4.4.jar:2.4.4]

at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2122) ~[jackson-databind-2.4.4.jar:2.4.4]

at io.druid.guice.FireDepartmentsProvider.<init>(FireDepartmentsProvider.java:44) ~[druid-server-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]

at io.druid.guice.FireDepartmentsProvider$$FastClassByGuice$$229da177.newInstance(<generated>) ~[guice-4.0-beta.jar:0.8.0-SNAPSHOT]

at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:61) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:108) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:88) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:269) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.BoundProviderFactory.get(BoundProviderFactory.java:62) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1058) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) ~[guice-4.0-beta.jar:?]

at com.google.inject.Scopes$1$1.get(Scopes.java:65) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:38) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:62) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:107) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:88) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:269) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:56) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1058) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) ~[guice-4.0-beta.jar:?]

at io.druid.guice.LifecycleScope$1.get(LifecycleScope.java:49) ~[druid-api-0.3.8.jar:0.8.0-SNAPSHOT]

at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:38) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:62) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:107) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:88) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:269) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl$3$1.call(InjectorImpl.java:1005) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1051) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl$3.get(InjectorImpl.java:1001) ~[guice-4.0-beta.jar:?]

... 5 more


Scott Switzer

unread,
May 20, 2015, 6:46:20 PM5/20/15
to Himanshu, druid...@googlegroups.com
Thanks.  This helped me get the server started.  I am running into another issue which I will ask in a new thread.

Manvendra singh tomar

unread,
Jun 5, 2015, 11:45:32 AM6/5/15
to druid...@googlegroups.com
I am trying to do the same thing, loading data using local csv file and i have stumbled upon the same error. please let me know how did u make things work, i would highly appreciate that.

Error Log :

1) Error injecting constructor, java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class io.druid.segment.realtime.FireDepartment] value failed: dataSchema (through reference chain: java.util.ArrayList[0])

  at io.druid.guice.FireDepartmentsProvider.<init>(FireDepartmentsProvider.java:41)

  while locating io.druid.guice.FireDepartmentsProvider

  at io.druid.guice.RealtimeModule.configure(RealtimeModule.java:79)

  while locating java.util.List<io.druid.segment.realtime.FireDepartment>

    for parameter 0 at io.druid.segment.realtime.RealtimeManager.<init>(RealtimeManager.java:85)

  while locating io.druid.segment.realtime.RealtimeManager

  at io.druid.guice.RealtimeModule.configure(RealtimeModule.java:83)

  while locating io.druid.query.QuerySegmentWalker

    for parameter 3 at io.druid.server.QueryResource.<init>(QueryResource.java:89)

  while locating io.druid.server.QueryResource


1 error

at com.google.inject.internal.InjectorImpl$3.get(InjectorImpl.java:1014) ~[guice-4.0-beta.jar:?]

at com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1036) ~[guice-4.0-beta.jar:?]

at io.druid.guice.LifecycleModule$2.start(LifecycleModule.java:134) ~[druid-api-0.3.8.jar:0.7.3]

at io.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:71) [druid-services-0.7.3.jar:0.7.3]

at io.druid.cli.ServerRunnable.run(ServerRunnable.java:38) [druid-services-0.7.3.jar:0.7.3]

at io.druid.cli.Main.main(Main.java:88) [druid-services-0.7.3.jar:0.7.3]

Nishant Bangarwa

unread,
Jun 5, 2015, 1:25:02 PM6/5/15
to Manvendra singh tomar, druid...@googlegroups.com
Hi Manvendra, 

As pointed out by Himanshu earlier in the thread, you are trying to setup realtime node with format for IndexTask, 
you can run either run  indexing service and submit index task to it or setup a realtime node with proper format.  

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages