I have been working my through the Introduction to CDAP, initially with 3.5.1 standalone (JVM) and now with the 4.0.0-SNAPSHOT standalone (JVM), both on a Windows PC. In both cases I ran into an error at the following step.
cdap (http://USORH-PQUIRK1:11015/namespace:default)> load stream logEventStream examples/resources/accesslog.txt
Successfully loaded file to stream 'logEventStream'
cdap (http://USORH-PQUIRK1:11015/namespace:default)> list dataset instances
+==============================================================================================+
| name | type | description |
+==============================================================================================+
| _auditLog | co.cask.tracker.entity.AuditLogTable | |
| _kafkaOffset | co.cask.cdap.api.dataset.lib.KeyValueTable | |
| _auditMetrics | co.cask.tracker.entity.AuditMetricsCube | |
| _auditTagsTable | co.cask.tracker.entity.AuditTagsTable | |
| _timeSinceTable | co.cask.tracker.entity.LatestEntityTable | |
| logEventStream_converted | co.cask.cdap.api.dataset.lib.TimePartitionedFileSet | |
+==============================================================================================+
cdap (http://USORH-PQUIRK1:11015/namespace:default)> execute 'describe dataset_logEventStream_converted'
Error: co.cask.cdap.explore.service.ExploreException: Cannot execute query. Reason: Response code: 400, message: 'Bad Request', body: '[SQLState 42S02] Error while compiling statem
ent: FAILED: SemanticException [Error 10001]: Table not found dataset_logEventStream_converted'
cdap (http://USORH-PQUIRK1:11015/namespace:default)>
If I inspect the pipeline logs in Hydrator, I see several audit errors related to the absence of a Kafka producer and an error "Input record does not contain the date field".
2016-09-29 09:25:01,615 - INFO [NettyHttpService STARTING:c.c.h.NettyHttpService@257] - Starting WorkflowDriver http service on address /127.0.0.1:0...
2016-09-29 09:25:01,618 - INFO [NettyHttpService STARTING:c.c.h.NettyHttpService@262] - Started WorkflowDriver http service on address /127.0.0.1:60823
2016-09-29 09:25:01,663 - INFO [WorkflowDriver:c.c.c.i.a.r.w.WorkflowProgramController@78] - Workflow service workflow.default.logEventStreamConverter.DataPipelineWorkflow.1f3039fa-8648-11e6-9fd2-000000341358 announced at /127.0.0.1:60823
2016-09-29 09:25:01,663 - INFO [WorkflowDriver:c.c.c.i.a.r.w.WorkflowDriver@554] - Start workflow execution for DataPipelineWorkflow
2016-09-29 09:25:01,664 - DEBUG [WorkflowDriver:c.c.c.i.a.r.w.WorkflowDriver@555] - Workflow specification is WorkflowSpecification{className='co.cask.cdap.datapipeline.SmartWorkflow', name='DataPipelineWorkflow', description='Data Pipeline Workflow', properties={pipeline.spec={"endingActions":[],"driverResources":{"virtualCores":1,"memoryMB":512},"stages":[{"name":"Projection","plugin":{"type":"transform","name":"Projection","properties":{"drop":"headers"},"artifact":{"name":"core-plugins","version":{"version":"1.5.0-SNAPSHOT","major":1,"minor":5,"fix":0,"suffix":"SNAPSHOT"},"scope":"SYSTEM"}},"inputSchemas":{},"inputs":["Stream"],"outputs":["TPFSAvro"]},{"name":"TPFSAvro","plugin":{"type":"batchsink","name":"TPFSAvro","properties":{"schema":"{\r\n \"type\":\"record\",\r\n \"name\":\"etlSchemaBody\",\r\n \"fields\":[\r\n {\"name\":\"ts\",\"type\":\"long\"},\r\n {\"name\":\"remote_host\",\"type\":[\"string\",\"null\"]},\r\n {\"name\":\"remote_login\",\"type\":[\"string\",\"null\"]},\r\n {\"name\":\"auth_user\",\"type\":[\"string\",\"null\"]},\r\n {\"name\":\"date\",\"type\":[\"string\",\"null\"]},\r\n {\"name\":\"request\",\"type\":[\"string\",\"null\"]},\r\n {\"name\":\"status\",\"type\":[\"int\",\"null\"]},\r\n {\"name\":\"content_length\",\"type\":[\"int\",\"null\"]},\r\n {\"name\":\"referrer\",\"type\":[\"string\",\"null\"]},\r\n {\"name\":\"user_agent\",\"type\":[\"string\",\"null\"]}]}","name":"logEventStream_converted","basePath":"logEventStream_converted"},"artifact":{"name":"core-plugins","version":{"version":"1.5.0-SNAPSHOT","major":1,"minor":5,"fix":0,"suffix":"SNAPSHOT"},"scope":"SYSTEM"}},"inputSchemas":{},"inputs":["Projection"],"outputs":[]},{"name":"Stream","plugin":{"type":"batchsource","name":"Stream","properties":{"format":"clf","name":"logEventStream","duration":"5m"},"artifact":{"name":"core-plugins","version":{"version":"1.5.0-SNAPSHOT","major":1,"minor":5,"fix":0,"suffix":"SNAPSHOT"},"scope":"SYSTEM"}},"inputSchemas":{},"inputs":[],"outputs":["Projection"]}],"connections":[{"from":"Stream","to":"Projection"},{"from":"Projection","to":"TPFSAvro"}],"resources":{"virtualCores":1,"memoryMB":512},"stageLoggingEnabled":true}}, nodes=[WorkflowActionNode{nodeId=phase-1program=ScheduleProgramInfo{programName='phase-1', programType=MAPREDUCE}, actionSpecification=null, customActionSpecification=null}], localDatasetSpecs={}}
2016-09-29 09:25:01,824 - INFO [action-phase-1-0:c.c.c.i.w.ProgramWorkflowAction@62] - Starting Program for workflow action: phase-1
2016-09-29 09:25:01,852 - INFO [action-phase-1-0:c.c.c.i.a.r.b.MapReduceProgramRunner@198] - Starting MapReduce Job: name=phase-1, jobId=null, namespaceId=default, applicationId=logEventStreamConverter, program=phase-1, runid=202918eb-8648-11e6-975a-0000001e3ac4
2016-09-29 09:25:07,150 - INFO [MapReduceRunner-phase-1:c.c.h.p.b.s.StreamBatchSource@124] - Setting input to Stream : logEventStream
2016-09-29 09:25:07,201 - INFO [MapReduceRunner-phase-1:c.c.h.p.b.s.TimePartitionedFileSetSink@87] - Writing to output partition of time 1475155500000.
2016-09-29 09:25:07,207 - ERROR [MapReduceRunner-phase-1:c.c.c.d.a.KafkaAuditPublisher@74] - Got exception publishing audit message AuditMessage{version=1, time=1475155507206, entityId=dataset:default.logEventStream_converted, user='', type=ACCESS, payload=AccessPayload{accessType=WRITE, accessor=program_run:default.logEventStreamConverter.-SNAPSHOT.mapreduce.phase-1.202918eb-8648-11e6-975a-0000001e3ac4} AuditPayload{}}. Exception:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: No kafka producer available.
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:294) ~[com.google.guava.guava-13.0.1.jar:na]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:281) ~[com.google.guava.guava-13.0.1.jar:na]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.data2.audit.KafkaAuditPublisher.publish(KafkaAuditPublisher.java:72) ~[co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.audit.AuditPublishers.publishAccess(AuditPublishers.java:77) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework.publishAudit(LineageWriterDatasetFramework.java:215) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework.writeLineage(LineageWriterDatasetFramework.java:190) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.ForwardingDatasetFramework.writeLineage(ForwardingDatasetFramework.java:169) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.workflow.NameMappedDatasetFramework.writeLineage(NameMappedDatasetFramework.java:154) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data.dataset.SystemDatasetInstantiator.writeLineage(SystemDatasetInstantiator.java:111) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache$LineageRecordingDatasetCache.get(SingleThreadDatasetCache.java:144) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache$LineageRecordingDatasetCache.get(SingleThreadDatasetCache.java:128) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache.getDataset(SingleThreadDatasetCache.java:171) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.DynamicDatasetCache.getDataset(DynamicDatasetCache.java:202) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.DynamicDatasetCache.getDataset(DynamicDatasetCache.java:161) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.AbstractContext.getDataset(AbstractContext.java:238) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.BasicMapReduceContext.addOutput(BasicMapReduceContext.java:296) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.mapreduce.MapReduceSinkContext$3.call(MapReduceSinkContext.java:85) [cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.mapreduce.MapReduceSinkContext$3.call(MapReduceSinkContext.java:81) [cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.log.LogContext.runUnchecked(LogContext.java:145) [cdap-etl-core-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.log.LogContext.runWithoutLoggingUnchecked(LogContext.java:139) [cdap-etl-core-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.mapreduce.MapReduceSinkContext.addOutput(MapReduceSinkContext.java:81) [cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.hydrator.plugin.batch.sink.TimePartitionedFileSetSink.prepareRun(TimePartitionedFileSetSink.java:93) [1475155501857-0/:na]
at co.cask.hydrator.plugin.batch.sink.TimePartitionedFileSetSink.prepareRun(TimePartitionedFileSetSink.java:42) [1475155501857-0/:na]
at co.cask.cdap.etl.batch.LoggedBatchConfigurable$1.call(LoggedBatchConfigurable.java:44) [cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.LoggedBatchConfigurable$1.call(LoggedBatchConfigurable.java:41) [cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.log.LogContext.run(LogContext.java:59) [cdap-etl-core-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.LoggedBatchConfigurable.prepareRun(LoggedBatchConfigurable.java:41) [cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.mapreduce.ETLMapReduce.initialize(ETLMapReduce.java:206) [cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.api.mapreduce.AbstractMapReduce.initialize(AbstractMapReduce.java:171) [co.cask.cdap.cdap-api-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.api.mapreduce.AbstractMapReduce.initialize(AbstractMapReduce.java:33) [co.cask.cdap.cdap-api-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2.call(MapReduceRuntimeService.java:506) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2.call(MapReduceRuntimeService.java:499) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.transaction.Transactions.execute(Transactions.java:165) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.beforeSubmit(MapReduceRuntimeService.java:499) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.startUp(MapReduceRuntimeService.java:217) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) [com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1$1.run(MapReduceRuntimeService.java:417) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_101]
Caused by: java.lang.IllegalStateException: No kafka producer available.
at org.apache.twill.internal.kafka.client.SimpleKafkaPublisher$SimplePreparer.send(SimpleKafkaPublisher.java:120) ~[org.apache.twill.twill-core-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
... 36 common frames omitted
2016-09-29 09:25:07,212 - DEBUG [MapReduceRunner-phase-1:c.c.c.d.m.w.BasicLineageWriter@63] - Writing access for run program_run:default.logEventStreamConverter.-SNAPSHOT.mapreduce.phase-1.202918eb-8648-11e6-975a-0000001e3ac4, dataset dataset:default.logEventStream_converted, accessType WRITE, component null, accessTime = 1475155507212
2016-09-29 09:25:07,284 - DEBUG [MapReduceRunner-phase-1:c.c.c.d.m.w.BasicLineageWriter@81] - Writing access for run program_run:default.logEventStreamConverter.-SNAPSHOT.mapreduce.phase-1.202918eb-8648-11e6-975a-0000001e3ac4, stream stream:default.logEventStream, accessType READ, component null, accessTime = 1475155507284
2016-09-29 09:25:07,299 - ERROR [MapReduceRunner-phase-1:c.c.c.d.a.KafkaAuditPublisher@74] - Got exception publishing audit message AuditMessage{version=1, time=1475155507299, entityId=stream:default.logEventStream, user='', type=ACCESS, payload=AccessPayload{accessType=READ, accessor=program_run:default.logEventStreamConverter.-SNAPSHOT.mapreduce.phase-1.202918eb-8648-11e6-975a-0000001e3ac4} AuditPayload{}}. Exception:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: No kafka producer available.
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:294) ~[com.google.guava.guava-13.0.1.jar:na]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:281) ~[com.google.guava.guava-13.0.1.jar:na]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.data2.audit.KafkaAuditPublisher.publish(KafkaAuditPublisher.java:72) ~[co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.audit.AuditPublishers.publishAccess(AuditPublishers.java:72) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.transaction.stream.FileStreamAdmin.addAccess(FileStreamAdmin.java:593) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.setDecoderForStream(MapReduceRuntimeService.java:701) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.setInputsIfNeeded(MapReduceRuntimeService.java:664) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.access$200(MapReduceRuntimeService.java:135) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2.call(MapReduceRuntimeService.java:514) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2.call(MapReduceRuntimeService.java:499) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.transaction.Transactions.execute(Transactions.java:165) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.beforeSubmit(MapReduceRuntimeService.java:499) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.startUp(MapReduceRuntimeService.java:217) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) [com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1$1.run(MapReduceRuntimeService.java:417) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_101]
Caused by: java.lang.IllegalStateException: No kafka producer available.
at org.apache.twill.internal.kafka.client.SimpleKafkaPublisher$SimplePreparer.send(SimpleKafkaPublisher.java:120) ~[org.apache.twill.twill-core-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
... 14 common frames omitted
2016-09-29 09:25:07,306 - DEBUG [MapReduceRunner-phase-1:c.c.c.i.a.r.b.MapReduceRuntimeService@713] - Using as output for MapReduce Job: [logEventStream_converted-5c456bf5-ff8d-4ab2-bf8b-b2ff1126ad73]
2016-09-29 09:25:07,306 - DEBUG [MapReduceRunner-phase-1:c.c.c.i.a.r.b.MapReduceRuntimeService@949] - Set output value class to class java.lang.Object
2016-09-29 09:25:07,310 - DEBUG [MapReduceRunner-phase-1:c.c.c.i.a.r.b.MapReduceRuntimeService@987] - Set map output key class to class java.lang.Object
2016-09-29 09:25:07,310 - DEBUG [MapReduceRunner-phase-1:c.c.c.i.a.r.b.MapReduceRuntimeService@992] - Set map output value class to class java.lang.Object
2016-09-29 09:25:07,311 - ERROR [MapReduceRunner-phase-1:c.c.c.d.a.KafkaAuditPublisher@74] - Got exception publishing audit message AuditMessage{version=1, time=1475155507311, entityId=dataset:default.logEventStream_converted, user='', type=ACCESS, payload=AccessPayload{accessType=READ, accessor=program_run:default.logEventStreamConverter.-SNAPSHOT.mapreduce.phase-1.202918eb-8648-11e6-975a-0000001e3ac4} AuditPayload{}}. Exception:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: No kafka producer available.
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:294) ~[com.google.guava.guava-13.0.1.jar:na]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:281) ~[com.google.guava.guava-13.0.1.jar:na]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.data2.audit.KafkaAuditPublisher.publish(KafkaAuditPublisher.java:72) ~[co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.audit.AuditPublishers.publishAccess(AuditPublishers.java:72) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework.publishAudit(LineageWriterDatasetFramework.java:215) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework.writeLineage(LineageWriterDatasetFramework.java:190) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework$BasicDatasetAccessRecorder.recordLineage(LineageWriterDatasetFramework.java:267) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.DefaultDatasetRuntimeContext.recordAccess(DefaultDatasetRuntimeContext.java:178) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.DefaultDatasetRuntimeContext.onMethodEntry(DefaultDatasetRuntimeContext.java:166) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.lib.file.FileSetDataset.exists(FileSetDataset.java) [na:na]
at co.cask.cdap.data2.dataset2.lib.file.FileSetDataset.access$300(FileSetDataset.java:61) [na:na]
at co.cask.cdap.data2.dataset2.lib.file.FileSetDataset$FileSetLocation.exists(FileSetDataset.java:405) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.dataset2.lib.partitioned.PartitionedFileSetDataset.postTxCommit(PartitionedFileSetDataset.java:155) [na:na]
at org.apache.tephra.TransactionContext.postCommit(TransactionContext.java:311) [org.apache.tephra.tephra-core-0.8.0-incubating.jar:0.8.0-incubating]
at org.apache.tephra.TransactionContext.finish(TransactionContext.java:119) [org.apache.tephra.tephra-core-0.8.0-incubating.jar:0.8.0-incubating]
at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache$DelayedDiscardingTransactionContext.finish(SingleThreadDatasetCache.java:410) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.data2.transaction.Transactions.execute(Transactions.java:172) [co.cask.cdap.cdap-data-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.beforeSubmit(MapReduceRuntimeService.java:499) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.startUp(MapReduceRuntimeService.java:217) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) [com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1$1.run(MapReduceRuntimeService.java:417) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_101]
Caused by: java.lang.IllegalStateException: No kafka producer available.
at org.apache.twill.internal.kafka.client.SimpleKafkaPublisher$SimplePreparer.send(SimpleKafkaPublisher.java:120) ~[org.apache.twill.twill-core-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
... 20 common frames omitted
2016-09-29 09:25:07,316 - DEBUG [MapReduceRunner-phase-1:c.c.c.d.m.w.BasicLineageWriter@63] - Writing access for run program_run:default.logEventStreamConverter.-SNAPSHOT.mapreduce.phase-1.202918eb-8648-11e6-975a-0000001e3ac4, dataset dataset:default.logEventStream_converted, accessType READ, component null, accessTime = 1475155507316
2016-09-29 09:25:07,333 - DEBUG [MapReduceRunner-phase-1:c.c.c.i.a.r.b.MapReduceRuntimeService@806] - Creating Job jar: C:\Users\pquirk\Documents\cdap-sdk-4.0-preview\data\tmp\runner\mapreduce.default.logEventStreamConverter.phase-1.202918eb-8648-11e6-975a-0000001e3ac4\job.jar
2016-09-29 09:25:07,339 - INFO [MapReduceRunner-phase-1:c.c.c.i.a.r.b.i.LocalClientProtocolProvider@42] - Using framework: local
2016-09-29 09:25:07,339 - INFO [MapReduceRunner-phase-1:c.c.c.i.a.r.b.i.LocalClientProtocolProvider@50] - Using tracker: clocal
2016-09-29 09:25:07,346 - WARN [MapReduceRunner-phase-1:o.a.h.m.JobSubmitter@150] - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2016-09-29 09:25:07,435 - WARN [MapReduceRunner-phase-1:o.a.h.c.Configuration@2345] - file:/C:/Users/pquirk/Documents/cdap-sdk-4.0-preview/data/tmp/runner/mapreduce.default.logEventStreamConverter.phase-1.202918eb-8648-11e6-975a-0000001e3ac4/mapreduce/staging/pquirk1814824884/.staging/job_local1814824884_0188/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-09-29 09:25:07,436 - WARN [MapReduceRunner-phase-1:o.a.h.c.Configuration@2345] - file:/C:/Users/pquirk/Documents/cdap-sdk-4.0-preview/data/tmp/runner/mapreduce.default.logEventStreamConverter.phase-1.202918eb-8648-11e6-975a-0000001e3ac4/mapreduce/staging/pquirk1814824884/.staging/job_local1814824884_0188/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-09-29 09:25:07,508 - WARN [MapReduceRunner-phase-1:o.a.h.c.Configuration@2345] - file:/C:/Users/pquirk/Documents/cdap-sdk-4.0-preview/data/tmp/runner/mapreduce.default.logEventStreamConverter.phase-1.202918eb-8648-11e6-975a-0000001e3ac4/mapreduce/local/localRunner/pquirk/job_local1814824884_0188/job_local1814824884_0188.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-09-29 09:25:07,510 - WARN [MapReduceRunner-phase-1:o.a.h.c.Configuration@2345] - file:/C:/Users/pquirk/Documents/cdap-sdk-4.0-preview/data/tmp/runner/mapreduce.default.logEventStreamConverter.phase-1.202918eb-8648-11e6-975a-0000001e3ac4/mapreduce/local/localRunner/pquirk/job_local1814824884_0188/job_local1814824884_0188.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-09-29 09:25:07,510 - INFO [MapReduceRunner-phase-1:c.c.c.i.a.r.b.MapReduceRuntimeService@316] - Submitted MapReduce Job: name=phase-1, jobId=job_local1814824884_0188, namespaceId=default, applicationId=logEventStreamConverter, program=phase-1, runid=202918eb-8648-11e6-975a-0000001e3ac4.
2016-09-29 09:25:08,337 - WARN [Thread-877:o.a.h.m.LocalJobRunnerWithFix@562] - Error cleaning up job: job_local1814824884_0188
java.lang.Exception: java.lang.IllegalArgumentException: Input record does not contain the date field.
at org.apache.hadoop.mapred.LocalJobRunnerWithFix$Job.runTasks(LocalJobRunnerWithFix.java:465) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at org.apache.hadoop.mapred.LocalJobRunnerWithFix$Job.run(LocalJobRunnerWithFix.java:524) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
java.lang.IllegalArgumentException: Input record does not contain the date field.
at co.cask.hydrator.plugin.common.StructuredToAvroTransformer.transform(StructuredToAvroTransformer.java:67) ~[1475155501857-0/:na]
at co.cask.hydrator.plugin.common.StructuredToAvroTransformer.transform(StructuredToAvroTransformer.java:51) ~[1475155501857-0/:na]
at co.cask.hydrator.plugin.batch.sink.TimePartitionedFileSetDatasetAvroSink.transform(TimePartitionedFileSetDatasetAvroSink.java:71) ~[1475155501857-0/:na]
at co.cask.hydrator.plugin.batch.sink.TimePartitionedFileSetDatasetAvroSink.transform(TimePartitionedFileSetDatasetAvroSink.java:42) ~[1475155501857-0/:na]
at co.cask.cdap.etl.batch.KVTransformations$KVSinkTransformation.transform(KVTransformations.java:123) ~[cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.KVTransformations$KVSinkTransformation.transform(KVTransformations.java:111) ~[cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.common.TrackedTransform.transform(TrackedTransform.java:58) ~[cdap-etl-core-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.common.TransformExecutor.executeTransformation(TransformExecutor.java:88) ~[cdap-etl-core-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.common.TransformExecutor.executeTransformation(TransformExecutor.java:93) ~[cdap-etl-core-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.common.TransformExecutor.executeTransformation(TransformExecutor.java:93) ~[cdap-etl-core-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.common.TransformExecutor.runOneIteration(TransformExecutor.java:50) ~[cdap-etl-core-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.mapreduce.TransformRunner.transform(TransformRunner.java:165) ~[cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at co.cask.cdap.etl.batch.mapreduce.ETLMapReduce$ETLMapper.map(ETLMapReduce.java:356) ~[cdap-etl-batch-4.0.0-SNAPSHOT.jar:na]
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapperWrapper.run(MapperWrapper.java:117) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
at org.apache.hadoop.mapred.LocalJobRunnerWithFix$Job$MapTaskRunnable.run(LocalJobRunnerWithFix.java:243) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:1.8.0_101]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:1.8.0_101]
at java.lang.Thread.run(Unknown Source) ~[na:1.8.0_101]
2016-09-29 09:25:08,519 - INFO [MapReduceRunner-phase-1:c.c.c.i.a.r.b.MapReduceRuntimeService@348] - MapReduce Job is complete, status: false, job: name=phase-1, jobId=job_local1814824884_0188, namespaceId=default, applicationId=logEventStreamConverter, program=phase-1, runid=202918eb-8648-11e6-975a-0000001e3ac4
2016-09-29 09:25:10,523 - INFO [MapReduceRunner-phase-1:o.a.t.TransactionManager@1013] - Tx invalid list: added tx 1475155507335000000 because of invalidate
2016-09-29 09:25:10,527 - INFO [MapReduceRunner-phase-1:c.c.c.e.b.m.ETLMapReduce@330] - Batch Run finished : status = ProgramState{status=FAILED, failureInfo='NA'}
2016-09-29 09:25:11,103 - ERROR [MapReduceRunner-phase-1:c.c.c.i.a.r.ProgramControllerServiceAdapter@81] - Program terminated with exception
java.lang.IllegalStateException: MapReduce execution failure: job-id : job_local1814824884_0188uber-mode : falsemap-progress : 1.0reduce-progress : 0.0cleanup-progress : 1.0setup-progress : 1.0runstate : FAILEDstart-time : 0user-name : pquirkpriority : NORMALscheduling-info : NAnum-used-slots0num-reserved-slots0used-mem0reserved-mem0needed-mem0
at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:358) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1$1.run(MapReduceRuntimeService.java:417) [co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_101]
2016-09-29 09:25:11,107 - INFO [action-phase-1-0:c.c.c.i.w.ProgramWorkflowAction@70] - Failed to execute MAPREDUCE Program phase-1 in workflow
java.lang.IllegalStateException: MapReduce execution failure: job-id : job_local1814824884_0188uber-mode : falsemap-progress : 1.0reduce-progress : 0.0cleanup-progress : 1.0setup-progress : 1.0runstate : FAILEDstart-time : 0user-name : pquirkpriority : NORMALscheduling-info : NAnum-used-slots0num-reserved-slots0used-mem0reserved-mem0needed-mem0
at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:358) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1$1.run(MapReduceRuntimeService.java:417) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_101]
2016-09-29 09:25:11,108 - INFO [NettyHttpService STOPPING:c.c.h.NettyHttpService@274] - Stopping WorkflowDriver http service on address /127.0.0.1:60823...
2016-09-29 09:25:11,109 - INFO [NettyHttpService STOPPING:c.c.h.NettyHttpService@288] - Stopped WorkflowDriver http service on address /127.0.0.1:60823
2016-09-29 09:25:11,123 - INFO [WorkflowDriver:c.c.c.i.a.r.w.WorkflowProgramController@98] - Workflow service failed from RUNNING. Un-registering service workflow.default.logEventStreamConverter.DataPipelineWorkflow.1f3039fa-8648-11e6-9fd2-000000341358.
java.lang.IllegalStateException: MapReduce execution failure: job-id : job_local1814824884_0188uber-mode : falsemap-progress : 1.0reduce-progress : 0.0cleanup-progress : 1.0setup-progress : 1.0runstate : FAILEDstart-time : 0user-name : pquirkpriority : NORMALscheduling-info : NAnum-used-slots0num-reserved-slots0used-mem0reserved-mem0needed-mem0
at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:358) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1$1.run(MapReduceRuntimeService.java:417) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_101]
2016-09-29 09:25:11,123 - INFO [WorkflowDriver:c.c.c.i.a.r.w.WorkflowProgramController@104] - Service workflow.default.logEventStreamConverter.DataPipelineWorkflow.1f3039fa-8648-11e6-9fd2-000000341358 unregistered.
2016-09-29 09:25:11,123 - INFO [pcontroller-program:default.logEventStreamConverter.-SNAPSHOT.workflow.DataPipelineWorkflow-1f3039fa-8648-11e6-9fd2-000000341358:c.c.c.i.a.s.ProgramLifecycleService@338] - Program stopped with error program:default.logEventStreamConverter.-SNAPSHOT.workflow.DataPipelineWorkflow, 1f3039fa-8648-11e6-9fd2-000000341358
java.lang.IllegalStateException: MapReduce execution failure: job-id : job_local1814824884_0188uber-mode : falsemap-progress : 1.0reduce-progress : 0.0cleanup-progress : 1.0setup-progress : 1.0runstate : FAILEDstart-time : 0user-name : pquirkpriority : NORMALscheduling-info : NAnum-used-slots0num-reserved-slots0used-mem0reserved-mem0needed-mem0
at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:358) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na]
at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1$1.run(MapReduceRuntimeService.java:417) ~[co.cask.cdap.cdap-app-fabric-4.0.0-SNAPSHOT.jar:na]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_101]
2016-09-29 09:25:11,149 - DEBUG [pcontroller-program:default.logEventStreamConverter.-SNAPSHOT.workflow.DataPipelineWorkflow-1f3039fa-8648-11e6-9fd2-000000341358:c.c.c.a.r.AbstractProgramRuntimeService@436] - Removing RuntimeInfo: Workflow DataPipelineWorkflow 1f3039fa-8648-11e6-9fd2-000000341358
2016-09-29 09:25:11,150 - DEBUG [pcontroller-program:default.logEventStreamConverter.-SNAPSHOT.workflow.DataPipelineWorkflow-1f3039fa-8648-11e6-9fd2-000000341358:c.c.c.a.r.AbstractProgramRuntimeService@439] - RuntimeInfo removed: RuntimeInfo{programId=program:default.logEventStreamConverter.-SNAPSHOT.workflow.DataPipelineWorkflow, twillRunId=null}
Sure enough, if you inspect the pipeline in Hydrator you see that the input stream, which is in clf format, contains a 'request_time' field, but the output sink uses the field 'date'.