Adding new fields on the output file

112 views
Skip to first unread message

henrique...@ifactory.com.br

unread,
Jul 9, 2014, 4:13:13 AM7/9/14
to scaldi...@googlegroups.com
I am new to Scalding and I have a simple question.

I have a input file X and then I am processing that file and on the output file I should include two new attributes that are not part of the input file, such as projectId and processedDate.

 I tried this sample that I got from this forum but it is not working when running the job.
  def logsAddTodayDateColumn : Pipe = self
    .map( () -> 'day ) {_ : Unit => "the item I want to add to the pipe here"
  }


  // Facebook Friend File
  val GENDER_FACEBOOK_FRIEND = List('day, 'gender,'total)


  val facebookFriends = Csv(args("input"), "," , FACEBOOK_FRIEND, true).read
    .logsAddTodayDateColumn
    .logsByField("gender")
    .write(Tsv(args("output")+"/gender_facebook_out.tsv", GENDER_FACEBOOK_FRIEND))

I got the following error:

aused by: cascading.flow.planner.PlannerException: could not build flow from assembly: [[TextDelimited[['day', ...][com.twitter.scalding.DelimitedScheme$class.localScheme(FileSource.scala:274)] unable to resolve scheme sink selector: [{1}:'day'], with incoming: [{2}:'gender', 'total']]
at cascading.flow.planner.FlowPlanner.handleExceptionDuringPlanning(FlowPlanner.java:576)
at cascading.flow.local.planner.LocalPlanner.buildFlow(LocalPlanner.java:108)
at cascading.flow.local.planner.LocalPlanner.buildFlow(LocalPlanner.java:40)
at cascading.flow.FlowConnector.connect(FlowConnector.java:459)
at com.twitter.scalding.Job.buildFlow(Job.scala:226)
at com.twitter.scalding.Job.run(Job.scala:275)
at com.twitter.scalding.Tool.start$1(Tool.scala:108)
at com.twitter.scalding.Tool.run(Tool.scala:125)
at com.twitter.scalding.Tool.run(Tool.scala:71)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.twitter.scalding.Tool$.main(Tool.scala:133)
... 1 more
Caused by: cascading.tap.TapException: [TextDelimited[['day', ...][com.twitter.scalding.DelimitedScheme$class.localScheme(FileSource.scala:274)] unable to resolve scheme sink selector: [{1}:'day'], with incoming: [{2}:'gender', 'total']
at cascading.tap.Tap.outgoingScopeFor(Tap.java:329)
at cascading.flow.planner.ElementGraph.resolveFields(ElementGraph.java:628)

Thanks,
Henrique

henrique...@ifactory.com.br

unread,
Jul 22, 2014, 3:47:18 PM7/22/14
to scaldi...@googlegroups.com
Fixed .. Thanks.
Reply all
Reply to author
Forward
0 new messages