.write(WritableSequenceFile[Text, TypedProtobufWritable[ReportRow.Row]](path, ('key, 'value)))
override def config(implicit mode: Mode): Map[AnyRef, AnyRef] = {
super.config ++ Map (
"mapreduce.output.fileoutputformat.compress" -> "true",
"mapreduce.output.fileoutputformat.compress.codec" -> "org.apache.hadoop.io.compress.SnappyCodec",
"mapreduce.output.fileoutputformat.compress.type" -> "BLOCK"
)
}
--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/8113e1a9-b8d9-4377-87be-0f56af2addb0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
What is your Hadoop version?
On Thursday, April 16, 2015, Asher <ash...@gmail.com> wrote:
Using Scalding 0.8.6, and writing out records to a WritableSequenceFile like this:--.write(WritableSequenceFile[Text, TypedProtobufWritable[ReportRow.Row]](path, ('key, 'value)))I would like to use Snappy Compression. One would assume the way to go about it would be to override the config method:override def config(implicit mode: Mode): Map[AnyRef, AnyRef] = {
super.config ++ Map (
"mapreduce.output.fileoutputformat.compress" -> "true",
"mapreduce.output.fileoutputformat.compress.codec" -> "org.apache.hadoop.io.compress.SnappyCodec",
"mapreduce.output.fileoutputformat.compress.type" -> "BLOCK"
)
}These values are set in the job configuration, but it appears the output format ignores them.Any suggestions? This is in production, so upgrading the Scalding version is not an option at this time.--Asher
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-user+unsubscribe@googlegroups.com.
To post to this group, send email to cascading-user@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/8113e1a9-b8d9-4377-87be-0f56af2addb0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
override def updateConf(c: Configuration): Unit = {
super.updateConf(c)
c.setBoolean("mapreduce.output.fileoutputformat.compress",true)
c.setBoolean("mapred.output.fileoutputformat.compress",true)
c.set("mapreduce.output.fileoutputformat.compress.codec", "org.apache.hadoop.io.compress.SnappyCodec")
c.set("mapred.output.fileoutputformat.compress.codec", "org.apache.hadoop.io.compress.SnappyCodec")
c.set("mapreduce.output.fileoutputformat.compress.type" , "BLOCK")
c.set("mapred.output.fileoutputformat.compress.type" , "BLOCK")
}
--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/3e8cf560-a1e9-4657-914e-2cd627204d6d%40googlegroups.com.
mapred.output.fileoutputformat.compress | true |
mapred.output.fileoutputformat.compress.type | BLOCK |
mapred.output.fileoutputformat.compress.codec | org.apache.hadoop.io.compress.SnappyCodec |
mapred.map.output.compression.codec | org.apache.hadoop.io.compress.SnappyCodec |
mapred.output.compression.type | BLOCK |
mapred.output.compression.codec | org.apache.hadoop.io.compress.DefaultCodec |
override def config(implicit mode: Mode): Map[AnyRef, AnyRef] = {
super.config ++ Map (
// JOB OUTPUT
"mapred.output.fileoutputformat.compress" -> "true",
"mapred.output.fileoutputformat.compress.codec" -> "org.apache.hadoop.io.compress.SnappyCodec",
"mapred.output.fileoutputformat.compress.type" -> "BLOCK",
"mapred.output.compression.type" -> "BLOCK",
"mapred.output.compress" -> "true",
"mapred.output.compression.codec" -> "org.apache.hadoop.io.compress.SnappyCodec",
// MAP OUTPUT
"mapred.map.output.compress" -> "true",
"mapred.map.output.compress.codec" -> "org.apache.hadoop.io.compress.SnappyCodec"
)
}
override def updateConf(c: Configuration): Unit = {
super.updateConf(c)
c.setBoolean("mapred.output.fileoutputformat.compress",true)
c.setBoolean("mapred.output.compress",true)
c.setBoolean("mapred.map.output.compress", true)
c.set("mapred.output.fileoutputformat.compress.codec", "org.apache.hadoop.io.compress.SnappyCodec")
c.set("mapred.output.compression.codec", "org.apache.hadoop.io.compress.SnappyCodec")
c.set("mapred.map.output.compress.codec", "org.apache.hadoop.io.compress.SnappyCodec")
c.set("mapred.output.fileoutputformat.compress.type" , "BLOCK")
c.set("mapred.output.compression.type" , "BLOCK")
}
--
You received this message because you are subscribed to a topic in the Google Groups "cascading-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cascading-user/QXWNkx980ds/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/CANX%3DQ2qELE5VCvBCtM8aCuRs_5vd%3DTPxf77XfDdK0sBX-jia6w%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/CAMHL%3DyHkjQ%2BRkBQ96fAQtg-22cYhE-QO_t0pAnnz4ufnk5h-Xw%40mail.gmail.com.