How to change the names of part-r-0000 and part-m-0000

816 views
Skip to first unread message

ksr

unread,
Feb 22, 2015, 5:26:22 AM2/22/15
to chenn...@googlegroups.com
Hi All,
 
Any body knows that how to change the names of "part-r-0000"  as "testresult-1"
Please let me know ASAP.
 
 
 
 

Sb Gowtham

unread,
Feb 22, 2015, 7:24:18 AM2/22/15
to chenn...@googlegroups.com
Hi

This is all you need to do in the Driver class to change the base name of the output file: job.getConfiguration().set("mapreduce.output.basename", "text"); 
So this will result in your files being called "text-r-00000".

--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks & Regards

sudhakar kurakula

unread,
Feb 22, 2015, 7:50:56 AM2/22/15
to chenn...@googlegroups.com
Hi Goutham,

Thank you for giving reply.

But I wanna change it completely from "part-r-0000" to "testresult1" not like "text-r-0000"
Can you tell me how to do it.

Kathir Suresh

unread,
Feb 23, 2015, 6:20:42 AM2/23/15
to chenn...@googlegroups.com
Hi,

         In setup method you need to override the getDefaultWorkFile method of TextOutputFormat class.

Thanks,
Kathir

sudhakar kurakula

unread,
Feb 23, 2015, 1:00:39 PM2/23/15
to chenn...@googlegroups.com
Hi Kathir,

If you dont mind,would you please provide me code...

Kathir Suresh

unread,
Feb 24, 2015, 12:47:18 AM2/24/15
to chenn...@googlegroups.com
Hi
  Example code is below.

package com.test.FileNameChangeTest

import java.io.IOException;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.RecordWriter;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class FileNameOutputsMapper extends
        Mapper<LongWritable, Text, NullWritable, NullWritable> {
    protected String filename;
    private RecordWriter<Text, Text> recordWriter;
    private Text outputValue;
    private Text outputKey;

    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
        //TODO map logic 

        // write to output to file using Writer
        recordWriter.write(outputKey, outputValue);
    }

    @Override
    protected void setup(Context context) throws IOException,
            InterruptedException {
        InputSplit split = context.getInputSplit();
        Path path = ((FileSplit) split).getPath();

      
        filename = path.getParent().getName() + "/" + path.getName();

        final Path baseOutputPath = FileOutputFormat.getOutputPath(context);
      
        final Path outputFilePath = new Path(baseOutputPath, filename);

        //override the getDefaultWorkFile
        TextOutputFormat<Text, Text> testOutPutFormat = new TextOutputFormat<Text, Text>() {
            @Override
            public Path getDefaultWorkFile(TaskAttemptContext context,
                    String extension) throws IOException {
                return outputFilePath;
            }
        };

        // this will create the output subfolder
        recordWriter = testOutPutFormat.getRecordWriter(context);
    }

    @Override
    protected void cleanup(Context context) throws IOException,
            InterruptedException {
        recordWriter.close(context);
Reply all
Reply to author
Forward
0 new messages