Read from HDFS

24 views
Skip to first unread message

José Manuel Abuín Mosquera

unread,
Jan 21, 2015, 11:11:45 AM1/21/15
to mp...@googlegroups.com
Hello,

is there any code example or documentation where I can find how to read a block from HDFS and process this block in a O task??

Thank you very much.

MPI-D

unread,
Jan 23, 2015, 1:27:36 PM1/23/15
to mp...@googlegroups.com
You can use standard HDFS APIs to read files. How to read data is totally free for applications. In DataMPI, we have provided some util functions to read data. We would like to share some example here. It may help your applications also.

                                // O communicator
                                int rank = MPI_D.Comm_rank(MPI_D.COMM_BIPARTITE_O);
                                int size = MPI_D.Comm_size(MPI_D.COMM_BIPARTITE_O);
                                if (rank == 0) {
                                        DataMPIUtil.printArgs(args);
                                }
                                System.out.println("The O task " + rank + " of " + size
                                                + " is working...");
                                Path[] inputs = DataMPIUtil.HDFSDataLocalLocator.getTaskInputs(
                                                MPI_D.COMM_BIPARTITE_O, jobConf, inDir, rank, size);

                                for (int i = 0; i < inputs.length; i++) {
                                        Path inPath = inputs[i];
                                        FileSystem fs = inPath.getFileSystem(jobConf);
                                        if (fs.exists(inPath) && fs.isFile(inPath)) {
                                                FileStatus status = fs.getFileStatus(inPath);
                                                FileSplit fsplit = new FileSplit(inPath, 0,
                                                                status.getLen(), jobConf);
                                                KeyValueLineRecordReader kvrr;
                                                kvrr = new KeyValueLineRecordReader(jobConf, fsplit);
                                                Text khead = kvrr.createKey();
                                                Text vhead = kvrr.createValue();
                                                while (kvrr.next(khead, vhead)) {
                                                        // send key-value
                                                        MPI_D.Send(khead, vhead);
                                                        ...

Thanks,
DataMPI Team

José Manuel Abuín Mosquera

unread,
Jan 27, 2015, 4:19:53 AM1/27/15
to mp...@googlegroups.com
Tahnk you very much for your time and your example :)

MPI-D

unread,
Jan 27, 2015, 11:42:54 PM1/27/15
to José Manuel Abuín Mosquera, mp...@googlegroups.com
No problem. Please let us know your further questions if you have any.

Thanks,
DataMPI Team


On Tue, Jan 27, 2015 at 4:19 AM, José Manuel Abuín Mosquera
<chem...@gmail.com> wrote:
> Tahnk you very much for your time and your example :)
>
> --
> You received this message because you are subscribed to the Google Groups
> "MPI-D" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mpi-d+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages