Writing data to FTP server

65 views
Skip to first unread message

Dhanashri Desai

unread,
Jul 27, 2015, 8:19:14 AM7/27/15
to cascading-user
I am trying to write data to FTP server using normal syntax o credetials: ftp://user:password@host/outputFilePath..

Code:
package com.ftp.readwrite;
import java.io.IOException;
import cascading.flow.FlowDef;
import cascading.flow.hadoop.HadoopFlowConnector;
import cascading.operation.aggregator.Count;
import cascading.operation.regex.RegexSplitGenerator;
import cascading.pipe.Each;
import cascading.pipe.Every;
import cascading.pipe.GroupBy;
import cascading.pipe.Pipe;
import cascading.scheme.hadoop.TextLine;
import cascading.tap.SinkMode;
import cascading.tap.Tap;
import cascading.tap.hadoop.Hfs;
import cascading.tuple.Fields;


public class MainClass {
   
    @SuppressWarnings("deprecation")
    public static void main(String args[]) throws IOException
    {
       
        Tap source=new Hfs(new TextLine(new Fields("line")), "C:\\Users\\DhanashriD\\Desktop\\test.txt");
        Tap sink=new Hfs(new TextLine(new Fields("line1")), "ftp://test:te...@10.30.125.63/output",SinkMode.REPLACE);
      
       
        Pipe line=new Pipe("Lines");
        line=new Each(line,new RegexSplitGenerator("\\s+"));
        Pipe group=new GroupBy(line);
         
         Pipe counts=new Every(group, new Count());
     
         FlowDef flowdef=FlowDef.flowDef().addSource(line, source).addTailSink(counts, sink);
         
         new HadoopFlowConnector().connect(flowdef).complete();
 
    }

}

I am getting following error:

java.io.IOException: Cannot rename parent(source): ftp://test:te...@10.30.125.63/op/_temporary/_attempt_local_0001_r_000000_0, parent(destination):  ftp://test:te...@10.30.125.63/output
    at org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:547)
    at org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:512)
    at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:154)
    at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172)
    at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132)
    at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:221)
    at org.apache.hadoop.mapred.Task.commit(Task.java:1001)
    at org.apache.hadoop.mapred.Task.done(Task.java:871)

Please help to resolve this error


Thanks
Dhanashri

Gera Shegalov

unread,
Jul 27, 2015, 1:59:56 PM7/27/15
to cascading-user
In the later versions of Hadoop the error message is more obvious: 

Only renames within the same parent dir are supported by FTPFileSystem. So the default FileOutputCommitter won't work.

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/adfc70e6-6456-4bc4-ac6b-4e32542b2b98%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
@gerashegalov

Dhanashri Desai

unread,
Jul 28, 2015, 2:14:52 AM7/28/15
to cascading-user, ge...@twitter.com
Okay... So is there any solution to get this code working?? 

Gera Shegalov

unread,
Jul 28, 2015, 8:55:28 PM7/28/15
to Dhanashri Desai, cascading-user
Cascading uses so called old MapReduce API. In the old API you can set a committer directly in the conf via "mapred.output.committer.class". You need to implement the OutputCommitter API.  You'll have to implement your own version of FileOutputCommitter. A naive way would be replacing hierarchical paths a/b/c with a flat path a_b_c. Since FTP spec supports atomicity via abor you may be able to utilize it. You may want to file a Hadoop MapReduce JIRA because it could be useful for others.   
--
@gerashegalov

Dhanashri Desai

unread,
Jul 29, 2015, 1:47:11 AM7/29/15
to cascading-user, ge...@twitter.com
Okay... Thanks a lot
Reply all
Reply to author
Forward
0 new messages