Ftp with compression

Hikmah

unread,

Apr 14, 2011, 10:52:30 PM4/14/11

to Python FTP server library - Discussion group

Hi,

I want to add lzma compression module in this ftp server.
I've found the method in ftplib (native ftp library in python) to be
added with lzma compression in client side.
Then, I read through ftpserver.py and there is FTP_RETR that retrieve
data to the client.
Is FTP_RETR the right place to add lzma compression in server side?

Thanks in advance..

Hikmah

Giampaolo Rodolà

unread,

Apr 15, 2011, 7:54:38 AM4/15/11

to pyft...@googlegroups.com

Hi,

the right way to do this is by overriding ftpserver.FileProducer.more method.

It is instantiated in ftp_RETR(), then its instance is passed to DTPHandler class which calls its more() method again and again until EOF is reached, in which case more() return an empty string.

For the data channel (DTPHandler) this means that the entire file has been sent and the connection must be closed.

Some pseudo code:

class LzmaFileProducer(ftpserver.FileProducer):

def more(self):

chunk = ftpserver.FileProducer.more(self)

if not chunk:

return "" # EOF

else:

return compress_data(chunk)

ftpserver.FileProducer = LzmaFileProducer # the monkey patch

Also be aware of the FTP TYPE currently in use, which can be either Binary or ASCII.
I don't know how lzma compression works but perhaps you might not want to apply the compression in case the current type is ASCII.

Hope this helps,

--- Giampaolo

Hikmah

unread,

Apr 15, 2011, 11:50:49 AM4/15/11

to Python FTP server library - Discussion group

Thank you,, i'll try that =)

Hikmah

unread,

May 3, 2011, 8:10:32 AM5/3/11

to Python FTP server library - Discussion group

Hi giampalo,

I have written a new class and override method more(self).
then, where should I put this ftpserver.FileProducer =
LzmaFileProducer # the monkey patch ?

Thank you

Giampaolo Rodolà

unread,

May 3, 2011, 8:17:49 AM5/3/11

to pyft...@googlegroups.com

You should put that in the module you use to start your FTP server, which imports pyftpdlib.

Note that in order to make the monkey patch work you have to import ftpserver directly.

Some code (not tested):

from pyftpdlib import ftpserver

class LzmaFileProducer(ftpserver.FileProducer):

def more(self):

chunk = ftpserver.FileProducer.more(self)

if not chunk:

return "" # EOF

else:

return compress_data(chunk)

ftpserver.FileProducer = LzmaFileProducer # the monkey patch

def main():

# start the server here

authorizer.add_user('user', password="12345", homedir='.', perm='elradfmw')

authorizer.add_anonymous(homedir='.')

handler = ftpserver.FTPHandler

handler.authorizer = authorizer

server = ftpserver.FTPServer(("", 21), handler)

server.serve_forever()

if __name__ == '__main__':

main()

Hikmah

unread,

May 6, 2011, 3:37:17 AM5/6/11

to Python FTP server library - Discussion group

Hi Giampalo,

I've tried to override class FileProducer and method more() :

==================================

from pyftpdlib import ftpserver
import gzip

class CompressFileProducer(ftpserver.FileProducer):

def more(self):
chunk = ftpserver.FileProducer.more(self)
if not chunk:
return "" # EOF
else:

return self.compress_data(chunk)

def compress_data(self, data_req):
compress = gzip.GzipFile(mode='wb',fileobj=data_req)
try:
compress.write(data_req)
finally:
compress.close()
return compress

ftpserver.FileProducer = CompressFileProducer

def main():
authorizer = ftpserver.DummyAuthorizer()
authorizer.add_user('newuser', password="123", homedir='D:\=
KULIAH =\Folder FTP', perm='elrmafdw')
authorizer.add_anonymous(homedir='C:\Users\Hikmah')

handler = ftpserver.FTPHandler
handler.authorizer = authorizer

address = ('127.0.0.1', 21)
server = ftpserver.FTPServer(address, handler)

server.serve_forever()

if __name__ == '__main__':
main()

==========================
but,,i found the error and it say :

127.0.0.1:49916 ==> 227 Entering passive mode (127,0,0,1,195,0).
127.0.0.1:49916 <== RETR judul.txt
127.0.0.1:49916 ==> 125 Data connection already open. Transfer
starting.
Traceback (most recent call last):
File "C:\Python26\pyftpdlib\ftpserver.py", line 2211, in
push_dtp_data
self.data_channel.push_with_producer(data)
File "C:\Python26\lib\asynchat.py", line 190, in push_with_producer
self.initiate_send()
File "C:\Python26\lib\asynchat.py", line 226, in initiate_send
data = first.more()
File "D:\= KULIAH =\Semester 8\workspace\BelajarTA\src\server.py",
line 12, in more
chunk = ftpserver.FileProducer.more(self)
File "D:\= KULIAH =\Semester 8\workspace\BelajarTA\src\server.py",
line 12, in more
chunk = ftpserver.FileProducer.more(self)
RuntimeError: maximum recursion depth exceeded while calling a Python
object

127.0.0.1:49916 ==> 426 Internal error; transfer aborted.
--

what should I do with this error?
Thanks,

Hikmah

Hikmah

unread,

May 11, 2011, 7:03:06 AM5/11/11

to Python FTP server library - Discussion group

Hi,

Andrew Scheller

unread,

May 11, 2011, 9:30:22 AM5/11/11

to pyft...@googlegroups.com

Hello Hikmah,

> class CompressFileProducer(ftpserver.FileProducer):
> def more(self):
> chunk = ftpserver.FileProducer.more(self)

[snip]
> ftpserver.FileProducer = CompressFileProducer

Doing this creates an infinite loop - the easiest way to fix it is to
store away a reference to the old class before monkey-patching it.

e.g. change:
ftpserver.FileProducer = CompressFileProducer
to:
CompressFileProducer._baseProducer = ftpserver.FileProducer
ftpserver.FileProducer = CompressFileProducer

and then change:
chunk = ftpserver.FileProducer.more(self)
to:
chunk = CompressFileProducer._baseProducer.more(self)

However doing that then throws up other errors, because you're trying
to use GzipFile in a streaming fashion (pyftpdlib reads and transmits
files chunk-by-chunk, instead of reading the whole file at once), and
python's gzip module unfortunately doesn't support streaming:
http://www.google.co.uk/search?q=python+stream+gzip

However the gzip module is in turn based on the zlib module, and the
zlib module *does* support streaming, so it should be possible to poke
around in the gzip.py sourcecode and pull out enough functionallity to
get streaming gzip compression working.
Or you might be abe to get
https://fedorahosted.org/spacewalk/browser/projects/python-gzipstream
working (but there's no documentation, so dunno if it supports
streaming compression or only streaming decompression).

Andrew

Andrew Scheller

unread,

May 11, 2011, 12:50:18 PM5/11/11

to pyft...@googlegroups.com

> python's gzip module unfortunately doesn't support streaming:
> http://www.google.co.uk/search?q=python+stream+gzip

[snip]

OMG! I tried playing around with the code from
http://stackoverflow.com/questions/2192529/python-creating-a-streaming-gzipd-file-like
and I actually got it working! :-)
So then I spent a while tidying it up and making it neater. Here's the
new version:

from pyftpdlib import ftpserver
import gzip

import os

class CompressedFileProducer(object):
"""Producer wrapper for transparently gzip-compressed file[-like]
objects."""

read_buffer_size = send_buffer_size = 65536

class StreamBuffer(object):
"""A file-like object for streaming writes."""
def __init__(self, chunksize=-1):
self.buffer = ''
self.chunksize = chunksize
def isfull(self):
if self.chunksize < 0:
return len(self.buffer) > 0
else:
return len(self.buffer) >= self.chunksize
def isempty(self):
return len(self.buffer) == 0
def readchunk(self):
if self.chunksize < 0:
ret, self.buffer = self.buffer, ''
else:
ret, self.buffer = self.buffer[:self.chunksize],
self.buffer[self.chunksize:]
return ret
def write(self, data):
self.buffer += data
def flush(self):
pass
def close(self):
pass

def __init__(self, file, type):
"""Initialize the producer and check the TYPE.

- (file) file: the file[-like] object.
- (str) type: the current TYPE, 'a' (ASCII) or 'i' (binary).
"""
self.done = False
self.file = file
self.stream = CompressedFileProducer.StreamBuffer(self.send_buffer_size)
if type == 'i':
self.zipper =
gzip.GzipFile(filename=os.path.basename(file.name), mode='wb',
fileobj=self.stream, mtime=os.fstat(file.fileno()).st_mtime)
else:
raise TypeError("unsupported type")

def more(self):
"""Attempt to send a chunk of data of at least size
self.send_buffer_size."""
if self.stream.isfull():
return self.stream.readchunk()
if self.done:
if not self.stream.isempty():
return self.stream.readchunk()
else:
return ''
while not self.done:
try:
data = self.file.read(self.read_buffer_size)
except OSError, err:
raise _FileReadWriteError(err)
if data:
self.zipper.write(data)
self.zipper.flush()
if self.stream.isfull():
break
else:
self.done = True
if not self.file.closed:
self.file.close()
self.zipper.close()
return self.stream.readchunk()

def main():
ftpserver.FileProducer = CompressedFileProducer
authorizer = ftpserver.DummyAuthorizer()
authorizer.add_user('user', password="123", homedir='/home/user',
perm='elrmafdw')
authorizer.add_anonymous(homedir='/tmp')

handler = ftpserver.FTPHandler
handler.authorizer = authorizer
address = ('127.0.0.1', 21)
server = ftpserver.FTPServer(address, handler)
server.serve_forever()

if __name__ == '__main__':
main()

Seems to work well for me, but it would be nice if other people could test it.

Giampaolo: Is this code "good enough" to be added to the
pyftpdlib/contrib/ directory? I'm quite pleased with how it worked
out, but any constructive criticism would be welcome.

Andrew

Giampaolo Rodolà

unread,

May 11, 2011, 1:03:24 PM5/11/11

to pyft...@googlegroups.com

Glad it worked out for you.

I don't think this fits well in the /contrib directory as it's a very specialized change which (I guess) also requires a customized client on the other end which is able to decompress the stream.

Am i wrong?

Andrew Scheller

unread,

May 11, 2011, 1:39:27 PM5/11/11

to pyft...@googlegroups.com

> I don't think this fits well in the /contrib directory as it's a very
> specialized change

Fair enough. I just thought it might be a nice example of a custom
FileProducer. But of course it's your call.

> which (I guess) also requires a customized client on the
> other end which is able to decompress the stream.

Nah, you can download the files with any regular client, and then
expand them with any gzip program. Although you may need to use the -c
option to stop gzip complaining about "unknown suffix". And I'm
definitely not gonna add on-the-fly filename-changing to pyftpdlib ;)

Andrew

Giampaolo Rodolà

unread,

May 17, 2011, 1:31:43 PM5/17/11

to pyft...@googlegroups.com

After some research it seems a semi-standard method to do this "cleanly" exists:

http://www.g6ftpserver.com/en/modez

http://tools.ietf.org/html/draft-preston-ftpext-deflate-03

MODE Z is also supported by proftpd:

http://castaglia.org/proftpd/modules/mod_deflate.html#Usage

It might be good to add support in pyftpdlib as well.

I'll file a ticket on the bug tracker.

Andrew Scheller

unread,

May 18, 2011, 8:23:12 AM5/18/11

to pyft...@googlegroups.com

> After some research it seems a semi-standard method to do this "cleanly"
> exists:

> http://tools.ietf.org/html/draft-preston-ftpext-deflate-03

Just a minor comment - there's actually a newer version of that draft RFC
http://tools.ietf.org/html/draft-preston-ftpext-deflate-04

Andrew

Hikmah

unread,

Jul 5, 2011, 3:32:08 AM7/5/11

to pyft...@googlegroups.com

Hi,

I had implement the server with lzma compression. here is the code :

[code]

class CompressFileProducer(ftpserver.FileProducer):

def more(self):

chunk = CompressFileProducer._baseProducer.more(self)

#print chunk

if not chunk:

return "" # EOF

else:

return self.compress_lzma(chunk)

def compress_lzma(self,data_req):

buf = StringIO(data_req)

c_buf = pylzma.compressfile(buf,eos=1)

compressed_lz = ''

count = 0

while True:

b = c_buf.read(10)

if not b and len(b)==0:break

print count, ":", b

compressed_lz += b

#print compressed

count = count+1

return compressed_lz

CompressFileProducer._baseProducer = ftpserver.FileProducer

ftpserver.FileProducer = CompressFileProducer

def main():

authorizer = ftpserver.DummyAuthorizer()

# Define a new user having full r/w permissions and a read-only

authorizer.add_user('user', password="123", homedir='/home/hikmah', perm='elrmafdw')

# Instantiate FTP handler class

handler = ftpserver.FTPHandler

handler.authorizer = authorizer

# Instantiate FTP server class and listen to 0.0.0.0:21

address = ('127.0.0.1', 21)

server = ftpserver.FTPServer(address, handler)

server.max_cons = 256

server.max_cons_per_ip = 5

server.serve_forever()

if __name__ == '__main__':

main()

[/code]

Now.. from the client side.. I wanted to make the client choose between using the lzma compression or not..

The "get" instruction when we use the compression would be like this :

get file.txt lzma

And, if we didnt use the compression, the instruction would be like this..

get file.txt none

According to this case.. What method that I should change in the server side, so I can handle those 2 choices from the client ?

Regards, Hikmah

Hikmah

unread,

Jul 6, 2011, 4:20:38 AM7/6/11

to pyft...@googlegroups.com

Andrew Scheller

unread,

Jul 6, 2011, 5:15:35 AM7/6/11

to pyft...@googlegroups.com

> Now.. from the client side.. I wanted to make the client choose between
> using the lzma compression or not..
> The "get" instruction when we use the compression would be like this :
> get file.txt lzma
> And, if we didnt use the compression, the instruction would be like this..
> get file.txt none

I'm afraid that's not the way the FTP protocol works... when you type
"get file.txt" in your FTP client, your FTP client actually sets up a
data connection to the server, and then sends a "RETR file.txt"
command. So "get file.txt lzma" doesn't make any sense.
I think the way to do what you're trying to do, would be to set up a
custom OPTS command which allows you to turn lzma-compression on or
off, and then just use regular get requests.

If you havn't already done so, I reccomend reading about the MODE Z
that Giampaolo already mentioned
http://tools.ietf.org/html/draft-preston-ftpext-deflate-04
to get an idea of how this sort of thing works.

Andrew

Reply all

Reply to author

Forward