Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to receive a data file of unknown length using a python socket?

4,801 views
Skip to first unread message

twgray

unread,
Jul 18, 2009, 5:33:48 PM7/18/09
to
I am attempting to send a jpeg image file created on an embedded
device over a wifi socket to a Python client running on a Linux pc
(Ubuntu). All works well, except I don't know, on the pc client side,
what the file size is? The following is a snippet:

[code]
f = open("frame.jpg",mode = 'wb')
while True:
data = self.s.recv(MAXPACKETLEN)
if len(data) == 0:
break
recvd += len(data)
f.write(data)
f.close()
[end]

It appears to be locking up in 'data=self.s.recv(MAXPACKETLEN)' on
the final packet, which will always be less than MAXPACKETLEN.

I guess my question is, how do I detect end of data on the client side?

Irmen de Jong

unread,
Jul 18, 2009, 5:43:47 PM7/18/09
to
twgray wrote:
> I am attempting to send a jpeg image file created on an embedded
> device over a wifi socket to a Python client running on a Linux pc
> (Ubuntu). All works well, except I don't know, on the pc client side,
> what the file size is?

You don't. Sockets are just endless streams of bytes. You will have to design some form
of 'wire protocol' that includes the length of the message that is to be read.
For instance a minimalistic protocol could be the following:
Send 4 bytes that contain the length (an int) then the data itself. The client reads 4
bytes, decodes it into the integer that tells it the length, and then reads the correct
amount of bytes from the socket.


--irmen

Tycho Andersen

unread,
Jul 18, 2009, 5:52:41 PM7/18/09
to pytho...@python.org

Exactly, sending the length first is the only way to know ahead of
time. Alternatively, if you know what the end of the data looks like,
you can look for that 'flag' as well, and stop trying to recv() after
that.

Some things that may be useful, though, are socket.settimeout() and
socket.setblocking(). More information is availible in the docs:
http://docs.python.org/library/socket.html.

You need to be careful with this, though, since network latency may
cause problems. Using these methods will keep your program from
sitting in recv() forever, though.

\t
--
http://tycho.ws

twgray

unread,
Jul 18, 2009, 6:04:41 PM7/18/09
to

Thanks for the reply. But, now I have a newbie Python question. If I
send a 4 byte address from the embedded device, how do I convert that,
in Python, to a 4 byte, or long, number?

MRAB

unread,
Jul 18, 2009, 7:05:48 PM7/18/09
to pytho...@python.org
twgray wrote:
> On Jul 18, 4:43 pm, Irmen de Jong <irmen.NOS...@xs4all.nl> wrote:
>> twgray wrote:
>>> I am attempting to send a jpeg image file created on an embedded
>>> device over a wifi socket to a Python client running on a Linux pc
>>> (Ubuntu). All works well, except I don't know, on the pc client side,
>>> what the file size is?
>> You don't. Sockets are just endless streams of bytes. You will have to design some form
>> of 'wire protocol' that includes the length of the message that is to be read.
>> For instance a minimalistic protocol could be the following:
>> Send 4 bytes that contain the length (an int) then the data itself. The client reads 4
>> bytes, decodes it into the integer that tells it the length, and then reads the correct
>> amount of bytes from the socket.
>>
> Thanks for the reply. But, now I have a newbie Python question. If I
> send a 4 byte address from the embedded device, how do I convert that,
> in Python, to a 4 byte, or long, number?

If you send the length as 4 bytes then you'll have to decide whether
it's big-endian or little-endian. An alternative is to send the length
as characters, terminated by, say, '\n' or chr(0).

John Machin

unread,
Jul 18, 2009, 7:26:21 PM7/18/09
to
On Jul 19, 8:04 am, twgray <twgray2...@gmail.com> wrote:

> send a 4 byte address from the embedded device, how do I convert that,
> in Python, to a 4 byte, or long, number?

struct.unpack() is your friend. Presuming the embedded device is
little-endian, you do:

the_int = struct.unpack('<I', four_bytes)[0]

See http://docs.python.org/library/struct.html

Nobody

unread,
Jul 18, 2009, 7:52:30 PM7/18/09
to
On Sat, 18 Jul 2009 14:33:48 -0700, twgray wrote:

> It appears to be locking up in 'data=self.s.recv(MAXPACKETLEN)' on
> the final packet, which will always be less than MAXPACKETLEN.
>
> I guess my question is, how do I detect end of data on the client side?

recv() should return zero when the sender closes its end of the connection.

Is the sender actually closing its end? If you are unsure, use a packet
sniffer such as tcpdump to look for a packet with the FIN flag.

If you need to keep the connection open for further transfers, you need to
incorporate some mechanism for identifying the end of the data into the
protocol. As others have suggested, prefixing the data by its length is
one option. Another is to use an end-of-data marker, but then you need a
mechanism to "escape" the marker if it occurs in the data. A length prefix
is probably simpler to implement, but has the disadvantage that you can't
start sending the data until you know how long it is going to be.

John Machin

unread,
Jul 18, 2009, 8:12:32 PM7/18/09
to
On Jul 19, 7:43 am, Irmen de Jong <irmen.NOS...@xs4all.nl> wrote:
> twgray wrote:
> > I am attempting to send a jpeg image file created on an embedded
> > device over a wifi socket to a Python client running on a Linux pc
> > (Ubuntu).  All works well, except I don't know, on the pc client side,
> > what the file size is?  
>
> You don't. Sockets are just endless streams of bytes. You will have to design some form
> of 'wire protocol' that includes the length of the message that is to be read.

Apologies in advance for my ignorance -- the last time I dipped my toe
in that kind of water, protocols like zmodem and Kermit were all the
rage -- but I would have thought there would have been an off-the-
shelf library for peer-to-peer file transfer over a socket
interface ... not so?

MRAB

unread,
Jul 18, 2009, 8:33:32 PM7/18/09
to pytho...@python.org
You could send it in chunks, ending with a chunk length of zero.

twgray

unread,
Jul 18, 2009, 10:02:05 PM7/18/09
to

Thanks for the help!

Aahz

unread,
Jul 18, 2009, 10:46:01 PM7/18/09
to
In article <mailman.3382.1247958...@python.org>,

MRAB <pyt...@mrabarnett.plus.com> wrote:
>
>If you send the length as 4 bytes then you'll have to decide whether
>it's big-endian or little-endian. An alternative is to send the length
>as characters, terminated by, say, '\n' or chr(0).

Alternatively, make it a fixed-length string of bytes, zero-padded in
front.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"The volume of a pizza of thickness 'a' and radius 'z' is
given by pi*z*z*a"

Piet van Oostrum

unread,
Jul 19, 2009, 2:55:20 AM7/19/09
to
>>>>> John Machin <sjma...@lexicon.net> (JM) wrote:

>JM> On Jul 19, 7:43�am, Irmen de Jong <irmen.NOS...@xs4all.nl> wrote:
>>> twgray wrote:
>>> > I am attempting to send a jpeg image file created on an embedded
>>> > device over a wifi socket to a Python client running on a Linux pc
>>> > (Ubuntu). �All works well, except I don't know, on the pc client side,
>>> > what the file size is? �
>>>
>>> You don't. Sockets are just endless streams of bytes. You will have to design some form
>>> of 'wire protocol' that includes the length of the message that is to be read.

>JM> Apologies in advance for my ignorance -- the last time I dipped my toe
>JM> in that kind of water, protocols like zmodem and Kermit were all the
>JM> rage -- but I would have thought there would have been an off-the-
>JM> shelf library for peer-to-peer file transfer over a socket
>JM> interface ... not so?

Yes, many of them, for example HTTP or FTP. But I suppose they are
overkill in this situation. There are also remote procedure call
protocols which can do much more, like XMLRPC.

By the way if the image file
is the only thing you send, the client should close the socket after
sending and then the receiver will detect end of file which will be
detected by your `if len(data) == 0:'
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org

Hendrik van Rooyen

unread,
Jul 19, 2009, 5:47:07 AM7/19/09
to pytho...@python.org
On Sunday 19 July 2009 02:12:32 John Machin wrote:
>
> Apologies in advance for my ignorance -- the last time I dipped my toe
> in that kind of water, protocols like zmodem and Kermit were all the
> rage -- but I would have thought there would have been an off-the-
> shelf library for peer-to-peer file transfer over a socket
> interface ... not so?

*Grins at the references to Kermit and zmodem,
and remembers Laplink and PC Anywhere*

If there is such a transfer beast in Python, I have
not found it.
(There is an FTP module but that is not quite
the same thing)

I think it is because the network stuff is
all done in the OS or NFS and SAMBA
now - with drag and drop support and
other nice goodies.

I have ended up writing a netstring thingy,
that addresses the string transfer problem
by having a start sentinel, a four byte ASCII
length (so you can see it with a packet
sniffer/displayer) and the rest of the
data escaped to take out the start
sentinel and the escape character.

It works, but the four byte ASCII limits the size
of what can be sent and received.

It guarantees to deliver either the whole
string, or fail, or timeout.

If anybody is interested I will attach the
code here. It is not a big module.

This question seems to come up periodically
in different guises.

To the OP:

There are really very few valid ways of
solving the string transfer problem,
given a featureless stream of bytes
like a socket.

The first thing that must be addressed
is to sync up - you have to somehow
find the start of the thing as it comes
past.

And the second is to find the end of the
slug of data that you are transferring.

So the simplest way is to designate a byte
as a start and end sentinel, and to make
sure that such a byte does not occur in
the data stream, other than as a start
and end marker. This process is called
escaping, and the reverse is called
unescaping. (SDLC/HDLC does this at a bit
pattern level)

Another way is to use time, namely to
rely on there being some minimum
time between slugs of data. This
does not work well on TCP/IP sockets,
as retries at the lower protocol levels
can give you false breaks in the stream.
It works well on direct connections like
RS-232 or RS-485/422 lines.

Classic netstrings send length, then data.
They rely on the lower level protocols and
the length sent for demarcation of
the slug, and work well if you connect,
send a slug or two, and disconnect. They
are not so hot for long running processes,
where processors can drop out while
sending - there is no reliable way for a
stable receiver to sync up again if it is
waiting for a slug that will not finish.

Adapting the netstring by adding a sync
character and time out is a compromise
that I have found works well in practice.

- Hendrik

pyt...@bdurham.com

unread,
Jul 19, 2009, 9:18:21 AM7/19/09
to Hendrik van Rooyen, pytho...@python.org
Hi Hendrik,

> I have ended up writing a netstring thingy, that addresses the string transfer problem by having a start sentinel, a four byte ASCII length (so you can see it with a packet sniffer/displayer) and the rest of the data escaped to take out the start sentinel and the escape character. It works, but the four byte ASCII limits the size of what can be sent and received. It guarantees to deliver either the whole string, or fail, or timeout.

> If anybody is interested I will attach the code here. It is not a big module.

I am interested in seeing your code and would be grateful if you shared
it with this list.

Thank you,
Malcolm

Hendrik van Rooyen

unread,
Jul 19, 2009, 12:07:54 PM7/19/09
to pyt...@bdurham.com, pytho...@python.org, Hendrik van Rooyen
On Sunday 19 July 2009 15:18:21 pyt...@bdurham.com wrote:
> Hi Hendrik,

> > If anybody is interested I will attach the code here. It is not a big
> > module.
>
> I am interested in seeing your code and would be grateful if you shared
> it with this list.

All right here it is.

Hope it helps

- Hendrik

netstring.py

pyt...@bdurham.com

unread,
Jul 19, 2009, 12:09:12 PM7/19/09
to Hendrik van Rooyen, pytho...@python.org, Hendrik van Rooyen
>> I am interested in seeing your code and would be grateful if you shared it with this list.

> All right here it is. Hope it helps.

Hendrik,

Thank you very much!! (I'm not the OP, but found this thread
interesting)

Best regards,
Malcolm

sandye...@gmail.com

unread,
Apr 22, 2015, 12:26:23 PM4/22/15
to
One way would be to send file size at the beginning. Like you can make first s.send(size) to only contain size in string converted mode by padding the integer size value to be equal to BufferSize. You can maybe use .rjust() method of string for this purpose. Then use this data to set the no of times you gotta do s.recv(BufferSize) in receiving end....Hope this helps.
0 new messages