Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Data loss with USB-serial transfer

383 views
Skip to first unread message

John Foster

unread,
May 12, 2007, 6:31:25 PM5/12/07
to
I'm reading data from a USB serial device, and the data comes in blocks
of 65536 bytes. The baud rate is 230400, so the stuff arrives quite
fast. The channel is set to non-blocking, with a fileevent handler to
pick up the data as it arrives.

Every so often, a transfer will fail because fewer than 65536 bytes are
received. The amount actually received varies. When this does happen,
there will have been one or more triggers of the fileevent handler in
which exactly 3968 bytes were received. In almost all other cases, the
fileevent handler will have seen smaller amounts of data in each
invocation, but in one transfer that I have seen so far, there was a
case where 3978 bytes were found in one fileevent trigger, and the
correct number of bytes was still transferred.

The buffer size set by fconfigure was the default 4096 bytes. I tried
increasing it to 8192 but it made no difference. The transfer mode is
binary.

Googling hasn't helped, so I'm wondering whether anyone here can shed
any light on what may be happening, and what to do about it. I'd prefer
to find a solution that didn't involve reducing the baud rate, because
of the size of the data transfer.

I'm running Tcl 8.4.13 on a Linux i686 box with os version
2.6.19-1.2911.fc6, and I'm ready to apologize if the answer is that I
should upgrade Tcl.

TIA

John Foster
--
jsjf dot demon dot co dot uk, username john

Alexandre Ferrieux

unread,
May 12, 2007, 6:53:46 PM5/12/07
to
On May 13, 12:31 am, John Foster <nos...@example.com> wrote:
> I'm reading data from a USB serial device, and the data comes in blocks
> of 65536 bytes. The baud rate is 230400, so the stuff arrives quite
> fast. The channel is set to non-blocking, with a fileevent handler to
> pick up the data as it arrives.
>
> Every so often, a transfer will fail because fewer than 65536 bytes are
> received....

>
> I'm running Tcl 8.4.13 on a Linux i686 box with os version

You're very lucky to be doing this under Unix !
Use the strace, Luke...

Indeed, there is a suspicion of data overrun, since the serial link
(true or emulated) has no flow control.
The [fconfigure -buffersize] may be large, it will do no good if the
kernel buffer has already overflowed.
Admittedly, if [vwait] has a chance to get back to listening quickly
enough, this shouldn' t happen.
However, I don't know the size of this kernel buffer, and maybe a
small latency is enough.
Use strace with the fine-grained timestamp option, and writing to a
file (otherwise scrolling the terminal will slow everything down).
And try to see how luch time is spent outside the poll() (i.e. the
[vwait]).

You can also do the same experiment with 'cat' reading the same serial
device and writing to a file.
See if the saved file is complete. If it's not, you're in trouble
because it means the kernel buffer is tiny.
If it is, then cat beats Tcl (or at least your script) for now. Try to
reduce the latency from your various handlers.

And *please* report back ;-)

-Alex

John Foster

unread,
May 12, 2007, 8:56:17 PM5/12/07
to
Alexandre Ferrieux wrote:

Alex, thank you very much for such a quick response!

> On May 13, 12:31 am, John Foster <nos...@example.com> wrote:
>> I'm reading data from a USB serial device, and the data comes in
>> blocks of 65536 bytes. The baud rate is 230400, so the stuff arrives
>> quite fast. The channel is set to non-blocking, with a fileevent
>> handler to pick up the data as it arrives.
>>
>> Every so often, a transfer will fail because fewer than 65536 bytes
>> are received....
>>
>> I'm running Tcl 8.4.13 on a Linux i686 box with os version
>
> You're very lucky to be doing this under Unix !
> Use the strace, Luke...
>
> Indeed, there is a suspicion of data overrun, since the serial link
> (true or emulated) has no flow control.

Yes, that's what bothered me. It does look like a flow control problem,
yet I can't see any way of fixing it.

> The [fconfigure -buffersize] may be large, it will do no good if the
> kernel buffer has already overflowed.
> Admittedly, if [vwait] has a chance to get back to listening quickly
> enough, this shouldn' t happen.
> However, I don't know the size of this kernel buffer, and maybe a
> small latency is enough.
> Use strace with the fine-grained timestamp option, and writing to a
> file (otherwise scrolling the terminal will slow everything down).
> And try to see how luch time is spent outside the poll() (i.e. the
> [vwait]).

Right now, I don't quite understand what you're saying here. But I've
never used strace, so I'll start that research now...

>
> You can also do the same experiment with 'cat' reading the same serial
> device and writing to a file.
> See if the saved file is complete. If it's not, you're in trouble
> because it means the kernel buffer is tiny.
> If it is, then cat beats Tcl (or at least your script) for now. Try to
> reduce the latency from your various handlers.

Tricky to use 'cat', because I have to send commands to the device in
order to trigger the output. And because this thing doesn't happen
every time, to capture it I really do need a script.

Reducing the latency from the handlers would be a possibility if there
were any running, other than the fileevent handler. But there aren't
any. It seems to be the operating system latency that's causing the
problem. But if that's the case, how come? The USB standard doesn't
seem to say that flow control is ever necessary. I'm baffled.

>
> And *please* report back ;-)
>
> -Alex

--

Uwe Klein

unread,
May 13, 2007, 3:24:02 AM5/13/07
to
John Foster wrote:
> I'm reading data from a USB serial device, and the data comes in blocks
> of 65536 bytes. The baud rate is 230400, so the stuff arrives quite
> fast. The channel is set to non-blocking, with a fileevent handler to
> pick up the data as it arrives.

is your device a _real_ serial device that is connected via
one of various RS232 <> USB adapters
or a device that behaves like a class serial device?

I have used these
067b:2303 Prolific Technology, Inc. PL2303 Serial Port
to connect to embedded systems and _downloaded_ i.e.
outbound large amounts of data @115kB. never any data loss.

can your device loose blocks of data due to not enough
local memory?

G!
uwe

John Foster

unread,
May 13, 2007, 8:04:55 AM5/13/07
to
Uwe Klein wrote:

> John Foster wrote:
>> I'm reading data from a USB serial device, and the data comes in
>> blocks of 65536 bytes. The baud rate is 230400, so the stuff arrives
>> quite fast. The channel is set to non-blocking, with a fileevent
>> handler to pick up the data as it arrives.
>
> is your device a _real_ serial device that is connected via
> one of various RS232 <> USB adapters
> or a device that behaves like a class serial device?

It's one that behaves like a serial device.

>
> I have used these
> 067b:2303 Prolific Technology, Inc. PL2303 Serial Port
> to connect to embedded systems and _downloaded_ i.e.
> outbound large amounts of data @115kB. never any data loss.
>
> can your device loose blocks of data due to not enough
> local memory?

I don't believe so - it collects the data in dedicated memory and then
transmits it, using the USB output from an Atmel micro.

Uwe Klein

unread,
May 13, 2007, 8:35:19 AM5/13/07
to
John Foster wrote:

> I don't believe so - it collects the data in dedicated memory and then
> transmits it, using the USB output from an Atmel micro.
>

I have only worked with Cypress E Z-USB FX2 using the limited
buffer space provided by the USB engine ( ~1MB/s continuous in Highspeed).

I had issues until I used tripple buffering.
Transfers _can_ be in spurts and bouts.

Do you have any cpu hogs and drainers running like KDE/Gnome/Beagle ...?

My first debug step was to have an overrun flag in the upstream dataformat ;-)

uwe

John Foster

unread,
May 13, 2007, 10:19:32 AM5/13/07
to
Uwe Klein wrote:

> John Foster wrote:
>
>> I don't believe so - it collects the data in dedicated memory and
>> then transmits it, using the USB output from an Atmel micro.
>>
> I have only worked with Cypress E Z-USB FX2 using the limited
> buffer space provided by the USB engine ( ~1MB/s continuous in
> Highspeed).
>
> I had issues until I used tripple buffering.
> Transfers _can_ be in spurts and bouts.
>
> Do you have any cpu hogs and drainers running like KDE/Gnome/Beagle
> ...?

I'm certainly running Gnome, though 'top' says things are pretty quiet.
On the other hand, kernel or scheduling latency could certainly be a
factor.

>
> My first debug step was to have an overrun flag in the upstream
> dataformat ;-)

I don't have that much control over the device, unfortunately. But is
the message here that USB *can* silently lose data? My understanding is
very limited, but I found these words through Google:

"Isochronous data transfer offers prenegotiated bandwidth with possible
data loss; often used when on-time data delivery is more important than
data accuracy, such as streaming audio and video."

"Bulk data transfer delivers large data transfers with no loss of data;
often used for applications where lots of data must be transferred with
no loss of data such as external hard drives."

I would have assumed that serial transfer would be of the bulk variety,
in which case I simply don't understand how data loss can be happening.

Hmm. This seems not to be a Tcl issue - maybe I should take it
elsewhere.

Uwe Klein

unread,
May 13, 2007, 12:23:14 PM5/13/07
to
John Foster wrote:

> I don't have that much control over the device, unfortunately. But is
> the message here that USB *can* silently lose data? My understanding is

You don't loose data in the USB transfer doing bulk transfers.

Your device may loose data before it ever gets on the bus!

Can you offload the upstream/in data transfer to a separate process
while still doing downstream/out commanding via tcl?

Apropos: do you get any usb-errors in /var/log/messages? ( afair there is
a debug flag you can set for the usb subsystem either at runtime or
while building a new kernel.)

There are still issues with certain Hardwarecombinations. forex
some cameras I use don't work behind certain hub types.
In another case data transfers get stuck. ( haven't been able to
resolve these things yet.


uwe

John Foster

unread,
May 13, 2007, 1:29:14 PM5/13/07
to
Uwe Klein wrote:

> John Foster wrote:
>
>> I don't have that much control over the device, unfortunately. But is
>> the message here that USB *can* silently lose data? My understanding
>> is
>
> You don't loose data in the USB transfer doing bulk transfers.
>
> Your device may loose data before it ever gets on the bus!

I have wondered about that, but I'm going to have a job proving it! And
I'm still at the point where I'm more ready to believe that it's
something in my setup.

>
> Can you offload the upstream/in data transfer to a separate process
> while still doing downstream/out commanding via tcl?

It'd be very difficult - in effect, I'd have to write a new application
just for that purpose. I'm not sure it would help - after sending the
request for the data, the application does nothing else except to
receive and store it via the fileevent handler, until the transfer
ends. So I'm as sure as I can be that latency in the application itself
is not a factor.

>
> Apropos: do you get any usb-errors in /var/log/messages? ( afair there
> is a debug flag you can set for the usb subsystem either at runtime or
> while building a new kernel.)

No, no error messages in /var/log/messages. (Thanks for the reminder -
I'd forgotten to look there.) Presumably a kernel buffer overrun would
be expected to show up there? As for the debug flag, I'll have a look
around, but I don't have much of clue as to where to start looking.

I'm busy experimenting with strace at the moment, as suggested by
Alexandre, but so far it's just confirming what I've already reported.

Cheers, and thanks for the help

John

Uwe Klein

unread,
May 13, 2007, 1:39:10 PM5/13/07
to
Been there? subscribed? ( was usefull for me):
https://lists.sourceforge.net/lists/listinfo/linux-usb-users

Linux USB FAQ
http://www.linux-usb.org/FAQ.html

Q: How do I see the "USB Verbose Debug Messages" that I enabled in the kernel config?
http://www.linux-usb.org/FAQ.html#ts7

uwe

John Foster

unread,
May 13, 2007, 2:10:38 PM5/13/07
to
Uwe Klein wrote:

Ah! Now that looks like a useful set of references, and I hadn't found
them. I'll go and check them out..

Cheers and thanks again

John Foster

unread,
May 13, 2007, 2:55:30 PM5/13/07
to
Uwe Klein wrote:

> Been there? subscribed? ( was usefull for me):
> https://lists.sourceforge.net/lists/listinfo/linux-usb-users

And what do I find there but these two recent posts (warning: long urls
which I've wrapped):

http://sourceforge.net/mailarchive/forum.php
?thread_name=5486cca80704140747u74f92c8bj564618b1cdc09e43
%40mail.gmail.com&forum_name=linux-usb-users

and

http://sourceforge.net/mailarchive/forum.php
?thread_name=5486cca80705040138r6ac16e9bp77e4f6217720ea8
%40mail.gmail.com&forum_name=linux-usb-users

They're discussing an FTDI driver problem involving data loss - and the
device I'm having trouble with does, I now realize, use an FTDI chip.

Looks as though my best plan is to lower the baud rate for now, and
check for driver updates. It certainly doesn't look as though there's
much point in further investigation on my part.

Cheers, and many thanks for the pointers.

Uwe Klein

unread,
May 13, 2007, 3:26:28 PM5/13/07
to
John Foster wrote:
> Uwe Klein wrote:
>
>
>>Been there? subscribed? ( was usefull for me):
>>https://lists.sourceforge.net/lists/listinfo/linux-usb-users
>
>
> And what do I find there but these two recent posts (warning: long urls
> which I've wrapped):
>
> http://sourceforge.net/mailarchive/forum.php
> ?thread_name=5486cca80704140747u74f92c8bj564618b1cdc09e43
> %40mail.gmail.com&forum_name=linux-usb-users
>
> and
>
> http://sourceforge.net/mailarchive/forum.php
> ?thread_name=5486cca80705040138r6ac16e9bp77e4f6217720ea8
> %40mail.gmail.com&forum_name=linux-usb-users
>
> They're discussing an FTDI driver problem involving data loss - and the
> device I'm having trouble with does, I now realize, use an FTDI chip.

you may want to ask on linux_usb_users or contact Oliver Neukum directly
he has been looking for testers with various usb devices recently.

uwe

John Foster

unread,
May 13, 2007, 3:48:38 PM5/13/07
to
Uwe Klein wrote:

> you may want to ask on linux_usb_users or contact Oliver Neukum
> directly he has been looking for testers with various usb devices
> recently.

Good idea - I'll try it.

Cheers

0 new messages