USB EHCI problems

tenfoot

unread,

May 10, 2009, 3:58:09 PM5/10/09

to Beagle Board

Hi,

I've seen a number of people having problems with the USB host port
(EHCI) on
the Rev C beagleboards. I seem to be having the same problem and
after a few
days of random lockups, I found I can easily reproduce it by reading
data
from a USB disk with dd. After a few seconds of I/O I get the
following:

beagleboard login: root
root@beagleboard:~# dd if=/dev/sda bs=1M > /dev/null
hub 1-0:1.0: port 2 disabled by hub (EMI?), re-enabling...
usb 1-2: USB disconnect, address 2
sd 0:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00
end_request: I/O error, dev sda, sector 647152
__ratelimit: 3 callbacks suppressed
Buffer I/O error on device sda, logical block 80894
Buffer I/O error on device sda, logical block 80895
sd 0:0:0:0: [sda] Result: hostbyte=0x01 driverbyte=0x00
end_request: I/O error, dev sda, sector 647168
Buffer I/O error on device sda, logical block 80896
Buffer I/O error on device sda, logical block 80897
Buffer I/O error on device sda, logical block 80898
Buffer I/O error on device sda, logical block 80899
Buffer I/O error on device sda, logical block 80900
Buffer I/O error on device sda, logical block 80901
Buffer I/O error on device sda, logical block 80902
Buffer I/O error on device sda, logical block 80903
sd 0:0:0:0: [sda] Result: hostbyte=0x01 driverbyte=0x00
end_request: I/O error, dev sda, sector 647408
dd: /dev/sda: Input/output error
root@beagleboard:~#

And the usb devices are dead from then on - unplugging and re-plugging
aren't
noticed, lsusb locks up or just reports the two (EHCI & MUSB) root
hubs.
This happens with 2 USB disks: a 250Gb hard disk and a 4Gb flash
drive.

The same happens with every kernel I've tried - the angstrom demos,
the Rev C
validation images, the ones from http://www.rcn-ee.com/deb/kernel/ and
ones
I've built myself from the OpenEmbedded patches. (2.6.28 and 2.6.29).

I've also updated u-boot from 2009.01-dirty (as supplied) to 2009.03.

I've tried with a couple of power supplies, USB power and also with
several
hubs (and no hub).

Is there anything else I can try to fix this?

Thanks

Rob

Robert Nelson

unread,

May 10, 2009, 4:06:31 PM5/10/09

to beagl...@googlegroups.com

Hi Rob,

Koen just pushed another ehci defconfig change to the angstrom's source tree..

http://cgit.openembedded.net/cgit.cgi?url=openembedded/commit/&id=f1ce646bf9e343b68bc602964cb462ed64f4a3dc

I've just uploaded a build based on that here:

http://www.rcn-ee.com/deb/kernel/CC-beagle-v2.6.29-58cf2f1-oer32

I haven't tested it yet to see if it solves the problem i was seeing..

Regards,
--
Robert Nelson
http://www.rcn-ee.com/

tenfoot

unread,

May 10, 2009, 5:36:12 PM5/10/09

to Beagle Board

> Hi Rob,
>
> Koen just pushed another ehci defconfig change to the angstrom's source tree..
>

> http://cgit.openembedded.net/cgit.cgi?url=openembedded/commit/&id=f1c...

>
> I've just uploaded a build based on that here:
>
> http://www.rcn-ee.com/deb/kernel/CC-beagle-v2.6.29-58cf2f1-oer32
>
> I haven't tested it yet to see if it solves the problem i was seeing..
>
> Regards,
> --
> Robert Nelsonhttp://www.rcn-ee.com/

Hi,

I've just given this a try and I still get the same error :(

Regards

Rob

azaparov

unread,

May 18, 2009, 7:14:40 PM5/18/09

to Beagle Board

I get the same error. After a while i'm getting:

root@beagleboard:~# usb 1-2.1: USB disconnect, address 3

hub 1-0:1.0: port 2 disabled by hub (EMI?), re-enabling...
usb 1-2: USB disconnect, address 2

usb 1-2.3: USB disconnect, address 4
eth0: unregister 'asix' usb-ehci-omap.0-2.3, ASIX AX88772 USB 2.0
Ethernet
usb 1-2.4: USB disconnect, address 5

I tried couple kernels with the same result.

Joep Schroen

unread,

May 19, 2009, 10:22:12 AM5/19/09

to discu...@beagleboard.org

Hello,

I'm getting similar issues with USB, almost only when creating heavy traffic like file transfers. The (powered) USB-HUB turns off all LEDs (indicating connected devices), and I've to reset the board to get it back running.

Connected to the USB hub is a WIFI adapter and an USB memory stick.

I've a, not fully related, follow-up question; How to build a newer kernel revision with BitBake/OpenEmbedded? When I build the kernel some version with some patches are build, where to define that the latest is build?

Wkr,

Joep

2009/5/10 tenfoot <ten...@gmail.com>

eelcor

unread,

May 19, 2009, 10:58:29 AM5/19/09

to Beagle Board

Hi,

There has been quite a thread about the disconnect problem of the USB.
I've had similar problems and I decided to return the board.
Thankfully I just (well half an hour ago) received a new board and it
works! Somehow I have the feeling that the tolerances on the board are
slightly off and some people have problems with their USB.

I am not sure whether the reported problems are solved by replacing
the board or that some kind of software trick could work as well. I am
very happy at this moment as I can show my company tomorrow what the
potential future of computing could be.

I would recommend to read the thread started by me (something with USB
and disconnect), because some people have given tips and tricks that
could help...

Kind regards, Eelco

On 19 mei, 16:22, Joep Schroen <joepschr...@gmail.com> wrote:
> Hello,
>
> I'm getting similar issues with USB, almost only when creating heavy traffic
> like file transfers. The (powered) USB-HUB turns off all LEDs (indicating
> connected devices), and I've to reset the board to get it back running.
> Connected to the USB hub is a WIFI adapter and an USB memory stick.
>
> I've a, not fully related, follow-up question; How to build a newer kernel
> revision with BitBake/OpenEmbedded? When I build the kernel some version
> with some patches are build, where to define that the latest is build?
>
> Wkr,
> Joep

> 2009/5/10 tenfoot <tenf...@gmail.com>

> > Rob- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -

Gerald Coley

unread,

May 19, 2009, 11:27:27 AM5/19/09

to beagl...@googlegroups.com

For what it is worth, I tested the board that Eelco returned. After a week of trying to make it fail, I could not. So, I had them send a new one, which is the one he is referring to.

Gerald

alexey

unread,

May 19, 2009, 12:02:09 PM5/19/09

to Beagle Board

Does that mean i should return board to Digikey and request new one ?

On May 19, 11:27 am, Gerald Coley <ger...@beagleboard.org> wrote:
> For what it is worth, I tested the board that Eelco returned. After a week
> of trying to make it fail, I could not. So, I had them send a new one, which
> is the one he is referring to.
>
> Gerald
>

Gerald Coley

unread,

May 19, 2009, 12:06:03 PM5/19/09

to beagl...@googlegroups.com

DigiKey does not handle the repairs. RMAs are through beagleboard.org.

We have no board to replace anyones boards at the moment. Fell free to send them back and in another 3-4 weeks we will have replacements to send you.

As I already siad, the board that was returned did not fail. I tested it myself So, it is unclear that simply getting a new board will fix anyones problems. It could be related to a noisy power supply where some board are more tolerant than others of this noise.

Gerald

alexey

unread,

May 19, 2009, 1:04:51 PM5/19/09

to Beagle Board

I'll try it with the best power supply i got. I power radio receivers
from it.

On May 19, 12:06 pm, Gerald Coley <ger...@beagleboard.org> wrote:
> DigiKey does not handle the repairs. RMAs are through beagleboard.org.
>
> We have no board to replace anyones boards at the moment. Fell free to send
> them back and in another 3-4 weeks we will have replacements to send you.
>
> As I already siad, the board that was returned did not fail. I tested it
> myself So, it is unclear that simply getting a new board will fix anyones
> problems. It could be related to a noisy power supply where some board are
> more tolerant than others of this noise.
>
> Gerald
>

eelcor

unread,

May 19, 2009, 1:11:20 PM5/19/09

to Beagle Board

Gerald,

Thank you for returning a new board, as somehow all problems have been
solved. I have no idea what the main difference is, but now I am able
to copy a 350MB file from USB stick to MMC (Í've tried several things
and it didn't work at all with my old board) and when running the
chameleon man demo the USB keeps on working. And the board now is
perfectly stable with one of my crappiest psu's. Quite strange...

Do you think tolerances could be the explanation? As people who have
read my thread might know, I've gone to quite some lengths to assure
that it was a HW faillure. Maybe there is some sensitive point in the
current design?

Regards, Eelco

> > > - Tekst uit oorspronkelijk bericht weergeven -- Tekst uit oorspronkelijk bericht niet weergeven -

eelcor

unread,

May 19, 2009, 1:16:46 PM5/19/09

to Beagle Board

I agree that it could have something to do with tolerances and slight
board variations. I couldn't solve the problems myself with 5
different kernel versions,3 different MMC cards, 3 different PSU's, 3
different keyboard/mouse combo's, 3 different USB keys and a single
USB to ethernet interface. At a certain point I really didn't have any
options, so I was happy with the suggestion of Gerald to return the
board and thankfully it has solved my problems. Currently I am really
happy as I am able to show my company that the cutting edge of mobile
processors is able to replace simple desktop machines.

Regards, Eelco

> > > > > > - Tekst uit oorspronkelijk bericht weergeven -- Tekst uit oorspronkelijk bericht niet weergeven -

Gerald Coley

unread,

May 19, 2009, 2:17:11 PM5/19/09

to beagl...@googlegroups.com

I think we may have a random noise issue that may be locking up something. Unfortuantely, until I can get something to fail, I can't do an analysis on what is going on. I certainly can't afford to transfer large files multiples of time in production, or we will be shipping 50 boards a week. We are chasing a similar issue on another board inside TI at the moment. It is a lot worse than Beagle. I am hoping that we can get some inforamtion from there. They are looking at a noisy 1.8V rail to see if that may be the issue. I hope to get some feedback soon on their progress.

Gerald

John Beetem

unread,

May 19, 2009, 2:35:23 PM5/19/09

to Beagle Board

eelcor,

I'm glad the new hardware was able to fix the problem. Boy, I hate
problems that are hard to reproduce. One possibility in this case is
a timing problem somewhere in the logic where a signal ALMOST has
sufficient set-up time for a clock. Most of the time it works, but a
voltage dip or temperature increase causes the signal to be missed, or
worse, become metastable. (I never metastable I liked.) Sometimes
this can be fixed in software -- perhaps there is a signal that could
be sampled on the opposite edge of the clock, providing a stable
signal at all times. Sometime software can perform extra reads on a
metastable register: the first to see if it has changed at all, and a
second read to see which value it really changed to.

One reason the new board could work while the old one failed is that a
component on the new board is just enough faster or slower so that the
timing issue does not occur -- or is so rare that nobody observes it.
For example, a different voltage regulator or passives can generate a
voltage that is a percent higher or lower... just enough to expose or
hide the timing issue.

I think I mentioned in the other stream a problem I had with a board
that would fail after a couple hours, and only at high temperature.
That was nasty to find. Turned out to have an easy software solution,
so happy ending!

Gerald Coley

unread,

May 19, 2009, 2:46:05 PM5/19/09

to beagl...@googlegroups.com

These issues are tough to track down. Funny thing is that there are only two components in the circuit, OMAP and the SMSC PHY. This is a similar issue we had on Port1 of the OMAP3530. When we moved to port2 it appeared to be resolved. So, it may be an issue inside the OMAP or the PHY.

One thing we noticed on Port1 was that if we slowed down the processor, the issue went away, even though the interface was still 60MHZ. Could someone with a questionable board slow the processor speed down and see if that affects the results?

Gerald

alexey

unread,

May 19, 2009, 2:50:01 PM5/19/09

to Beagle Board

What is the correct way to slow the CPU? I can try it.

On May 19, 2:46 pm, Gerald Coley <ger...@beagleboard.org> wrote:
> These issues are tough to track down. Funny thing is that there are only two
> components in the circuit, OMAP and the SMSC PHY. This is a similar issue we
> had on Port1 of the OMAP3530. When we moved to port2 it appeared to be
> resolved. So, it may be an issue inside the OMAP or the PHY.
>
> One thing we noticed on Port1 was that if we slowed down the processor, the
> issue went away, even though the interface was still 60MHZ. Could someone
> with a questionable board slow the processor speed down and see if that
> affects the results?
>
> Gerald
>

tenfoot

unread,

May 20, 2009, 4:02:30 PM5/20/09

to Beagle Board

We've just got a beagleboard at work (also rev C2) so I tried the same
test (dumping the contents of a USB disk) on both my board (which I
was using in my original email) and work's board. Mine always fails
after a minute or so, but the work's one doesn't show any error and I
was able to read the disk several times over (a total of about 40Gb of
data) without a problem. Everything was the same apart from the board
(i.e. same power supply, SD card, kernel, u-boot, NAND contents - both
in factory state) - so it definitely looks like a hardware issue.

How would I go about changing the clock speed to see if that fixes
anything?

Regards

Rob

On May 19, 7:46 pm, Gerald Coley <ger...@beagleboard.org> wrote:
> These issues are tough to track down. Funny thing is that there are only two
> components in the circuit, OMAP and the SMSC PHY. This is a similar issue we
> had on Port1 of the OMAP3530. When we moved to port2 it appeared to be
> resolved. So, it may be an issue inside the OMAP or the PHY.
>
> One thing we noticed on Port1 was that if we slowed down the processor, the
> issue went away, even though the interface was still 60MHZ. Could someone
> with a questionable board slow the processor speed down and see if that
> affects the results?
>
> Gerald
>

Frans Meulenbroeks

unread,

May 20, 2009, 4:52:03 PM5/20/09

to beagl...@googlegroups.com

2009/5/19 Gerald Coley <ger...@beagleboard.org>:

> These issues are tough to track down. Funny thing is that there are only two
> components in the circuit, OMAP and the SMSC PHY. This is a similar issue we
> had on Port1 of the OMAP3530. When we moved to port2 it appeared to be
> resolved. So, it may be an issue inside the OMAP or the PHY.
>
> One thing we noticed on Port1 was that if we slowed down the processor, the
> issue went away, even though the interface was still 60MHZ. Could someone
> with a questionable board slow the processor speed down and see if that
> affects the results?
>

This might be interesting for other things as well.
I noticed that my kingston 4gb sdhc card does not work with the latest
kernels but used to work on .27 or so.
Also I noticed some usb issues with EHCI (1.1 webcam on hub not
working; hauppage pvr working directly on ehci works, but if connected
to the hub it does not work any more; apparently the reset after
loading the firmware does not get thru. This happens with several
brands hub, but of course it it could be they have the same chip
inside). Odd thing is that the very same device on the very same hub
works like a charm under opensuse 11.1 (and also worked on 2.6.11 or
so on NLSU2 which is much slower).
Of course I have no idea if that is the ehci hw or the ehci driver
that is causing the issue (and no idea how to further diagnose this).

Frans

David Hagood

unread,

Jun 4, 2009, 11:12:24 AM6/4/09

to Beagle Board

I have a RevC board that very repeatably fails with a "hub 1-0:1.0:
port 2 disabled by hub (EMI?), re-enabling... " message. The board is
powered by a 5V 2A switching power supply. It is connected to a USB
2.0 hub which is itself powered by a 2A power supply.

I have tried an experiment where I put the board into a cold chamber,
took it to 0C, and ran the same test - to the same result. I don't see
any significant variance in the time to failure. This would tend to
exclude any marginal timing issues that are exacerbated by low
temperature.

I could get a heat-gun and run the other side of the test if anybody
thinks that would be useful.

Michael Evans

unread,

Jun 4, 2009, 5:17:22 PM6/4/09

to beagl...@googlegroups.com

I assume you've tried different peripherals (USB device and USB hub) to rule out it being the USB device itself and/or the hub...? What about the other ports on the hub...? Do they drop out too...?

> Date: Thu, 4 Jun 2009 08:12:24 -0700
> Subject: [beagleboard] Re: USB EHCI problems
> From: david....@aeroflex.com
> To: beagl...@googlegroups.com

David Hagood

unread,

Jun 4, 2009, 5:47:51 PM6/4/09

to Beagle Board

On Jun 4, 4:17 pm, Michael Evans <horse_d...@hotmail.com> wrote:
> I assume you've tried different peripherals (USB device and USB hub) to rule out it being the USB device itself and/or the hub...? What about the other ports on the hub...? Do they drop out too...?
>
>

The USB port on the Beagleboard dies as do all the ports on the hub -
after that, unplugging and plugging back in the hub does nothing: only
a reboot will restore the port.

I've already tried a couple of hubs, and I could get the failures on
both heavy access to a USB memory stick and to a USB to Ethernet
adapter.

Right now we are performing experiments with my Beagleboard in one of
our environmental chambers: we are currently running it at 0C to
reproduce my tests, but in moving the board to the chamber we
unavoidably had to change the configuration, so we are sequencing
through:

Different hubs.
Different power supplies.
Different devices on the bus in addition to the memory stick (mouse,
Ethernet device), etc.
Presence/absence of a device on the HDMI port.

Right now, things aren't failing - which is puzzling because I had a
100% reproducibility before. However, that might be a GOOD thing if I
can work out what variable caused the change.

If I can't get it to start failing, I'll try to get to exactly the
configuration I had in my office, then start simplifying. Failing
that, I'll take the chamber back to 20C, and then up to 40C.

Hopefully we can characterize this enough to at least make it
reproducible.

Gerald Coley

unread,

Jun 4, 2009, 6:18:05 PM6/4/09

to beagl...@googlegroups.com

Request an RMA immediately!

Gerald

Message has been deleted

Gerald Coley

unread,

Jun 5, 2009, 11:12:38 AM6/5/09

to beagl...@googlegroups.com

The board will be replaced and your board will be evaluated along with the other two boards we have. On the other two boards, we could not get them to fail, but in both those cases, the replacement boards worked fine. So, this is not something that is in all boards,

Gerald

On Fri, Jun 5, 2009 at 10:08 AM, David Hagood <david....@aeroflex.com> wrote:

On Jun 4, 5:18 pm, Gerald Coley <ger...@beagleboard.org> wrote:
> Request an RMA immediately!
>
> Gerald

Well, before I do that, I'd like to characterized, as best I can, what
is going on.

I have some more data:

My board, held in the chamber at 0C, configured as I had it in my
office, did fail eventually, but it was MUCH more reliable than it was
at ambient. It took many hundreds of runs of my test (dd if=/dev/zero
of=/media/disk/zeros bs=1024 count=100000) before it failed with the
disabled message.

When we brought the chamber up to 25C it died on the second run of the
test.

We are currently running the same test with a different Beagleboard,
but otherwise the same configuration and at 25C.

For reference, my configuration is:

Beagleboard on a 5V, 2A supply purchased from DigiKey.

RS-232 on the board connected to an external serial terminal.

8G Class 6 SDHC with Ubuntu on it.

USB host port driving a 4 port, self-powered hub with the USB memory
stick, keyboard, mouse, and a second 4 port hub on it.
Second 4 port self powered hub driving a Wii USB to Ethernet device
(no network connected).

HDMI port connected to a flat-panel interface to a 12" flat panel.

I'm going to let the second board run in the chamber at 24C for
another hour, then I will put my board back on the bench and run my
tests with everything but the serial port, SDHC, and USB memory on it.

After I do that, I will post my results.

Gerald - are you just RMA it, or are you actually wanting to examine
my board?

David Hagood

unread,

Jun 5, 2009, 11:17:43 AM6/5/09

to Beagle Board

On Jun 4, 5:18 pm, Gerald Coley <ger...@beagleboard.org> wrote:
> Request an RMA immediately!
>
> Gerald

(NOTE: I pulled my previous message, as I had a couple of errors in it
that I wanted to correct...)

David Hagood

unread,

Jun 5, 2009, 11:20:01 AM6/5/09

to Beagle Board

On Jun 5, 10:12 am, Gerald Coley <ger...@beagleboard.org> wrote:
> The board will be replaced and your board will be evaluated along with the
> other two boards we have. On the other two boards, we could not get them to
> fail, but in both those cases, the replacement boards worked fine. So, this
> is not something that is in all boards,
>
> Gerald

Yes, that checks with the boards we have - mine dies, a couple of
others don't. I'd like to try to get as much information as possible
to allow others to have a better shot at reproducing the problems.

Is there anything I should check on the two boards I have (e.g. lot
numbers on the boards and/or parts) that would be of use in
troubleshooting this?

Gerald Coley

unread,

Jun 5, 2009, 11:29:59 AM6/5/09

to beagl...@googlegroups.com

Well, it sounds like you are adventurous. If you download the Allegro files, you can find a series of testpoints on board that are the signals coming from the OMAP3530 to the SMSC PHY. You can scrape off the solder mask and the probe these signals. You may need to be able probe these points and detect differences between the good and the bad boards.

Another idea is to focus on the PHY, cool it and heat it to see if there are any changes in behavior.

Gerald

David Hagood

unread,

Jun 5, 2009, 12:07:14 PM6/5/09

to Beagle Board

On Jun 5, 10:29 am, Gerald Coley <ger...@beagleboard.org> wrote:
> Another idea is to focus on the PHY, cool it and heat it to see if there are
> any changes in behavior.

I can do that pretty easily.

The second board ran fine at 25C, so there are some board-to-board
variations.

OK, what I propose to do is:
1) Set my board back up on the bench here in my office.
2) Try cooling the PHY down with freeze-spray and run my tests.
3) Try cooling the OMAP down, holding the PHY at ambient.
4) Try my tests at ambient after disconnecting various pieces of
hardware, to try to simplify the test case down.

I don't think I'll go so far as to probe the signal lines - while the
OMAP and the USB support is important to work, I have a lot of other
items that are also important, and that other folks AREN'T working on.

However, will post my results, and can include them on paper when I
return the board under RMA.

Hopefully that will be enough to help you guys to reproduce the issue.

Gerald Coley

unread,

Jun 5, 2009, 12:08:39 PM6/5/09

to beagl...@googlegroups.com

Thank you!

Gerald

Marcus Bauer

unread,

Jun 5, 2009, 12:28:59 PM6/5/09

to beagl...@googlegroups.com

Similar/same problem here. The USB stops working after some time -
usually between half and hour and a day.

This also happens while no activity on the USB, i.e. only a keyboard and
a mouse connected; a reboot is then needed to bring USB it back. However
I am using Debian and there is a hint on the elinux Wiki that USB is
"flaky" on the revC boards, so maybe it is a kernel problem?

FWIW, uname -a :

Linux beagle 2.6.28-oer17 #1 Wed Mar 25 06:26:12 UTC 2009 armv7l
GNU/Linux

I could still run a test with the Angstrom images.

Marcus

John Beetem

unread,

Jun 5, 2009, 12:32:29 PM6/5/09

to Beagle Board

Just speculating here...

I wonder if it's possibly a bad solder joint? One of the nasty
problems with BGAs is that your connections between boards and ICs are
made with solder which fractures instead of flexing when there are
thermal mismatches. This is a serious issue in desert and space
applications where you may have equipment that goes through 50-100C
temperature swings at least once a day. You would normally not see
this with BGAs as small as the OMAP and the SMSC PHY, particularly
over only a 25C swing. However, if one of those tiny solder balls
wasn't soldered properly the first time, its conducivity could vary
over temperature due to thermal mismatch of the IC and the board.
This is very hard to diagnose, though JTAG can help if it's
implemented.

One failure mode is for a ball to switch from being a conductor to a
capacitor, so AC signals can get through provided the load is
extremely low impedance. When you try to diagnose it with a 'scope
probe, the probe's capacitance changes the behavior resulting in a
"Heisenbug".

As I said, just a speculation for readers' entertainment. I'm still
voting for a timing issue. If it were a solder problem, we'd probably
be seeing defects all over the place instead of just the EHCI USB
port.

John

Gerald Coley

unread,

Jun 5, 2009, 12:36:44 PM6/5/09

to beagl...@googlegroups.com

You may have a point there. This is a nasty part in that it is small and .4mm pitch. It is a bear to work with. My problem is that I can't get these boards to fail, so if I can find one that fails, I will have it reflowed to see if it solves the problem.

Gerald

David Hagood

unread,

Jun 5, 2009, 1:39:56 PM6/5/09

to Beagle Board

OK, I have enough results to, I hope, enable you to work out the
issues:

Here are my tests:

all parts at ambient:
beagle -> hub (mem) -> hub (key, mouse, ethernet) : passed
In other words, the Beagle was driving a self-powered hub with the USB
memory on it, and that hub was in turn driving a second (also powered)
hub with keyboard, mouse, and Ethernet on it.

It passed my 100MB write test more than 10 times without error.

all at ambient:
beagle -> hub (mem,key,mouse) -> hub (ethernet) : failed immediately!
This makes me wonder if there is something about having multiple
devices with interrupt endpoints on the same device.

Phy cooled:
beagle -> hub (mem,key,mouse) -> hub (ethernet) : failed immediately!

OMAP cooled:
beagle -> hub (mem,key,mouse) -> hub (ethernet) : failed in 2 runs
This is likely the same result as with the Phy cooled - the difference
between one run or 2 runs is pretty marginal.

all at ambient:
beagle -> hub (mem,key,mouse,ethernet): failed immediately.
This would tend to remove the second hub as an issue.

all at ambient:
beagle -> mem : passed
I wonder if this is more like the test cases the folks at Beagleboard
are running - perhaps they aren't using a hub?

all at ambient:
beagle -> hub2 (mem,key,mouse,ethernet):
test 1: ( BUG: soft lockup - CPU#0 stuck for 61s! [aplay:2366])
I don't have an explanation for this.
Test 2: failed before booting complete. By this I mean the system had
the "port disabled (EMI)" message before I even started my tests.

all at ambient:
beagle -> hub2 (mem) : Failed.

all at ambient:
beagle -> hub (mem) Failed after longer time.
These were 2 tests to check if there needed to be other devices on the
bus.

PHY cooled:
beagle -> hub (mem) : passed (more than 10 reps)

PHY cooled:
beagle -> hub (mem,key,mouse,ethernet): failed

My hypothesis is that there needs to be more than one device on the
bus, preferably many devices with interrupt endpoints.

OK, so: what do I do for the RMA?

Gerald Coley

unread,

Jun 5, 2009, 2:34:49 PM6/5/09

to beagl...@googlegroups.com

You are correct. The test that is done at the factory is the memory thumbdrive only.

As to an RAM goto http://beagleboard.org/support/rma

Gerald

David Hagood

unread,

Jun 5, 2009, 5:09:35 PM6/5/09

to Beagle Board

On Jun 5, 1:34 pm, Gerald Coley <ger...@beagleboard.org> wrote:
> You are correct. The test that is done at the factory is the memory
> thumbdrive only.

You may want to add a powered hub, keyboard, and mouse. Also, do you
really try to beat upon the stick as I have been (copying a 100MB
file), or are you just doing a small write?

>
> As to an RAM gotohttp://beagleboard.org/support/rma
Sent, awaiting reply. I'll see if I can get it out this weekend. I'm
assuming putting it in the box and putting the box in a padded
shipping envelope should be enough?

Gerald Coley

unread,

Jun 5, 2009, 8:16:12 PM6/5/09

to beagl...@googlegroups.com

That wil be fine!

Gerald

Frans Meulenbroeks

unread,

Jun 6, 2009, 8:22:07 AM6/6/09

to beagl...@googlegroups.com

I have also problems with the keyboard and I get corrupted data with
my USB 1.1 pwc webcam (connected through a hub of course).

I was under the impression this was a SW issue.

Frans

Duckyduck

unread,

Jun 6, 2009, 2:13:47 PM6/6/09

to Beagle Board

Hi Gerard,

I've a board with exact this problem, heavy I/O traffic through the
USB HOST will generate the "disabled by hub (EMI?), re-enabling"
error. Using dd to move chunks of 100MB to /dev/null will let the USB
crash after 200MB.
There's no difference between connections through a USB hub or
directly connected.

What do you suggest, send it for RMA or wait a few weeks till more is
known? Because I'm living in the Netherlands it'll cost quite a bit to
send it back to the USA :(

Wkr,
Joep

On 5 jun, 17:12, Gerald Coley <ger...@beagleboard.org> wrote:
> The board will be replaced and your board will be evaluated along with the
> other two boards we have. On the other two boards, we could not get them to
> fail, but in both those cases, the replacement boards worked fine. So, this
> is not something that is in all boards,
>
> Gerald
>

> > my board?- Tekst uit oorspronkelijk bericht niet weergeven -

Gad Krumholz

unread,

Jun 4, 2009, 4:03:04 PM6/4/09

to beagl...@googlegroups.com

EHCI is known to have issues... try booting the Angstrom Demo from the Angstrom guys and see if you lose EHCI randomly still (they have a newer u-boot in their SD image that fixes some EHCI issues).

http://www.angstrom-distribution.org/demo/beagleboard/

If your problems go away... replace your u-boot with the one from the Angstrom SD image.

Rob Walker

unread,

Jun 6, 2009, 7:17:23 PM6/6/09

to beagl...@googlegroups.com

My beagleboard has similar issues. I'm in the UK, so if I RMA it, will I have
to pay import duty + VAT again for the replacement?

Rob

Gerald Coley

unread,

Jun 6, 2009, 8:28:11 PM6/6/09

to beagl...@googlegroups.com

You should not have to. It is marked as an RMA repair. It is unclear at this time if this is a SW or HW issue,. We only have two boards that have reported th eissue in hous eand we can't get them to fail. So, there may be someting a little off that shows up more often on some boards than others. We have replaced some boards and they seem to work.

It is your call as to whether or not you want to send it in for replacement or to wait and see what happens in the SW realm. You can move to the OTG for host funtions if you like in the mean time.

Gerald

tenfoot

unread,

Jun 9, 2009, 6:59:25 PM6/9/09

to Beagle Board

What about shipping costs? I guess I'm paying to ship it to you, but
what about the return direction?

Regards

Rob

On Jun 7, 1:28 am, Gerald Coley <ger...@beagleboard.org> wrote:
> You should not have to. It is marked as an RMA repair. It is unclear at this
> time if this is a SW or HW issue,. We only have two boards that have
> reported th eissue in hous eand we can't get them to fail. So, there may be
> someting a little off that shows up more often on some boards than others.
> We have replaced some boards and they seem to work.
>
> It is your call as to whether or not you want to send it in for replacement
> or to wait and see what happens in the SW realm. You can move to the OTG for
> host funtions if you like in the mean time.
>
> Gerald
>

Gerald Coley

unread,

Jun 9, 2009, 8:34:46 PM6/9/09

to beagl...@googlegroups.com

We always pay for the return shipping.

Gerald

Rob Walker

unread,

Jun 9, 2009, 3:36:25 PM6/9/09

to beagl...@googlegroups.com

What about the shipping costs? Who pays for each direction?

Rob

Gerald Coley

unread,

Jun 9, 2009, 10:30:43 PM6/9/09

to beagl...@googlegroups.com

We will pay for the return shipment.

Gerald

David Hagood

unread,

Jun 10, 2009, 9:43:18 AM6/10/09

to Beagle Board

I dropped my board in the mail last night - I would expect it to
arrive there in a couple of days.

I enclosed a summary of the tests I've run and a reference to this
board.

Gerald Coley

unread,

Jun 10, 2009, 9:46:36 AM6/10/09

to beagl...@googlegroups.com

Sounds good!

Thnak you!

Gerald

Joep Schroen

unread,

Jun 11, 2009, 3:45:46 AM6/11/09

to beagl...@googlegroups.com

Because of the USB HOST problems I've switched to the USB OTG port, with various success;

- USB OTG -> HUB -> HD + RT73 based WIFI dongle

- DD 10GB from HD to /dev/null works as expected (USB HOST -> EMI crash after 100MB)

- FTP from WIFI to HD -> after few 100MB I get the error bellow:

beagleboard login: usb 2-1.1: reset high speed USB device using musb_hdrc and ad

dress 3

------------[ cut here ]------------

WARNING: at drivers/usb/musb/musb_host.c:128 musb_h_tx_flush_fifo+0x94/0xd4()

Could not flush host TX3 fifo: csr: 0003

Modules linked in: ipv6 rt73

[<c039e304>] (dump_stack+0x0/0x14) from [<c00647b0>] (warn_slowpath+0x5c/0x78)

[<c0064754>] (warn_slowpath+0x0/0x78) from [<c0270c4c>] (musb_h_tx_flush_fifo+0x

94/0xd4)

r3:00000003 r2:c0484cfb

r6:ffffffff r5:00000003 r4:00000003

[<c0270bb8>] (musb_h_tx_flush_fifo+0x0/0xd4) from [<c0271998>] (musb_cleanup_urb

+0xd4/0x120)

[<c02718c4>] (musb_cleanup_urb+0x0/0x120) from [<c0271ff8>] (musb_urb_dequeue+0x

140/0x174)

[<c0271eb8>] (musb_urb_dequeue+0x0/0x174) from [<c0252898>] (unlink1+0xb8/0xc4)

[<c02527e0>] (unlink1+0x0/0xc4) from [<c0253088>] (usb_hcd_unlink_urb+0x58/0x74)

r8:cf89dfa0 r7:ffffff98 r6:cf9a2560 r5:a0000093 r4:00000000

[<c0253030>] (usb_hcd_unlink_urb+0x0/0x74) from [<c0253b6c>] (usb_unlink_urb+0x4

0/0x44)

r7:cf89df94 r6:00000500 r5:cf99a000 r4:cf99a35c

[<c0253b2c>] (usb_unlink_urb+0x0/0x44) from [<c0268b70>] (usb_stor_stop_transpor

t+0x3c/0x68)

[<c0268b34>] (usb_stor_stop_transport+0x0/0x68) from [<c0268060>] (command_abort

+0x7c/0x90)

r4:cf99a35c

[<c0267fe4>] (command_abort+0x0/0x90) from [<c02208bc>] (scsi_send_eh_cmnd+0x128

/0x21c)

r4:40000013

[<c0220794>] (scsi_send_eh_cmnd+0x0/0x21c) from [<c02209dc>] (scsi_eh_tur+0x2c/0

x88)

[<c02209b0>] (scsi_eh_tur+0x0/0x88) from [<c0221360>] (scsi_error_handler+0x184/

0x364)

r5:cfab48ac r4:cfab48a0

[<c02211dc>] (scsi_error_handler+0x0/0x364) from [<c0078190>] (kthread+0x5c/0x94

)

[<c0078134>] (kthread+0x0/0x94) from [<c00679ac>] (do_exit+0x0/0x6a8)

r6:00000000 r5:00000000 r4:00000000

---[ end trace fc1b8fac7574014e ]---

usb 2-1.1: reset high speed USB device using musb_hdrc and address 3

And again:

root@beagleboard:~# usb 2-1.1: reset high speed USB device using musb_hdrc and a

ddress 3

------------[ cut here ]------------

WARNING: at drivers/usb/musb/musb_host.c:128 musb_h_tx_flush_fifo+0x94/0xd4()

Could not flush host TX3 fifo: csr: 0003

Modules linked in: ipv6 rt73

[<c039e304>] (dump_stack+0x0/0x14) from [<c00647b0>] (warn_slowpath+0x5c/0x78)

[<c0064754>] (warn_slowpath+0x0/0x78) from [<c0270c4c>] (musb_h_tx_flush_fifo+0x

94/0xd4)

r3:00000003 r2:c0484cfb

r6:ffffffff r5:00000003 r4:00000003

[<c0270bb8>] (musb_h_tx_flush_fifo+0x0/0xd4) from [<c0271998>] (musb_cleanup_urb

+0xd4/0x120)

[<c02718c4>] (musb_cleanup_urb+0x0/0x120) from [<c0271ff8>] (musb_urb_dequeue+0x

140/0x174)

[<c0271eb8>] (musb_urb_dequeue+0x0/0x174) from [<c0252898>] (unlink1+0xb8/0xc4)

[<c02527e0>] (unlink1+0x0/0xc4) from [<c0253088>] (usb_hcd_unlink_urb+0x58/0x74)

r8:cf949fa0 r7:ffffff98 r6:cf89c560 r5:a0000093 r4:00000000

[<c0253030>] (usb_hcd_unlink_urb+0x0/0x74) from [<c0253b6c>] (usb_unlink_urb+0x4

0/0x44)

r7:cf949f94 r6:00000500 r5:cf99c000 r4:cf99c35c

[<c0253b2c>] (usb_unlink_urb+0x0/0x44) from [<c0268b70>] (usb_stor_stop_transpor

t+0x3c/0x68)

[<c0268b34>] (usb_stor_stop_transport+0x0/0x68) from [<c0268060>] (command_abort

+0x7c/0x90)

r4:cf99c35c

[<c0267fe4>] (command_abort+0x0/0x90) from [<c02208bc>] (scsi_send_eh_cmnd+0x128

/0x21c)

r4:40000013

[<c0220794>] (scsi_send_eh_cmnd+0x0/0x21c) from [<c02209dc>] (scsi_eh_tur+0x2c/0

x88)

[<c02209b0>] (scsi_eh_tur+0x0/0x88) from [<c0221360>] (scsi_error_handler+0x184/

0x364)

r5:cfab48ac r4:cfab48a0

[<c02211dc>] (scsi_error_handler+0x0/0x364) from [<c0078190>] (kthread+0x5c/0x94

)

[<c0078134>] (kthread+0x0/0x94) from [<c00679ac>] (do_exit+0x0/0x6a8)

r6:00000000 r5:00000000 r4:00000000

---[ end trace ace7fd02eed9425e ]---

usb 2-1.1: reset high speed USB device using musb_hdrc and address 3

Does anybody know what's happening here?

Wkr,

Joep

2009/6/10 Gerald Coley <ger...@beagleboard.org>

David Hagood

unread,

Jun 12, 2009, 10:05:04 AM6/12/09

to Beagle Board

Well, my board is now in the hospital, so hopefully the Doctors there
will be able to pop some Vicodin, make some sexist comments, and come
up with a diagnosis as to what the problem is. I hope my reports on
how to reproduce the issue will help.

Gerald Coley

unread,

Jun 12, 2009, 10:27:37 AM6/12/09

to beagl...@googlegroups.com

Yes, the head surgeon has it on his desk and yes it does fail. It definitely has some sort of rash. We are thinking it may be allergic to Linux.

Gerald

Marcus Bauer

unread,

Jun 12, 2009, 10:55:38 AM6/12/09

to beagl...@googlegroups.com

I figured out that mine has problems with the d-link dwa-110. The USB
seems only to die when the corresponding module is loaded. This does
not depend on data transfer, the USB can even die when the wifi is not
associated to an AP thus no i/o happening.

HTH
Marcus

David Hagood

unread,

Jun 12, 2009, 11:08:06 AM6/12/09

to Beagle Board

On Jun 12, 9:27 am, Gerald Coley <ger...@beagleboard.org> wrote:
> Yes, the head surgeon has it on his desk and yes it does fail. It definitely
> has some sort of rash. We are thinking it may be allergic to Linux.
>
> Gerald
>

Good - in that the problem is reproducible and therefor (hopefully)
identifiable and fixable.

Are you seeing the problem using the tests you'd normally use, or only
using the tests I've recommended (adding one or more interrupt-
endpoint devices)?

I'm really very curious because of your past comments that you've
previously not been able to reproduce the problems, and I'm hoping
that my board is just a very good exemplar of the problem, and not
some other problem unrelated to what others have seen.

Also: I've heard rumblings that the suspicion is on the USB host core
TI licensed, and given that I am looking at a production design where
that core would be useful I'd like to know if that really is the case
or not.

Gerald Coley

unread,

Jun 12, 2009, 11:14:57 AM6/12/09

to beagl...@googlegroups.com

Our normal tests pass, but we have a new stress test that seems to cause the issue to show up on a few boards, including yours. So, we are looking to find the issue as we speak. We do not know if it is HW or SW related, but it does not show up on all boards.

Gerald

Frans Meulenbroeks

unread,

Jun 12, 2009, 1:14:40 PM6/12/09

to beagl...@googlegroups.com

2009/6/12 Gerald Coley <ger...@beagleboard.org>:

> Our normal tests pass, but we have a new stress test that seems to cause the
> issue to show up on a few boards, including yours. So, we are looking to
> find the issue as we speak. We do not know if it is HW or SW related, but it
> does not show up on all boards.
>
> Gerald

Can you share this stress test?
I'm seeing issues with mine and would like to verify myself if I have
the problem or not.

Frans

Gerald Coley

unread,

Jun 12, 2009, 2:11:46 PM6/12/09

to beagl...@googlegroups.com

dd if=/dev/sda of=/dev/null bs=1M

You may have to run it 2-3 times before it fails.

Now, just because you have a board that fails the test, I am not in a position to have a bunch of RMAs coming in to get them "fixed". Again, we do not know at this point what exactly the issue is and why it just happens on certain boards.

Gerald

David Hagood

unread,

Jun 12, 2009, 5:43:07 PM6/12/09

to Beagle Board

Forgive me for asking, but I've found that sometimes it's the
questions you didn't ask that get you in trouble:

Did you run the stress test against the replacement board you sent out
to me? Did it pass?

Gerald Coley

unread,

Jun 12, 2009, 5:59:45 PM6/12/09

to beagl...@googlegroups.com

I am not sure if I should answer that, considering the fact that you felt that you had to ask it.

Gerald

Gerald Coley

unread,

Jun 12, 2009, 6:01:34 PM6/12/09

to beagl...@googlegroups.com

BTW, the board you sent back was one of the first Rev C2 boards that went out. It seems to me to have a lot of problems besides just EHCI. It is very unstable.

Gerald

On Fri, Jun 12, 2009 at 4:43 PM, David Hagood <david....@aeroflex.com> wrote:

Marcus Bauer

unread,

Jun 13, 2009, 2:04:28 AM6/13/09

to beagl...@googlegroups.com

On Fri, 12 Jun 2009 13:11:46 -0500
Gerald Coley <ger...@beagleboard.org> wrote:

> dd if=/dev/sda of=/dev/null bs=1M
>
> You may have to run it 2-3 times before it fails.

My board fails within 4-5 seconds and the LCD starts immediately to
flicker then goes off.

The last messages on the serial console are:

omap-dss DISPC error: dispc irq error status
00e6 omap-dss DISPC error: dispc irq error status
0040 omap-dss DISPC error: dispc irq error status
00c0 omap-dss DISPC error: dispc irq error status
00e2 omap-dss DISPC error: dispc irq error status
0040 omap-dss DISPC error: dispc irq error status
00c0 omap-dss DISPC error: dispc irq error status
00e2 omap-dss DISPC error: dispc irq error status
00e2 omap-dss DISPC error: dispc irq error status
00e2 omap-dss DISPC error: dispc irq error status
0040 omap-dss DISPC error: Excessive DISPC
errors Turning off lcd and
digit omap-dss DISPC error: Excessive DISPC
errors Turning off lcd and digit

Marcus

karminet

unread,

Jul 23, 2009, 3:14:11 AM7/23/09

to Beagle Board

Dear Gerald

We saw this post right now after a night of work to understand what we
were doing wrong 'cause the USB was experiencing strange "USB
disconnect" errors on heavy net traffic (we are using an USB-LAN
adapter).
Is it possible to know if the issue has been identified and a
workaround is possible?

Kind regards,
Dario

Nuno

unread,

Jul 23, 2009, 3:37:52 AM7/23/09

to Beagle Board

> We saw this post right now after a night of work to understand what we
> were doing wrong 'cause the USB was experiencing strange "USB
> disconnect" errors on heavy net traffic (we are using an USB-LAN
> adapter).

I have 3 REVC boards and im using an USB-LAN adapter (asix based), my
solution to this was to put the beagleboard inside an antistatic
bag... looks stupid i know but it works....
BTW have you tried to compile something in the board? all of mine
boards gives me segfault and internal errors in the gcc.. i suspect
RAM issues....

Gerald Coley

unread,

Jul 23, 2009, 7:43:04 AM7/23/09

to beagl...@googlegroups.com

The issue is being identified and a work around is being worked on. It is not a Beagle issue per se, but an OMAP issue. Some boards are worse than others and some boards do not have the issue. The root cause has not been totally nailed down yet and TI is working the issue aggressively.

Gerald

Kiam Peng Wee

unread,

Jul 23, 2009, 7:51:20 AM7/23/09

to beagl...@googlegroups.com

On Thu, Jul 23, 2009 at 7:43 PM, Gerald Coley<ger...@beagleboard.org> wrote:
> The issue is being identified and a work around is being worked on. It is
> not a Beagle issue per se, but an OMAP issue. Some boards are worse than
> others and some boards do not have the issue. The root cause has not been
> totally nailed down yet and TI is working the issue aggressively.
>
> Gerald
>
>

I have 2 rev C3 beagleboards. 1 of them have this issue, the other not.
do keep us posted if they managed to find a workaround.

KP

Diego Almeida

unread,

Jul 23, 2009, 7:57:19 AM7/23/09

to beagl...@googlegroups.com

then this may be my problem ...
I have a BB Rev C3 the video works but is slow and can not run video wmv

---------------------------------------------
Diego de Almeida
Comsat - Tecnologia
Tel.: 55 11 - 3078 2824
55 11 - 3078 2816
www.comsattecnologia.com.br

--------------------------------------------------------------------------------
Diego
ComSat - Tecnologia
Tel.: (55) 11 3078 2816
(55) 11 3078 2824

Gerald Coley

unread,

Jul 23, 2009, 7:57:59 AM7/23/09

to beagl...@googlegroups.com

Will do. What we have found is that if you lower the ARM clock to 250MHZ the issues goes away. We are looking into the clock tree settings inside the OMAP. We are also working on a way to recover from the condition.

Gerald

Gerald Coley

unread,

Jul 23, 2009, 7:59:47 AM7/23/09

to beagl...@googlegroups.com

This is an EHCI issue that will force a disconnect of the USB PHY requiring you to power cycle the board. Is this what you are seeing?

Gerald

Diego Almeida

unread,

Jul 23, 2009, 8:02:38 AM7/23/09

to beagl...@googlegroups.com

my problem is that I am not able to run the video in a BB C3 wmv HD video

---------------------------------------------
Diego de Almeida
Comsat - Tecnologia
Tel.: 55 11 - 3078 2824
55 11 - 3078 2816
www.comsattecnologia.com.br

Gerald Coley

unread,

Jul 23, 2009, 8:23:29 AM7/23/09

to beagl...@googlegroups.com

Then I don't think this is an EHCI issue but some other issue you are having.

Gerald

Diego Almeida

unread,

Jul 23, 2009, 9:33:58 AM7/23/09

to beagl...@googlegroups.com

config ethernet do not work out in the 2009 angstrom

---------------------------------------------
Diego de Almeida
Comsat - Tecnologia
Tel.: 55 11 - 3078 2824
55 11 - 3078 2816
www.comsattecnologia.com.br

Thu, 23 Jul 2009 06:59:47 -0500, Gerald Coley escreveu:

Gerald Coley

unread,

Jul 23, 2009, 9:48:06 AM7/23/09

to beagl...@googlegroups.com

Then I suggest that you post a sperate email thread to the group concerning this to get some assistance.

Gerald

karminet

unread,

Jul 23, 2009, 6:06:59 PM7/23/09

to Beagle Board

I found how to lower the ARM clock to 250MHZ.

The source u-boot file that have to be modified is include/asm-arm/
arch-omap3/clocks_omap3.h.

Here I modified the MPU_M_13_ES2 register.

For more informations go to http://git.mansr.com/?p=u-boot;a=commitdiff;h=045149ea1076575f773079677a3d1b01ff71757c

Dario

Nuno Felicio

unread,

Jul 24, 2009, 5:12:32 PM7/24/09

to Beagle Board

> I found how to lower the ARM clock to 250MHZ.

This solves the instability with EHCI?

Nuno Felicio

unread,

Jul 24, 2009, 5:13:52 PM7/24/09

to Beagle Board

Dario,

> I found how to lower the ARM clock to 250MHZ.

Have your problem gone away???

> For more informations go tohttp://git.mansr.com/?p=u-boot;a=commitdiff;h=045149ea1076575f7730796...
>
> Dario

Gerald Coley

unread,

Jul 24, 2009, 7:43:38 PM7/24/09

to beagl...@googlegroups.com

This is VERY PRELIMINARY information. It appears that lowering the VDD2 voltage solves the EHCI issue that some people may be having. What we don't know is what impact, if any, that lowering the voltage will have in other areas of the system. We also don't know what the exact voltage setting needs to be. We think that a seting of between 1V and 1.05V will work, but that is not definite. We are working to do more testing to determine where it needs to be set to. The problem is that the issue is only seen in about 40% of the units. 60% of the units work just fine at the current default settings.

If anyone that is having this issue wants to play around with the VDD2 voltage settings feel free to do so and please let us know your results. We hope to have an official solution in a couple of weeks at the latest after more data is collected from our testing.

Gerald

karminet

unread,

Jul 25, 2009, 6:08:06 PM7/25/09

to Beagle Board

Yes. It finally resolves the problem about EHCI.

The BB becomes really stable and no any more instability behaviour
appears.

I have found a strange phenomena jet: when you boot the BB sometimes
(not predictable) the usb adaptor is not detect correctly and eth0
will not come up. In order to avoid this I compile the kernel and I
have applied the patch described here:

http://article.gmane.org/gmane.linux.usb.general/19647

Best regards

Dario

Frantisek Dufka

unread,

Jul 27, 2009, 8:21:25 AM7/27/09

to beagl...@googlegroups.com, ger...@beagleboard.org

Gerald Coley wrote:
> If anyone that is having this issue wants to play around with the VDD2
> voltage settings feel free to do so and please let us know your
> results.

Is VDD2 software controlled? If I ask such question does it
automatically mean I am not qualified to play around with this? The best
would be prebuild kernel with tunable VDD2 voltage via some knob in
/sys. Or maybe it can be changed via some uboot command before booting
kernel?

I think I do have this issue and I would be willing to play with it. It
is rev C2 board and no matter what power supply I attach or what usb hub
I attach the hub is disconnected when I attach usb keyboard and the EHCI
port doesn't work (re-plugging hub, trying other usb 2.0 device) until I
boot the board again. I tried 2 hubs, 3 power supplies (5V/1A and two
5V/2.5A), 2 keyboards.

Also usb harddisk connected directly to EHCI port mostly works but fails
later when reading more data (using dd, testing longer movie in mplayer).

All this happens when using images from
http://code.google.com/p/beagleboard/wiki/BeagleboardRevCValidation

> We hope to have an official solution in a couple of weeks at
> the latest after more data is collected from our testing.

Great. I hope this can be fixed only in software. I noticed this issue
only recently and it is more than 90 days from buying the board so
according to http://beagleboard.org/support I cannot do RMA (and anyway
I am in Europe and bought it via friend in US so the digi-key invoice
has different name and address).

Frantisek

Nuno Felicio

unread,

Jul 27, 2009, 5:41:03 PM7/27/09

to Beagle Board

Gerald, i too have this problem, at this moment im running a test
using the OMAP3 processor at 250 Mhz in two beagleboards that have
this problem :(,
at this moment it is stable, if your project is viable using the board
at a lower performance i can send you the uboot image with that little
mod in it.... good luck ;)

Nuno

On Jul 27, 1:21 pm, Frantisek Dufka <duf...@gmail.com> wrote:
> Gerald Coley wrote:
> > If anyone that is having this issue wants to play around with the VDD2
> > voltage settings feel free to do so and please let us know your
> > results.
>
> Is VDD2 software controlled? If I ask such question does it
> automatically mean I am not qualified to play around with this? The best
> would be prebuild kernel with tunable VDD2 voltage via some knob in
> /sys. Or maybe it can be changed via some uboot command before booting
> kernel?
>
> I think I do have this issue and I would be willing to play with it. It
> is rev C2 board and no matter what power supply I attach or what usb hub
> I attach the hub is disconnected when I attach usb keyboard and the EHCI
> port doesn't work (re-plugging hub, trying other usb 2.0 device) until I
> boot the board again. I tried 2 hubs, 3 power supplies (5V/1A and two
> 5V/2.5A), 2 keyboards.
>
> Also usb harddisk connected directly to EHCI port mostly works but fails
> later when reading more data (using dd, testing longer movie in mplayer).
>

> All this happens when using images fromhttp://code.google.com/p/beagleboard/wiki/BeagleboardRevCValidation

>
> > We hope to have an official solution in a couple of weeks at
> > the latest after more data is collected from our testing.
>
> Great. I hope this can be fixed only in software. I noticed this issue
> only recently and it is more than 90 days from buying the board so

> according tohttp://beagleboard.org/supportI cannot do RMA (and anyway

Gerald Coley

unread,

Jul 27, 2009, 7:31:10 PM7/27/09

to beagl...@googlegroups.com

There is another fix that we are working on. We hope to have more information later this week on that. It does not require slowing down the processor.

Gerald

Nuno Felicio

unread,

Jul 29, 2009, 4:22:48 PM7/29/09

to Beagle Board

After 2 days one of two beagleboards i was running at half the speed
(250Mhz) to attempt to solve the EHCI issue have dropped the ECHI
port :(, im sorry but i must give up at this moment my attempts to
work around this problems .... I will RMA the board that failed and
use for the moment another OMAP3 board that have an ethernet port on
board...

>here is another fix that we are working on. We hope to have more
> information later this week on that. It does not require slowing down the
> processor.

I hope this solves the issue .... the beagleboard have many
potentialities but it need stable connectivity....

Good Luck!!! ;)

On Jul 28, 12:31 am, Gerald Coley <ger...@beagleboard.org> wrote:
> There is another fix that we are working on. We hope to have more
> information later this week on that. It does not require slowing down the
> processor.
>
> Gerald
>

> > > according tohttp://beagleboard.org/supportIcannot do RMA (and anyway

Gerald Coley

unread,

Jul 29, 2009, 4:27:01 PM7/29/09

to beagl...@googlegroups.com

The RMA for the EHCI issue will not be accepted. The solution is based on a SW fix that we are working on which involves adjsuting the VDD2 volatge level. It is nothing that can be "repaired" via an RMA. You can attempt to do this yourself or you cna wait until the SW fix has been released.

Gerald

Nuno Felicio

unread,

Jul 29, 2009, 6:16:55 PM7/29/09

to Beagle Board

Gerald, thank you for the reply

> The RMA for the EHCI issue will not be accepted.

I have 3 RevC boards, all of them have this problem, i need to
complete a project for my company, its a critical project and for that
i bought another board so that i have hardware that worked well.

> The solution is based on a
> SW fix that we are working on which involves adjsuting the VDD2 volatge
> level.

Yes, you have said that already, but HOW ? in uboot? in kernel ? using
APM ? in what range ? Up or Down ? Much or little ? Maybe an
experimental patch.. something.....

>You can attempt to
> do this yourself

I know a little of embedded work using linux and ARM arch but.... for
example to discover howto to slow down the OMAP i had to search in
uboot, to use tome tips from another users, nothing is officially
explained how to deal with this issue.

> you cna wait until the SW fix has been released.

No i cant, im sorry but my project is not an hobby project.

I understand the complexity of the problem, but you must understand
also that some of the buyers of the beagleboard are doing serious
work, and if there are problems with the EHCI port ( a critical
component for me and im sure, to many others) i think that a word of
warning to potential buyers should be said... There are many people
have been wanting for an working usb Host port to buy an
beagleboard... and now this...

thanks again and good luck

Nuno

On Jul 29, 9:27 pm, Gerald Coley <ger...@beagleboard.org> wrote:
> The RMA for the EHCI issue will not be accepted. The solution is based on a
> SW fix that we are working on which involves adjsuting the VDD2 volatge
> level. It is nothing that can be "repaired" via an RMA. You can attempt to
> do this yourself or you cna wait until the SW fix has been released.
>
> Gerald
>

> > > > > according tohttp://beagleboard.org/supportIcannotdo RMA (and anyway

Gerald Coley

unread,

Jul 29, 2009, 6:51:47 PM7/29/09

to beagl...@googlegroups.com

It is set in the kernel. The "how" is the piece that we are working on along with the "what". We need to make sure there are no other side affects in making this change.

Gerald

Antti P Miettinen

unread,

Aug 4, 2009, 6:23:44 AM8/4/09

to beagl...@googlegroups.com

Gerald Coley <ger...@beagleboard.org> writes:
> This is VERY PRELIMINARY information. It appears that lowering the VDD2
> voltage solves the EHCI issue that some people may be having.

Any experimental kernels/bootloaders to try? I'm running my
beagleboard from a USB disk and it usually runs without problems for
quite long times, but eventually I get something like:

hub 2-2:1.0: cannot reset port 2 (err = -71)
hub 2-2:1.0: cannot reset port 2 (err = -71)
hub 2-2:1.0: cannot reset port 2 (err = -71)
hub 2-2:1.0: cannot reset port 2 (err = -71)
hub 2-2:1.0: cannot reset port 2 (err = -71)
hub 2-2:1.0: Cannot enable port 2. Maybe the USB cable is bad?

etc.

I'd be happy to test any patches or do some debugging if someone
guides me into what kind of information would be useful :-)

--
http://www.iki.fi/~ananaza/

Gerald Coley

unread,

Aug 4, 2009, 12:17:59 PM8/4/09

to beagl...@googlegroups.com

I will keep you in mind and let you know when we have something. Right now we are still trying to figure out the best approach and get it tested.

Gerald

Joep Schroen

unread,

Aug 13, 2009, 8:51:51 AM8/13/09

to beagl...@googlegroups.com

Hi Gerald,

Maybe you can give us a status update?

Wkr,

Joep

2009/8/4 Gerald Coley <ger...@beagleboard.org>

Gerald Coley

unread,

Aug 13, 2009, 9:06:23 AM8/13/09

to beagl...@googlegroups.com

You need to enable Smart Reflex on VDD2 in the Kernel. I can't seem to get anyone around here to help me formalize exactly how to do this in the Kernel. So if you can figure it out, I would go ahead and do it. If you could please then post it to the community, that would be great.

Gerald

Frans Meulenbroeks

unread,

Aug 14, 2009, 5:07:09 PM8/14/09

to beagl...@googlegroups.com

2009/8/13 Gerald Coley <ger...@beagleboard.org>:

> You need to enable Smart Reflex on VDD2 in the Kernel. I can't seem to get
> anyone around here to help me formalize exactly how to do this in the
> Kernel. So if you can figure it out, I would go ahead and do it. If you
> could please then post it to the community, that would be great.
>
> Gerald

Did a quick google and got these two links from the TI forum:
http://community.ti.com/forums/p/4289/16084.aspx#16084
http://e2e.ti.com/forums/p/4319/15852.aspx

Haven't tried it myself (and it is too late do do now), but maybe it
helps someone else getting started.

Frans

Frantisek Dufka

unread,

Aug 14, 2009, 6:40:16 PM8/14/09

to beagl...@googlegroups.com

Gerald Coley wrote:
> You need to enable Smart Reflex on VDD2 in the Kernel. I can't seem to
> get anyone around here to help me formalize exactly how to do this in
> the Kernel. So if you can figure it out, I would go ahead and do it. If
> you could please then post it to the community, that would be great.
>
> Gerald

http://www.webos-internals.org/wiki/Patch_webOS_CPU_Frequency_or_Voltage_Scaling#Enable_SmartReflex

the file is there but after running
echo -n 1 > /sys/power/sr_vdd2_autocomp
there is still zero in the file and in kernel log I see
OPP3 doesn't support SmartReflex
SR2: VDD autocomp not activated

Maybe different kernel would do.

Siarhei Siamashka

unread,

Aug 15, 2009, 4:04:14 PM8/15/09

to beagl...@googlegroups.com, linux...@vger.kernel.org

Would it make sense to ask for help in the linux-omap mailing list?
I guess Gerald has a reply from TI or a new errata list or something else
which says that "enabling Smart Reflex on VDD2" can workaround EHCI problem.

Now somebody knowledgeable about kernel internals, OMAP3, Smart Reflex
and power management code in general could probably provide some hints
or even a patch to enable this bugfix in the kernel. Maybe the fix is even
sitting in somebody's branch already and we just don't know. Or maybe a
solution for EHCI problem has been explained in the list, but just was
lost in the pile of other messages in linux-omap (it has quite a heavy
traffic).

It may make sense to look at the 'pm' branch:
http://elinux.org/OMAP_Power_Management

But this wiki page has comment:
"Enables SmartReflex autocompensation on VDD2 (Note: This feature can only be
tested on a ES3.1 silicon):
# echo 1 > /sys/power/sr_vdd2_autocomp"

And AFAIK beagleboard revision C has ES3.0 silicon. So no luck here. Or can
something still be done?

The whole beagleboard EHCI story starts here:
http://groups.google.com/group/beagleboard/browse_thread/thread/5b8385f0bb1f63da/d46625fe49783a8a

--
Best regards,
Siarhei Siamashka

Koen Kooi

unread,

Aug 15, 2009, 4:06:24 PM8/15/09

to beagl...@googlegroups.com

Op 15 aug 2009, om 22:04 heeft Siarhei Siamashka het volgende
geschreven:

>
> On Saturday 15 August 2009, Frantisek Dufka wrote:
>> Gerald Coley wrote:
>>> You need to enable Smart Reflex on VDD2 in the Kernel. I can't
>>> seem to
>>> get anyone around here to help me formalize exactly how to do this
>>> in
>>> the Kernel. So if you can figure it out, I would go ahead and do
>>> it. If
>>> you could please then post it to the community, that would be great.
>>>
>>> Gerald
>>
>> http://www.webos-internals.org/wiki/Patch_webOS_CPU_Frequency_or_Voltage_Sc
>> aling#Enable_SmartReflex
>>
>> the file is there but after running
>> echo -n 1 > /sys/power/sr_vdd2_autocomp
>> there is still zero in the file and in kernel log I see
>> OPP3 doesn't support SmartReflex
>> SR2: VDD autocomp not activated
>>
>> Maybe different kernel would do.
>
> Would it make sense to ask for help in the linux-omap mailing list?

You mean like this: http://thread.gmane.org/gmane.linux.ports.arm.omap/22131
:)

regards,

Koen

PGP.sig

Siarhei Siamashka

unread,

Aug 15, 2009, 5:14:17 PM8/15/09

to beagl...@googlegroups.com

On Saturday 15 August 2009, Koen Kooi wrote:
> Op 15 aug 2009, om 22:04 heeft Siarhei Siamashka het volgende
>
> geschreven:
> > On Saturday 15 August 2009, Frantisek Dufka wrote:
> >> Gerald Coley wrote:
> >>> You need to enable Smart Reflex on VDD2 in the Kernel. I can't
> >>> seem to
> >>> get anyone around here to help me formalize exactly how to do this
> >>> in
> >>> the Kernel. So if you can figure it out, I would go ahead and do
> >>> it. If
> >>> you could please then post it to the community, that would be great.
> >>>
> >>> Gerald
> >>
> >> http://www.webos-internals.org/wiki/Patch_webOS_CPU_Frequency_or_Voltage

> >>_Sc aling#Enable_SmartReflex

> >>
> >> the file is there but after running
> >> echo -n 1 > /sys/power/sr_vdd2_autocomp
> >> there is still zero in the file and in kernel log I see
> >> OPP3 doesn't support SmartReflex
> >> SR2: VDD autocomp not activated
> >>
> >> Maybe different kernel would do.
> >
> > Would it make sense to ask for help in the linux-omap mailing list?
>
> You mean like this:
> http://thread.gmane.org/gmane.linux.ports.arm.omap/22131
>
> :)

OK, thanks. It's good that somebody is actively looking for a solution.

This EHCI issue suddenly started to be a problem for me too just a few days
ago when I tried to hook an external USB HDD to beagle. I did not pay much
attention to these discussion threads earlier so don't have a full picture
yet. And I'm not sure if I have enough spare time to catch up with this stuff,
probably I will just put this HDD on a shelf and wait a few weeks/months :)

Koen Kooi

unread,

Aug 15, 2009, 5:44:39 PM8/15/09

to beagl...@googlegroups.com

Op 15 aug 2009 om 23:14 heeft Siarhei Siamashka <siarhei....@gmail.com

I'm kind of in the same situation with getting a decent fix out :(

I hope to get something ready tomorrow.

regards,

koen

Gregoire Gentil

unread,

Aug 15, 2009, 6:02:18 PM8/15/09

to Siarhei Siamashka, beagl...@googlegroups.com, linux...@vger.kernel.org

On the Touch Book (which has the same transceiver as Beagleboard), we
experience a similar problem with high speed USB. I'm playing with those
various options and patches but it's not working so far. I'm then very
interested to get any help on this problem too. More documentation about
Smart Reflex would help,

Grégoire

Paul Walmsley

unread,

Aug 15, 2009, 8:19:23 PM8/15/09

to Siarhei Siamashka, beagl...@googlegroups.com, linux...@vger.kernel.org

Hi

On Sat, 15 Aug 2009, Siarhei Siamashka wrote:

> But this wiki page has comment:
> "Enables SmartReflex autocompensation on VDD2 (Note: This feature can only be
> tested on a ES3.1 silicon):
> # echo 1 > /sys/power/sr_vdd2_autocomp"
>
> And AFAIK beagleboard revision C has ES3.0 silicon. So no luck here. Or can
> something still be done?

It should be possible to enable SmartReflex on <ES3.1, but the SR register
values need to be programmed at runtime. (ES3.1 has the SR register
values blown in to eFUSE OTPROM)

As far as I know, only TI has this information, so it's up to them to
release it. Although one of the device vendor kernel patches/tarballs
might have something useful here - I haven't looked recently.

- Paul

Woodruff, Richard

unread,

Aug 15, 2009, 10:53:51 PM8/15/09

to Siarhei Siamashka, beagl...@googlegroups.com, linux...@vger.kernel.org

> From: linux-om...@vger.kernel.org [mailto:linux-omap-
> ow...@vger.kernel.org] On Behalf Of Siarhei Siamashka
> Sent: Saturday, August 15, 2009 3:04 PM

> On Saturday 15 August 2009, Frantisek Dufka wrote:
> > Gerald Coley wrote:
> > > You need to enable Smart Reflex on VDD2 in the Kernel. I can't seem to
> > > get anyone around here to help me formalize exactly how to do this in
> > > the Kernel. So if you can figure it out, I would go ahead and do it. If
> > > you could please then post it to the community, that would be great.
> > >
> > > Gerald
> >
> > http://www.webos-internals.org/wiki/Patch_webOS_CPU_Frequency_or_Voltage_Sc
> >aling#Enable_SmartReflex

Seems hacking moves along fast, some basic info and mis-info on Pre. Pre kernel which is shipped needs a few patches and it is capable of running sr+dvfs at the same time.

> > the file is there but after running
> > echo -n 1 > /sys/power/sr_vdd2_autocomp
> > there is still zero in the file and in kernel log I see
> > OPP3 doesn't support SmartReflex
> > SR2: VDD autocomp not activated

In older code you need both compile time and run time to get it to take. Just turning it on won't be so safe until some bug fixes make their way out.

> It may make sense to look at the 'pm' branch:
> http://elinux.org/OMAP_Power_Management

pm branch needs a few changes to make it viable for sr.

> But this wiki page has comment:
> "Enables SmartReflex autocompensation on VDD2 (Note: This feature can only be
> tested on a ES3.1 silicon):
> # echo 1 > /sys/power/sr_vdd2_autocomp"
>
> And AFAIK beagleboard revision C has ES3.0 silicon. So no luck here. Or can
> something still be done?

You will only find characterized parameters for 3.1 and 3.1.1. These are production parts.

I had heard a few bits of talk about running at lower voltage helping ehci out. That seems backwards even if tests show it working. I wonder if it is some artifact of a timing change which happens at lower voltage and because of some new internal chip activity.

Regards,
Richard W.

Koen Kooi

unread,

Aug 16, 2009, 6:08:03 AM8/16/09

to beagl...@googlegroups.com, Gerald Coley

Op 15 aug 2009, om 23:14 heeft Siarhei Siamashka het volgende
geschreven:

>
> On Saturday 15 August 2009, Koen Kooi wrote:
>> Op 15 aug 2009, om 22:04 heeft Siarhei Siamashka het volgende
>>
>> geschreven:
>>> On Saturday 15 August 2009, Frantisek Dufka wrote:
>>>> Gerald Coley wrote:
>>>>> You need to enable Smart Reflex on VDD2 in the Kernel. I can't
>>>>> seem to
>>>>> get anyone around here to help me formalize exactly how to do this
>>>>> in
>>>>> the Kernel. So if you can figure it out, I would go ahead and do
>>>>> it. If
>>>>> you could please then post it to the community, that would be
>>>>> great.
>>>>>
>>>>> Gerald
>>>>
>>>> http://www.webos-internals.org/wiki/Patch_webOS_CPU_Frequency_or_Voltage
>>>> _Sc aling#Enable_SmartReflex
>>>>
>>>> the file is there but after running
>>>> echo -n 1 > /sys/power/sr_vdd2_autocomp
>>>> there is still zero in the file and in kernel log I see
>>>> OPP3 doesn't support SmartReflex
>>>> SR2: VDD autocomp not activated
>>>>
>>>> Maybe different kernel would do.

Try the kernel (and modules) from http://dominion.thruhere.net/koen/OE/vbb/
, that is built from Kevins pm-2.6.29 branch with DSS2, isp, iommu,
resizer, vfp, alsa, led, syscalls and zippy daugtherboard patches
applied. It's basically angstrom kernel + pm - extra musb patches. As
a bonus you will be able to use cpufreq to enable 600MHz :)

regards,

Koen

PGP.sig

Joep Schroen

unread,

Aug 17, 2009, 2:05:38 AM8/17/09

to beagl...@googlegroups.com

Hi Richard,

> But this wiki page has comment:
> "Enables SmartReflex autocompensation on VDD2 (Note: This feature can only be
> tested on a ES3.1 silicon):
> # echo 1 > /sys/power/sr_vdd2_autocomp"
>
> And AFAIK beagleboard revision C has ES3.0 silicon. So no luck here. Or can
> something still be done?

>>You will only find characterized parameters for 3.1 and 3.1.1. These are production parts.

Do you mean that the Beagleboards are shipped with Engineering Samples that have no characterized SR parameters in their fuses "block"? And that means we need those from TI to be programmed "manually" during runtime?

In that case we're really f@#$ ed without help from TI?

Wkr,

Joep

Frantisek Dufka

unread,

Aug 17, 2009, 3:13:46 AM8/17/09

to beagl...@googlegroups.com, Gerald Coley

Koen Kooi wrote:

>>> geschreven:
>>>> On Saturday 15 August 2009, Frantisek Dufka wrote:
>>>>> the file is there but after running
>>>>> echo -n 1 > /sys/power/sr_vdd2_autocomp
>>>>> there is still zero in the file and in kernel log I see
>>>>> OPP3 doesn't support SmartReflex
>>>>> SR2: VDD autocomp not activated
>>>>>
>>>>> Maybe different kernel would do.
>
> Try the kernel (and modules) from
> http://dominion.thruhere.net/koen/OE/vbb/ , that is built from Kevins
> pm-2.6.29 branch with DSS2, isp, iommu, resizer, vfp, alsa, led,
> syscalls and zippy daugtherboard patches applied. It's basically
> angstrom kernel + pm - extra musb patches. As a bonus you will be able
> to use cpufreq to enable 600MHz :)

Thanks Koen,

echo -n 1 > /sys/power/sr_vdd2_autocomp now keeps the value stored and
it maybe even does something. I've successfully read two external usb
disks (60GB, 320GB) via dd with no error. Both were conected to usb hub.
Will try without hub too.

This kernel has two issues though. There are some periodic power
management related stacktraces in kernel log (something with PRCM and
cpu idle) and also when switched to 600MHz (which is default) my Nokia
branded mmcmobile 1GB card gives me i/o errors consistently on specific
block when read via dd. At 500MHz it is OK. Two other SD cards and one
2GB Kingston mmcmobile are fine both at 600 and 500 Mhz.

Also when I enabled sleep while idle it hangs after a while. I still see
the screen output but cursor stops blinking and typing on usb keyboard
does nothing.

All those tests were done only with display and usb keyboard so no
kernel output was saved, next time I'll try serial console and save the log.

Frantisek

Joep Schroen

unread,

Aug 17, 2009, 5:46:58 AM8/17/09

to beagl...@googlegroups.com

That's great news!

Koen, can you somewhere write down which patches to apply to which Kernel release? (maybe a howto on how to use OE/BitBake to build recent Kernels with all those nice patches applied)

Then we can finally start building a kernel that does work :-)

I think it's a good idea to just run the OMAP at 500MHz looking to all stability problems that seem to be related to thermal/timing issues with these E.S. chips.

Wkr,

Joep

2009/8/17 Frantisek Dufka <duf...@gmail.com>

Kiam Peng Wee

unread,

Aug 17, 2009, 7:47:22 AM8/17/09

to beagl...@googlegroups.com

>
> I think it's a good idea to just run the OMAP at 500MHz looking to all
> stability problems that seem to be related to thermal/timing issues with
> these E.S. chips.

Are there thermal issues?

KP