Hi, Sorry for a lenghty post.
Martin Guy wrote:
> According to this thread on the linux-cirrus mailing list:
>
http://www.freelists.org/post/linux-cirrus/edb93xx-RedBoot-PLL1-settings-out-of-spec
> we are running PLL1 out of spec. Quote:
>
> "According to page 131 of the EP93xx user manual the PLL1_X1 output
> frequency range is > 294 MHz and < 368 MHz"
>
> Is clock instability maybe the single cause of the all the serial, USB
> and MMC problems? The posts in that suggest some possible alternative
> settings.
The "recommended" PLL setting:
-------------------------------
I have just verified and tried the exact value from that post in our kernel
2.6.24.7. Unfortunately USB disconnects happen as easily as previously during
apt-get. In fact, the second restart (cold boot) had a USB disconnect within 5
seconds. :/
When I have done my experiment on the PLL1/PLL2/FCLK/HCLK/PCLK frequencies. I
have both stayed in spec. for all the limits mentioned in the datasheet and for
some test I deliberately was *out* of spec on some of them. What adds to the
complexity is that there is two PLLs and about 17-18 different clocks derived
from then main 14.7456 Mhz crystal and the PLLs. Add up the AHB and APB busses
and a bunch of FIQ and IRQ and there's a mess even without mutexes. My tests
have been random trial-and-error type, just to get a feeling for what works and
what doesnt.
The latest finding I have done are these:
-------------------------------------------
* By lowering the peripherial clock, PCLK, I seem to be able to induce USB
disconnect. With 99% probability I can get a spontaneous USB disconnect within
5-120 seconds from the USB connect during boot. That is without teasing the USB
with apt-get or dd.
* When the peripherial clock is overclocked, and runs at the same freq as the
AHB bus clock, HCLK, I can not induce USB disconnect. It seem pretty stable to
apt-get and dd. However, there seem to be some pretty good disturbances of the
CS4202 chip and probably is not a good thing to do for the system as a whole.
These test have never lasted for longer than 4 hours. Im setting up the CPU
right now to do an over-night run.
* There is a "feature" in the DMA: When a DMA channel is enabled you have to
wait some clock cycles before you can access the DMA channel. This waiting time
depends on the clocks for AHB and peripherials. This must be investigated further.
* When serious garbage appear on the serial console, waiting 20-40 min without
touching the Sim.One restore the serial communication to a good state. Now, how
weird is that?! At least this has worked four times for me, interesting to know
if it does for you too.
* There can be bus errors on the AHB bus. These, in combination with some long
IRQs, might screw things up for the USB unit. Remember, setting an extremely
long timeout period allowed fsck to finish its job even when multiple USB resets
was issued.
* This is one thing I have not tested, but I strongly believe it can be part of
the solution: We're running the Sim.One in Fast Bus mode. What I'd like to test
is to run in Async mode. When running in Fast Bus mode, the CPU is clocked with
the HCLK, which only is half of the FCLK. So I think the CPU actually only run
at 100 Mhz, when you think it goes at 200 Mhz. I dont know if the bogoMIPS can
be of guidance here. AVR32 running at 140Mhz and says 140 bogoMIPS, Sim.One say
99 bogoMIPS and 200 Mhz. AVR32 have single cycle execution instructions. Im new
to 920t, so I cant be certain. Do you know?
* I tried to run the system with I-cache and/or D-cache disabled (3
combinations) but as the system was sooo slow none of the test was completed. No
conclution, except them caches being necessary for speed, could be drawn.
Ideas/conslutions/questions/and thoughts:
-------------------------------------------
* Kernel version, 2.6.2x or 2.6.3x doesnt seem to matter for the USB
disconnect/serial garbage problem.
* The garbage and USB disconnect problem might be related to time. (garbage,
wait 30 min, no garbage). What does the RTC and other time functions do in the
software. Do they write any registers that could affect USB transmission. (How
can baud rate (garbage) be affected by this???)
* The peripherial clock might disturb the USB clock (resonance, noise, whatever,
I dont know).
* There might be a lock in a peripherial irq or a mutex, which blocks the USB
code from doing its magic. Bus errors help to prolong the waiting time?
* Need to find out if linux heed the need to wait before accessing DMA channels,
when they just have been enabled.
* Running the system various clocks at less than 100 Mhz, and as equal as
possible, has been the best for me so far.
The need for U-boot source:
----------------------------
To get into async mode it is needed to be in a privileged mode. Using openocd
and Jtag to tweak into this mode have been unsuccessful for me.
I would need sources and patches for any version of U-boot.
I have tried, but failed, to get U-boot compiling. As I do not fully understand
the compile system of U-boot. Hacking the binary dump of the U-boot is an
option, but gives less flexibility. As I gave this a shot, I noted that the
patches in post:
http://groups.google.com/group/sim1/browse_thread/thread/c62c88ecfdf4e043#
have not been applied intact (or at all).
About USB:
------------
There is actually two instances of the USB problem:
Disconnect and Reset. I get both. The difference is the lenght of the signal.
* A USB disconnect for low and full speed USB devices occur when the
differential D+ and D- are logic low for more than 2.5 us. A differential logic
low occur when D+ is logic low and D- is logic high. This is called Single-ended
zero state (SE0). SE0 is used for entering End-of-packet, Disconnect and Reset
states.
* A Reset state occur when the SE0 state lasted at least 10 ms.
(End-of-Packet state is entered when SE0 state during at least 1 bit time, then
Data J state (D+/D- hi/lo levels depending on low/hispeed device) for at least 1
bit time.
(Source of info: USB complete 3ed [Axelson, ISBN10 1-931448-02-7])
Does our end-of-packet end up being interpreted as a disconnect due to bus
errors (AHB bus) and/or long IRQ service times?
Reduction of complexity:
-------------------------
To reduce the complexity of the linux kernel, I'd like to have a minimal OS
which does USB and serial communication. Is there any options available?
Newer versions of U-boot can do USB communication.
Should I just make a minimal linux kernel with most things excluded, like I did
with 2.6.31.6?
If there's silicon bugs, like the Maverick. It will be very difficult to sort
this out running a full system.
The crystal:
--------------
sergio.sorrenti wrote:
> Frequency:14.7456MHz
> Tolerance:+/-50ppm(standard)
Great. So *if* the error is multiplied in the PLL without being smoothed out,
worst case you'd get ~1.5% error in the PLL going from the 14.7456 Mhz to 400
Mhz PLL. However PLLs are pretty complex, as they themself produce noise and
jitter beyond our control. I think my assumption of doing a linear
multiplication to get the error is wrong.
I could not exacly read the info on highest allowed assembly temperature for the
crystal from the datasheet. Maybe Im blind. But the assembly temperature plot at
least show that the peek temp of about 220-225 C is only held for 5-10 seconds
which seem "normal" and hopefully is not too long or hot for this specific crystal.
Only one though: Wasnt the software for the pick-n-place machine Y2K compatible!
The dates... ;) Im assuming this is the correct plot.
> We are also checking the capacity of the tranceiver,
> as we changed from Maxim to Texas equivalent,
Is this the correct datasheet?
http://focus.ti.com/general/docs/lit/getliterature.tsp?genericPartNumber=max3243&fileType=pdf
Regards,
Marcus