On 4/19/2013 8:25 AM, Allan Herriman wrote:
> On Thu, 18 Apr 2013 17:36:35 -0400, rickman wrote:
>
>> On 4/18/2013 8:29 AM, Allan Herriman wrote:
>>>
>>> I've had a block RAM in an FPGA "forget" some bits after a clock
>>> glitch, even though the write enable was permanently tied inactive.
>>> Does that count as a hang?
>>
>> All synchronous logic can screw up with clock glitches. That includes
>> parts where all the logic is internal and you don't see it such as reset
>> controllers. A power glitch and the controller malfunctions...
>
> Yes, all synchronous logic can screw up with clock glitches, etc., but
> mostly it will come good again after a reset. The ram problem affects
> rams used as ROMs - fixed lookup tables that can get corrupted. Reset
> doesn't fix that; the FPGA needs to be reloaded.
>
> Intuitively, one expects that a ROM doesn't change its values regardless
> of how its used. Finding out that preloaded rams in FPGAs used as ROMs
> can change their contents is rather disturbing.
Yes, I get that. But I can see how a clock glitch can screw it up since
it is *not* a ROM, just a RAM with the write disabled. Actually, it is
easy to prevent this if I read the note properly. They talk about a
precondition that the block RAM enable is asserted. Of course, they are
talking about address setup/hold violations, not clock glitches, so your
issue may be slightly different. Did you find that deasserting the
overall enable would prevent the problem?
>>> It's actually a well know feature of certain types of self timed static
>>> ram and is not limited to FPGAs. Xilinx and Altera block rams in all
>>> FPGA families have this problem.
>>>
>>> Here's the relevant Xilinx Answer Record:
>>>
http://www.xilinx.com/support/answers/21870.htm (The answer record says
>>> it happens on an address setup or hold time violation, but I know from
>>> experience that it can happen if the clock has a glitch, in my case
>>> caused by being connected to a DCM that generated an out of spec clock
>>> while it was locking. Apart from that, my design was fully synchronous
>>> and passed STA.)
>>>
>>> One of the Altera families (I forget which one) could hang its RAM so
>>> badly under similar conditions that the part had to be reconfigured for
>>> it to recover. They mentioned it in the datasheet, which I thought was
>>> considerate of them.
>>
>> How do you prevent this problem when designing with PLDs? I do it by
>> designing clocks that don't glitch. I also avoid leaving screwdrivers
>> lay (or is it lie?) on my circuits...
>
>
> There are a few ways to deal with it:
>
> - don't use RAMs as ROMs.
That's not a very good one... :(
> - keep the CE input inactive whenever the clock can glitch
Ok, that makes good sense!
> - use a BUFGMUX / BUFGCE (etc) on the output of a DCM to mute potential
> clock glitches
Yes, I anticipated that.
> - detect corruption and somehow restore the contents, perhaps by
> reloading the FPGA.
Not fun, but possible. I remember a satellite application where to deal
with SEU, they would periodically reconfigure the FPGA. So any glitch
was short lived.
> In all the tests I did, I only had a problem on the initial DCM lock. If
> it survived the initial lock, everything was fine.
> I only noticed this problem at all because my application had hundreds of
> RAMS used ROMs, and they were tested after the FPGA was loaded as part of
> the product self test.
>
> On the worst boards I could get a few failures per minute during testing.
> On the best boards I couldn't get it to fail at all.
This is one of those little known failure modes I expect. I would not
have thought of it. I *would* expect there to be a note about this in
the DCM information, not just an Answer Record that could be found
*after* you are bitten by the bug.
>> Humor aside, I think you said the Xilinx DCM could generate a glitch
>> that would screw up the rest of the chip while the DCM was locking. Is
>> that right?
>
> Yes, exactly right. My particular problem was with Virtex2-Pro in a
> design from about a decade back. I haven't noticed the problem on newer
> devices, but I've been very careful to avoid designs that can fail in
> this way. Hopefully the newer PLL-based clock managers (MMCM) are better
> than the old DLLs.
I remember hearing warnings about the output of the DCMs being "invalid"
until locked. I never heard about such serious downstream impacts.
>> Does Xilinx expect you to gate the clock output until it is
>> locked? Obviously this is a major issue in any design that uses those
>> two components.
>
> IP published by Xilinx that uses RAMs as ROMs mostly ignores this problem.
>
> IP *I* write is good, and can be verified by design review. What really
> worries me is the IP that is closed source, and silently gets added to my
> design by the tools. (If you've used a tool like XPS you'll know what I
> mean.)
My typical design is not so large that I can't craft all my own code. I
also have not used a Xilinx part in close to a decade. I have been
using Lattice parts a lot lately. I will be working on an ICE40 design
in the near future. *very* low power... pretty cool parts, so to speak...
--
Rick