On 6/17/2013 8:14 PM, Rob Gaddi wrote:
> On Mon, 17 Jun 2013 20:00:01 -0400
> rickman<
gnu...@gmail.com> wrote:
>
>> So I finally got around to adding some debug signals which I would
>> monitor on an analyzer and guess what, the bug is gone! I *hate* when
>> that happens. I can change the code so the debug signals only appear
>> when a control register is set to enable them, but still, I don't like
>> this. I want to know what is causing this DURN THING!
>>
>> Anyone see this happen to them before?
>>
>> Oh yeah, someone in another thread (that I can't find, likely because I
>> don't recall the group I posted it in) suggested I add synchronizing FFs
>> to the serial data in. Sure enough I had forgotten to do that. Maybe
>> that was the fix... of course! It wasn't metastability, I bet it was
>> feeding multiple bits of the state machine! Durn, I never make that
>> sort of error. Thanks to whoever it was that suggested the obvious that
>> I had forgotten.
>>
>> --
>>
>> Rick
>
> Not metastability, a race condition. Asynchronous external input
> headed to multiple clocked elements, each of which it reaches via a
> different path with a different delay.
>
> When you added debugging signals you changed the netlist, which changed
> the place and route, making unpredictable changes to those delays.
No, when changing the debug output I added the synchronization FFs which
fixed the problem.
My point was that when the other poster suggested that I need to sync to
the clock I mistook that for metastability forgetting that the input
went to multiple sections of logic. So actually I made the same mistake
twice... lol
> In
> this case, it happened to push it into a place where _as far as you
> tested_, it seems happy. But it's still unsafe, because as you change
> other parts of the design, the P&R of that section will still change
> anyhow, and you start getting my favorite situation, the problem that
> comes and goes based on entirely unrelated factors.
>
> The fix you fixed fixes it. When you resynchronized it on the same
> clock as you're running around the rest of the logic, you forced that
> path to become timing constrained. As such, the P&R takes it upon
> itself to make sure that the timing of that route is irrelevant with
> respect to the clock period, and your problem goes away for good.
Just to make sure of what was what (it has been two years since I last
worked with this design) I pulled the FFs out and added back just one.
Sure enough the bug reappears with no FFs, but goes away with just one.
The added debug info available allowed me to see exactly the error and
sure enough, when a start bit comes in there is a chance that the two
counters are not properly set and the error shows up in the center of
the bit where the current contents of the shift register are moved into
the holding register as a new char.
I guess what most likely happened is that when I wrote the UART code I
assumed the sync FFs would be external and when I wrote the wrapper code
I assumed the FFs were inside the UART. In other words, I didn't have a
proper spec and never gave this problem proper consideration.
I will revisit this design and look at the other inputs. No reason to
assume I didn't make the same mistake elsewhere.
--
Rick