Hi Gergő,
On Mon, 15 Feb 2021, Gergő Érdi wrote:
> However, I still can't quite wrap my head around how memory access
> would work.
These dual-port blockRAMs in the FPGA are very special beasts. I never
looked further into how they are constructed.
For the FIFO synchronizer, we use a dual-port blockRAM with separate
read and write clocks. It has one read port synchronised to one clock,
and one write port synchronised to another clock. Once the timing of the
blockRAM tells us that a write has landed, a subsequent read of that
address will return the correct data. The write was clocked by the write
clock, but the subsequent read by the read clock. There is no known
relation between the two clocks; when they don't come from the same
source, they will drift relatively to each other. This is no problem for
this RAM.
The behaviour of this dual-port blockRAM can, I think, not be expressed
in relation to a single clock. It doesn't matter which clock is slow,
which is fast, and what the phase relation between the two is (the
latter will drift with independent clocks).
At its core, an SRAM is asynchronous. So maybe multiple clock domains is
not quite as magic as it seems on first glance. It's registered, and
those registers are clocked. But every register in itself belongs to a
single clock domain, nothing special there. The actual reads and writes
to the SRAM however are asynchronous, there is no clock involved
anymore. I think most of the magic actually comes from it being
dual-ported. These blockRAMs are quite special beasts. They have two
ports, and if you want, both of those ports can be used for reads as
well as writes. In this application, we don't use that feature.
Where the scientific paper comes into the equation is for the
synchronising of the read and write pointers to the other domain. I
think that's where the intellectual effort lies, in proving that no
matter the glitches, the read and write pointers will never read past
the end or write past the start of the circular buffer. And of course we
only update the write pointer once it is certain that the data has
landed in the SRAM.
When two ports of a blockRAM which are in separate clock domains do an
operation on the same address, there is by definition no knowing what
the result will be for the data at that address (either read or, in the
case of two conflicting writes, written). You'll find in the datasheet
that the behaviour on write conflicts, for instance, is defined or
configurable if the two operations are from the same clock domain.
However, if they are from different clock domains, the datasheets will
just say "undefined result". Because it really cannot be constrained.
One clock might rise a femtosecond before the other on one cycle, and
only after the other on another cycle. Who's to say which was first? Of
course, it's possible to construct clocks that always have a fixed phase
relation with each other. And then you could build simpler
synchronisers. This is unexplored territory for Clash. We assume two
clocks have no fixed relation, and then you need "proper" synchronisers.
> Suppose I have two addressing signals `Signal fast (Maybe (addr, Maybe
> dat))` and `Signal slow (Maybe (addr, Maybe dat))`, and I want to
> connect them to a shared synchronous block RAM (with some static
> arbitration, i.e. either the fast one always takes precedence, or the
> slow one always takes precedence).
Where this train of thought derails is on "synchronous blockRAM". For
understanding, I think you need to split that into "asynchronous SRAM"
and "registers before, and possibly after, this asynchronous SRAM". And
then view the registers separately. Some are in one clock domain, others
are in another. But every single register is in a single clock domain.
I hope this all makes some sense. Also note that I might be wrong in
parts, this is purely what I made of it after thinking about it for some
time. Because I was quite intrigued by this strange RAM that has
multiple ports and clock domains, yet I never made the time to try to
find an authoritative source to explain it to me (well, other than the
datasheet for the FPGA). I have also never been formally educated on
multiple clock domains in one circuit. I think I'm right in what I write
here, but that could just be the Dunning-Kruger effect :-). I would
definitely ask an expert for guidance before I design a multi-clock
circuit myself that needs to be correct. There are bound to be
intricacies that I'm not aware of.
> On Sun, Feb 14, 2021 at 11:37 PM Peter Lebbing <
pe...@qbaylogic.com> wrote:
> > It generates VHDL with Clash 1.2.5, but I glanced through the
> > generated files and I strongly suspect we set INCLK0_INPUT_FREQUENCY
> > wrongly in the PLL qsys files. I'm filing bugs.
There was no problem with the generated Qsys file, it was a
misunderstanding on my part. Apparently Quartus expresses frequencies in
picoseconds. Think about that for a second (heh). The mind boggles.
HTH,
Peter.