Yeah, I think we're also going to have to implement PFC pause
frames for this to work correctly. That's definitely something
I'm going to have to take a look at.
Looks like PFC is Annex 31D in IEEE 802.3, and pause frames are Annex 31B.
It looks like the implementation of PFC should be relatively straightforward, although I will have to make some modifications to the MACs to handle PFC frames since they need to jump over the queues for things to work correctly. That part of the protocol mainly involves exchanging pause quanta over the link (where 1 pause quanta is 512 bit times), and the transmit side must stop sending the specified traffic class until the pause time expires.
I think the more tricky part is going to be implementing support
for traffic classes in the rest of Corundum; right now it's not
totally clear exactly how that's going to work, aside from having
an output FIFO for each traffic class, internal flow control to
prevent the output FIFOs from filling up, and logic at the input
of the TX MAC after the FIFO to merge the data coming out of the
separate FIFOs. I'm planning on implementing some form of
internal flow control along with the shared interface datapath so
that there isn't head-of-line blocking when using multiple ports
per interface, so it makes sense to figure out how PFC would fit
in alongside that since it requires similar functionality.
Alex Forencich
hallo, please keep in mind that roce1/2 isnt stable at all (and was never).the tech. depends strongly on back-pressure/pause-frame handling. and this is only stableavailable in a handful of switches today (mlnx, some arista and cisco). the rest will fail...
mp
--
You received this message because you are subscribed to the Google Groups "corundum-nic" group.
To unsubscribe from this group and stop receiving emails from it, send an email to corundum-nic...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/corundum-nic/591f7e58-e641-2008-cea7-92f36864b811%40eng.ucsd.edu.
So, looking at the PFC spec in 802.3 Annex 31D, it doesn't say
anything about response time. There are some numbers in 31B.3.7
for pause frames, but TBH pause frames are pretty trivial to
implement with extremely low delay - just deassert tready on the
TX side of the MAC between frames whenever the pause timer is
non-zero. For PFC, this back-pressure needs to be asserted at a
higher level, before the different traffic classes are merged to
avoid head-of-line blocking. What I'm trying to determine is if I
can keep an async FIFO after the merge (i.e. can I do the merge in
the 250 MHz PCIe clock domain, then hand the packet off to the MAC
which runs at 322 MHz through an async FIFO, or do I have to do
the merge in the 322 MHz MAC clock domain which would involve
crossing a lot more stuff into the MAC clock domain). I can't
find anything in 802.3 for PFC, but there is some information in
802.1Qbb, and interestingly that seems to be much more restrictive
than the numbers for pause frames (614.4 ns budget for pausing the
queue, regardless of rate, or 120 vs 394 pause quanta at 100
Gbps). Does anyone have any concrete information on this, or
other relevant sources? Does properly implementing PFC mean that
the requirements in 802.1Qbb need to be followed in addition to
802.3, or does 802.3 supersede 802.1Qbb?
Ideally, I would like to be able to do everything in the PCIe clock domain and then just have a single async frame FIFO to the MAC, but the size of that FIFO would have to be large enough to store jumbo frames and it won't be possible to stop the transmission of frames that have already been handed off to that async FIFO, and the delay requirements in 802.1Qbb are ~half of a 1500 byte frame at 10 Gbps (1.2 us) and less than a 9K jumbo frame at 100 Gbps (720 ns). Perhaps what I need to do is implement something more akin to an elastic buffer for TX that attempts to maintain a minimum fill level, that way I don't need a frame-oriented FIFO so the FIFO delay will be much less than an MTU frame size and should satisfy the response time requirements in 802.1Qbb. For MTU frames on a 512-bit AXI stream interface at 100 Gbps, the clock speed only needs to be around 200 MHz, so I think this will work OK with a frame-oriented FIFO at 250 MHz followed by an async FIFO that enforces a minimum occupancy before releasing each start-of-frame to ensure the MAC gets a contiguous frame without any gaps.
Alex Forencich