I trying to design a 8b/10b encoder/decoder, I been told that it can be
implemented by means of a lookup table but that it's much better and smaller
to do it using algorithms. Does anyone know where I could get info on these
algorithms or tips etc..?
Thanks,
Paul.
Paul Noone schrieb:
If you are thinking of the 8B/10B coding used in e.g. Ethernet, I
suggest you get hol dof the IBM patent (US Patent 4,486,739).
--Kai
It's more likely, I think, that the guy is after the 8B/10B encoding used in
gigabit Ethernet, Fibre Channel, etc. It consists a 5B/6B block in parallel
with a 3B/4B block.
> > I trying to design a 8b/10b encoder/decoder, I been told that it can be
> > implemented by means of a lookup table but that it's much better and
smaller
> > to do it using algorithms. Does anyone know where I could get info on
these
> > algorithms or tips etc..?
The IBM patent suggested does haves a bunch of circuits in it, but I can
almost guarantee you won't get that much out of it unless you're already
familiar with typical block encoding schemes. You'd be better off reading
the couple of pages in something likeRick Seifert's gigabit Ethernet book, a
fibre channel book, or even... darn, I can't remember the name right now...
Pouton's? engineering book.
In a CPLD architecture, I did 8B/10B encoding using a two stage pipelined
apporach. The first stage did a 5B/6B lookup and a 3B/4B lookup (in
parallel), both assuming positive disparity (and not "cross connecting" the
LUTs to take the output of the block you do first -- 5B/6B, as I recall --
into account for the 3B/4B lookup). The second stage just figured out
whether or not disparity was really supposed to have been negative for
either block, and fixed the output if so ("fixing the output" simply inverts
the output bits). This worked quite well, and chewed up twenty-some-odd
macrocells in a Cypress Ultra37K part (and generated all outputs in "one
pass through the array," of course -- this was working at the 106.25MHz
fibre channel speeds, so you don't have time for multiple passes).
We later took this exact same chunk of code and tossed it into a Virtex
FPGA. It worked fine there as well (and has some ridiculous upper speed
limit in the 150+ MHz ballpark without any effort whatsoever... but the nice
thing about these block codes is that they're highly amenable to further
pipelining, if you ever need more speed).
The decoder is a straight table lookup (6B/5B and 4B/3B in parallel),
although it takes a little bit of a mess of logic to figure out when you
encounter a disparity error. The other fun part is figuring out how to
match clock domains (if you need to -- the CPLD design simply ran half of
itself off of the SerDes's receive clock, but in the FPGA version this
wasn't practical)... if your system clock is running slightly slower than
the incoming data stream rate, clearly you have no choice but to somehow
compress the data. We ended up doing this by outputting 9 bits instead of 8
for every incoming character, and using the 9th bit as a "comma detect"bit.
Of course, you also need an asynchronous FIFO tossed in there, and if you're
using one of these SerDes's that puts out a half speed clock, you need to
either regenerate the full speed clock (we did this on the CPLD design) or
do do things in paralle for awhile until we can compress the data (we did
this on the FPGA design -- we had a 16 deep 20 bit+2 bit [comma detection]
asynchronous FIFO that ate commas as it went, the output side then just read
out one side of the FIFO or the other at full speed, did the decode, and off
it went...)
---Joel Kolstad
best regards,
Chris Dunlap
A good synthesis tool should be able to do the necessary logic
minimization automatically, given a behavioral description (e.g., a
table). 8b/10b was designed for simple hardware implementation.
> In a CPLD architecture, I did 8B/10B encoding using a two stage pipelined
> apporach. The first stage did a 5B/6B lookup and a 3B/4B lookup (in
> parallel), both assuming positive disparity (and not "cross connecting" the
> LUTs to take the output of the block you do first -- 5B/6B, as I recall --
Correct.
> into account for the 3B/4B lookup). The second stage just figured out
> whether or not disparity was really supposed to have been negative for
> either block, and fixed the output if so ("fixing the output" simply inverts
> the output bits).
Just remember to also fix the six (3 of each disparity) codegroups,
where you need to change to 3B/4B code. This is needed to keep the
maximum number of running ones/zeros down to five. In the IBM patent,
look for the Dx.A7 ("alternative") codes.
Yeah, there was a bit or two for "oh, this is an exception, funny things
have to happen." That was slightly annoying, but oh well.
We've sent many billions of bits down this link without an error, so I think
we got it correct. :-)
---Joel Kolstad
Yeah, but I don't think it was designed to just be implemented as a big (256
input) LUT that would then be product term minimized. I'm curious if the
gate count would be the same as the IBM suggested implementation, however.
I would be very surprised if the gate delays were as fast as a two stage
approach, and it does always seem that 8B/10B comes up in the same breath
as, "...of course it has to be able to ship 10Gb over a single wire, too."
---Joel Kolstad
I know of 8b/10b coding in two contexts, both of which are channel
coders. One is for DAT and DDS recording, the other is for FDDI.
I've been involved (though at a distance) in both.
You need to decide whether your channel needs to be DC free. If it
does, then you need an approach like the DAT/DDS one. There was a
paper in the 1980a, IIRC, by some guys who worked for Sony. Sorry
I no longer have access to the paper, but that may give you enough
references to find it. It id specified as a lookup table. There
aren't 256 DC free codes 10 bits long, and I think there are some
issues of concatenation od codes. So you need to input not only
your expected 8 bits of data, but also the DSV (digital sum variance)
of the code so far. The output is a 10 bit code which either
corrects or maintains the DSV; it also outputs a new DSV. It
amazed me at the time when we implemented one, that it really was
done by letting the synthesis tool synthesise and then minimise
the logic. These were 1980s tools: Ella and Locam.
If you don't need to be DC free, then you can use the FDDI code
book, which is actually two identical 4b/5b encoders. There's
no feedback, so it is economical to implement as a ROM; I don't
know if there's anything to be gained by synthesising logic to
do it. In something like a Spartan FPGA, you'll need 5 halves
of a CLB working as 16 x 1 ROMs, which is about as cheap as it
gets. The cheapest reference for this is the AMD TAXI chip
data - I think they give you the whole code book.
Dave