I guess experience varies, and the algorithm you find simplest will
depend on that experience. The "best" algorithm will also depend on
whether your data is coming in as a bitstream, or in bytes, or larger
lumps, along with the size and speed requirements.
An 8-bit CRC done by table lookup involves a single 256-entry by 8-bit
wide table - for each incoming byte, you xor with the current CRC
register then look it up in the table to get the new CRC value. IIRC, a
32-bit CRC will involve 4 tables, each 256 entries of 32 bits, and
you'll have 4 xors, all of which can be done in parallel. I think that
to get the same performance in a bit-wise LFSR you'd need 32 of them
chained in a pipeline. (Correct me if that's wrong.)
Still, whichever way you implement the CRC, it's often best if you can
replicate the same algorithm at both ends - whether it be
software-friendly lookup tables in an FPGA, or FPGA-friendly LFSR's in
software. Then you don't need to worry about the different naming for
all the little details of bit ordering, inversion, etc.