It's a great story, you could build a blog entry with it.
Minor detail : Megadrive is the french name for Sega's console. In the rest of the world, it's more likely known as Genesis.
I'm curious to know which use case requires you to decompress sprites at each and every frame.
In most designs I'm aware of, these sprites are cached after decompression, and only dropped from memory when they are no longer needed.
Technicality : you probably noticed, LZ4 expresses distances and length in bytes, but tries to use "register size" access (either 32 or 64 bits depending on target) as much as possible.
It makes the code slightly more complex to handle remainings, but that part is manageable.
More critically, this design works well with CPU supporting unaligned access, but those which don't (and I guess 68000 is among them) have to downgrade to byte access.
As a fix, LZ4W design enforces every match to start and end exactly at word boundaries.
Some classes of matches become impossible to recover with this limitation (say, an identical sequence, just shifted by one byte),
but it's an acceptable trade off to get more speed.
In theory, it's possible to apply this restriction to LZ4 too : on the compression side, it would only search on even bytes, and all lengths would be even.
It would keep compatibility with LZ4, but is less efficient than a custom scheme, since one bit of every field becomes useless.
That being said, for the usage you have in mind, compatibility is an unlikely objective.