I used to do yuv422 to rgb32 conversion with huffyuv (used to live here
http://www.math.berkeley.edu/~benrg/huffyuv.html, didn't google for a new location).That code was basically 32 bit assembly code, which obviously doesn't work well on x86_64 architectures.
I've now replaced that code with equivalent code from PixFC-SSE, but interestingly when I replace a call to mmx_UYVYtoRGB32 with a PixFC-SSE conversion from PixFcUYVY to PixFcBGRA, my output image is flipped vertically and thus could require extra code to get it "right". In my case it actually doesn't, as a different stride of the buffer I will eventually copy it to forces me to copy it row by row, so it doesn't really matter, but I'm curious which routine is doing the "right" thing.