Hi, everyone.
Recently I'm working on optimizing WebP encoder and I've made it 4% faster than the current implementation with a small improvement. In brief, ITransform_SSE2() could be more efficient in case that do_two is 0.
If you are interested in this, I'll post the detail and my patch. (Do I need to execute the CLA before posting?)
I would be glad if I could contribute to the upstream!