Hello all,
I've pushed the current code for my OpenCL VP8 decoder to a sandbox.
I have implemented initial subpixel prediction (sixtap and bilinear),
IDCT/Dequant, and loop filtering. The CL compiler and device
detection is present, and the presence of an OpenCL library is
detected at run-time through dlopen(). If the system is deemed unable
to use OpenCL for decoding, the CPU paths are used as a fallback.
While subpixel/IDCT/Loop Filtering is implemented, it's most
definitely not optimized. I'm planning on working on performance
optimization as a next step, starting with the loop filtering, then
working on refactoring the Macroblock decoding to increase the thread
count. If anyone wants to work on getting the Windows Cygwin or
Visual Studio configuration working, feel free. I've mainly been
developing in Linux/MacOS.
I've been doing most of my work in
github.com/awatry/
libvpx.opencl.git, and I'll probably continue to use that as a primary
development store for now, but if anyone wants to take my current code
and run with it, go for it (or let me know about collaborating).
Anyway, let me know if you have any comments/questions. I can go into
more detail about the implementation for anyone who wants it.
--Aaron Watry