j2k on the GPU

381 views
Skip to first unread message

Aaron Boxer

unread,
Jun 3, 2014, 7:37:52 PM6/3/14
to open...@googlegroups.com
For anyone who has a need for speed, check out my OpenCL jpeg 2000
codec. I am currently able to decode RGB lossless images using the GPU: 

https://github.com/OpenCodec/ThousandthChicken

Windows only at the moment.

Cheers,
Aaron


Antonin Descampe

unread,
Jun 4, 2014, 5:00:52 AM6/4/14
to open...@googlegroups.com

Hi Aaron,

 

Great job !

 

Following this thread (https://groups.google.com/d/msg/openjpeg/AtVLSEiVD-Q/UrSjHhC2HHAJ), did you have the opportunity to benchmark the performances of your OpenCL codec ?

 

Antonin

 

De : open...@googlegroups.com [mailto:open...@googlegroups.com] De la part de Aaron Boxer
Envoyé : 04 June 2014 01:38
À : open...@googlegroups.com
Objet : [OpenJPEG] j2k on the GPU

--
You are subscribed to the mailing-list of the OpenJPEG project (www.openjpeg.org)
To post: email to open...@googlegroups.com
To unsubscribe: email to openjpeg+u...@googlegroups.com
For more options: visit http://groups.google.com/group/openjpeg
OpenJPEG is mainly supported by :
* UCL Image and Signal Processing Group (http://sites.uclouvain.be/ispgroup)
* IntoPIX (www.intopix.com)

Aaron Boxer

unread,
Jun 4, 2014, 7:53:09 AM6/4/14
to open...@googlegroups.com
Thanks, Antonin. I plan on benchmarking this at some point, but I have lots of bugs to fix right now :)

One issue with GPU libraries is that one has to tune the code to a specific platform. Also, the traditional
bottleneck with these devices is data throughput over PCIe bus.  I am really interested in the new AMD APU
integrated GPU and CPU, which supports the hUMA memory model; this dramatically reduces the memory
bottleneck to the GPU, allowing the device access to all available system memory. 

Cheers,
Aaron

Bob Friesenhahn

unread,
Jun 5, 2014, 9:35:48 AM6/5/14
to open...@googlegroups.com
I don't see any usage licence or copyright associated with the code.
Please note that in most of the civilized world, this means that the
code is not ok to copy and use since it is automatically copyright by
the author and not ok to copy by default.

What sort of speed-up have you observed from OpenCL vs a modern CPU?

Bob
--
Bob Friesenhahn
bfri...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/

Bob Friesenhahn

unread,
Jun 5, 2014, 9:35:48 AM6/5/14
to open...@googlegroups.com
On Wed, 4 Jun 2014, Aaron Boxer wrote:

> Thanks, Antonin. I plan on benchmarking this at some point, but I have lots of bugs to fix right now :)
> One issue with GPU libraries is that one has to tune the code to a specific platform. Also, the traditional
> bottleneck with these devices is data throughput over PCIe bus.  I am really interested in the new AMD APU
> integrated GPU and CPU, which supports the hUMA memory model; this dramatically reduces the memory
> bottleneck to the GPU, allowing the device access to all available system memory. 

This is a great architecture for systems with a dedicated purpose/role
and which need to be sold at a low price-point and consume limited
power. It is not clear if these chips will make it into HPC
server-type hardware which only uses it for processing.

A major selling point of GPUs has been that they can be used to
arbitrarily enhance the graphics performance of an existing system via
plug-in cards. Classic GPUs do have a large impedance mis-match
between the GPU and CPU, particularly when I/O is to files. Classic
GPUs are also very expensive compared with the cost of the rest of the
system.

We are interested in hearing about results. :-)

Mathieu Malaterre

unread,
Jun 5, 2014, 9:49:35 AM6/5/14
to openjpeg
On Wed, Jun 4, 2014 at 3:43 AM, Bob Friesenhahn
<bfri...@simple.dallas.tx.us> wrote:
> On Tue, 3 Jun 2014, Aaron Boxer wrote:
>
>> For anyone who has a need for speed, check out my OpenCL jpeg 2000
>> codec. I am currently able to decode RGB lossless images using the GPU:
>>
>> https://github.com/OpenCodec/ThousandthChicken
>> Windows only at the moment.
>
>
> I don't see any usage licence or copyright associated with the code. Please
> note that in most of the civilized world, this means that the code is not ok
> to copy and use since it is automatically copyright by the author and not ok
> to copy by default.

Indeed, looking at:

https://github.com/OpenCodec/ThousandthChicken/blob/master/ThousandthChicken/codestream_tag_tree_encode.c

It seems to be missing the openjpeg copyright statement. Clearly the
code looks like a copy/paste of tgt.c (esp. the magic 999 value +
comments)...

Please follow Bob's advice and update the copyright+license of your code.

Thanks.

Aaron Boxer

unread,
Jun 5, 2014, 10:11:57 PM6/5/14
to open...@googlegroups.com
Gentlemen,

Thanks for your feedback: I have added appropriate licensing information to all source files.

 Regarding performance, a quick test shows that my new library run on the CPU under windoze is twice as *slow*
as OpenJPEG. But, I am sure with a little bit of care, I will be able to at least match OpenJPEG performance
on the CPU. On the GPU, it should perform much better, but of course, the "proof is in the pudding".
Also, since OpenCL supports heterogenous computing, routines can be run on both the GPU and CPU in parallel,
so this will improve the situation as well. 

When OpenCL 2.0 support becomes available (should be next year for AMD APU), there will be even more opportunities for speed ups: 

1) read/write OpenCL images (GPU texture memory) will provide a significant speed-up over OpenCL buffers
2) dynamic parallelism (launch secondary GPU kernels from primary kernels) will avoid costly host API calls 


Regarding the AMD APU, the A10-7850 cpu/gpu is available right now: we just need to wait for the software tools to mature:


Cheers,
Aaron







Aaron Boxer

unread,
Jun 6, 2014, 8:48:58 AM6/6/14
to open...@googlegroups.com
By the way, the A10-7850 is promising for a low-cost real time DCP encoder. I will be starting
this project after I get the decoder working properly, probably in a bout 6 months.


Aaron Boxer

unread,
Jun 11, 2014, 11:18:40 AM6/11/14
to open...@googlegroups.com
Update: decoder is now running at about twice the speed of OpenJPEG, and I am only getting started on my optimization plans  :)

Environment: Windows 7, SSD, core i7 3770, latest intel opencl sdk.

I have a fairly fast AMD card, but haven't used it yet.

Cheers,
Aaron

  

Vlad Craciun

unread,
Feb 29, 2016, 3:55:21 PM2/29/16
to OpenJPEG
hello aaron,

any update on this? 

interested in a j2k codec to encode video using AMD GPUs. thanks!

Aaron Boxer

unread,
Feb 29, 2016, 9:13:05 PM2/29/16
to open...@googlegroups.com

Hi Vlad

I'm afraid this code will not be released as open source. 

Cheers,
Aaron

Vlad Craciun

unread,
Mar 1, 2016, 1:36:22 PM3/1/16
to open...@googlegroups.com

Aaron,

That may be OK, whats your status on this?

Aaron Boxer

unread,
Mar 20, 2016, 3:31:18 PM3/20/16
to OpenJPEG
Hi Vlad,

My apologies for the delay. Project status: I have developed a GPU-accelerated encoder
that gets around 22 FPS 2K encoding using a 4 year old entry level HD 7700 card
+ i7 3770.   Test data is an FFMpeg capture of this youtube clip:

https://www.youtube.com/watch?v=ZSc7-IbEq38

Extrapolating to a recent card such as the 390X, this should easily reach over 100 FPS
encoding.

My encoder will be released this year as a commercial product.

Cheers,
Aaron

Vlad Craciun

unread,
Mar 20, 2016, 3:57:50 PM3/20/16
to open...@googlegroups.com

Aaron,

Sounds great, is there any way I can contribute? Maybe with some hardware?

Aaron Boxer

unread,
Mar 20, 2016, 4:11:09 PM3/20/16
to open...@googlegroups.com
On Sun, Mar 20, 2016 at 3:57 PM, Vlad Craciun <vlad...@gmail.com> wrote:

Aaron,

Sounds great, is there any way I can contribute? Maybe with some hardware?


What sort of applications are you interested in ?  DCI encoding?  Geospatial? Medical?


Vlad Craciun

unread,
Mar 20, 2016, 4:29:18 PM3/20/16
to open...@googlegroups.com

DCI encoding.

Reply all
Reply to author
Forward
0 new messages