Image resize optimisation

1,730 views
Skip to first unread message

Ryan Slade

unread,
Sep 30, 2013, 8:32:20 AM9/30/13
to golan...@googlegroups.com
Hi All

I'm writing an image resizing service and I'm currently using a pure Go image resizing library in order to have zero dependencies.

Simple example code here:

I'm using:

I found it to be a lot faster than https://github.com/nfnt/resize which seems to be the other popular image resize library.

However, it's still far slower than using ImageMagick.

I've been using this image, resizing it down to 512x512:

On my machine:

Go :
real 0m0.610s
user 0m0.588s
sys 0m0.052s

ImageMagick:
time convert input.jpg -filter box -resize 512x512 test.jpg
real 0m0.112s
user 0m0.136s
sys 0m0.020s

I profiled the code and didn't see any obvious ways of speeding it up, although I don't have much experience with these types of optimisations.

Any help would be greatly appreciated. It would be amazing if we could get some decent speedups.

Ryan

Benoît Amiaux

unread,
Sep 30, 2013, 9:31:58 AM9/30/13
to Ryan Slade, golang-nuts
Even in C, a resizer implemented using SIMD instructions (mmx,sse2,avx) can easily add x8 speed compared to a pure C version. This is valid in go too.
If you don't want to mess with that, the first thing to check is whether resizing both dimensions separately is faster than doing both at the same time.
If the filter is separable, result will be the same, but much faster.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

James Bardin

unread,
Sep 30, 2013, 10:26:53 AM9/30/13
to golan...@googlegroups.com, Ryan Slade
It looks like the disintegration/imaging library already resizes each dimension independently. The code is pretty minimal and clean, so I'd guess that in Go the only major serial speedup would be from some serious inlining (maybe gccgo could do a bit better here?). GraphicsMagick, and I think ImageMagick too, use openmp to parallelize these operations, whereas the Go implementations are single threaded.

-jim

James Bardin

unread,
Sep 30, 2013, 10:44:12 AM9/30/13
to golan...@googlegroups.com, Ryan Slade
Oh, and the SIMD level optimizations apply even more to the encoding/decoding steps, since those are serial code paths. I think it's likely that the majority of time is spent reading/writing the jpg, and not in resizing.

I've been using GraphicsMagick directly from Go, since it's going to be real hard to come close to the native libjpeg/libjpeg-turbo implementations.

Ryan Slade

unread,
Sep 30, 2013, 10:49:10 AM9/30/13
to golan...@googlegroups.com, Ryan Slade
James, do you have an example of using GraphicsMagick directly?

Thanks
Ryan

James Bardin

unread,
Sep 30, 2013, 11:38:36 AM9/30/13
to golan...@googlegroups.com, Ryan Slade


On Monday, September 30, 2013 10:49:10 AM UTC-4, Ryan Slade wrote:
James, do you have an example of using GraphicsMagick directly?


Unfortunately I don't have any code I can release, but there is a fairly comprehensive set if binding at https://github.com/gographics/imagick (disclaimer, I've not tested these myself). It has a lot of example code, and may help you along.

If it suits your workflow, you can also opt to shell-out, and call the GraphicsMagick or ImageMagick binaries directly. 

Rajiv Kurian

unread,
Sep 30, 2013, 12:31:58 PM9/30/13
to golan...@googlegroups.com
This doesn't answer your question directly but like others have said, it's going to be tough beating libjpegturbo when it comes to decoding an image. GraphicksMagick does use libjpegturbo when available. AFAIK they depend only on OpenMP for parallelization and do not use SIMD instructions for any of their resizing implementations.

As long as you are using C/C++ libraries it might be worth checking out OpenCV's performance. They do use SIMD instructions to speed up some of their algorithms though I don't know if image resize(bilinear, bicubic, lanczos etc) is one of them. Further OpenCV has a GPU module that also has a resize function.The GPU implementation cannot use lanczos resampling yet and can only do bilinear and bicubic. I don't know if the GPU implementation with it's copy overhead beats the CPU one either.

Halide(http://halide-lang.org) is also an interesting project for image processing. It lets you specify image processing algorithms in a terse way. You can target either the CPU (x86/SSE, Arm/Neon) or the GPU (OpenCL, CUDA) with the same code. It compiles down to C files which you can use from your Go code. They don't have readily available implementations of image-resizing though so you'd have to write your own kernels.
Reply all
Reply to author
Forward
0 new messages