Faster image resizing - utilize all CPU cores

disintegration

unread,

Jan 11, 2014, 8:19:29 PM1/11/14

to golan...@googlegroups.com

Hi!

Today as an experiment I started a new image processing library (for now only resizing using various filters is supported). The goal is to use all CPU cores to speed things up.

http://github.com/disintegration/imglib

I divide the image to N parts (where N is runtime.NumCPU()) and process each part in a separate goroutine. As the result, i got ~2-2.5x speedup on a laptop with 4 CPU cores, ~1.5x speedup with 2CPU cores.

The only thing library user should do is to enable all CPU cores usage by Go:

runtime.GOMAXPROCS(runtime.NumCPU())

But now I wonder, what is considered the right approach for a package? To use goroutines for faster calculations, or to leave all parallelization to the end user?

(I am sorry, english is not my native language)

Michael Jones

unread,

Jan 11, 2014, 9:20:13 PM1/11/14

to disintegration, golang-nuts

If your CPU has "Simultaneous Multi Threading" (i.e., a recent intel CPU) then you may want twice the number of physical cores. (If it is another design from IBM or Sun etc., then you may want more than that)

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Michael T. Jones | Chief Technology Advocate | m...@google.com | +1 650-335-5765

Lars Seipel

unread,

Jan 12, 2014, 8:52:08 AM1/12/14

to disintegration, golan...@googlegroups.com

On Sat, Jan 11, 2014 at 05:19:29PM -0800, disintegration wrote:
> But now I wonder, what is considered the right approach for a package? To
> use goroutines for faster calculations, or to leave all parallelization to
> the end user?

When a C library starts threads behind the user's back it makes the
program suspectible to a slew of ugly issues. This problem does not
apply to Go. Using multiple goroutines therefore becomes a sole question
of library design.

In case of your image library I'd say it depends on the API your package
is exporting to its users. Is it a high level interface where you just
pass a whole image and the library handles everything else? Then
parallelizing the work being done inside your package might be entirely
appropriate. If the user already has enough control that she can
parallelize it herself I'd tend to keep it out of your package to not
surprise your users.

Dobrosław Żybort

unread,

Jan 12, 2014, 9:40:20 AM1/12/14

to golan...@googlegroups.com

You can consider adding your library to https://github.com/fawick/speedtest-resize

Christoph Hack

unread,

Jan 12, 2014, 9:57:52 AM1/12/14

to golan...@googlegroups.com

The problem is naturally parallel and therefore I do not think that your 2-2.5x improvement is that good on 4 cores. You might want to try to use a sync.WaitGroup in order to simply your code and reduce the communication overhead a bit (instead of the done channel). In addition to that, you should probably split the image into more parts and let each worker dynamically fetch one part after another (maybe by just fetching a partId with atomic.AddUint32 for example). I haven't tested your program yet, but maybe some workers get scheduled more often than others and finish their work earlier, resulting in an under-utilization of the cores. Also, try to minimize the amount of work that is done before and after the parallel transform operation.

disintegration

unread,

Jan 12, 2014, 10:44:45 AM1/12/14

to golan...@googlegroups.com

Thanks for suggestions, i'll try to explore more on how effective goroutines work. From the start to the end of resizing in my test script it's about 100% cpu usage now. I tried to start more goroutines (divide image to more parts) but the result time is the same.

воскресенье, 12 января 2014 г., 18:57:52 UTC+4 пользователь Christoph Hack написал:

disintegration

unread,

Jan 12, 2014, 10:55:57 AM1/12/14

to golan...@googlegroups.com, disintegration

The library design is pretty high-level. One can resize an image like this:

img, _ := imglib.Open("test.jpg")
img = img.Resize(800, 0, imglib.ResampleLanczos)
img.Save("result.jpg")

The only way for end-user to parallelize in this case is to process several images simultaneously. But it means more boilerplate code (setting up workers, sending images via channels).

воскресенье, 12 января 2014 г., 17:52:08 UTC+4 пользователь Lars Seipel написал:

Niklas Schnelle

unread,

Jan 12, 2014, 11:26:55 AM1/12/14

to golan...@googlegroups.com, disintegration

Also consider that Go is really good at handling a lot of goroutines one really easy design for image processing that I found to have great speedup is to process each line of pixels in a seperate goroutine.

This leads to better load balancing and reduces the time where one thread waits for others to finish the work. It also works for any number of CPUs and needs no splitting the image at weird bounds. For a simple

optical flow analysis using the Horn&Schunk method I get about 3.75 speedup on a quad core with HT abd GOMAXPROCS=8 and 3.21 with GOMAXPROCS=8

disintegration

unread,

Jan 12, 2014, 1:03:48 PM1/12/14

to golan...@googlegroups.com, disintegration

Thanks! Good idea, I'll try this aproach too.

воскресенье, 12 января 2014 г., 20:26:55 UTC+4 пользователь Niklas Schnelle написал:

Konstantin Khomoutov

unread,

Jan 13, 2014, 8:43:57 AM1/13/14

to disintegration, golan...@googlegroups.com

On Sun, 12 Jan 2014 07:55:57 -0800 (PST)
disintegration <disintegr...@gmail.com> wrote:

> The library design is pretty high-level. One can resize an image like
> this:
>
> img, _ := imglib.Open("test.jpg")
> img = img.Resize(800, 0, imglib.ResampleLanczos)
> img.Save("result.jpg")

I'd say embedding file operations into the package which really has
nothing to do with files is a bad practice in a language with such
powerful concept of interfaces like Go.

You should better have
imglib.ReadFrom(io.Reader) or imglib.ReadFrom(io.ReadSeeker) --
depending on whether your library is okay with reading the image data
sequentially or does it require seeking (like it had to when reading,
say, TIFF-formatted data). This would allow the user to present to
your library anything they see fit, starting with a simple *os.File
which is returned by a call to os.Open().

Writting would better be done using something like
imglib.SaveTo(io.Writer) or imglib.SaveTo(io.WriteSeeker).

In other words, the library interface should better be as minimal as
possible. Image resizing library has really nothing to do with files
and filesystems.

disintegration

unread,

Jan 13, 2014, 11:07:00 AM1/13/14

to golan...@googlegroups.com, disintegration

Open and Save are just useful shortcuts, every image library have them. You can use standard "image" package to decode the image data from io.Reader and to write to io.Writer while using imglib only for resizing, like this (error handling skipped):

src, _, _ := image.Decode(reader)

img := imglib.Convert(src)

img = img.Resize(800, 0, imglib.ResampleLanczos)

png.Encode(writer, img.GetImage())

The library is in early stage, the API is not stable yet. Maybe I'll add more appropriate functions to work with io.Reader and io.Writer later.

понедельник, 13 января 2014 г., 17:43:57 UTC+4 пользователь Konstantin Khomoutov написал:

Nigel Tao

unread,

Jan 13, 2014, 6:34:18 PM1/13/14

to disintegration, golang-nuts

On Tue, Jan 14, 2014 at 3:07 AM, disintegration
<disintegr...@gmail.com> wrote:
> Open and Save are just useful shortcuts, every image library have them.

Go's standard image library doesn't.

Reply all

Reply to author

Forward