New to Haxe: Building Client-Side Parallel Image Processing Library

118 views
Skip to first unread message

Murli Rajagopalan

unread,
Jun 2, 2015, 9:32:01 AM6/2/15
to haxe...@googlegroups.com
We currently have a parallel implementation of an image processing library written purely in C++ and CILK and running on the server as a service that uses ALL CPU cores of the server.  It can be accessed via Http from a Web client.  It takes in an image and other inputs, runs the algo in a parallel way, and gives out the output image.

This is working pretty well ... however, scaling is a problem, since the algorithm is extremely CPU intensive.  Even using 8-core, the algo takes about 2 seconds to manipulate a 1000x1000 pixel image.

So, we want to convert this into a client-side library that's directly accessed by the web JS client.  It should download when the page is requested (like ActiveX control).

Questions are:
1.  Is such an implementation possible in Haxe?
2.  Can we then target browsers on any platform (Windows, Linux, Mac, Android, and iOS) seamlessly?
3.  If so, what the next steps I should be doing?

Please note that the image processing library accesses memory liberally (for performance) and directly needs to access the raw bytes of the image.
Also, it should be able to use all CPU cores on the client machine.

Regards,

Marc Weber

unread,
Jun 2, 2015, 9:38:12 AM6/2/15
to haxelang
C/C++ can be compiled to js via llvm (see asmjs.org which is a subset of
js using js arrays simulating raw memory or such - its said to as fast
as up to 2 times slower than native C)

This way you may have chance to reuse existing code - maybe there are
better / alternative ways.

You may want to have a look at your clients - maybe there are many
mobile phones first.

Marc Weber

Marc Weber

unread,
Jun 2, 2015, 9:40:05 AM6/2/15
to haxelang
It might make sense to talk about the algo .. maybe there are ways to
outsource the work to dedicated "graphic hardware" (server side) or to
find a faster implementation.

Marc Weber

Murli Rajagopalan

unread,
Jun 2, 2015, 10:18:29 AM6/2/15
to haxe...@googlegroups.com
Thanks for your prompt reply.

Here are my comments/reply to your suggestions.

First, the algo is quite parallelizable (fortunately, except for some parts), so we'd like to take advantage that.  Do you know if the asm.js compile takes advantage of the CPU cores on the client machine (for example, using the latest Javascript Worker Threads)?  If so, this might be a very good starting point.  Only thing is JS is exposed and viewable by anyone.

Second, talking about the algo, it uses sparse matrix math and transforms an image (along with some other information) into an output image.  It sets up a lot of vectors and matrices for each pixel of the image and finally solves the equation using conjugate gradient method.  As you can see, it is CPU intensive.  Besides, its what we have developed for producing high-quality output.  This is our USP.  We don't plan on replacing that with any other algo in the near future.

We did look at GPU architectures, and it is definitely a route to take.  Our first take was parallelizing using CPU cores.  Next, we can do GPGPU parallelization.  Again here there's the same scalability problem ... the cost of a GPU cloud ... all the complexity of handling 10s of thousands of users (we hope!!!) trying to use the algo all at the same time.

Hence the interest in leveraging the CPU/GPU of the client directly.

Marc Weber

unread,
Jun 2, 2015, 10:44:22 AM6/2/15
to haxelang
> First, the algo is quite parallelizable (fortunately, except for some
> parts), so we'd like to take advantage that. *Do you know if the asm.js
> compile takes advantage of the CPU cores on the client machine (for
> example, using the latest Javascript Worker Threads)*?
http://www.w3schools.com/html/html5_webworkers.asp

JS on its own does not support threading except passing serializable
content to something like worker threads - something similar does exist
for node.

I don't think that scaling is an issues (eg using Amazon) - but costs
could be.

Using java applets could be an option as well. Java supports threads - I
never tried using those in an applet though - however I don't expect
that many people to have Java installed ..

GPU cloud computing does exist:
http://www.nvidia.com/object/gpu-cloud-computing-services.html
If your algo gets much faster it could be an option.

FPGA cloud? - does it even exist?
http://www.electronicsweekly.com/news/components/programmable-logic-and-asic/big-fpga-design-moves-to-the-cloud-2013-06/

Marc Weber

Murli Rajagopalan

unread,
Jun 2, 2015, 10:53:28 AM6/2/15
to haxe...@googlegroups.com
Thanks for your suggestions.

Regards.
Reply all
Reply to author
Forward
0 new messages