This may be a stupid question, but how about distributing the workload
to multiple node processes?
This should sidestep the memory barrier and also happens to leverage
all cores. IPC is pretty straight forward with node
(
http://nodejs.org/api/child_process.html#child_process_child_send_message_sendhandle).
And should the amount of data ever grow to surpass the capabilities of
one machine, you can swap out the IPC for some network protocol and
run the stuff on multiple machines.
Without knowing more about the problem, I would suggest aiming for a
map-reduce-ish information flow between the distinct processes. Maybe
this is a good starting point:
http://www.mapjs.org/
Regards,
Juraj