Thanks for the response. Let me throw more light into the issue that we are facing.
We have a 3 machine beta hadoop cluster, where we use Rhadoop. The issue we are facing with the beta cluster and a larger 50 machine hadoop cluster with R streaming job is that the R, code which is compute intensive and memory intensive, seems to be garbage collecting. These machines are 32 core machines, and what we find is that the system cpu% goes to 70% where as the user cpu% is low. When profiled, using "perf", i noticed that there was lot of spinlock within the kernel which is inducing the system cpu% to go up massively, there be rendering the machine to be less effective for other jobs.
On further debugging using "gdb", i noticed that the R program was the cause of the issue, It seems to be garbage collecting quite frequently. See below stack trace of one such instance. We do know that the code does lot of rbind [ without proper data type mapping], and lots of memory allocation in a tight loop , but before attempting to fix the code, wanted to spawn the R program with "--min-vsize=512M --max-vsize=1G --min-nsize=500k --max-nsize=1M" and see if that reduces the system cpu%. So is there a way to set this currently ? . Let me know if you need more inputs.
[Thread 0x7f18f05f7700 (LWP 25405) exited]
Breakpoint 1, RunGenCollect (size_needed=353916) at memory.c:1510
1510 PROCESS_NODES();
(gdb) where
#0 RunGenCollect (size_needed=353916) at memory.c:1510
#1 R_gc_internal (size_needed=353916) at memory.c:2579
#2 0x000000000042248b in Rf_allocVector (type=14, length=353916) at memory.c:2353
#3 0x00007f18f6f67c63 in extract_col (x=0x7f18d3532010, j=0x70bfe60, drop=0x4795668, first_=<optimized out>, last_=0x78d9c98) at extract_col.c:31
#4 0x000000000053edfd in do_dotcall (call=0x4694f40, op=<optimized out>, args=<optimized out>, env=<optimized out>) at dotcode.c:549
#5 0x0000000000573cac in Rf_eval (e=0x4694f40, rho=0x78ca848) at eval.c:492
#6 0x00000000005768d4 in do_return (call=0x4694ed0, op=<optimized out>, args=<optimized out>, rho=0x78ca848) at eval.c:1430
#7 0x0000000000573b29 in Rf_eval (e=0x4694ed0, rho=0x78ca848) at eval.c:466
#8 0x0000000000573b29 in Rf_eval (e=0x4694db8, rho=0x78ca848) at eval.c:466
#9 0x0000000000576960 in do_begin (call=0x46a58a0, op=0x1d53940, args=0x4694d80, rho=0x78ca848) at eval.c:1413
#10 0x0000000000573b29 in Rf_eval (e=0x46a58a0, rho=0x78ca848) at eval.c:466
#11 0x0000000000577b7f in Rf_applyClosure (call=0x78cad18, op=0x46a6290, arglist=<optimized out>, rho=0x78cadc0, suppliedenv=0x78cad88) at eval.c:859
#12 0x000000000042dd42 in Rf_usemethod (generic=0x696699 "[", obj=0x7fff87117a30, call=<optimized out>, args=<optimized out>, rho=0x78cadc0, callrho=0x78caf48, defrho=0x1d7aa98, ans=0x7fff87117e78)
at objects.c:363