Hi All,
I'm having performance time difference with below Rscript in Production and development environments.Please let me know if any Memory settings used inside YARN container for R script or what are the areas needs to be focused for tuning a rmr job please provide the references if any for the same.
Distribution:-HDP2.1 version
Please find attached rmr job logs in both the environments.
RMR Benchmark script:
Sys.setenv(HADOOP_CMD="/usr/bin/hadoop",
HADOOP_STREAMING="/usr/lib/hadoop-mapreduce/hadoop-streaming.jar")
library(rmr2)
library(forecast)
small.ints <- to.dfs(1:100)
rmr.options(backend.parameters=list())
out <-
mapreduce(small.ints,map=function(k,v)keyval(v,v),reduce=function(k,vv){ets(USAccDeaths);keyval(k,vv)},backend.par
ameters=list(hadoop=list(D="mapreduce.job.reduces=36")))
Please let me know if you need more information.
Thanks
Arun
--
post: rha...@googlegroups.com ||
unsubscribe: rhadoop+u...@googlegroups.com ||
web: https://groups.google.com/d/forum/rhadoop?hl=en-US
---
You received this message because you are subscribed to the Google Groups "RHadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rhadoop+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.