memory usage of plink1.9 with --R

198 views
Skip to first unread message

Damian Gola

unread,
Sep 11, 2018, 9:10:02 AM9/11/18
to plink2-users
Dear all,

I try to run a R script which computes the Fst, Fit and Fis population statistics within plink1.9 beta 6.2 using the --R interface. I know there ist the --fst flag but this will give Fst values only and I'm also interested in Fit and Fis as well.

Since we are using slurm as a resource management on our computer cluster, I tried to limit the memory usage of plink by specifying --memory as well. Unfortunately, the memory usage of plink keeps rising while writing out the output. I'm wondering if there might be a memory leak in plink because as of my understanding, plink should read the data in chunks maximum as big as specified by --memory, pass it to the Rserve instance get the results back, write it to the output file and start over again. Thus, the memory usage of plink shouldn't increase with time.

For example, at the moment a plink process uses 15.0G resident memory according to top, but was started with --memory 7800. It started with a memory usage of ~2.0G and increased continuously during the last 2 hours. The bed file size is 11G. 
This makes usage with a resource management system really hard because this job will be killed sooner or later as others did already.

The following can reproduce the issue:

R> library(Rserve)
R
> Rserve(port = 1025)



$ cat test.R

Rplink <- function(PHENO,GENO,CLUSTER,COVAR)
{

 f1 <- function(x)   
 { 
    r <- mean(x, na.rm=T) / 2 
    c( length(r) , r )  
 }

 as.numeric( apply(GENO, 2 , f1) )
}

$ plink1.9b6.2 --dummy 1200 250000 --make-bed --out foo
$ plink1
.9b6.2 --bfile foo --R test.R --out bar --memory 64 --R-port 1025 --allow-no-sex

The attached screenshots document the memory usage at beginning (16.1m), mid (395.6 > 64m given by --memory) and end (1.131g > 64m given by --memory) of plink.

I hope I could explain the problem well. If any questions remain, I'll try to answer them as soon as possible.

Greetings,
Damian
start.png
mid.png
end.png

Christopher Chang

unread,
Sep 11, 2018, 11:44:57 AM9/11/18
to plink2-users
--memory controls the size of plink's main workspace, and ~99% of the time that does what you expect... unfortunately, --R is not part of that 99%, since it depends on third-party Rserve code which uses a totally different memory management strategy.  I'll see if there's something straightforward I can do, though.

Christopher Chang

unread,
Sep 11, 2018, 12:25:37 PM9/11/18
to plink2-users
Today's development build has an attempted fix, but I haven't tested it under a heavy workload yet; can you check if it solves the memory leak?


On Tuesday, September 11, 2018 at 6:10:02 AM UTC-7, Damian Gola wrote:

Damian Gola

unread,
Sep 13, 2018, 9:54:16 AM9/13/18
to plink2-users
Thanks for your quick reply and fix. Looks like it solves the memory leak. Haven't noticed any rise of memory usage so far.
Reply all
Reply to author
Forward
0 new messages