Is there any sample size limit for the GlfMultiples/GlfFlex when doing the joint calling?

71 views
Skip to first unread message

Fred Zhou

unread,
Jun 3, 2016, 5:28:14 AM6/3/16
to GotCloud
Hi All,

I have been modifying the gotcloud pipeline on our cluster and successfully tested with 1000 genome data (n ~ 900). (It's really amazing I have to say)

However, after I include more samples (n ~ 1100), I got error from glfmultiples/glfFlex when doing the joint calling:

Failed to open genotype likelihood file [.XXX.glf]

I have checked the corresponding file  [.XXX.glf]  by just doing the single sample calling, the stat just come out, which meant my file was OK.
And I also tried to use a sub-group of files with ~500 or ~800 including [.XXX.glf], the software runs smoothly.

So, I'm just curious, whether there's certain limit for GlfMultiples/GlfFlex when taking the file names as argument.
Or can GlfMultiples/GlfFlex take the text file as input to get location of glf file?


Best,
Fred

Hyun Min Kang

unread,
Jun 3, 2016, 8:08:54 AM6/3/16
to Fred Zhou, GotCloud
You need to increase ulimit in your system. It probably is set to 1024

Hyun.

--
You received this message because you are subscribed to the Google Groups "GotCloud" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gotcloud+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fred

unread,
Jun 3, 2016, 9:01:30 AM6/3/16
to GotCloud, fredz...@gmail.com
Hi Hyun,

Yes, You are right. Thank you very much!

Best,
Fred

wayne

unread,
Nov 16, 2017, 10:24:49 PM11/16/17
to GotCloud
Dear Hyun,

I encountered another problem when performing glfFlex.

Error message: 5580 File size limit exceeded(core dumped) glfFlex -p 0.9 --minMapQuality 0 --minDepth 1 --maxDepth 100000 --uniformTsTv --smartFilter  -b *.glf
We have around 1700 samples. 

We had already set the ulimit -n 2000.

Would you please advise if there are other ways to get around that error.

Thank you

Regards,
Wayne


On Friday, June 3, 2016 at 8:08:54 PM UTC+8, Hyun Min Kang wrote:

pj...@umich.edu

unread,
Nov 18, 2017, 6:04:34 PM11/18/17
to GotCloud
Have you tried `ulimit -n 10000`?

wayne

unread,
Nov 21, 2017, 10:11:53 PM11/21/17
to GotCloud
Ok now. We find out there are some problems with our OS.

Thank you

O E-G

unread,
Jul 23, 2019, 5:13:14 PM7/23/19
to GotCloud
Hello Hyun,

I am experiencing this same issue when using GotCloud with more than 1000 samples. However, after talking to my system administrator, he would not relax the security related control by increasing the ulimit -n since he thinks this issue has to do with a potential bug in glfFlex by not properly handling the close() of each file after it has been used. See for example line 37 of the pFile.h specification.

I wonder if there is another workaround for this? 

I'd appreciate any insight you may have.

Thanks,
Osvaldo
To unsubscribe from this group and stop receiving emails from it, send an email to gotc...@googlegroups.com.

Hyun Min Kang

unread,
Jul 23, 2019, 5:30:57 PM7/23/19
to O E-G, GotCloud
I suggest to use https://github.com/statgen/topmed_variant_calling . It has an option for hierarchical callling by batch, as well as support for indels.

Hyun.
-----------------------------------------------------
Hyun Min Kang, Ph.D.
Associate Professor of Biostatistics
University of Michigan, Ann Arbor
Email : hmk...@umich.edu


To unsubscribe from this group and stop receiving emails from it, send an email to gotcloud+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gotcloud/7217c9d3-a028-40c6-97f3-773de2a60659%40googlegroups.com.

O E-G

unread,
Jul 23, 2019, 6:20:09 PM7/23/19
to GotCloud
Thanks for the prompt response, much appreciated.

Best,


On Tuesday, 23 July 2019 17:30:57 UTC-4, Hyun Min Kang wrote:
I suggest to use https://github.com/statgen/topmed_variant_calling . It has an option for hierarchical callling by batch, as well as support for indels.

Hyun.
-----------------------------------------------------
Hyun Min Kang, Ph.D.
Associate Professor of Biostatistics
University of Michigan, Ann Arbor
Email : hmk...@umich.edu


To unsubscribe from this group and stop receiving emails from it, send an email to gotc...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages