Expected Runtime for 550k SNPs, 500k Respondents and 100 PCs

46 views
Skip to first unread message

Tobias Wolfram

unread,
Jun 25, 2021, 4:33:20 AM6/25/21
to flashpca-users
Hi, I'm running FlashPCA2 on an Ubuntu Workstation from the command line for roughly 550k SNPs, 500k Respondents and require the first 100 PCs. So my command is

/data/flashpca/flashpca --bfile filename -v -d 100.

I started this on Monday (so 5 days ago) and it is (according to the verbose output) still reading blocks. Is this within the expectations concerning runtime? 

Thanks for your help!
Tobias 

Gad Abraham

unread,
Jun 29, 2021, 4:13:06 AM6/29/21
to Tobias Wolfram, flashpca-users
Hi Tobias,

Thanks for your message.

Hard to say without the details of your workstation, but two things I
can think of:

1. Do you really need 550k SNPs? Usually I LD prune the SNPs
beforehand with PLINK to something much smaller, say 50k-100k. That's
better for capturing population structure etc (reduces LD).

2. Do you really need 100 PCs? it's quite a lot...

Regards,
Gad
> --
> You received this message because you are subscribed to the Google Groups "flashpca-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to flashpca-user...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/flashpca-users/dc9911ee-e530-41f0-86b5-d41b2c2f8e60n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages