can you allocate resources for Snakemake pipeline?

88 views
Skip to first unread message

Stephan Sie

unread,
Mar 18, 2021, 10:40:30 AM3/18/21
to GCP Life Sciences Discuss
Hi,

I'm looking into running my Snakemake pipeline on GCP and I saw within the snakemake documentation that it can be run with the google life sciences executor. But I can't seem to find out how to allocate resources for my pipeline. Will resources be assigned automatically or based on e.g. the amount of threads I assign in my pipeline? 

Thank you in advance,
Stephan Sie


Wendy Wong

unread,
Mar 18, 2021, 10:52:37 AM3/18/21
to Stephan Sie, GCP Life Sciences Discuss
Hi Stephan,

I am also developing Snakemake with --google-lifesciences option. You can specify a default instance type and then modify it like:
resources: 
   machine-type="n1-standard-8" for individual rules.

Hope that helps,
Wendy 

--
You received this message because you are subscribed to the Google Groups "GCP Life Sciences Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gcp-life-sciences-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gcp-life-sciences-discuss/9c50f7ea-977a-4f72-a344-a5b973c195c1n%40googlegroups.com.

Stephan Sie

unread,
Mar 18, 2021, 11:44:17 AM3/18/21
to GCP Life Sciences Discuss
Hi Wendy,

Thank you that does indeed help! 

Would you also happen to know whether the --google-lifesciences option autoscales or if there is a option for that? So for example if i have 16 files that have to be processed by the pipeline and i specify a n1-standard-8 machine for my rule which, correct me if I'm wrong, means 8 cores. will another instance be created or will it first process 8 and then the other 8? 
 
Thank you in advance,
Stephan

Op donderdag 18 maart 2021 om 15:52:37 UTC+1 schreef wendy...@gmail.com:

Paul Grosu

unread,
Mar 18, 2021, 2:14:22 PM3/18/21
to GCP Life Sciences Discuss
Hi Stephan,

According to the following link, it looks like you'll have to use Kubernetes via "gcloud beta container clusters create --enable-autoscaling", but doesn't feel straight-forward:


Regarding parallelization, depends on how the files are being processed.  Snakemake has the -j option which tells it how many cores to run, but the input usually is concatenated.

Hope it helps,
Paul

Stephan Sie

unread,
Mar 19, 2021, 7:51:20 AM3/19/21
to Paul Grosu, GCP Life Sciences Discuss
Hi Paul,

Thank you for your response!

 I'll have a look into using kubernetes, since that does seem to allow me to do what I want.

Once again thank you,
Stephan



Op do 18 mrt. 2021 om 19:14 schreef Paul Grosu <pgr...@gmail.com>:
You received this message because you are subscribed to a topic in the Google Groups "GCP Life Sciences Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gcp-life-sciences-discuss/itAWtx89ZvY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gcp-life-sciences-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gcp-life-sciences-discuss/06af9916-0eb0-452f-af67-aaf1abb56b5cn%40googlegroups.com.

Paul Grosu

unread,
Mar 19, 2021, 9:14:33 AM3/19/21
to GCP Life Sciences Discuss
Hi Stephan,

Gladly, anytime.  Happy to hear it helped.

Paul

Mick Watson

unread,
May 11, 2021, 7:53:09 AM5/11/21
to GCP Life Sciences Discuss
You control how many jobs get submitted (and therefore VMs created) using the --jobs flag

snakemake blah blah --jobs 100

Will submit the first 100 jobs to google cloud executor, each with the VM you specifiy
Reply all
Reply to author
Forward
0 new messages