"Scaling out" using SFS and cloud infrastarcture

17 views
Skip to first unread message

OS

unread,
Oct 30, 2020, 1:06:50 AM10/30/20
to mlxtend
Hey Guys,
I would love to use the SFS method in a distributed way.
I saw that there is a way to do it in parallel on a single-machine utilising the different cores (n_jobs=-1, like in Scikit-learn).
But, is there a way / framework to use mlxtend SFS on a number of different machines?
as part of my thesis I need to do a SFS on a data set of about 10K genes on about 100K samples and looking for a way to do it with many low cost machines via AWS.

thanks in advance for your help and thoughts.

(if there is a way to do it using Pyspark and UDF it will also be alright as I can scale out using AWS EMR)
Reply all
Reply to author
Forward
0 new messages