Hello Farhad,
The expensive part of almost all dadi analyses is optimization, and you typically need to run multiple optimization runs to ensure convergence. So the easy way to parallelize is to run multiple optimization runs independently, then compare results.
We’re in the process of developing a framework for making this easier, including using cloud computing resources. But if you’re cluster-savvy, it’s not too hard to roll your own.
The key CUDA function dadi relies on is gtsv2StridedBatch. I’m not sure which version of the toolkit that was add in. I know it works in version 10, but I’m not how far back it goes.
Best,
Ryan
> To view this discussion on the web visit
https://groups.google.com/d/msgid/dadi-user/93fef7e1-a21a-4a11-a970-78ecee431786n%40googlegroups.com.