Hi,
First of all, thank you for such a great module.
I'm using Luigi to parallel my Cellprofiler pipelines for analyzing High Throughput Screening images. My pipeline is quite simple and consists of these tasks:
step 1)
create_metadata
calculate_illumination_pattern
step 2)
calculate_well_features
step 3)
calcualte_profile
The problem is that in my image dataset, I have 1340 wells. So my calculate_profile task requires that 1340 tasks of calculate_well_features be done. In
this StackOverflow answer, suggested creating one task and using multiprocessing.Pool for scheduling more than 1k jobs with Luigi.
My questions are first, does Luigi really have a problem with scheduling more than 1k tasks? Second, what is the best practice for piping my workflow using Luigi?
This is my first conversation in google groups, so I'm looking forward to your feedback.
Bests,
Nahal