Dear Luis,
Previously I was using ngless (v 1.1.1) and the collect function was working fine. Now that I am using a new cluster system, I had to install the new version of ngless (v 1.2).
Using the same script, I get the following error message:
[Mon 22-11-2021 08:13:41]: The collect() call at line 43 could not be executed as there are partial results missing.
When you use the parallel module and the collect() function,
you typically need to run ngless *multiple times* (once per sample)!
And here's the script I used:
#!/usr/bin/env ngless
ngless "1.2"
import "parallel" version "1.0"
import "mocat" version "1.0"
import "motus" version "0.1"
import "igc" version "1.0"
samples = readlines('/scratch/e1376a01/healthy_data/04_list')
sample = lock1(samples)
input = load_mocat_sample(sample)
pd_mapped = map(input, reference='igc', mode_all=True)
pd_mapped_post = select(pd_mapped) using |mr|:
mr = mr.filter(min_match_size=45, min_identity_pc=95, action={drop})
if not mr.flag({mapped}):
discard
counts_raw_KOs = count(pd_mapped_post, features=['KEGG_KOs'],normalization={raw})
collect(counts_raw_KOs, current=sample,allneeded=samples,ofile='pd_igc.KO_profiles.raw.tsv')
Could this be the issue with the new version?
By the way, I recently installed ngless using conda, however it was version 1.2 instead of 1.3.
Thank you.
Jane