GMGC split pipeline

11 views
Skip to first unread message

ullo...@googlemail.com

unread,
Mar 16, 2021, 6:33:19 AM3/16/21
to NGLess

Dear all,
I tried to split my gmgc map and count parts to spare some memory and runtime. Somehow, I run into different errors.
First I preprocessed all samples, mapped them to gmgc and stored the results as bam file:

```
gmgc_mapped = map (non_human_reads, reference='gmgc:no-rare',mode_all=True,block_size_megabases=370000)
gmgc_mapped_post = select(gmgc_mapped) using |mr|:
    mr = mr.filter(min_match_size=45, min_identity_pc=95, action={drop})
    if not mr.flag({mapped}):
        discard
write(gmgc_mapped_post,ofile=RESULTS</>'gmgc_prelim.bam')
```

next, I only read in the bam files and started separate annotations:

```
gmgc_mapped_post = samfile(current</>'gmgc_prelim.bam')
gmgc_counts_new = count(gmgc_mapped_post,
                    features=['seqname'],
                    normalization={raw},
                    multiple={all1})
collect(gmgc_counts_new,
        current=current,
        allneeded=samples,
        ofile='gmgc_geneabundance.all1.raw.txt')

```

here the first error occurs:

```
Exiting after fatal error while loading and running script
Data Error (the input data did not conform to NGLess' expectations)
SAM file does not contain the right number of tokens (line: 29361628)
```

and for either BRITE, KEGG and EGGnogOGs:

```
Exiting after fatal error while loading and running script
Script Error (there is likely an error in your script)
ESC[31mFor counting, you must do one of
1. use seqname mode
2. pass in a GFF file using the argument 'gff_file'
3. pass in a gene map using the argument 'functional_map'

```

Do I need to pass the maps, even I use gmgc? `local import "gmgc" version "1.0"`

Thanks in advance!
Ulrike
Reply all
Reply to author
Forward
0 new messages