GMGC split pipeline

64 views
Skip to first unread message

ullo...@googlemail.com

unread,
Mar 16, 2021, 6:33:19 AM3/16/21
to NGLess

Dear all,
I tried to split my gmgc map and count parts to spare some memory and runtime. Somehow, I run into different errors.
First I preprocessed all samples, mapped them to gmgc and stored the results as bam file:

```
gmgc_mapped = map (non_human_reads, reference='gmgc:no-rare',mode_all=True,block_size_megabases=370000)
gmgc_mapped_post = select(gmgc_mapped) using |mr|:
    mr = mr.filter(min_match_size=45, min_identity_pc=95, action={drop})
    if not mr.flag({mapped}):
        discard
write(gmgc_mapped_post,ofile=RESULTS</>'gmgc_prelim.bam')
```

next, I only read in the bam files and started separate annotations:

```
gmgc_mapped_post = samfile(current</>'gmgc_prelim.bam')
gmgc_counts_new = count(gmgc_mapped_post,
                    features=['seqname'],
                    normalization={raw},
                    multiple={all1})
collect(gmgc_counts_new,
        current=current,
        allneeded=samples,
        ofile='gmgc_geneabundance.all1.raw.txt')

```

here the first error occurs:

```
Exiting after fatal error while loading and running script
Data Error (the input data did not conform to NGLess' expectations)
SAM file does not contain the right number of tokens (line: 29361628)
```

and for either BRITE, KEGG and EGGnogOGs:

```
Exiting after fatal error while loading and running script
Script Error (there is likely an error in your script)
ESC[31mFor counting, you must do one of
1. use seqname mode
2. pass in a GFF file using the argument 'gff_file'
3. pass in a gene map using the argument 'functional_map'

```

Do I need to pass the maps, even I use gmgc? `local import "gmgc" version "1.0"`

Thanks in advance!
Ulrike

ullo...@googlemail.com

unread,
May 13, 2022, 6:38:12 PM5/13/22
to NGLess
If anyone else is ever running into this sam/bam issue, it seems to be related to https://github.com/samtools/samtools/issues/661 and https://github.com/merenlab/anvio/issues/1479
samtools needs to be fixed in the ngless environment. I do not know why suddenly samtools broke in my environment, but this might be helpful.
Line 22: Message from samtools (stderr): [W::sam_hdr_sanitise] Missing trailing newline on SAM header. Possibly truncated
Reply all
Reply to author
Forward
0 new messages