Losing many contigs not reported in the contigs.reduced.fa.hetero.tsv file.

Joseph Sevigny

unread,

Dec 19, 2017, 11:20:33 AM12/19/17

to Redundans

Hello,

I am very excited about redundans and I hope to use it consistently for all my projects ... if I can figure this out!

My unexpected results occur during the reduction phase of the pipeline. It seems that I am losing a bunch of content and losing all my smaller contigs.

Original Assembly QUAST results (contigs.fa)

# contigs (>= 0 bp) 46087

# contigs (>= 1000 bp) 23529

# contigs (>= 5000 bp) 11560

# contigs (>= 10000 bp) 6759

# contigs (>= 25000 bp) 1685

# contigs (>= 50000 bp) 221

After reduction (contigs.reduced.fa)

# contigs (>= 0 bp) 3907

# contigs (>= 1000 bp) 3907

# contigs (>= 5000 bp) 3907

# contigs (>= 10000 bp) 3907

# contigs (>= 25000 bp) 1685

# contigs (>= 50000 bp) 221

So it is basically removing all my contigs that have a length less than 10,000 regardless if they are duplicated. Only one contig (~15,000 bp) shows up in the contigs.reduced.fa.hetero.tsv file.

Here are some BUSCO results showing the loss of content.

BUSCO original:

554 Complete BUSCOs (C)

504 Complete and single-copy BUSCOs (S)

50 Complete and duplicated BUSCOs (D)

118 Fragmented BUSCOs (F)

306 Missing BUSCOs (M)

978 Total BUSCO groups searched

BUSCO contigs.reduced.fa

389 Complete BUSCOs (C)

369 Complete and single-copy BUSCOs (S)

20 Complete and duplicated BUSCOs (D)

92 Fragmented BUSCOs (F)

497 Missing BUSCOs (M)

978 Total BUSCO groups searched

Switching around the options do not change this observations. Here is my typical run.

redundans.py --identity 0.6 -i $forward $reverse -f $contigs -o redundans_out_60 -t 24

If it matters, I am currently working with a nematode that has ~150 MB genome and using illumina 250 bp reads. I am working with redundans-0.13c.

I appreciate any feedback.

Happy Holidays,

Joseph

l.p.p...@gmail.com

unread,

Dec 19, 2017, 11:35:15 AM12/19/17

to Joseph Sevigny, Redundans

Hi Joseph,

This is peculiar. Can you send me Redundans log, and output of reduction (contigs.reduced.fa.hetero.tsv) and identity histogram (contigs.reduced.fa.hist.png)?

Bests,

L.

--
You received this message because you are subscribed to the Google Groups "Redundans" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redundans+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/redundans/e5be1f04-2895-400a-8284-95553dcead3e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Joseph Sevigny

unread,

Dec 19, 2017, 12:03:04 PM12/19/17

to Redundans

Than you for the quick reply. Attached are the files.

Thanks you.

- Joseph

L.

To unsubscribe from this group and stop receiving emails from it, send an email to redundans+...@googlegroups.com.

contigs.reduced.fa.hetero.tsv

contigs.reduced.fa.hist.png

log.txt

Joseph Sevigny

unread,

Dec 19, 2017, 2:21:25 PM12/19/17

to Redundans

Also, I can send you the raw contig file if you want to run it through on your end to rule out any dependency or hardware issues.. Just let me know the best secure way to send it.

- Joseph

On Tuesday, December 19, 2017 at 11:35:15 AM UTC-5, lpryszcz wrote:

L.

To unsubscribe from this group and stop receiving emails from it, send an email to redundans+...@googlegroups.com.

Joseph Sevigny

unread,

Dec 22, 2017, 8:25:33 AM12/22/17

to Redundans

Just wanted to give a quick update, While I still haven't found the cause, this observation seems to be isolated to this particular assembly. The rest of my projects run as expected.

On Tuesday, December 19, 2017 at 11:35:15 AM UTC-5, lpryszcz wrote:

L.

To unsubscribe from this group and stop receiving emails from it, send an email to redundans+...@googlegroups.com.

Reply all

Reply to author

Forward