Error in abyss-map

46 views
Skip to first unread message

Bret Boyd

unread,
Nov 19, 2014, 5:18:22 PM11/19/14
to abyss...@googlegroups.com

Hello

I was hoping you could help me with a problem I am encountering in ABySS-map.  I keep getting errors with the .dist and .hist files.  The relevant log information is pasted below. 

 

I have paired-end Illumina HiSeq libraries.  Some libraries have 100bp reads and other have 160bp reads.  I am trying to assemble a few genes of interest from the libraies.  I used blast to find \1 reads that are similar to my genes of interest, used a script to pull the corresponding \2 reads, and tried to assemble these reads in ABySS.  For some genes ABySS works just fine and I get an assembly.  For some genes I get one of the two errors below.  Also, for some genes where I encounter this error when assembling reads from one library, I do not when assembling reads form another library.

 

Could this have something to do with input file format or read type?  Or something else?

 

Usage: abyss-pe v=-v k=32 name=myname lib=’mype’ mype=’read1.fasta read2.fasta’

 

Copied form log, two different runs with ABySS:

 

abyss-map -v -j1 -l32    read1.fasta read2.fasta myname-3.fa \

                        |abyss-fixmate -v -l32  -h mype-3.hist \

                        |sort -snk3 -k4 \

                        |DistanceEst -v -j1 -k32 -l32 -s50 -n1   -o mype-3.dist mype-3.hist

Reading from standard input...

Reading `myname-3.fa'...

warning: the seed-length should be at least twice k: k=32, s=50

Using 442 kB of memory and 1.11e+05 B/sequence.

Reading `myname-3.fa'...

Building the suffix array...

Building the Burrows-Wheeler transform...

Building the character occurrence table...

Read 737 B in 4 contigs.

Using 442 kB of memory and 600 B/bp.

Mapped 312 of 428 reads (72.9%)

Mapped 312 of 428 reads uniquely (72.9%)

Read 428 alignments

Mateless     0

Unaligned    0

Singleton  116  54.2%

FR           0

RF           0

FF           0

Different   98  45.8%

Total      214

error: the histogram `mype-3.hist' is empty

make: *** [mype-3.dist] Error 1

make: *** Deleting file `mype-3.dist'

 

 

 

abyss-map -v -j2 -l20    read1.fasta read2.fasta myname-3.fa \

                        |abyss-fixmate -v -l20  -h mype-3.hist \

                        |sort -snk3 -k4 \

                        |DistanceEst -v -j2 -k20 -l20 -s300 -n10   -o mype-3.dist mype-3.hist

Reading from standard input...

Reading `myname-3.fa'...

Using 442 kB of memory and 2.21e+05 B/sequence.

Reading `myname-3.fa'...

Building the suffix array...

Building the Burrows-Wheeler transform...

Building the character occurrence table...

Read 918 B in 2 contigs.

Using 442 kB of memory and 482 B/bp.

Mapped 412 of 462 reads (89.2%)

Mapped 412 of 462 reads uniquely (89.2%)

Read 462 alignments

Mateless     0

Unaligned    0

Singleton   50  21.6%

FR         181  78.4%

RF           0

FF           0

Different    0

Total      231

FR Stats mean: 362.8 median: 358 sd: 32.86 n: 169 min: 305 max: 473 ignored: 12

_▁_ ▅▂▂_ ▂▂▅▃▃▁█▁__▁▁▁_▁ ▁ ▁▂▂▁ _

Mate orientation FR: 181 (100%) RF: 0 (0%)

The library mype-3.hist is oriented forward-reverse (FR).

Stats mean: 362.8 median: 358 sd: 32.86 n: 169 min: 305 max: 473

_▁_ ▅▂▂_ ▂▂▅▃▃▁█▁__▁▁▁_▁ ▁ ▁▂▂▁ _

Minimum and maximum distance are set to -19 and 473 bp.

DistanceEst: DistanceEst.cpp:537: int main(int, char**): Assertion `in' failed.

/bin/bash: line 3: 42729 Done                    abyss-map -v -j2 -l20 file1.fasta file2.fasta bob.2.1-3.fa

     42730                       | abyss-fixmate -v -l20 -h mype-3.hist

     42731                       | sort -snk3 -k4

     42732 Aborted                 (core dumped) | DistanceEst -v -j2 -k20 -l20 -s300 -n10 -o mype-3.dist mype-3.hist

make: *** [mype-3.dist] Error 134

make: *** Deleting file `mype-3.dist'

 

Here is the first two lines of my read files:

 

>DBRHHJN1_0173:3:1:16451:104950#ACAGTG/1

TTAACGTGAGTTTTCTTTTTAGCTATTCTTTGATTATTACATAATAAATACATTTTCACGTAGGGATCTGAAAAAATTTATTATTCTATTATCGAGTACG

>DBRHHJN1_0173:3:21:15017:110900#ACAGTG/1

AAGTCACGGCAAATATGCCTACGAAAAATGCCAATGCTGCCAAACAAATGCCAACTAACGCCGGAGTTGATACTGGAAAAAATATAAACAGATGTTTACT

file2: 

>DBRHHJN1_0173:3:1:16451:104950#ACAGTG/2

CAGCAAACAGATTAACTGTCGTCATACTCAAAGCTAGAAATTTGCCAAAAATGGACGTTACCGGTCTTGCAGGTACAAAT

>DBRHHJN1_0173:3:21:15017:110900#ACAGTG/2

ATTTTTCCCTTCGTAAGAAAAAAAAAAAAATGAAAGAAATAATAAAACGTCGAAAAAAATTTTTTTTTTTTTACATGTAGTTAACAAAATACATGTAAAT

Anthony Raymond

unread,
Nov 19, 2014, 5:47:50 PM11/19/14
to Bret Boyd, abyss...@googlegroups.com
Hi Bret,

I don't think this is a file format issue. abyss-fixmate is able to pair up all of the alignments given to it based on the log files.

Both log files fail for different reasons. the first one fails because there are no read pairs that map to the same contig. This is needed in ABySS because we estimate the fragment size based on these alignments. As in this case, if there aren't any read pairs mapping to the same contig, there is no histogram to estimate the fragment size and the assembly will stop.

The second log file generates a histogram based on the 181 fragments that map to the same contig in FR orientation, but since there are no fragments that map to different contigs, there is no new information for the assembly to work with so it stops as well.

Even though the assemblies fail to produce a contigs.fa or scaffolds.fa, you could still use the sequences in the unitigs.fa. That said, I think you'll have better luck if you pull in more reads for some of the genes you are trying to assemble. Maybe you could select for both \1 and \2 reads and then extract their mates? You could also select reads from the flanking sequences.

Cheers,
Tony

--
You received this message because you are subscribed to the Google Groups "ABySS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to abyss-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bret Boyd

unread,
Nov 20, 2014, 5:24:46 PM11/20/14
to abyss...@googlegroups.com, bmbo...@gmail.com
Hi Tony

Thank you for the prompt and informative reply!  Based on your reply reran ABySS with the same reads as single end reads and got assemblies for my genes.  This seem to be working just fine. 

Thanks again for your help,
Bret
Reply all
Reply to author
Forward
0 new messages