Shuiquan
unread,May 15, 2012, 4:41:42 PM5/15/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to ABySS
I keep having the same error with ABySS 1.3.3 saying:
error: the histogram `pe1-3.hist' is empty
make: *** [pe1-3.dist] Error 1
make: *** Deleting file `pe1-3.dist'
The program works fine on other sequencing data; so I think it is the
problem of the raw data; maybe ABySS does not recognize the identifier
of the reads? But the identifier looks OK to me, ending with :1 and :
2. Moreover, the raw data are definitely in pairs with some read
mapping tests.
The following test is running with a small subset of the original
data.
The command: abyss-pe k=41 n=10 v=-v name=CF50_minor lib='pe1'
pe1='Pair.fasta' &>abyss.log
Reads are like:
>110110_ccCPB772:3:78:3213:15827:0:1
TATAATCATTAAAAGTTCCTGTAAATAAATACTAGGAACTTTTAGTATGTTATATTGACCTTAATCTAAATATAGT
>110110_ccCPB772:3:78:3213:15827:0:2
AAATTTCTAAAGCTTGTTCTCCGGTATCTGGTTGAGATACTATTAGATTATCTATATCTACACCTAATGCTTTAGC
>110110_ccCPB772:3:4:19035:7244:0:1
TCTTATATTGAATAAACTTCATATATGCATATAAAAAAGCAACCATTTGAGATATAACTGTAGCCACAGCTGCTCC
>110110_ccCPB772:3:4:19035:7244:0:2
AAATGGAAACGGGCAATTTTTAACAGAAAACATATGGAAGCTATTATTAAGATTTTCTATACCAGCCATACTTTCG
>110110_ccCPB772:3:98:17808:9960:0:1
TGGCTGTATACCTATACCTCTTAATTCTTTAACTGAATGTTGAGTAGGCTTGGTTTTCAATTCTCCTGATTTTTTT
>110110_ccCPB772:3:98:17808:9960:0:2
TTTATGAATACAAAATATATATTTGTAACAGGGGGAGTAGTATCTTCATTAGGAAAGGGAATAACAGCTGCTTCAT
.....
The log:
which: no mpirun in (/usr/local/amos-3.0.0/bin:/usr/local/MUMmer3.22:/
usr/local/qt/bin:/usr/lib64/qt-3.3/bin:/usr/NX/bin:/usr/kerberos/bin:/
usr/local/bin:/bin:/usr/bin:/usr/local/bin:/home/shuiquan/bin)
ABYSS -k41 -q3 -v --coverage-hist=coverage.hist -s CF50_minor-
bubbles.fa -o CF50_minor-1.fa Pairs.fasta
ABySS 1.3.3
ABYSS -k41 -q3 -v --coverage-hist=coverage.hist -s CF50_minor-
bubbles.fa -o CF50_minor-1.fa Pairs.fasta
Reading `Pairs.fasta'...
Read 100000 reads. Hash load: 3155366 / 4355707 = 0.724 using 191 MB
Read 200000 reads. Hash load: 6016173 / 8844859 = 0.68 using 367 MB
Read 300000 reads. Hash load: 8643483 / 8844859 = 0.977 using 496 MB
Read 388401 reads. Hash load: 10801889 / 17961079 = 0.601 using 674 MB
`Pairs.fasta': discarded 11599 reads shorter than 41 bases
Loaded 10801889 k-mer
Hash load: 10801889 / 11200489 = 0.964 using 620 MB
Minimum k-mer coverage is 23
Coverage: 23 Reconstruction: 194
Coverage: 9.59 Reconstruction: 5189
Coverage: 3.32 Reconstruction: 392229
Coverage: 1.73 Reconstruction: 1896724
Coverage: 1.41 Reconstruction: 10801889
Coverage: 1 Reconstruction: 10801889
Using a coverage threshold of 1...
The median k-mer coverage is 1
The reconstruction is 10801889
The k-mer coverage threshold is 1
Setting parameter e (erode) to 2
Setting parameter E (erodeStrand) to 0
Setting parameter c (coverage) to 2
Generating adjacency
Finding adjacent k-mer: 1000000
Finding adjacent k-mer: 2000000
Finding adjacent k-mer: 3000000
Finding adjacent k-mer: 4000000
Finding adjacent k-mer: 5000000
Finding adjacent k-mer: 6000000
Finding adjacent k-mer: 7000000
Finding adjacent k-mer: 8000000
Finding adjacent k-mer: 9000000
Finding adjacent k-mer: 10000000
Added 21098990 edges.
Eroding tips
Eroded 8686217 tips.
Eroded 0 tips.
Hash load: 2115672 / 2144977 = 0.986 using 548 MB
Pruning tips shorter than 1 bp...
Removed 1211 marked k-mer.
Pruned 1211 k-mer in 1211 tips.
Pruning tips shorter than 2 bp...
Removed 2634 marked k-mer.
Pruned 2634 k-mer in 2500 tips.
Pruning tips shorter than 4 bp...
Removed 8497 marked k-mer.
Pruned 8497 k-mer in 4701 tips.
Pruning tips shorter than 8 bp...
Removed 32375 marked k-mer.
Pruned 32375 k-mer in 9796 tips.
Pruning tips shorter than 16 bp...
Removed 137242 marked k-mer.
Pruned 137242 k-mer in 21569 tips.
Pruning tips shorter than 32 bp...
Removed 543936 marked k-mer.
Pruned 543936 k-mer in 44304 tips.
Pruning tips shorter than 41 bp...
Removed 584952 marked k-mer.
Pruned 584952 k-mer in 32717 tips.
Pruning tips shorter than 41 bp...
Removed 129 marked k-mer.
Pruned 129 k-mer in 15 tips.
Pruning tips shorter than 41 bp...
Pruned 116813 tips in 8 rounds.
Hash load: 804696 / 834181 = 0.965 using 555 MB
Marked 1057 edges of 499 ambiguous vertices.
Removing low-coverage contigs (mean k-mer coverage < 2)
Found 804440 k-mer in 10801 contigs before removing low-coverage
contigs.
Removed 314308 k-mer in 4104 low-coverage contigs.
Split 577 ambigiuous branches.
Hash load: 490388 / 520241 = 0.943 using 555 MB
Eroding tips
Eroded 60 tips.
Eroded 0 tips.
Hash load: 490328 / 520241 = 0.943 using 555 MB
Pruning tips shorter than 1 bp...
Removed 47 marked k-mer.
Pruned 47 k-mer in 47 tips.
Pruning tips shorter than 2 bp...
Removed 67 marked k-mer.
Pruned 67 k-mer in 55 tips.
Pruning tips shorter than 4 bp...
Removed 128 marked k-mer.
Pruned 128 k-mer in 60 tips.
Pruning tips shorter than 8 bp...
Removed 199 marked k-mer.
Pruned 199 k-mer in 63 tips.
Pruning tips shorter than 16 bp...
Removed 255 marked k-mer.
Pruned 255 k-mer in 44 tips.
Pruning tips shorter than 32 bp...
Removed 215 marked k-mer.
Pruned 215 k-mer in 17 tips.
Pruning tips shorter than 41 bp...
Removed 72 marked k-mer.
Pruned 72 k-mer in 3 tips.
Pruning tips shorter than 41 bp...
Pruned 289 tips in 7 rounds.
Hash load: 489345 / 520241 = 0.941 using 555 MB
Popping bubbles
Removed 10 bubbles.
Removed 10 bubbles
Marked 229 edges of 108 ambiguous vertices.
Left 256 unassembled k-mer in circular contigs.
Assembled 488620 k-mer in 6366 contigs.
Removed 10312544 k-mer.
The signal-to-noise ratio (SNR) is -13.2 dB.
AdjList -v -k41 -m30 CF50_minor-1.fa >CF50_minor-1.adj
Reading `CF50_minor-1.fa'...
Finding overlaps of exactly k-1 bp...
V=12732 E=335 E/V=0.0263
Degree: ?
01234
0: 98% 1: 0.98% 2-4: 0.7% 5+: 0% max: 4
Finding overlaps of fewer than k-1 bp...
V=12732 E=379 E/V=0.0298
Degree: ?
01234
0: 98% 1: 1.3% 2-4: 0.7% 5+: 0% max: 4
abyss-filtergraph -v -k41 -g CF50_minor-2.adj CF50_minor-1.adj
>CF50_minor-1.path
Loading graph from file: CF50_minor-1.adj
Graph stats before:
V=12732 E=379 E/V=0.0298
Degree: ?
01234
0: 98% 1: 1.3% 2-4: 0.7% 5+: 0% max: 4
Removing shim contigs from the graph...
Pass 1: Checking 50 contigs.
Pass 2: Checking 8 contigs.
Shim removal stats:
Removed: 25 Too Complex: 31 Tails: 6308 Too Long: 2 Self Adjacent: 0
Parallel Edges: 0
Graph stats after:
V=12682 E=329 E/V=0.0259
Degree: ?
01234
0: 98% 1: 1.1% 2-4: 0.48% 5+: 0.032% max: 17
PopBubbles -v -j2 -k41 -p0.9 -g CF50_minor-3.adj CF50_minor-1.fa
CF50_minor-2.adj >CF50_minor-2.path
Reading `CF50_minor-2.adj'...
V=12682 E=329 E/V=0.0259
Degree: ?
01234
0: 98% 1: 1.1% 2-4: 0.48% 5+: 0.032% max: 17
Reading `CF50_minor-1.fa'...
Bubbles: 3 Popped: 2 Scaffolds: 0 Complex: 0 Too long: 0 Too many: 0
Dissimilar: 1
V=12628 E=271 E/V=0.0215
Degree: ?
01234
0: 99% 1: 0.74% 2-4: 0.45% 5+: 0.032% max: 17
MergeContigs -v -k41 -o CF50_minor-3.fa CF50_minor-1.fa
CF50_minor-2.adj CF50_minor-2.path
Reading `CF50_minor-2.adj'...
Read 12682 vertices. Using 1.77 MB of memory.
Reading `CF50_minor-1.fa'...
Read 6341 sequences. Using 3.31 MB of memory.
Reading `CF50_minor-2.path'...
Read 25 paths. Using 3.31 MB of memory.
The minimum coverage of single-end contigs is 2.
The minimum coverage of merged contigs is 2.05455.
Consider increasing the coverage threshold parameter, c, to 2.05455.
n n:200 n:N50 min N80 N50 N20 max sum
6314 237 49 200 223 331 1573 10022 89422 CF50_minor-3.fa
awk '!/^>/ {x[">" $1]=1; next} {getline s} $1 in x {print $0 "\n" s}'
\
CF50_minor-2.path CF50_minor-1.fa >CF50_minor-indel.fa
ln -sf CF50_minor-3.fa CF50_minor-unitigs.fa
abyss-map -v -j2 -l41 Pairs.fasta CF50_minor-3.fa \
|abyss-fixmate -v -h pe1-3.hist \
|sort -snk3 -k4 \
|DistanceEst -v -j2 -k41 -l41 -s200 -n10 -o pe1-3.dist pe1-3.hist
Reading from standard input...
Reading `CF50_minor-3.fa'...
Reading `CF50_minor-3.fa'...
Building the suffix array...
Building the Burrows-Wheeler transform...
Building the character occurrence table...
Read 832 kB in 6314 contigs.
Using 8.22 MB of memory and 9.88 B/bp.
Read 3 alignments. Hash load: 3 / 5 = 0.6 using 135 kB.
Read 6 alignments. Hash load: 6 / 11 = 0.545455 using 135 kB.
Read 12 alignments. Hash load: 12 / 23 = 0.521739 using 135 kB.
Read 24 alignments. Hash load: 24 / 47 = 0.510638 using 135 kB.
Read 48 alignments. Hash load: 48 / 97 = 0.494845 using 135 kB.
Read 98 alignments. Hash load: 98 / 199 = 0.492462 using 135 kB.
Read 200 alignments. Hash load: 200 / 409 = 0.488998 using 135 kB.
Read 410 alignments. Hash load: 410 / 823 = 0.498177 using 135 kB.
Read 824 alignments. Hash load: 824 / 1741 = 0.473291 using 270 kB.
Read 1742 alignments. Hash load: 1742 / 3739 = 0.4659 using 569 kB.
Read 3740 alignments. Hash load: 3740 / 7517 = 0.497539 using 975 kB.
Read 7518 alignments. Hash load: 7518 / 15173 = 0.495485 using 2 MB.
Read 15174 alignments. Hash load: 15174 / 30727 = 0.493833 using 3.62
MB.
Read 30728 alignments. Hash load: 30728 / 62233 = 0.493757 using 7 MB.
Read 62234 alignments. Hash load: 62234 / 126271 = 0.492861 using 14
MB.
Read 126272 alignments. Hash load: 126272 / 256279 = 0.492713 using
28.4 MB.
Read 256280 alignments. Hash load: 256280 / 520241 = 0.492618 using
57.4 MB.
Mapped 46076 of 400000 reads (11.5%)
Mapped 46076 of 400000 reads uniquely (11.5%)
Read 400000 alignments
Mateless 400000 100%
Unaligned 0
Singleton 0
FR 0
RF 0
FF 0
Different 0
Total 400000
error: the histogram `pe1-3.hist' is empty
make: *** [pe1-3.dist] Error 1
make: *** Deleting file `pe1-3.dist'