Rsem error read negative

185 views
Skip to first unread message

MP

unread,
Mar 8, 2016, 5:09:19 AM3/8/16
to RSEM Users
This is the code I use:


mkdir
-p  ALIGNrsem/342  && /illumina/software/PROG/rsem-1.2.21/rsem-calcul
ate
-expression -p 5  --bowtie2  --paired-end  /illumina/runs/FASTQ/Analisi_febraio2
016/ANALISIghr74/trim/342/342.trim.pair1.fastq.gz /illumina/runs/FASTQ/Analisi_febr
aio2016
/ANALISIghr74/trim/342/342.trim.pair2.fastq.gz  \ /illumina/software/databas
e
/Trasc_GH37_74/rsem_74/Homo_sapiens.GRCh37.74  ALIGNrsem/342




this is the error I obtain:

rsem-run-em  /illumina/software/database/Trasc_GH37_74/rsem_74/Homo_sapiens.GRCh37.
74 3 ALIGNrsem/342 ALIGNrsem/342.temp/342 ALIGNrsem/342.stat/342 -p 5 -b b ALIGNrse
m
/342.temp/342.bam 0
Refs.loadRefs finished!
DAT
3000000 reads left
Thread 0 : N = 701628, NHit = 8501479
Thread 1 : N = 706114, NHit = 8501372
DAT
2000000 reads left
Thread 2 : N = 707876, NHit = 8501392
DAT
1000000 reads left
Thread 3 : N = 711553, NHit = 8501379
DAT
0 reads left
Thread 4 : N = 710478, NHit = 8501152
EM_init finished
!
1000000 READS PROCESSED
2000000 READS PROCESSED
3000000 READS PROCESSED
4000000 READS PROCESSED
5000000 READS PROCESSED
6000000 READS PROCESSED
7000000 READS PROCESSED
8000000 READS PROCESSED
9000000 READS PROCESSED
10000000 READS PROCESSED
11000000 READS PROCESSED
12000000 READS PROCESSED
13000000 READS PROCESSED
14000000 READS PROCESSED
15000000 READS PROCESSED
16000000 READS PROCESSED
17000000 READS PROCESSED
18000000 READS PROCESSED
19000000 READS PROCESSED
20000000 READS PROCESSED
21000000 READS PROCESSED
22000000 READS PROCESSED
23000000 READS PROCESSED
24000000 READS PROCESSED
25000000 READS PROCESSED
26000000 READS PROCESSED
27000000 READS PROCESSED
28000000 READS PROCESSED
29000000 READS PROCESSED
30000000 READS PROCESSED
31000000 READS PROCESSED
32000000 READS PROCESSED
33000000 READS PROCESSED
34000000 READS PROCESSED
35000000 READS PROCESSED
36000000 READS PROCESSED
37000000 READS PROCESSED
estimateFromReads
, N0 finished.
1000000 READS PROCESSED
2000000 READS PROCESSED
3000000 READS PROCESSED
estimateFromReads
, N1 finished.
The alignment of fragment B0P8DQ1:71:HJM7YBCXX:1:2115:12203:51515 to transcript 196
317 starts at -212584 from the forward direction, which should be a non-negative nu
mber
! It is possible that the aligner you use gave different read lengths for a sam
e read
in SAM file.
Found unknown sequence letter # at function get_rbase_id!
"rsem-run-em  /illumina/software/database/Trasc_GH37_74/rsem_74/Homo_sapiens.GRCh37
.74 3 ALIGNrsem/342 ALIGNrsem/342.temp/342 ALIGNrsem/342.stat/342 -p 5 -b b ALIGNrs
em/342.temp/342.bam 0"
failed! Plase check if you provide correct parameters/option
s
for the pipeline!
step
2 ERROR



What can I do?

Bo Li

unread,
Mar 8, 2016, 3:30:40 PM3/8/16
to rsem-...@googlegroups.com
Hi MP,

Can you prepare a minimum data set that can trigger this error? Then I
can look into it.

Best,
Bo
> --
> RSEM website: http://deweylab.biostat.wisc.edu/rsem/ [1]
> ---
> You received this message because you are subscribed to the Google
> Groups "RSEM Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to rsem-users+...@googlegroups.com.
> To post to this group, send email to rsem-...@googlegroups.com.
> Visit this group at https://groups.google.com/group/rsem-users [2].
>
>
> Links:
> ------
> [1] http://deweylab.biostat.wisc.edu/rsem/
> [2] https://groups.google.com/group/rsem-users

Bo Li

unread,
Mar 10, 2016, 2:47:05 AM3/10/16
to rsem-...@googlegroups.com
Hi MP,

RSEM aligns reads to a set of transcript sequences instead of the
genome. However, the reference name shown seems like a genome reference.
Can you check it for us?

Thanks,
Bo

On 2016-03-08 02:09, MP wrote:

MP

unread,
Jun 17, 2016, 3:54:29 AM6/17/16
to RSEM Users
Thanks so much!! sorry for my reply. I use the transcript data.
Now I have found the same error with other samples. Please tell me what can I send you for resolve this.


Job start: Fri Jun 17 03:05:10 CEST 2016
step rsem_align
.1._363 start: Fri Jun 17 03:05:10 CEST 2016
bowtie2
-q --phred33 --sensitive --dpad 0 --gbar 99999999 --mp 1,1 --np 1 --score-min L,0,-0.1 -I 1 -X 1000 --no-mixed --no-discordant -p 5 -k 200 -x  /illumina/software/database/Trasc_GH37_74/rsem_74/Homo_sapiens.GRCh37.74 -1 trim/363/363.trim.pair1.fastq.gz -2 trim/363/363.trim.pair2.fastq.gz | samtools view -S -b -o ALIGNrsem/363.temp/363.bam -
[samopen] SAM header is present: 196317 sequences.
39435223 reads; of these:
 
39435223 (100.00%) were paired; of these:
   
31543381 (79.99%) aligned concordantly 0 times
   
2652356 (6.73%) aligned concordantly exactly 1 time
   
5239486 (13.29%) aligned concordantly >1 times
20.01% overall alignment rate

rsem
-parse-alignments  /illumina/software/database/Trasc_GH37_74/rsem_74/Homo_sapiens.GRCh37.74 ALIGNrsem/363.temp/363 ALIGNrsem/363.stat/363 b ALIGNrsem/363.temp/363.bam -t 3 -tag XM
Warning: The SAM/BAM file declares less reference sequences (196317) than RSEM knows (196483)!
Parsed 1000000 entries
Parsed 2000000 entries
Parsed 3000000 entries
Parsed 4000000 entries
Parsed 5000000 entries
Parsed 6000000 entries
Parsed 7000000 entries
Parsed 8000000 entries
Parsed 9000000 entries
Parsed 10000000 entries
Parsed 11000000 entries
Parsed 12000000 entries
Parsed 13000000 entries
Parsed 14000000 entries
Parsed 15000000 entries
Parsed 16000000 entries
Parsed 17000000 entries
Parsed 18000000 entries
Parsed 19000000 entries
Parsed 20000000 entries
Parsed 21000000 entries
Parsed 22000000 entries
Parsed 23000000 entries
Parsed 24000000 entries
Parsed 25000000 entries
Parsed 26000000 entries
Parsed 27000000 entries
Parsed 28000000 entries
Parsed 29000000 entries
Parsed 30000000 entries
Parsed 31000000 entries
Parsed 32000000 entries
Parsed 33000000 entries
Parsed 34000000 entries
Parsed 35000000 entries
Parsed 36000000 entries
Parsed 37000000 entries
Parsed 38000000 entries
Parsed 39000000 entries
Parsed 40000000 entries
Parsed 41000000 entries
Parsed 42000000 entries
Parsed 43000000 entries
Parsed 44000000 entries
Parsed 45000000 entries
Parsed 46000000 entries
Parsed 47000000 entries
Parsed 48000000 entries
Parsed 49000000 entries
Parsed 50000000 entries
Parsed 51000000 entries
Parsed 52000000 entries
Parsed 53000000 entries
Parsed 54000000 entries
Parsed 55000000 entries
Parsed 56000000 entries
Parsed 57000000 entries
Parsed 58000000 entries
Parsed 59000000 entries
Parsed 60000000 entries
Parsed 61000000 entries
Parsed 62000000 entries
Parsed 63000000 entries
Parsed 64000000 entries
Parsed 65000000 entries
Parsed 66000000 entries
Parsed 67000000 entries
Parsed 68000000 entries
Parsed 69000000 entries
Parsed 70000000 entries
Parsed 71000000 entries
Parsed 72000000 entries
Parsed 73000000 entries
Parsed 74000000 entries
Parsed 75000000 entries
Parsed 76000000 entries
Parsed 77000000 entries
Done!

rsem
-build-read-index 32 1 0 ALIGNrsem/363.temp/363_alignable_1.fq ALIGNrsem/363.temp/363_alignable_2.fq
FIN
1000000
FIN
2000000
FIN
3000000
FIN
4000000
FIN
5000000
FIN
6000000
FIN
7000000
Build Index ALIGNrsem/363.temp/363_alignable_1.fq is Done!
FIN
1000000
FIN
2000000
FIN
3000000
FIN
4000000
FIN
5000000
FIN
6000000
FIN
7000000
Build Index ALIGNrsem/363.temp/363_alignable_2.fq is Done!

rsem
-run-em  /illumina/software/database/Trasc_GH37_74/rsem_74/Homo_sapiens.GRCh37.74 3 ALIGNrsem/363 ALIGNrsem/363.temp/363 ALIGNrsem/363.stat/363 -p 5 -b b ALIGNrsem/363.temp/363.bam 0
Refs.loadRefs finished!
DAT
7000000 reads left
Thread 0 : N = 1582291, NHit = 9161641
DAT
6000000 reads left
DAT
5000000 reads left
Thread 1 : N = 1584451, NHit = 9161715
DAT
4000000 reads left
Thread 2 : N = 1577005, NHit = 9161642
DAT
3000000 reads left
DAT
2000000 reads left
Thread 3 : N = 1576527, NHit = 9161642
DAT
1000000 reads left
DAT
0 reads left
Thread 4 : N = 1571568, NHit = 9161568
READS PROCESSED
estimateFromReads
, N0 finished.

1000000 READS PROCESSED
2000000 READS PROCESSED
3000000 READS PROCESSED
4000000 READS PROCESSED
5000000 READS PROCESSED
6000000 READS PROCESSED
7000000
READS PROCESSED
estimateFromReads
, N1 finished.
The alignment of fragment B0P8DQ1:73:HTKGNBCXX:1:1207:15100:69692 to transcript 196317 starts at -212563 from the forward direction, which should be a non-negative number! It is possible that the aligner you use gave different read lengths for a same read in SAM file.
Found unknown sequence letter  at function get_rbase_id!
"rsem-run-em  /illumina/software/database/Trasc_GH37_74/rsem_74/Homo_sapiens.GRCh37.74 3 ALIGNrsem/363 ALIGNrsem/363.temp/363 ALIGNrsem/363.stat/363 -p 5 -b b ALIGNrsem/363.temp/363.bam 0" failed! Plase check if you provide correct parameters/options for the pipeline!
step rsem_align
.1._363 ERROR



Bo Li

unread,
Jun 17, 2016, 5:19:34 AM6/17/16
to rsem-...@googlegroups.com
Hi MP,

Can you tell me which version of RSEM are you using?

Best,
Bo

MP

unread,
Jun 20, 2016, 4:25:13 AM6/20/16
to RSEM Users
version  rsem 1.2.21

Bo Li

unread,
Jun 20, 2016, 4:39:19 AM6/20/16
to rsem-...@googlegroups.com
Hi MP,

Can you update to the latest version, RSEM v1.2.31 and try it again? I
remembered that I have encountered and fixed similar issues before. If
it does not work, just let me know.

Best,
Bo

On 2016-06-20 01:25, MP wrote:
> version  rsem 1.2.21

David DeTomaso

unread,
Jun 22, 2016, 3:28:48 PM6/22/16
to RSEM Users
I had this same issue show up this week, and I found it went away when I used a non-masked reference.  Originally, I had built my RSEM reference from a hard-masked fasta sequence.  I was getting this mysterious negative position as well as a very low alignment rate.  It might be good to have some sort of check for a hard-masked reference when running rsem-prepare-reference.  I'm unsure whether or not using a soft-masked reference would result in the same issue.

MP

unread,
Jul 15, 2016, 9:10:09 AM7/15/16
to RSEM Users
The new version resolve that error.. thanks so much!
Reply all
Reply to author
Forward
0 new messages