BSeQC_RPKM

46 views
Skip to first unread message

Xueqiu Lin

unread,
Jun 21, 2014, 11:52:25 PM6/21/14
to rseqc-...@googlegroups.com
Hi,

I try to run RPKM_saturation.py, the command is RPKM_saturation.py -i accepted_hits.bam -o WT_rep1_accepted_hits_rpkm_saturation -r RSeQC/mm9_NCBI37_Refseq.bed

The BAM file is paired-end, and from tophat mapping.

And I have met a error:
Traceback (most recent call last):
  File "/home/xueqiul/anaconda/bin/RPKM_saturation.py", line 5, in <module>
    pkg_resources.run_script('RSeQC==2.3.9', 'RPKM_saturation.py')
  File "/home/xueqiul/anaconda/lib/python2.7/site-packages/distribute-0.6.10-py2.7.egg/pkg_resources.py", line 461, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/home/xueqiul/anaconda/lib/python2.7/site-packages/distribute-0.6.10-py2.7.egg/pkg_resources.py", line 1194, in run_script
    execfile(script_filename, namespace, namespace)
  File "/home/xueqiul/anaconda/lib/python2.7/site-packages/RSeQC-2.3.9-py2.7-linux-x86_64.egg/EGG-INFO/scripts/RPKM_saturation.py", line 172, in <module>
    main()
  File "/home/xueqiul/anaconda/lib/python2.7/site-packages/RSeQC-2.3.9-py2.7-linux-x86_64.egg/EGG-INFO/scripts/RPKM_saturation.py", line 159, in main
    show_saturation(infile=options.output_prefix + ".eRPKM.xls", outfile=options.output_prefix + ".saturation.r",rpkm_cut = options.rpkm_cutoff)
  File "/home/xueqiul/anaconda/lib/python2.7/site-packages/RSeQC-2.3.9-py2.7-linux-x86_64.egg/EGG-INFO/scripts/RPKM_saturation.py", line 118, in show_saturation
    norm_RPKM[head[i]].append(str(j))
IndexError: list index out of range

Thanks,
Xueqiu

Liguo Wang

unread,
Jun 23, 2014, 10:53:07 AM6/23/14
to rseqc-...@googlegroups.com
in your WT_rep1_accepted_hits_rpkm_saturation.eRPKM.xls file, do you see a header line starting with '#'?  Thanks.

Liguo


--
You received this message because you are subscribed to the Google Groups "rseqc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rseqc-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

XueqiuLin

unread,
Jun 23, 2014, 11:01:44 AM6/23/14
to rseqc-...@googlegroups.com
I just  see one header line:
#chr start end name score strand 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100%

Thanks,
Xueqiu

You received this message because you are subscribed to a topic in the Google Groups "rseqc-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rseqc-discuss/TC0kxL5dnuA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rseqc-discus...@googlegroups.com.

Liguo Wang

unread,
Jun 23, 2014, 11:21:57 AM6/23/14
to rseqc-...@googlegroups.com
Then it really confused me. Maybe this file is damaged, you could check if the file "WT_rep1_accepted_hits_rpkm_saturation.eRPKM.xls" has the same number of columns for each row. You already get the results, but the error you had is about visualization.

Or you simply rerun it.

Thanks

Liguo

XueqiuLin

unread,
Jun 23, 2014, 12:15:28 PM6/23/14
to rseqc-...@googlegroups.com
There are 3 rows which have 46 columns, while the others have 26. 
And I have got the same output when I rerun it.

Thanks,
Xueiqu

Liguo Wang

unread,
Jun 23, 2014, 6:26:58 PM6/23/14
to rseqc-...@googlegroups.com
If  concatenation of  the first 6 columns (chrom, start, end, name, score, strand) of each bed entry can NOT make it unique, you would encounter such error.  The temporary solution is remove such duplicate entries from your BED file.

We will fix this bug.

Thanks

Liguo
Reply all
Reply to author
Forward
0 new messages