Information needed in .bed file for geneBody_coverage.py

297 views
Skip to first unread message

magnus jespersen

unread,
Jul 31, 2020, 12:02:07 AM7/31/20
to rseqc-discuss
Hello,

I'm trying to produce a gene body coverage plot using geneBody_coverage.py. I work with a bacteria, and thus have to produce the gene model bed file myself (example attached). However I cannot get the function to produce a gene body coverage plot. 
I looked into the python code and saw that the stand and gene name was used so I added those into the .bed file, but am I missing something in the .bed file?
I'm running RseQC version 3.0.1.

Here is an example of what is printed to stdout when running:

geneBody_coverage.py -i 5448_1_HL7J5DRXX_AGCTCGCT-GCAGAATC_to_5448.sorted.bam.bai -o 5448 -r ../../References/gene_body_genes/5448_con_exp_genes.bed 

@ 2020-07-31 13:53:16: Read BED file (reference gene model) ...

@ 2020-07-31 13:53:16: Total 13 transcripts loaded

@ 2020-07-31 13:53:16: Get BAM file(s) ...




Sample Skewness

@ 2020-07-31 13:53:16: Running R script ...

null device 

          1 

5448_con_exp_genes.bed

magnus jespersen

unread,
Jul 31, 2020, 12:14:10 AM7/31/20
to rseqc-discuss
And I have checked that the chromosome ID is the same for the .bed and .bam
example line of .bam file: A00121:248:HL7J5DRXX:2:1254:3595:12070 16 NZ_CP008776.1 62753 60 100M * 0 0 GAATTCCGTGGAAAA

Liguo Wang

unread,
Jul 31, 2020, 12:36:36 PM7/31/20
to rseqc-discuss
Hello,
your bed file is not correct. please strictly follow these instructions to make BED file. 


Thanks

Liguo

magnus jespersen

unread,
Aug 2, 2020, 6:46:56 PM8/2/20
to rseqc-discuss
Thank you for the reply. I have revised my bed-file but cannot spot what you suggest is wrong. Can you point me more specifically to what you have seen? I have the 12 required fields. I have the chromosome, start, end and strand which seems to be the information used by geneBody_coverage.py, so what specifically am I missing or is wrong with the formatting?

Thanks,
Magnus

Liguo Wang

unread,
Aug 3, 2020, 9:26:44 PM8/3/20
to rseqc-...@googlegroups.com
Standard BED file has 12 fields. The first 6 fields define the transcribed region. The last 6 fields define the starts/ends positions of exons, introns which are very important for geneBody_coverage.py. 


--
You received this message because you are subscribed to the Google Groups "rseqc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rseqc-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rseqc-discuss/543bc57a-1c9f-41db-b76f-0daeca3c7511o%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages