how to collapse sam/bam to bed with read count

250 views
Skip to first unread message

Joseph Dhahbi, PhD

unread,
Dec 1, 2011, 7:51:10 PM12/1/11
to bedtools...@googlegroups.com
Hi
is there a way to combine tools to collapse a bam or sam file into a bed
file with the following columns:
#chr start end id read_count strand

Regards,
Joseph

Joseph M. Dhahbi, PhD
Childrens Hospital Oakland Research Institute
5700 Martin Luther King Jr. Way
Oakland, CA 94609
USA
Ph.(510)428-3885 EXT.5743
Cell.(702)335-0795
Fax (510)450-7910
jdh...@chori.org
CONFIDENTIALITY NOTICE: This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

Aaron Quinlan

unread,
Dec 7, 2011, 12:03:00 PM12/7/11
to bedtools...@googlegroups.com
Hi Joseph,

In this case, what would the ID represent?

Assuming it is meaningless, you could do something like (untested, just whipped together, so caveat emptor):

# step 1. create a bedgraph of the forward strand alignments.
genomeCoverageBed -i <(bamToBed -i aln.bam | awk '$6=="+") -g chrom.sizes -bg > plus.bedg

# step 2. create a bedgraph of the reverse strand alignments.
genomeCoverageBed -i <(bamToBed -i aln.bam | awk '$6=="-") -g chrom.sizes -bg > neg.bedg

# step 3. combine the two bedgraphs, adding a strand and a fake ID
# 3a. use awk to add a strand column
# 3b. sort by chrom and strand
# 3c. add some fake ID which is the line number
(cat plus.bedg | awk '{print $0"\t+"; cat neg.bedg | awk '{print $0"\t-") | \
sort -k1,1 -k2,2n | \
awk '{OFS="\t"; print $1,$2,$3,NR,$4,$5}' \
> joseph.out

I hope this helps.
Aaron

Reply all
Reply to author
Forward
0 new messages