Junctions bed to bigbed issue

19 views
Skip to first unread message

Philip Badzuh

unread,
Feb 23, 2022, 1:20:56 PM2/23/22
to gen...@soe.ucsc.edu
Hello,

I have a junctions file that was generated by TopHat and would like to add it to a track hub that I am creating. I am trying to convert that file to the bigbed format, using the latest version (2.8) of bedToBigBed as follows:

./bedToBigBed SRR1957124_Adrenal_gland.bed http://igbquickload.org/rnaseq/H_sapiens_Dec_2013/genome.txt SRR1957124_Adrenal_gland.bb


However, I am seeing the error below:

pass1 - making usageList (298 chroms): 140 millis

Error line 3157 of SRR1957124_Adrenal_gland.bed: score (2334) must be between 0 and 1000


After taking a look at the UCSC bed format specification, it looks like the error makes sense. Could anyone recommend an approach to converting junction files to a UCSC-compliant bed format, or directly to a bigbed format?

Any help would be greatly appreciated,
Philip Badzuh
UNC Charlotte
Loraine Lab

Jairo Navarro Gonzalez

unread,
Mar 1, 2022, 6:53:51 PM3/1/22
to Philip Badzuh, UCSC Genome Browser Discussion List

Hello,

Thank you for using the UCSC Genome Browser and sending your inquiry.

One solution is to scale the score values in column 5 so that the maximum value is only 1000. You
can run the following commands to scale your scores:

# find the maximum score in column 5 of the bed file:
cat SRR1957124_Adrenal_gland.bed | gawk '{print $5}' | sort -n | tail -1
71533

cat SRR1957124_Adrenal_gland.bed | gawk '{print $1"\t"$2"\t"$3"\t"$4"\t"int(0.5+(($5*1000)/71533))"\t"$6"\t"$7"\t"$8"\t"$9"\t"$10"\t"$11"\t"$12}' > SRR1957124_Adrenal_gland.scaled.bed

curl -O http://igbquickload.org/rnaseq/H_sapiens_Dec_2013/genome.txt

bedToBigBed SRR1957124_Adrenal_gland.scaled.bed genome.txt SRR1957124_Adrenal_gland.bb

Please note that the awk int() function truncates the fractional part to produce an integer, so to
make it round up, we add 0.5 first. There are some values that still round down to 0, and if that
causes a problem, we can provide another formula for forcing the lowest value to 1 instead of 0.

Another option is to convert your junction file into a BED detail format, https://genome.ucsc.edu/FAQ/FAQformat.html#format1.7.
The simplest solution would be to use a BED4+8, standard 4 BED columns plus 8 additional fields.
Using the BED4+8 format, you will not have to modify the file; however, you will lose the RGB color
option. A more sophisticated solution is to move the junction score value from the 5th column,
replacing it with an arbitrary value like 0, and place the value into an additional field (BED12+1).

Once you have the BED detail file, you can create the bigBed using the bedToBigBed utility. You
can learn more about creating the bigBed file with extra columns from the following help page,
https://genome.ucsc.edu/goldenPath/help/bigBed.html#Ex3. Please note that you will have to
create an AutoSql file to define the additional field. Here is an example file that you can modify
to fit your needs,
 https://genome.ucsc.edu/goldenPath/help/examples/bedExample2.as.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser

Want to share the Browser with colleagues?
Host a workshop: https://bit.ly/ucscTraining


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAOUhRH1V0zB5JFe-6LQvkjw5bMJY0HUhZEjQTLFJO7%3DUkEfGAA%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages