A question about "Table Browser"

9 views
Skip to first unread message

jeryoung

unread,
Sep 15, 2021, 12:38:25 PM9/15/21
to genome, JYL_Lab HsinLun Li
Dear UCSC genome browser bioinformatian,

I use the “Table Browser" to export data from the Genome Browser annotation track database.
When I chose the function “track: RefSeq”, I can get the output file below with the first column as “#bin”.

For the “bin”, the website says it is “Indexing field to speed chromosome range queries”.
I did not quite understand what this means.

I work on plants and, for running a program, I need to generate an input file of the UCSC table browser format with a bin as the first field.
Could you please let me know how what this “bin” mean and how to generate this “bin” number?

Thank you very much for your time.

Best,
Jer-Young


#bin name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds score name2 cdsStartStat cdsEndStat exonFrames
0 NM_207014 chr1 - 67075873 67163127 67075923 67163102 10 67075873,67078739,67085754,67100417,67109640,67113051,67129424,67131499,67143471,67162932, 67076067,67078942,67085949,67100573,67109780,67113208,67129537,67131684,67143646,67163127, 0 DNAI4 cmpl cmpl 0,1,1,1,2,1,2,0,2,0,
0 NM_024763 chr1 - 67051159 67163127 67052400 67163102 17 67051159,67060631,67065090,67066082,67071855,67072261,67073896,67075980,67078739,67085754,67100417,67109640,67113051,67129424,67131499,67143471,67162932, 67052451,67060788,67065317,67066181,67071977,67072419,67074048,67076067,67078942,67085949,67100573,67109780,67113208,67129537,67131684,67143646,67163127, 0 DNAI4 cmpl cmpl 0,2,0,0,1,2,0,0,1,1,1,2,1,2,0,2,0,

Jairo Navarro Gonzalez

unread,
Sep 17, 2021, 12:31:29 PM9/17/21
to jeryoung, genome, JYL_Lab HsinLun Li

Hello,

Thank you for using the UCSC Genome Browser and sending your inquiry.

The "bin" field is for optimizing database access for genomic annotations that cover a particular region
of the genome. For most users, this field should be ignored as it is used to speed up our searches,
which is not part of the file format. You can remove the bin field from the Table Browser output using
the following command:

cut -f 2- file_with_bin.txt > file_without_bin.txt

Could you explain why you need to add a bin field to your input file? If it is necessary to have a bin
field to satisfy a particular format for your program but is not needed for processing, you could add a
value of 0 to each line. The following awk command could add a zero column to each line:

awk '{ print 0, $0}' refSeqOutput.txt > file_with_bin.txt

I look forward to your reply. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/1631721862.23235.jeryoung%40gate.sinica.edu.tw.

jeryoung

unread,
Sep 20, 2021, 2:00:25 PM9/20/21
to Jairo Navarro Gonzalez, genome, JYL_Lab HsinLun Li
Hi Jairo,

Thank you very much for your detailed information.
We are using this genome table as input to run ChromHMM which needs bin as the first field.

Besides, thank you so much for your suggestions.
Indeed, we did test run by talking your suggestions and it goes well indicating that this bin is not an issue but still needs to be in the input file.

Really appreciate your great help.

Sincerely,
Jer-Young
Reply all
Reply to author
Forward
0 new messages