How do I get hg38 centromeres?

965 views
Skip to first unread message

Jeltje van Baren

unread,
Jan 26, 2017, 10:29:53 AM1/26/17
to gen...@soe.ucsc.edu
Hello Browsers,

I wrote myself a handly little howto on getting centromeres using the Table browser. This works on hg19, but gets a # No results returned from query on hg38.

Here's my howto:

  http://genome.ucsc.edu/cgi-bin/hgTables
  Using the following selections:
  - group: Mapping and Sequencing
  - track:gap
  -       filter - goes to new page, look for 'type does match' and type centromere, submit
  -       output format: bed
  Submit, on the next page just press Get Bed


What changed?

Thanks!

-Jeltje

Jeltje van Baren

unread,
Jan 26, 2017, 12:51:33 PM1/26/17
to gen...@soe.ucsc.edu
Follow up:

A little bird told me about the centromeres track - I'm embarrassed to say I didn't notice it!

Unfortunately, this is not a great replacement for the original - it lists multiple 'centromeres' per chromosome, which is not something most centromere aware software will be happy to deal with. While I understand that these are modeling results, and most predictions overlap or are close to each other, it would be very helpful if there was a merged version available.

Thanks!

-Jeltje

Christopher Lee

unread,
Jan 27, 2017, 6:14:37 PM1/27/17
to Jeltje van Baren, UCSC Genome Browser Discussion List
Hi Jeltje,

Thank you for your question about obtaining centromere coordinates for
hg38. You can get slightly different centromere coordinates than the
centromeres track via the cytoBandIdeo table. Similar to before, head
to the Table Browser and choose the Mapping and Sequencing group,
although instead of selecting the gap track, select the "Chromosome
Band (Ideogram)" track. Then select filter, and enter "*cen*" in the
gieStain field. This leads to output like the following:

#filter: cytoBandIdeo.gieStain like '%cen%'
#chrom chromStart chromEnd name gieStain
chr1 121700000 123400000 p11.1 acen
chr1 123400000 125100000 q11 acen
chr2 91800000 93900000 p11.1 acen
chr2 93900000 96000000 q11.1 acen


Here each chromosome will have two entries, and will overlap so you
can merge them into a single entry. Unfortunately, these coordinates
will differ from those in the centromeres track, even if you were to
create single, per chromosome entries out of the centromeres track
items. You may find this previously answered mailing list question
helpful in deciding what data set to choose:
https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/SaR2y4UNrWg/XsGdMI3AazgJ

Thank you again for your inquiry and using the UCSC Genome Browser. If
you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a
publicly-accessible forum. If your question includes sensitive data,
you may send it instead to genom...@soe.ucsc.edu.

Christopher Lee
UCSC Genomics Institute

On Thu, Jan 26, 2017 at 8:58 AM, Jeltje van Baren
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "UCSC Genome Browser discussion list" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to genome+un...@soe.ucsc.edu.
Reply all
Reply to author
Forward
0 new messages