Information on hg38 repeat masker file

201 views
Skip to first unread message

Mirko Celii

unread,
Oct 27, 2021, 11:50:38 AM10/27/21
to gen...@soe.ucsc.edu
Hello,
I hope this email finds you well,
I would like to know which fasta library was used to generate the rmsk.txt.gz file in http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/ but I could not find this information.
Could you please help me ?

best regards,
Mirko

--

Mirko Celii, PhD
Kaust Environmental Epigenetics Program
Department of Bioscience and Engineering 
King Abdullah university of science and technology



This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.

Jairo Navarro Gonzalez

unread,
Nov 1, 2021, 8:06:47 PM11/1/21
to Mirko Celii, UCSC Genome Browser Discussion List

Hello,

Thank you for using the UCSC Genome Browser and sending your inquiry.

On the following downloads directory, http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/,
the README states the version of RepeatMasker that was used for the initial release of hg38. The
initial release sequences (annotated by the 20130422 library version) are listed in
https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/initial/hg38.chrom.sizes.

June 20 2013 (open-4-0-3) version of RepeatMasker
RepBase library: RELEASE 20130422

We are using libraries from the Genetic Information Research Institute that we're not allowed to share.
You can register an account at https://www.girinst.org/ to access RepBase and you should be able to
find the library.

If you are interested in the version of RepeatMasker used on the recent patch work for hg38
(https://genome-blog.soe.ucsc.edu/blog/patches/), the information is below. It's a little more complicated
for patch 12 and patch 13, because the chrom.sizes for those releases include initial release sequences
(and patch 13 chrom.sizes includes patch 12 sequences) so some subtraction is involved.

Patch 12:

#    February 01 2017 (open-4-0-7) 1.331 version of RepeatMasker

    Dfam_Consensus RELEASE 20170127;                            *
    RepBase RELEASE 20170127;                                   *

https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/p12/hg38.p12.chrom.sizes

Patch 13:

#    February 01 2017 (open-4-0-8) 1.332 version of RepeatMasker

    Dfam_Consensus RELEASE 20181026;                            *
    RepBase RELEASE 20181026;                                   *

https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/p13/hg38.p13.chrom.sizes

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser

Want to share the Browser with colleagues?
Host a workshop: https://bit.ly/ucscTraining


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CABJ75j%2BmYJpKA_qQEa8exhRx1HCuKTQhe71J2FofOG35Q%3D_8Ow%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages