hg38.fa.gz - "Soft-masked"

71 views
Skip to first unread message

h.mogh...@znu.ac.ir

unread,
Jul 11, 2016, 12:35:48 PM7/11/16
to gen...@soe.ucsc.edu
Dear UCSC,

I use  hg38.fa.gz - "Soft-masked"  from the link
 http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/    to obtain the whole genome sequence of the human.
In the output file, sequences are labeled by some names such as chr1, chr1_GL383518v1_alt,....

Do the sequences labeled by the name chr1,as an example contain the whole sequence of the chr1?is that true?
What are the sequences with names such as chr1_GL383518v1_alt , chr3_KI270780v1_alt?


Kind Regards,


  Hanieh Moghaddasi
  Ph.D Student of Physics
  Department of Physics, University of Zanjan

Matthew Speir

unread,
Jul 12, 2016, 1:07:05 PM7/12/16
to h.mogh...@znu.ac.ir, gen...@soe.ucsc.edu
Hi Hanieh,

Thank you for your question about the hg38.fa.gz file provided on our download server.

Yes, the output the sequence labeled simply "chr1" file is the entire sequence of chr1 as displayed in the UCSC Genome Browser. These "alt" sequences are the sequences of alternative loci that are part of the hg38 genome and provided by the Genome Reference Consortium (GRC). You can read more about these sequences on the GRC website, https://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/info/definitions.shtml, under the definition for "Alternate locus".

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
--


Reply all
Reply to author
Forward
0 new messages