Regarding genomic sequence of K562 and GM12878

547 views
Skip to first unread message

Roshan Fatima. Begum

unread,
Aug 26, 2015, 12:00:08 PM8/26/15
to gen...@soe.ucsc.edu
Dear Sir/ Madam

I would like to know if the genome of GM12878 lymphoblast cell line and the K562 CML cell line have been sequenced and if yes, if the sequence is available at UCSC genome browser?

When we say 'get data', we do get the DNA sequence, but I'm not sure if this is specific for a particular cell line or it is the sequence of the region in general, not of any cell line in particular.

I tried to search in 1000 genomes but I could not retrieve the data.

Could you please help me in retrieving the data?

Thanking you,

sincerely
Dr. Roshan Fatima
CBL-MBGU
JNCASR
Bangalore 560064
India.

Jonathan Casper

unread,
Aug 27, 2015, 7:50:40 PM8/27/15
to Roshan Fatima. Begum, gen...@soe.ucsc.edu

Hello Roshan,

Thank you for your question about obtaining sequence for the GM12878 and K562 cell lines. While we do display annotation from the ENCODE project for these cell lines on the UCSC Genome Browser, we do not have sequence data for those cell lines. The DNA sequence results from 'get data' are from regions of the assembly being displayed on the browser (e.g., the GRCh37/hg19 human genome assembly). We suggest that you contact the 1000 Genomes Project directly for help retrieving data from their site. You may also have some success searching for results at NCBI's GenBank and BioProject pages (e.g., http://www.ncbi.nlm.nih.gov/bioproject/293939).

We provide downloads of the annotation track data for these cell lines, and some of those downloads do include sequence data. For example, we provide BAM files that contain alignments of fragments from these cells to the human genome assembly. Please note, however, that the sequence data in these files will only represent the fragments used for annotation. They will not contain the full sequence of the cell lines. You can access these files from our ENCODE project portal at http://genome.ucsc.edu/ENCODE/ and clicking the "downloads" link from the left menu, or directly from our File Search CGI at http://genome.ucsc.edu/cgi-bin/hgFileSearch. You can use the filters on the File Search tool to limit results to the K562 and GM12878 cell lines, and then look for files with the file types like "bam" and "fasta" that match your interests.

You may also be interested in the answers to the following mailing list questions:
https://groups.google.com/a/soe.ucsc.edu/d/topic/genome/qrkteFREYm0/discussion
https://groups.google.com/a/soe.ucsc.edu/d/topic/genome/k3WbAV74n8Q/discussion

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Group



--


Reply all
Reply to author
Forward
0 new messages