[Genome] How could I get knownToEnsembl.txt and ensGene.txt files for Ensembl release 60?

13 views
Skip to first unread message

Emilie Chautard

unread,
Dec 22, 2011, 11:33:18 AM12/22/11
to gen...@soe.ucsc.edu
Hi,

I need to use the Ensembl r60 gene annotations for the files
knownToEnsembl.txt and ensGene.txt.
It seems that the files on
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ correspond to a
more recent release.
Do you have a tool to suggest which could help me to generate these files
or are these files still available somewhere?
Thanks a lot in advance,
Best regards,

Emilie

Hiram Clawson

unread,
Dec 29, 2011, 4:49:16 PM12/29/11
to Emilie Chautard, gen...@soe.ucsc.edu
Good Afternoon Emilie:

I don't have the archive of the knownToEnsembl.txt file, however,
I do have a copy of the genePred file for the ensGene track version60:

http://genome-test.cse.ucsc.edu/~hiram/ensGene.hg19.v60/hg19.ensGene.gp.gz

The simple awk script included below can convert this genePred file
to a bed file.

--Hiram
#!/usr/bin/awk -f

#
# Convert genePred file to a bed file (on stdout)
#
BEGIN {
FS="\t";
OFS="\t";
}
{
name=$1
chrom=$2
strand=$3
start=$4
end=$5
cdsStart=$6
cdsEnd=$7
blkCnt=$8

delete starts
split($9, starts, ",");
delete ends
split($10, ends, ",");
blkStarts=""
blkSizes=""
for (i = 1; i <= blkCnt; i++) {
blkSizes = blkSizes (ends[i]-starts[i]) ",";
blkStarts = blkStarts (starts[i]-start) ",";
}

print chrom, start, end, name, 1000, strand, cdsStart, cdsEnd, 0, blkCnt, blkSizes, blkStarts
}

Emilie Chautard

unread,
Jan 6, 2012, 5:32:33 PM1/6/12
to Hiram Clawson, gen...@soe.ucsc.edu
Hi Hiram,

Thank you for responding to my question so quickly. It's exactly what I
needed.
Best regards,

Emilie

On Thu, Dec 29, 2011 at 4:49 PM, Hiram Clawson <hi...@soe.ucsc.edu> wrote:

> Good Afternoon Emilie:
>
> I don't have the archive of the knownToEnsembl.txt file, however,
> I do have a copy of the genePred file for the ensGene track version60:
>
> http://genome-test.cse.ucsc.**edu/~hiram/ensGene.hg19.v60/**
> hg19.ensGene.gp.gz<http://genome-test.cse.ucsc.edu/%7Ehiram/ensGene.hg19.v60/hg19.ensGene.gp.gz>
>
> The simple awk script included below can convert this genePred file
> to a bed file.
>
> --Hiram
>
>
> Emilie Chautard wrote:
>
>> Hi,
>>
>> I need to use the Ensembl r60 gene annotations for the files
>> knownToEnsembl.txt and ensGene.txt.
>> It seems that the files on
>> http://hgdownload.cse.ucsc.**edu/goldenPath/hg19/database/<http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/>correspond to a
Reply all
Reply to author
Forward
0 new messages