I have been using the refseq transcripts and ensemble transcripts downloaded from UCSC genome browser table on June 23 2011.
The transcript IDs in these datasets that were downloaded from UCSC do not have the version numbers (such as NM_134564.2) where ".2" is the version number after the period.
However, recently, it turns out that I need to have the version numbers of each transcript. So, I tried to look for them and download them using the info provided here, however there is no way for me to choose the refseq transcripts for the date June 23 2011:
https://lists.soe.ucsc.edu/pipermail/genome/2011-September/027099.html
Would it be possible for you to please send me the refseq and ensemble transcripts for June 23 2011 from your archives please which includes the version numbers for each transcript in them?
Or if there is a way that I could access this data myself, if you could please let me know I would very much appreciate it.
Thank you,
Laura
Thank you very much for your reply. Based on your suggestion, I decided to download the newest REFSEQ and ENSEMBLE transcripts from UCSC Browser with all of the gbstatus subfields.
I have tried to download these files with the gbstatus fields, however I keep getting error from UCSC genome browser website. I am following the directions listed here:
https://lists.soe.ucsc.edu/pipermail/genome/2011-September/027099.html
Is there something I am doing wrong perhaps? Please see the attached 2 files for the screenshots of the error messages from UCSC browser.
If not, since I am not able to download these files, would it be possible for you to please send me or provide me a link to the latest Refseq and Ensemble transcripts please with all of the gbstatus subfields?
or if you could please let me know how I can download them with the gbstatus fields, I would very much appreciate it.
Thank you,
Laura
________________________________
From: Steve Heitner <st...@soe.ucsc.edu>
To: 'Laura Smith' <lsmith...@yahoo.com>; gen...@soe.ucsc.edu
Sent: Wednesday, May 30, 2012 10:47 AM
Subject: RE: [Genome] Downloading old refseq and ensemble transcripts with the "version numbers" in the accession IDs.
Thank you very much for your very informative email.
I followed your instructions and I downloaded the REFSEQ and ENSEMBL transcripts from GALAXY exactly the way you described and I also downloaded the gbstatus and did a "join" on the transcript name.
Now, I need to know which version of ENSEMBLE and REFSEQ are these that I downloaded. Would it be possible for you to please kindly let me know how and where I can retrieve this information?
Basically to summarize, what versions of ENSEMBLE and REFSEQ transcripts are on currently UCSC website? How often are they updated on UCSC website? Is there an online link where this information is provided?
Another issue is, is it for sure that GALAXY is in-sync with all updates from UCSC website? Perhaps, this is a question for GALAXY, but in case you may know, I wanted to ask you as well. When users access UCSC MAIN from GALAXY, are they connected to UCSC online website or some version of "in-house UCSC browser within GALAXY"?
Once again, thank you very much for your help.
Laura
________________________________
From: Brooke Rhead <rh...@soe.ucsc.edu>
To: Laura Smith <lsmith...@yahoo.com>
Cc: "gen...@soe.ucsc.edu" <gen...@soe.ucsc.edu>
Sent: Monday, June 4, 2012 5:15 PM
Subject: Re: [Genome] Downloading old refseq and ensemble transcripts with the "version numbers" in the accession IDs.
Thank you again for your email.
I have a question on gbstatus. I downloaded the gbStatus.txt file in this link you sent to me:
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gbStatus.txt.gz
However, this file only contains "REFSEQ transcripts". It does not contain "ENSEMBL transcripts".
I also would like to get the gbStatus for ENSEMBL transcripts.
Does UCSC browser provide them?
To be more clear:
For example, on ucsc genome website, when I look at a random gene's refseq transcripts and ensembl transcripts, I noticed that when I click on refseq transcripts, they contain the accession ID.VERSION number as is NM_123.4 where ".4" is the version.
However, the ENSEMBL transcripts do not have version numbers. They only have accession numbers such as ENS_123 on UCSC website. ENSEMBL transcripts should also have version numbers as listed in ENSEMBL website.
So, do you know why this information is not included in the UCSC genome browser website?
Another thing I tried is this: When I try to get the ENSEMBL gbstatus from galaxy website and do a join between gbstatus and ensembl transcripts and empty set is returned. I am guessing that the reason is because in the original gbstatus.txt file there is no ENSEMBL transcripts so there is nothing to join based on accession ids.
Is there any plan to include "ENSEMBL versions" in gbstatus.txt file in the near future? If not, is there another way for me to retrieve them from ucsc genome browser?
If you could please provide me any recommendation I would greatly appreciate it.
Thank you,
Laura
________________________________
From: Brooke Rhead <rh...@soe.ucsc.edu>
To: Laura Smith <lsmith...@yahoo.com>
Cc: "gen...@soe.ucsc.edu" <gen...@soe.ucsc.edu>
Sent: Monday, June 4, 2012 5:15 PM
Subject: Re: [Genome] Downloading old refseq and ensemble transcripts with the "version numbers" in the accession IDs.
I have downloaded the placental pylop scores from UCSC GENOME BROWSER website from here:
http://hgdownload.cse.ucsc.edu/goldenpath/hg19/phyloP46way/placentalMammals/
http://hgdownload.cse.ucsc.edu/goldenpath/hg19/phyloP46way/
Could you please confirm the version of chromosome M used when calculating the phylop scores? Was it the new RCRS CHRM version? or was it the old chrM?
Thank you,
Laura