Question about protein functional domains

29 views
Skip to first unread message

Dhanasekaran, Mohan

unread,
Aug 20, 2015, 12:07:41 PM8/20/15
to gen...@soe.ucsc.edu, Vats, Pankaj

Hello,

We work at the University of Michigan and have a question about protein domains.   We are interested in knowing the genomic co-ordinates (start and end positions) of functional domains (pfam domains) present in each gene.  Can you tell us if this information is available as a table in the browser?  It will be of great help if you can give us any pointers on how this information can be obtained either from UCSC or other resource that you may be aware of.

Thank you for your help.

 

Best

-Mohan

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

Steve Heitner

unread,
Aug 20, 2015, 3:54:50 PM8/20/15
to Dhanasekaran, Mohan, gen...@soe.ucsc.edu, Vats, Pankaj

Hello, Mohan.

We have a Pfam Domains track on our hg38 Browser (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=ucscGenePfam).  You can download the table in its entirety by downloading ucscGenePfam.txt.gz from http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/.

If you want to download only a part of the table or if you want to cross-reference with gene symbols, you can use our Data Integrator.  If you are unfamiliar with the Data Integrator, you can also refer to http://genome.ucsc.edu/goldenPath/help/hgIntegratorHelp.html.

Perform the following steps:

1. Navigate to http://genome.ucsc.edu/cgi-bin/hgIntegrator

2. In the “Select Genome Assembly and Region” section, select the following options:
Group: Mammal
Genome: Human
Assembly: Dec. 2013 (GRCh38/hg38)
Region to annotation: Select “position or search term” to select a single locus.  Select “genome” for the entire genome.  Select “defined regions” to select multiple loci.

3. In the “Configure Data Sources” section, select the following options:
Track group: Genes and Gene Predictions
Track: Pfam in UCSC Gene (ucscGenePfam)

4. Click the “Add” button

5. If you want to add another data source, repeat steps 3 & 4.  For example, if you want to cross-reference gene symbols, you will want to add a gene track like GENCODE or RefSeq Genes.

6. Note that there is a double-sided vertical arrow to the left of the track names in your data sources.  If you have multiple data sources, you can click and drag this vertical arrow to re-arrange the order of data sources.  The results will be organized by the top-most data source.  You can also click the “X” to the right to remove any data source.

7. In the “Output Options” section, click the “Choose fields” button to select which fields you would like to appear in your output.  The default is to include all fields.

8. Click the “Get output” button

Please contact us again at gen...@soe.ucsc.edu if you have any further questions. 
Questions sent to that address will be archived in a publicly-accessible forum for the benefit of other users.  If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

---
Steve Heitner
UCSC Genome Bioinformatics Group

--

Vats, Pankaj

unread,
Sep 9, 2015, 1:54:08 PM9/9/15
to st...@soe.ucsc.edu, Dhanasekaran, Mohan, gen...@soe.ucsc.edu

Hi Steve,

 

I tried to download the pfam in UCSC Gene(ucscGenePfam) for hg19 build along with refSeq genes but end up in the error  below:

 

here is the step wise details:

 

1) In the “Select Genome Assembly and Region” section, select the following options:
Group: Mammal
Genome: Human
Assembly: Dec. 2009 (GRCh37/hg19)
Region to annotation: Selected   “genome” for the entire genome

 

2. In the “Configure Data Sources” section, select the following options:


Track group: Genes and Gene Predictions
Track: Pfam in UCSC Gene (ucscGenePfam)

 

Redo the step 2 for the refSeq gene

 

Output to a file and the file contain the error below at chr10

 

“annoGrator refGene: Unsorted input from primary source (chr10, 3109860 < 100374755)”

 

 

Can you please help us out

 

Thanks,

Pankaj

Jonathan Casper

unread,
Sep 9, 2015, 7:32:29 PM9/9/15
to Vats, Pankaj, Dhanasekaran, Mohan, gen...@soe.ucsc.edu

Hello Pankaj,

Thank you for your report of an error while using the UCSC Data Integrator. This sounds like an issue that we are already aware of, and one of our engineers is currently working to resolve it. We believe the bug is now fixed on our development server at http://genome-test.soe.ucsc.edu. Please try running your queries there instead and let us know if you have any further problems.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Group


--


Reply all
Reply to author
Forward
0 new messages