Dear UCSC Genome Browser support team,
I wish to bulk-extract the names of the source (parent) genes for the entire set of retroposed genes, including pseudogenes, available in either hg19 or hg38.
For example, from the retroposed track at the Genome Browser, I see that ARL5A is the source gene name for the retro NM_012097.3-2 by clicking on the retrogene to open the information table “Retroposed Genes V9, Including Pseudogenes (NM_012097.3-2)” (see print screen attached)
So, I could not find the right Table that contains the information about the source (parent) gene name using either Table Browser or Data Integrator
Here is the path a took in Table Browser.
Clade: Mammal
Genome: human
Assembly: hg19 or hg38
Group: Genes and Gene Predictions
Track: retroposed genes
Tables: ucscRetroAli5, ucscRetroCds5, ucscRetroExpressed5, ucscretroInfo5, ucscretroOrtho5, ucscretroSeq5
Output format: selected fields from primary and related tables.
Best regards,
Enrique
Hi Enrique,
Thank you for your question about bulk extracting source gene names
for the Retro Genes track. Unfortunately, the query used to extract this
information is unsupported in the Table Browser, and so in order to
bulk extract these data, you will need to query our public MySQL server,
which requires access to a command line and having MySQL installed on
your machine. Once you have this set up, the following command will
extract the source gene information for the hg38 RetroGenes 9 track:
mysql --host=genome-mysql.soe.ucsc.edu --user=genome -A -Ne "select r.name as id, gene.name from hg38.ucscRetroInfo9 r, hgFixed.geneName gene, hgFixed.gbCdnaInfo g, hgFixed.description d where (substring(r.name,1,locate('.', r.name)-1))=g.acc and g.geneName=gene.id and g.description=d.id"
This query results in output like the following:
A22930.1-6 n/a A22938.1-11 n/a A22938.1-14 n/a A22938.1-21 n/a AB002312.2-5 KIAA0314 AB004304.1-75 n/a AB004304.1-84 n/a AB009619.1-2 n/a AB011119.1-179 KIAA0547 AB011539.2-17 MEGF6
For more information about using our public MySQL server, please see the following page:
http://genome.ucsc.edu/goldenPath/help/mysql
Thank you again for your inquiry and using the UCSC Genome Browser. If
you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a
publicly-accessible forum. If your question includes sensitive data,
you may send it instead to genom...@soe.ucsc.edu.
Christopher Lee
UCSC Genomics Institute
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CADv9Q1AxP8Nvac2H0UZK7k-qDQkU7cnWH%2BqmzAV%2BUUYv7hARcg%40mail.gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.