Search for a homologous protein in the genome

6 views
Skip to first unread message

Alexander Tsygankov

unread,
Nov 1, 2019, 4:09:28 PM11/1/19
to gen...@soe.ucsc.edu
Hello, colleagues-

sorry to bug you, but I truly looked through FAQs and descriptions, but failed to find the answer I need.

I need to find out whether the lamprey genome (Pmar_germline 1.0 of Dec 2017) contains one OR two members of a gene/protein family (UBASH3, which is a two-member family in other vertebrates). Using either name as a search term, I am getting both family members as Refeq gene, but this hits are in fact in the same locus, and the predicted sequences there correspond to one of the family members. (Note that the expected gene coding sequence is divided between two predicted sequences, but their combination looks as expected for one family member.) 

It is possible that the second family member (if exists at all in the lamprey genome) is not well predicted and its coding sequence is also distributed between several predicted genes. I would like to BLAST the sequence of this second family member from different species against all predicted gene sequences of this genome. I failed to find out how to do this. I did try BLAT, but BLAT is sensitive to substitutions; the sequences I am going to use as queries are likely to have way less than 80% identify to the lamprey sequence.

Any suggestions are highly appreciated!

Best regards,
Alex Tsygankov

Jairo Navarro Gonzalez

unread,
Nov 6, 2019, 7:49:00 PM11/6/19
to Alexander Tsygankov, UCSC Genome Browser Mailing List

Hello Alex,

Thank you for using the UCSC genome Browser and for sending your inquiry.

Unfortunately, this mailing list is not intended for scientific advice, so I cannot give you a reason why the second family member may or may not be in the genome. There are forums like BioStar, https://www.biostars.org/, where scientists may be able to provide you with the scientific direction you need, or other agencies devoted to resolving such questions. Please note that when genome assemblies are not of very high quality, genes may appear to be missing simply because the assembly is incomplete.

A protein BLAT search will be able to find sequences of 80% and greater similarity of length 20 amino acids or more. Using the protein sequence, you can conclude whether the second family member is in the lamprey genome. To find which gene annotations overlap with your BLAT results, you can follow these instructions:

1. Perform the protein BLAT search and create a custom track
2. Using the Table Browser, intersect the BLAT custom track with the gene track of your choice.

If you are unfamiliar with creating a Table Browser intersection, a tutorial can be found on the help page, https://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#Intersection.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly-accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser

Want to share the Browser with colleagues?
Host a workshop: https://bit.ly/ucscTraining


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAG9h_skZBHC1YOyELp1b33_Hr4f0bXP%3DuBvXRWuL0E1kgTN0Aw%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages