Pseudogenes

255 views
Skip to first unread message

Meh...@mskcc.org

unread,
Aug 23, 2022, 3:34:43 PM8/23/22
to gen...@soe.ucsc.edu

To Whom It May Concern,

 

I was hoping to get some help in trying to identify if particular genes have a pseudogene and what that pseudogenes name/transcript is.  I saw in the archives that the appropriate track to use is the GENCODE track, and I do see that I have to turn it to show and full to view it in the tracks.  However, I’m having trouble 1) getting data to show up in the track even with things turned on to display and 2) understand any data that is in there.

 

For example, I know that PMS2’s pseudogene is called PMS2CL.  However, when I search for PMS2, turn on the GENCODE track, and select the sub-filter of “pseudo”, nothing is displayed for PMS2.

When I search for PMS2CL, then I see transcripts under the pseudogene track.

 

The same thing happens when I search for CHEK2 versus CHEK2P2.

 

Is there a way to use UCSC Genome Browser to identify a gene’s pseudogene or pseudogenes?  Is there also a way to use the UCSC Genome Browser, especially since GENCODE gives transcripts, to determine where the pseudogene overlaps with the real gene?

 

Thank you in advance for any help you can provide!


Sincerely,

Nikita M.

 

 

Nikita Mehta, MS, CGC

Genetic Analysis Specialist, Sr

Diagnostic Molecular Genetics Laboratory, Department of Pathology

 

Memorial Sloan Kettering Cancer Center

1250 First Ave., New York, NY 10065

Schwartz Building

Meh...@mskcc.org

 

Please consider the environment before printing this page or its attachments.

 

=====================================================================

Please note that this e-mail and any files transmitted from
Memorial Sloan Kettering Cancer Center may be privileged, confidential,
and protected from disclosure under applicable law. If the reader of
this message is not the intended recipient, or an employee or agent
responsible for delivering this message to the intended recipient,
you are hereby notified that any reading, dissemination, distribution,
copying, or other use of this communication or any of its attachments
is strictly prohibited. If you have received this communication in
error, please notify the sender immediately by replying to this message
and deleting this message, any attachments, and all copies and backups
from your computer.

Daniel Schmelter

unread,
Aug 24, 2022, 6:29:22 PM8/24/22
to Meh...@mskcc.org, gen...@soe.ucsc.edu

Hello Nikita,

Thank you for contacting the Genome Browser support team with your question about pseudogenes.

You are right to be looking in the Gencode Pseudogenes track, but it seems like there's no data there for the PMS2 region. The Pseudogenes track only reports the positions of pseudogene annotations themselves, not a list of all regions similar to the current range. For an alignment of the current region to the entire genome, which should reveal pseudogenes and includes a base-by-base alignment, you can turn on the Self-Chain track. You could also use BLAT to search for similar sequences and use those results to find Pseudogenes.

A slightly broader method of finding Pseudogenes using UCSC tools would be to do a Table Browser query of the Pseudogenes dataset's name2 field using a wildcard search, such as PMS*. You can do this by selecting the Pseudogene dataset and then clicking Filter "Create". A test of this method resulted in dozens of hits to many PMS transcripts, though they may need to be looked through individually.


I hope this was helpful! If you have any more questions, please reply-all to our public support email at gen...@soe.ucsc.edu. For private communication, please reply-all to genom...@soe.ucsc.edu.
All the best,

--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/BL0PR18MB2163288046BE5D1C1DCCFB8CF5709%40BL0PR18MB2163.namprd18.prod.outlook.com.

Meh...@mskcc.org

unread,
Aug 25, 2022, 5:41:28 PM8/25/22
to dsch...@ucsc.edu, gen...@soe.ucsc.edu

Hi Daniel,

 

Thank you for this response.  I think I might find it easier and more successful at a first try to use the Table Browser search.  I was initially in there trying to use the filter search, but didn’t know how to do it correctly.

 

Following your directions, I set the track to GENCODE V41lift37, the table to Pseudogenes, and then made the filter using the * after the gene.  My follow-up questions are 1) why does this gene name have to go into the name2 field of the filter instead of name (I tested it and it doesn’t work after hitting get output) and 2) can I search for multiple genes at once, perhaps somehow using the free-form query?  I tested the latter two, but couldn’t get it to work, but I’m not sure if my syntax is just wrong for that section.

 

Thanks again for your help!

Nikita

 

PS - I’ll definitely keep Self-Chain and BLAT searches in mind for actual sequence alignment.  I think I’m finding several pseudogenes for some genes as you said, and to determine if there is any interference from those in a clinical assay, the sequence search probably will be more useful.

 

Nikita Mehta, MS, CGC

Genetic Analysis Specialist, Sr

Diagnostic Molecular Genetics Laboratory, Department of Pathology

 

Memorial Sloan Kettering Cancer Center

1250 First Ave., New York, NY 10065

Schwartz Building

Meh...@mskcc.org

 

Please consider the environment before printing this page or its attachments.

 



*** Only open attachments or links from trusted senders. Report phishing to
inf...@mskcc.org ***

 

Gerardo Perez

unread,
Sep 2, 2022, 8:44:29 PM9/2/22
to Meh...@mskcc.org, gen...@soe.ucsc.edu

Hello, Nikita.

Thank you for your follow-up questions.

1) why does this gene name have to go into the name2 field of the filter instead of name (I tested it and it doesn’t work after hitting get output)

The name field in the GENCODE V41lift37 track consists of transcript identifiers that start with ENST. The name2 field consists of gene ids such as PMS2CL and CHEK2P2. You can check the fields of a track by clicking the table schema next to the table: option:

table_schema.png

2) can I search for multiple genes at once, perhaps somehow using the free-form query?

Yes, you can use the free-form query option to search for an additional gene. For example, on the Filter on Fields page, you can do the following:

name2 does match PMS2* AND

OR Free-form query: name2 like "CHEK2*"

The free-form query takes in SQL syntax, such as:

name2 like "CHEK2%" 
name2 like "CHEK2*" 
name2 like 'CHEK2%'
name2 like 'CHEK2*'

Also, we are working on improving the display under the main gene, where you will see the locations of all pseudogenes for the current gene in different colors.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute


Mehta, Nikita

unread,
May 2, 2024, 3:54:22 PM5/2/24
to Gerardo Perez, gen...@soe.ucsc.edu

Hello Gerardo,

 

I’m looking at pseudogenes again, and I was wondering if the UCSC was able to implement a track that “improv[es] the display under the main gene, where you will see the locations of all pseudogenes for the current gene in different colors” as you previously mentioned.

 

Thanks!

Nikita

 

Nikita Mehta, MS, CGC

Genetic Analysis Specialist, Sr

Diagnostic Molecular Genetics Laboratory, Department of Pathology

 

Memorial Sloan Kettering Cancer Center

1250 First Ave., New York, NY 10065

Schwartz Building

Meh...@mskcc.org

 

Please consider the environment before printing this page or its attachments.

 

From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Friday, September 2, 2022 8:44 PM
To: Mehta, Nikita N./Pathology <Meh...@mskcc.org>
Cc: gen...@soe.ucsc.edu
Subject: [EXTERNAL] Re: Re: [genome] Pseudogenes

 

Hello, Nikita.

Thank you for your follow-up questions.

1) why does this gene name have to go into the name2 field of the filter instead of name (I tested it and it doesn’t work after hitting get output)

The name field in the GENCODE V41lift37 track consists of transcript identifiers that start with ENST. The name2 field consists of gene ids such as PMS2CL and CHEK2P2. You can check the fields of a track by clicking the table schema next to the table: option:

Disclaimer ID:MSKCC

Gerardo Perez

unread,
May 13, 2024, 8:25:28 PM5/13/24
to Mehta, Nikita, gen...@soe.ucsc.edu

Hello, Nikita.

There has been some progress. We recently got the data from the Gerstein group but still need some additional data fields. If you are interested, we would be happy to share an early version and you can provide us some feedback. For example, would you be interested to see the actual base pair alignments, or would the exon annotation itself be good enough? 

I hope this is helpful. Please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute

Mehta, Nikita

unread,
Jul 8, 2024, 6:00:23 PM7/8/24
to Gerardo Perez, gen...@soe.ucsc.edu

Hi Gerardo,

 

I’m so sorry for never responding and letting this sit for 2 months.  If it would help to have someone test things, I’d be happy to see an early version (if it’s still early and not already in production) of this pseudogene track.


I think from my perspective as a variant curator, the exon structure is sufficient.  However, I know my colleagues would probably appreciate base pair alignments (for panel design, primer design, etc.).  For example, I’ve seen them use Clustal to align sequence manually.

 

Please let me know what I can do!

 

Thanks,

Nikita

 

Nikita Mehta, MS, CGC

Genetic Analysis Specialist, Sr

Diagnostic Molecular Genetics Laboratory, Department of Pathology

 

Memorial Sloan Kettering Cancer Center

1250 First Ave., New York, NY 10065

Schwartz Building

Meh...@mskcc.org

 

Please consider the environment before printing this page or its attachments.

 

From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Monday, May 13, 2024 8:25 PM
To: Mehta, Nikita <Meh...@mskcc.org>
Cc: gen...@soe.ucsc.edu
Subject: [EXTERNAL] Re: [genome] Pseudogenes

 

Hello, Nikita. There has been some progress. We recently got the data from the Gerstein group but still need some additional data fields. If you are interested, we would be happy to share an early version and you can provide us some feedback. 

Jairo Navarro Gonzalez

unread,
Jul 11, 2024, 6:30:31 PM7/11/24
to Mehta, Nikita, Gerardo Perez, gen...@soe.ucsc.edu

Hello,

Thank you for using and helping improve the UCSC Genome Browser.

I have added a note to contact you once the track has developed enough so you can review the track. Unfortunately, I cannot give an estimated date for when it will be available for review.

If you have any further questions, please reply to gen...@soe.ucsc.edu.

All messages sent to that address are archived on a publicly accessible Google Groups forum.


If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser


Gerardo Perez

unread,
Mar 31, 2025, 7:53:56 PMMar 31
to Mehta, Nikita, gen...@soe.ucsc.edu
Hello, Nikita.

We wanted to follow up to let you know that we have released the Pseudogenes track for hg38. Here is the link to the track announcement: https://genome.ucsc.edu/goldenPath/newsarch.html#033125.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute

Mehta, Nikita

unread,
Apr 1, 2025, 12:13:34 PMApr 1
to Gerardo Perez, gen...@soe.ucsc.edu

Oh wow!  Thank you so much!  This will make tracking complications due to pseudogene much easier.  I’m sure others will find use out of this track too.

 

Thanks again!

Nikita

 

Nikita Mehta, MS, CGC

Genetic Analysis Specialist, Sr

Diagnostic Molecular Genetics Laboratory, Department of Pathology

 

Memorial Sloan Kettering Cancer Center

1250 First Ave., New York, NY 10065

Schwartz Building

Meh...@mskcc.org

 

Please consider the environment before printing this page or its attachments.

 

From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Monday, March 31, 2025 7:54 PM
To: Mehta, Nikita <Meh...@mskcc.org>
Cc: gen...@soe.ucsc.edu
Subject: [EXTERNAL] Re: [genome] Pseudogenes

 

Hello, Nikita. We wanted to follow up to let you know that we have released the Pseudogenes track for hg38. Here is the link to the track announcement: https://genome.ucsc.edu/goldenPath/newsarch.html#033125. I hope this is helpful. If you have

Mehta, Nikita

unread,
Apr 1, 2025, 12:13:51 PMApr 1
to Gerardo Perez, gen...@soe.ucsc.edu

Hi Gerardo,

 

I’m trying to play around with the track a little bit, and I’m not sure if I’m not doing something correctly or I just need to understand the information presented better.

 

Using PMS2 as an example, I loaded that gene into GRCh38 and the track shows that it is a parent gene (purple) with pseudogene (gray lines below).  However, when I click on any of those gray lines, representing the pseudogenes, it’s hard to tell which one it is.  Let’s say I’m looking for PMS2CL, the PGOHUMTID is hard to correlate (as I’m unfamiliar with this ID).  I think “PMS2CL” is a HUGO ID, so could that be given as well in the hover-over and detailed views?

 

On the other hand, when I look up PMS2CL in GRCh38, it does not come up at all in the pseudogene subtrack and PMS2 is listed there as opposed to the parent gene subtrack.  PMS2 is also color-coded in blue for being processed, which I guess means that PMS2CL is a processed pseudogene, but then when you click on PMS2, it says unprocessed_pseudogene (which I assume is also meant to refer to PMS2CL even though the gene in that track is labeled PMS2).  Either I’m not using this track properly or perhaps these are errors?

 

Finally, I think by searching for a specific pseudogene, the view does show which exons overlap with the parent gene.  However, when you search for the parent gene, this exon-level detail is not present, which I think would also be useful to determine when a variant or CNV call could be complicated by a pseudogene.  Furthermore, since the pseudogenes aren’t easily identifiable, I thought perhaps I could get the PGOHUMTID by search for a specific pseudogene, but I do not see the one for PMS2CL when I do that search so that I can go back to the PMS2 search and identify the gray line that corresponds to PMS2CL.

 

Could you let me know if I’m not using the track in the ideal way and if I’m missing features?

 

Thanks,

Nikita

 

Nikita Mehta, MS, CGC

Genetic Analysis Specialist, Sr

Diagnostic Molecular Genetics Laboratory, Department of Pathology

 

Memorial Sloan Kettering Cancer Center

1250 First Ave., New York, NY 10065

Schwartz Building

Meh...@mskcc.org

 

Please consider the environment before printing this page or its attachments.

 

Gerardo Perez

unread,
Apr 4, 2025, 11:13:15 PMApr 4
to Mehta, Nikita, gen...@soe.ucsc.edu

Hello, Nikita.

Thank you for following up with us and for sharing your feedback on the Pseudogenes track.

There was an error in how we labeled the color-coded pseudogene types on both the track description page and the news announcement. These have now been corrected to reflect that unprocessed pseudogenes are shown in blue and processed pseudogenes in olive green. We appreciate you bringing this to our attention.

We have created an internal ticket to consider incorporating your suggestions, such as adding HUGO IDs to the item details page and the mouse hover text for the gray pseudogene indicators. We are also looking into why the track displays PMS2 but not PMS2CL.

It would be helpful if you could take a look at the track once the updates are in place and share any feedback you may have.

Regarding your note about the view does show which exons overlap with the parent gene, could you clarify what you meant? The track currently shows the full pseudogene as annotated by Yale but does not indicate which specific exons overlap with the parent gene.

I hope this is helpful. Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute

Mehta, Nikita

unread,
Apr 16, 2025, 1:52:57 PMApr 16
to Gerardo Perez, gen...@soe.ucsc.edu

Hello,

 

Sorry for the delay, but thanks for the updates!  I was looking at this again today and noticed the following after I searched for PMS2 in the search bar.

 

Coloring

  1. I do see that the parent gene, PMS2, is labeled in purple.
  2. However, all the pseudogene are only in gray below it (seems consistent with the blog page).  They do not show up in the Yale Pseudogenes subtrack with the orange/blue/green coloring.

  1. When I do click on a PGOHUMT ID, then the Yale Pseudogenes subtrack populates with the correct coloring (example screenshot below), but presumably for that one pseudogene and it’s strangely labeled with PMS2 instead of PMS2CL.  The parent track is also now blank.

 

Regarding requests and suggestions (adding HUGO IDs to the item details page and the mouse hover text for the gray pseudogene indicators), appreciate that!

 

Thank you for looking into why the search for a pseudogene doesn’t seem to work properly.

  1. Coloring issue mentioned above (parent gene shows up in the Yale Pseudogenes subtrack) colored according to the pseudogene colors
  2. The Yale Pseudogene Parents subtrack is empty when you search for PMS2CL (and I assume other pseudogenes).  This is probably a small thing, but I think would still be helpful to have displayed and linked.

 

Regarding exon view, please see my initial question with screenshots, which will hopefully clarify.

  1. If you search for a specific pseudogene (PMS2CL), the view does show you exon structure.  I thought the track that shows you this is the pseudogene track, but it’s actually the Gencode track (pink).  I would have thought that the pseudogene track should be PMS2CL too and that exon structure represents that of the pseudogene because it is labeled in blue to indicate that PMS2CL is a pseudogene.  However, I’m confused because it is labeled as PMS2 in the Yale Pseudogenes subtrack (blue) and doesn’t seem to align with the GenCode track (pink).

  1. What I was hoping to see is which exons for each of the pseudogenes overlap with the actual PMS2 gene.  I made the following up, but something like this:

  1. Disregard this: “Furthermore, since the pseudogenes aren’t easily identifiable, I thought perhaps I could get the PGOHUMTID by search for a specific pseudogene, but I do not see the one for PMS2CL when I do that search so that I can go back to the PMS2 search and identify the gray line that corresponds to PMS2CL.”  I do see the PGOHUMTID for the pseudogene, but again the labeled of PMS2 in the Yale Pseudogene track is what threw me off, because I think this is supposed to be saying PMS2CL.

 

I hope that clarifies things!  Please let me know if and when there are any more things to try out.  I’m more than happy to as I really appreciate this effort!!

 

Thanks again,

Nikita

 

Nikita Mehta, MS, CGC

Genetic Analysis Specialist, Sr

Diagnostic Molecular Genetics Laboratory, Department of Pathology

 

Memorial Sloan Kettering Cancer Center

1250 First Ave., New York, NY 10065

Schwartz Building

Meh...@mskcc.org

 

Please consider the environment before printing this page or its attachments.

 

From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Friday, April 4, 2025 11:13 PM
To: Mehta, Nikita <Meh...@mskcc.org>
Cc: gen...@soe.ucsc.edu
Subject: [EXTERNAL] Re: [genome] Pseudogenes

 

Hello, Nikita. Thank you for following up with us and for sharing your feedback on the Pseudogenes track. There was an error in how we labeled the color-coded pseudogene types on both the track description page and the news announcement. These

Jairo Navarro Gonzalez

unread,
Apr 17, 2025, 3:46:29 PMApr 17
to Mehta, Nikita, Gerardo Perez, gen...@soe.ucsc.edu

Hello Nikita,

Thank you for sharing your feedback on the Pseudogenes track. We have relayed your message to the engineer working on the track development. Please let us know if you have any other suggestions or ways we can improve the track to better serve our users.

If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser

Gerardo Perez

unread,
May 7, 2025, 5:03:50 PMMay 7
to Mehta, Nikita, gen...@soe.ucsc.edu

Hello, Nikita.

Thank you again for your helpful feedback on the Pseudogenes track.

We have made several updates based on your comments:

  • HUGO IDs: We added HUGO gene symbols to the item details page and the mouseover text for the gray indicators in the Pseudogene Parents track. You should now see “PMS2CL” on both the item details page and mouseover.
  • Improved search: Search functionality has been updated to support PseudoPipe IDs, Ensembl IDs, and HUGO IDs. For example, searching for “PMS2CL” should now return a result under the Yale Pseudogenes subtrack.
  • Limitations on exon overlap: Unfortunately, we are unable to display which specific exons of the pseudogenes overlap with the parent gene. The Yale data does not indicate which specific exons overlap with the parent gene.
  • Gray indicators: Regarding pseudogenes that appear only as gray bars below the gene and do not show up in the Yale Pseudogenes subtrack, these gray indicators are not intended to represent the actual locations of pseudogenes. Instead, they indicate the number of pseudogenes linked to the gene. The mouseover text and item details page for each gray bar include a link that navigates to the corresponding pseudogene location. We have also updated the track description page to clarify this:
    https://genome-test.gi.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=pseudogenes&position=default#TRACK_HTML

Could you take a look at the track on our development site and let us know if you have any feedback? Here is a session that shows PMS2CL on our development site: https://genome-test.gi.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=Gerardo&hgS_otherUserSessionName=PSM2

Please include genom...@soe.ucsc.edu in any replies to ensure visibility by the team.

Gerardo Perez
UCSC Genomics Institute

Mehta, Nikita

unread,
May 16, 2025, 5:38:25 PMMay 16
to Gerardo Perez, gen...@soe.ucsc.edu, genom...@soe.ucsc.edu

Hello Gerardo and Team,

 

Thank you so much for taking my feedback into consideration!  I had some time to look at your most recent updates today.  My comments are below.

  • Subtracks and Gray indicators

The updated description page is helpful, so first, I want to make sure I understand the purpose of the Pseudogene Parent track and Pseudogene track properly.  Is it correct that the Pseudogene Parent track lists all the pseudogenes that exist for the gene that you search for?  That is in part what is said by this sentence: “These indicators do not show pseudogene locations directly but instead indicate how many pseudogenes are associated with each gene and link to their genomic regions in the Pseudogenes track.”  The Pseudogenes track is meant to show you the type of pseudogene and its structure, but only if you search for that specific pseudogene (e.g., PMS2 search will only give you information in the parent track whereas PMS2CL search will only give you information in the pseudogene track).  Both subtracks allow you to link between the gene and pseudogenes (essentially allowing you to toggle between using the parent and pseudogene tracks).

  • HUGO IDs: We added HUGO gene symbols to the item details page and the mouseover text for the gray indicators in the Pseudogene Parents track. You should now see “PMS2CL” on both the item details page and mouseover.

The HUGO IDs are immensely helpful.  I like that they are present on the side on the parent track and on mouseover.  I did notice that the parent gene is at the top (as expected) but also listed again amongst the pseudogenes; I don’t necessarily mind that, but when I hover over the top one, only PMS2CL is listed as the pseudogene position and I think when I hover over the bottom one, the rest of the pseudogenes are listed.  I’m also not clear as to why the structure looks different for PMS2 between the top row and the bottom row (looks shifted).

  • Improved search: Search functionality has been updated to support PseudoPipe IDs, Ensembl IDs, and HUGO IDs. For example, searching for “PMS2CL” should now return a result under the Yale Pseudogenes subtrack.

Super helpful, especially if there is a pseudogene that one has not heard of and then you want to look at it.  Or even if you just want structure of a pseudogene that you do know about and don’t want to go through a multistep process to get to it.

  • Limitations on exon overlap: Unfortunately, we are unable to display which specific exons of the pseudogenes overlap with the parent gene. The Yale data does not indicate which specific exons overlap with the parent gene.

Oh ok, I understand that this is a limitation based on the data you have.  I wanted to see if I could manually figure this out.  The following steps seem to work, but I’m no programming expert.  Do you think there might be some way to use an algorithm to do all of this and somehow get an exon overlap track to work?

  • Use the parent gene track to jump to the pseudogene of interest.
  • Right-click on exon to zoom in to the exon of interest.  Copy the genomic coordinates from the field to the Get DNA tool (https://genome.ucsc.edu/cgi-bin/hgc?hgsid=2331139134_ydzcaj8JwEYBvEIbXqOjCdOXcGSg&o=136130562&g=getDna&i=mixed&c=chr9&l=136130562&r=136150630&db=hg19).
  • Use BLAT Search to find all regions that map to that sequence.
  • The only issue that I find in the results is that I can't easily locate the parent gene in the list.  It might be nice if HUGO IDs could be available on this result page too.  Otherwise, I could narrow down the list based on the boundaries of the parent gene.
  • Pick the right result to open the parent gene exon that matches and hover over for the exon position and/or highlight the exon to then zoom out and see it in the context of the entire parent gene transcript.

 

Thanks again for all of your time!  I’m going to test this out some more as I do some pseudogene investigations for my lab.

 

Happy Weekend!

Nikita

 

Nikita Mehta, MS, CGC

Genetic Analysis Specialist, Sr

Diagnostic Molecular Genetics Laboratory, Department of Pathology

 

Memorial Sloan Kettering Cancer Center

1250 First Ave., New York, NY 10065

Schwartz Building

Meh...@mskcc.org

 

Please consider the environment before printing this page or its attachments.

 

From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Wednesday, May 7, 2025 5:04 PM
To: Mehta, Nikita <Meh...@mskcc.org>
Cc: gen...@soe.ucsc.edu
Subject: [EXTERNAL] Re: [genome] Pseudogenes

 

Hello, Nikita. Thank you again for your helpful feedback on the Pseudogenes track. We have made several updates based on your comments: HUGO IDs: We added HUGO gene symbols to the item details page and the mouseover text for the gray indicators

Gerardo Perez

unread,
Oct 20, 2025, 9:17:52 PM (12 days ago) Oct 20
to Mehta, Nikita, gen...@soe.ucsc.edu

Hello, Nikita.

We apologize for the delay in our response. We will address your questions below:

Is it correct that the Pseudogene Parent track lists all the pseudogenes that exist for the gene that you search for?

The Pseudogenes track is meant to show you the type of pseudogene and its structure, but only if you search for that specific pseudogene (e.g., PMS2 search will only give you information in the parent track whereas PMS2CL search will only give you information in the pseudogene track).

When you search for a gene, the search results will include the pseudogenes available from the Yale Pseudogenes track (not from the Yale Pseudogene Parents track). For example, searching for PMS2 shows pseudogene results listed under Yale Pseudogenes on the search results page. The following screenshot shows the PMS2 pseudogene results:

image.png


A search for PMS2CL lists the PMS2CL pseudogene under the Yale Pseudogenes track, as shown in the following screenshot:

image.png

Both subtracks allow you to link between the gene and pseudogenes (essentially allowing you to toggle between using the parent and pseudogene tracks).

Yes, both subtracks allow you to link between the gene and its pseudogenes.


I did notice that the parent gene is at the top (as expected) but also listed again amongst the pseudogenes; I don’t necessarily mind that, but when I hover over the top one, only PMS2CL is listed as the pseudogene position and I think when I hover over the bottom one, the rest of the pseudogenes are listed. I’m also not clear as to why the structure looks different for PMS2 between the top row and the bottom row (looks shifted).

The two parent gene entries appear because PMS2 has two transcripts. Each transcript is associated with different pseudogenes, which may appear above or below the transcript display. The ENST00000441476.6 PMS2 transcript is associated with the PMS2CL pseudogene, while the ENST00000643595.1 PMS2 transcript is associated with PMS2P1–PMS2P12, AC004980.8, and CH17-264B6.3.

image.png

Do you think there might be some way to use an algorithm to do all of this and somehow get an exon overlap track to work?

Our engineer shared that your approach of taking a parent exon and using BLAT to find where it maps in the genome does precisely what you want, namely, it finds the pseudogenes that contain it. This approach will also find orthologs and gene family members, which is why it has not been applied on a genome-wide scale. However, if you perform this manually and then check the pseudogene track for hits, or at least check that no protein-coding genes are annotated in the same region, it should work fine for your purpose.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute

Reply all
Reply to author
Forward
0 new messages