Custom genome, gene/transcript ID is shown rather than name

772 views
Skip to first unread message

roberto....@gmail.com

unread,
Dec 27, 2014, 3:36:08 PM12/27/14
to igv-...@googlegroups.com
Hello,

I added the murine GRCm38.p3 custom genome from Gencode M4 (http://www.gencodegenes.org/mouse_releases/4.html) to IGV, using the FASTA sequence for the genome and the GTF with gene annotation on reference chromosomes (ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_mouse/release_M4/gencode.vM4.annotation.gtf.gz) as gene model. The GTF provides Ensembl gene_id, transcript_id and gene_name. IGV displays the gene_id or transcript_id when the collapsed or expanded views are selected, rather than the more natural gene_name. I can tweak the GTF file, but it may make sense to prioritize the gene name field by default for visualization? Also, is there any plan to provide minor genome revisions by the GRC/Gencode from the standard IGV servers?

Thank you,

Roberto

Jim Robinson

unread,
Dec 27, 2014, 4:47:53 PM12/27/14
to igv-...@googlegroups.com
Hi,

I'll make a note to consider making the files used for printed name customizable, but from experience I can say that there is no natural default for all communities that use gtf/gff.    There is no plan to keep the servers up-to-date,  but if specific files are requested I try to do it if they don't require further processing / head-scratching.   A link helps speed that up.

Jim

--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/6470594b-a6ac-4108-a979-554058d1c12b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

roberto....@gmail.com

unread,
Dec 27, 2014, 5:09:09 PM12/27/14
to igv-...@googlegroups.com
Thank you Jim.

The genome I was referring to is the Mouse GRCm38.p3 (download FASTA) with related gene model (download GTF). All come from Gencode (http://www.gencodegenes.org/mouse_releases/4.html). It is easy to make a custom genome out of those two files, but it takes a long time to load due to the number and depth of annotated features, and it displays IDs rather than names. A sleeker annotation file derived from the provided one would be great. Due to the popularity of the mouse model, supporting the very latest version of the genome would enjoy great interest. 

Thanks again,

Roberto

Jim Robinson

unread,
Dec 27, 2014, 8:45:44 PM12/27/14
to igv-...@googlegroups.com
That assembly is already supported, its "mm10" from UCSC.    To speed up loading of the GTF file I suggest you index it,  either using "tabix" or igvtools.   I will try to do so and put it under the "load from server" menu for mm10.


roberto....@gmail.com

unread,
Dec 28, 2014, 11:22:52 PM12/28/14
to igv-...@googlegroups.com
Thanks Jim about the GTF. About the assembly, mm10=GRCm38 (major release) but this is GRCm38.p3 (patch 3, minor release) which I understand introduces a few changes.

roberto....@gmail.com

unread,
Dec 29, 2014, 2:09:07 AM12/29/14
to igv-...@googlegroups.com
Also, I have indexed the GTF file with IGVtools. It loads very fast when I open it manually from IGV. However, if I try to bundle the index with the GTF file in a .genome (zip-compressed) file, it appears that IGV does not take advantage of the index, as it is still slows in loading the GTF file.

Jim Robinson

unread,
Jan 5, 2015, 7:47:07 AM1/5/15
to igv-...@googlegroups.com
Hi Roberto,

That's correct, you can't package a gtf in a .genome file, its gzipped and the index consequently would be invalid.   You do not need to use a .genome file, just load the indexed fasta directly to define the reference then load annotation files separately.  

Jim

roberto....@gmail.com

unread,
Jan 5, 2015, 11:35:18 PM1/5/15
to igv-...@googlegroups.com
Thank you Jim!
Reply all
Reply to author
Forward
0 new messages