Index gff files for IGV

2,341 views
Skip to first unread message

Alvaro Martinez Barrio

unread,
Nov 22, 2011, 2:29:24 PM11/22/11
to igv-...@googlegroups.com
Dear Jim and IGVers,

I have the following problem:

A very big gff file which I need to visualize and preferentially, I
would like to keep all the attributes as it is in the gff so when I
overlay my cursor, I can see them. Because the number of features
causes the heap memory of the Java process to fail loading it and I
cannot supersed the restrictions of the memory restrictions in the
interactive node I am running, I would like to make this an indexed
file or something like tdf.

But unluckily, igvtools doesn't seem to like gff.

$ igvtools tile <gff> <genome>
"Command tile not supported for files of type: .gff"

So I could do bed but I would loose the amount of rich information I
have collected in the tags. Any clever solution out of the box?

Jim, I vaguely remember we had the discussion about tdf. If there was
an API outside IGV so I could translate my file and leave it binary
and retrieve parts of it or consult from external programs without
bothering to leave the plain text (takes space). But apart from IGV
and igvtools I have seen no use for tdf and the internal format is not
described anywhere (as opposite to BAM/SAM). Have you though on
alternatives like supporting indexed files and tabix, provided by Heng
Li?

http://samtools.sourceforge.net/tabix.shtml

You can compress your files and get an B-tree index which you can
query for specific loci. Could it be an alternative to tdf? Then, it
would be compatible with all my scripts :)

Best regards,
álvaro

--
Alvaro Martinez Barrio
Department of Medical Biochemistry and Microbiology
Biomedicinska Centrum (BMC), Husargatan 3, C10:322b
Box 582, SE 751 23 Uppsala SWEDEN

Tel: +46 18 471 4502
Fax: +46 18 471 4673

Jim Robinson

unread,
Nov 22, 2011, 4:13:38 PM11/22/11
to igv-...@googlegroups.com, Alvaro Martinez Barrio
Hi alvaro,

There is a version of igvtools that can index gff files, it is available
here

http://www.broadinstitute.org/igv/projects/downloads/igvtools_test.zip

You can also index them with "tabix", IGV can read tabix files. I am
planning to publish the TDF spec at some point, but with tabix, BigWig,
and BigBed available the urgency to do that has fallen. TDF does more
than index and compress, it keeps precalculated density data at various
resolution scales for "zoomed out" views. In this sense it is more like
BigWig/BigBed, but with support for more formats, than it is like tabix
or BAM.

There is also a trick you can employ with bed files. If the following
line is placed at the top of the file IGV will treat the "name" column
as a GFF3 style attribute list (column 9). The parent-child relations
in the GFF will be ignored, but the attributes will display.

#gffTags


best,

Jim

> �lvaro
>

Alvaro Martinez Barrio

unread,
Nov 22, 2011, 5:20:55 PM11/22/11
to Jim Robinson, igv-...@googlegroups.com
Thanks Jim,

Hi Jim,

As always, great help... :)

The tabix trick worked smoothly. The issue now is that it looses all
the attributes that were important to me just leaving name and
ref:start-stop when I overlay the cursor.

But I formed the bed file and indexed it with tabix with the tag you
said and now displays perfectly. Great feature! These things should be
in the documentation... or I should read it :)

IGV is just great for this small things. Thanks for doing it Jim.

Best,
álvaro

Reply all
Reply to author
Forward
0 new messages