feature request: allow space character to occur in gff column 9

46 views
Skip to first unread message

Malcolm Cook

unread,
Nov 27, 2015, 8:17:55 PM11/27/15
to igv-help
Hi,

Depending who you ask, attribute values in column 9 of GFF are allowed to contain spaces and they should NOT be url-encoded.

For example, the IGV team has advised

If the attributes field contains spaces it must be surrounded by quotes. Personally I would recommend against using spaces, replacing with an underscore might be better, but surrounding the attribute field with quotes should work


Column 9: "attributes"

A list of feature attributes in the format tag=value. Multiple tag=value pairs are separated by semicolons. URL escaping rules are used for tags or values containing the following characters: ",=;". Spaces are allowed in this field, but tabs must be replaced with the %09 URL escape. Attribute values do not need to be and should not be quoted

I am increasingly working with the same GFF in multiple tools.  If they adhered to same (albeit loose and evolving) standard in this regard, life would be better.

Finally, I find that the (old) suggested workaround "surrounding the attribute field with quotes should work" does in fact not work to allow spaces to appear in the value of an attribute.

Worth considering?

Cheers,

Malcolm

Jim Robinson

unread,
Nov 27, 2015, 8:59:41 PM11/27/15
to igv-...@googlegroups.com
Could you send me a gff3 file with spaces in column 9 that is causing a problem?

GFF is a bloated mess.  There I said it.   I will however try to get it to work as best I can.

Jim

--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/ce91ce18-a352-4f7a-ade1-e534b30bdecd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Malcolm Cook

unread,
Nov 27, 2015, 10:41:42 PM11/27/15
to igv-...@googlegroups.com
Jim,

Ah.... yes.... mea culpa, my report was in error - IGV seems to be near perfect in this regard, to wit:

Un-encoded spaces are not allowed in column 9 attribute values of gff2 (.gff) files, which is probably correct, since "Column 9 has a slightly different format and is much more tightly defined in GFF3 than GFF2. Both require attention. GFF2 does not have any reserved attribute names, uses C style encoding/escaping of special characters, and has many other small differences" (per gmod spec)

I now see spaces _are_ allowed as per the gff3 spec BUT ONLY IF the file is in fact recognized as gff3.

My current tests show IGV will recognize gff3 if extension is ".gff3" or if it contains

##gff-version 3 

but NOT if it contains

##gff-version 3.2.1

which you might want to change code to recognize  as it is allowed in gff3 spec:

##gff-version 3.2.1
The GFF version follows the format of 3.#.# in this spec. This directive must be present, must be the topmost line of the file. The version number always begins with 3, the second and third numbers are optional and indicate a major revision and a minor revision respectively.




--

---
You received this message because you are subscribed to a topic in the Google Groups "igv-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/igv-help/oRvdl45JXJ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to igv-help+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/56590A8A.6050006%40broadinstitute.org.

Jim Robinson

unread,
Nov 27, 2015, 10:44:19 PM11/27/15
to igv-...@googlegroups.com
OK, thanks for the report,  I will change the code to recognize anything that starts with "3" as GFF3.   GFF2 explicitly does not allow un-escape spaces,  I forget the reason why.
Reply all
Reply to author
Forward
0 new messages