Index file format idx files generated by igvtoos

10 views
Skip to first unread message

Peter Kramer

unread,
Nov 4, 2019, 5:30:01 AM11/4/19
to igv-help
Hello,

I have been looking for the file format of the idx file(s) produced by igvtools 'index' command. In particular I would like to know the file structure for an idx file generated by igvtools for a vcf file.

I intend to write a simple utility program that allows batch input of SNP ID numbers and writing output of data extracted from the vcf file pertaining to each SNP specified. I.e., I want to write a program that retrieves the relevant line from a vcf file for each SNP ID inputted and that outputs the relevant lines from the vcr file.

It's simple enough to do without random access (I simply do a data-scrape of chromosome number and position for each SNP at NCBI SNP web site, sort the SNPs by position in the vcf file so that my program looks up each SNP from first to last in the vcf file. But that is a cludgey way to do it and I would like to make a utility that is a bit more 'professional', using random access.

Any information would be appreciated.

Peter Kramer
Cebu, Philippines

James Robinson

unread,
Nov 4, 2019, 4:28:34 PM11/4/19
to igv-help
The idx format is part of the htsjdk (https://github.com/samtools/htsjdk), as far as I know there is no description of it.  I think you'll find more support for bgzipped / tabix indexed vcfs.   What language are you writing your tool in?



--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/6ba0d51c-b4d6-4550-a5fb-c4d17143ab02%40googlegroups.com.

Peter Kramer

unread,
Nov 4, 2019, 5:59:12 PM11/4/19
to igv-...@googlegroups.com
Thank you James. BTW, I usually code with C# (making simple tools, with Windows Forms, having a few buttons and maybe a text box or two.).

Peter

James Robinson

unread,
Nov 4, 2019, 10:56:27 PM11/4/19
to igv-help
I'm not aware of any existing libraries for c#.   You should be able to port this javascript reader:  https://github.com/igvteam/igv.js/blob/master/js/feature/tribble.js.  It loads a "tribble" index, and will return file blocks (start position, size) corresponding to a genomic range.   The file is binary,  "BinaryParser" reads it and turns the bytes into ints, floats, strings, etc.  Strings are null (0) terminated.   I think there are libraries to do this in c/c#, you shouldn't have to port BinaryParser but it should be easy enough to do.

Given a file block you need to seek to the start position and read "size" bytes.   Blocks will contain many features, you still have to scan through them looking for the snp of interest.



Reply all
Reply to author
Forward
0 new messages