I am going to build a tool to visualize genome sequence, something
like igv viewer.
I tried to use .tdf files as data source, but it is hard to know where
the format is specified.
All that I can find is that .tdf files are constructed using "tile"
igvtool command from text-based files, and that they are designed for
faster processing in igv viewer. I check that the open source code of
igv, and I get the file named "notes.txt" in the folder named
"org.broad.igv.tdf", which I believe tdf converter. It saids like
this:
General layout
[Header]
[Tiles]
[Datasets]
[Groups]
[Master Index]
Header
------
magic number (32 bit int)
version (32 bit int
index position (64 bit long)
index size (bytes) (32 bit int)
header size (bytes) (32 bit int -- # of bytes for rest of header).
# of window functions
[window functions]
track type (string)
track line (string)
# of tracks
[track names]
FixedTile
---------
Type
# Positions
Start location
Span
Data
Features
---------
magic number (32 bit int) <= 'F','E', 'A', 'T'
format (string) <= bed, gff, other
[chr (string)
start (int)
rest of record (string)]
FeatureIndex
------------
Dataset
-------
# attributes
[attribute key (string)
attribute value (string]
data type
tile width
# tiles
[tile position
tile size (bytes)] <= THIS IS NOT NECCESSARY
Group
------
# attributes
[attribute key (string)
attribute value (string]
Master Index
-----------
feature index position (long) (-1 if no features)
# datasets
[dataset name (string)
position (long)
size in bytes (int)] <= THIS IS NOT NECCESSARY
# groups
[group name
position (long)
size in bytes (int)] <= THIS IS NOT NECCESSARY
seek(long position)
read(buffer, offset, length)
length() <== Only used for testing end of file
I think this specification does not include some information. For
example how many bit is needed to represent "# of window functions"?
Can anyone have the whole specification of tdf file format? Then
please let me know it.
Thank you.
We have not published "tdf" yet, we might do so in the future but for
the time being the only specification available is the source code.
best,
Jim