I've done some more work on the NITF reader. I'm now able to read some more of
(but not all) of the NITF 2.1/NSIF 1.0 files.
Most of yesterday was spent building a test tool that runs across a list of
files. For each file it:
- invokes gdalinfo and saves a (slightly filtered) output to a file (i.e. gdal
as oracle)
- tries to generate the same format using the file reader
- compares the results of our format to the oracle.
Code at
https://github.com/bradh/nitf-metadata-comparison
That showed me how far I still have to go...
I'm now starting to read extensions. The current design approach is to build a
parser for, and then use, the GDAL TRE data
(
http://trac.osgeo.org/gdal/browser/trunk/gdal/data/nitf_spec.xml)
Still working that through.
I have some concerns about how to present that data though. That is, what API
would you like?
I've though of some alternatives .
One idea is a flat key/value pair (Map<String,String>), where the key contains
a concatenated TRE name and field. This would be conceptually similar to the
approach used by GDAL (although I'd probably lose the NITF_ prefix).
NITF_PIAIMC_CAMSPECS=
NITF_PIAIMC_CLOUDCVR=010
NITF_PIAIMC_COMGEN=00
NITF_PIAIMC_ESD=Y
NITF_PIAIMC_GENERATION=1
NITF_PIAIMC_IDATUM=ZYX
NITF_PIAIMC_IELLIP=RF
NITF_PIAIMC_MEANGSD=00098.4
NITF_PIAIMC_PREPROC=P1
NITF_PIAIMC_SATTRACK=00000000
NITF_PIAIMC_SENSMODE=PUSHBROOM
NITF_PIAIMC_SENSNAME=PRISM N
NITF_PIAIMC_SOURCE=PROCESS:JAPAN-JAXA-EOC-ALOS-DPS 20090107065215
NITF_PIAIMC_SRP=Y
NITF_STDIDC_ACQUISITION_DATE=20080223
NITF_STDIDC_COUNTRY=
NITF_STDIDC_END_COLUMN=005
NITF_STDIDC_END_ROW=00016
NITF_STDIDC_END_SEGMENT=AA
NITF_STDIDC_LOCATION=3416N13227E
NITF_STDIDC_MISSION=ALOS
NITF_STDIDC_OP_NUM=000
NITF_STDIDC_PASS=
NITF_STDIDC_REPLAY_REGEN=000
NITF_STDIDC_REPRO_NUM=00
NITF_STDIDC_START_COLUMN=001
NITF_STDIDC_START_ROW=00001
NITF_STDIDC_START_SEGMENT=AA
Alternatively we could build a tree of TREs (no pun intended), where the top
level key would be the TRE name (e.g. PIAIMC), and the value would be another
map (from field to actual value, e.g. SENSMOD:PUSHBROOM).
Both of these flatten nested/looped structures (e.g. CSEPHA is going to have
stack of fields that look like EPHEM_0_X, EPHEM_1_X, EPHEM_2_X and so on.
Another approach would be to open code every known TRE (e.g. using an offline
code generator tool that makes accessors and the parsing code). That could
give us a better representation of the TRE structure, but would probably be
harder to maintain, and would prevent adding new TRE structures at runtime.
The final concept was to say to "its all out of scope, here is the raw data as
a string, parse it yourself". Seems weak though.
Brad