--
You received this message because you are subscribed to the Google Groups "astropy-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astropy-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Jun 19, 2016 9:30 AM, "John K. Parejko" <pare...@uw.edu> wrote:
>
> Some progress on this problem!
>
> The astropy.io bug is due to the interaction between CompImageHDU._update_header_data() and _BaseHDUMeta._update_uint_scale_keywords(). The former was not guaranteed to set TFIELDS in its required place (it must be the 8th keyword in a BinTable, immediately following GCOUNT), while the latter always puts BSCALE immediately after GCOUNT when the file is written (which is fine in ImageHDUs but breaks BinTables, which CompImageHDUs actually are!). I discovered this after setting TFIELDS with "after='GCOUNT'" and finding that BSCALE/BZERO still showed up ahead of it in the header.
>
> I think I've come up with a safe work-around, and I'll submit a pull request with it shortly.
>
> Sadly, I still haven't come up with a working minimal example (besides my already-existing large files), since I don't really understand exactly how BSCALE/BZERO are managed. Hopefully my above description can help someone who understands FITS headers better to craft an appropriate integration test for an ImageHDU->CompImageHDU conversion.
>
> This whole process has underscored how unnecessarily complicated the compressed image FITS convention is. None of this would be necessary if FITS actually supported a compressed HDU type natively. I'll just add it to my growing list of FITS problems.
FYI if you search the Astropy issue tracker under the io.fits label there are a couple issues I've written pertaining to plans for reworking the compressed image support.
There are two main reasons (both documented, I think) that the implementation is so complicated: the first is that it wraps CFITSIO for compression support. This is the *only* area where CFITSIO is used, and it's being wrapped in a way that was never really intended, as PYFITS was never design around CFITSIO in the first place. There's no reason it needs to use CFITSIO though. The compression algorithms can be taken on their own, and the rest of the relevant bits rewritten in some combination of Python and Cython. It's not particularly hard to implement, it's just never been a priority.
The second thing that makes things complicated is a design mistake in PyFITS which originally made the CompImageHDU class a subclass of BinTableHDU. On its face that makes sense since the compressed image convention is implemented in a binary table. However, the class takes great pains to make it look to the user transparently like a normal image HDU. The class really should have been based on ImageHDU, and instead contain a wrapped instance of a BinTableHDU for interaction with the low-level format. That would probably make the code a fair bit simpler.
As FITS conventions go I don't think the tile compression convention is *too* terribly complicated. The biggest offender here is once again unsigned integers, and the general badness of FITS design.
As I wrote, the way CFITSIO is designed doesn't mesh well with the
rest of PyFITS. Just to give one example, it manages the entire FITS
file on its own, including the header, whereas PyFITS manages the
header separately. It's not easy to wrap CFITSIO in such a way that
it can interact with raw data in memory.
That's partly why the
existing interface to CFITSIO is so complicated--it really just needs
to be able to compress / decompress a data array using the FITS tile
compression algorithms, without the assumption that those arrays are
wrapped in a FITS file. In other words, CFITSIO does not do enough to
abstract the internal data structure away from FITS.
At the end of the day it would be less of a maintenance hassle.
CFITSIO is also a moving target without well defined versioning
semantics. Its ABI can change from version to version.
That said,
for most of the time I maintained it, maintaining that interface was
still less hassle than rewriting it, though I would have preferred to
rewrite it if I had the time.