I'm thinking about fixing
issue 3864 ("archive/tar: cannot read GNU sparse files"), but I wasn't sure how we actually want to handle sparse files. I didn't want to go crazy implementing the wrong solution before bringing it up here. Do we want the reader to read the file in expanded form or in condensed form?
There are three different formats that GNU tar can use to store a sparse file in a posix format tar file (called sparse formats 0.0, 0.1, and 1.0). These formats are designed so that an implementation of tar that isn't aware of them extracts them in condensed form as regular files. The post-processing steps one has to follow to extract them depends on which of the three sparse formats gets used.
For a GNU format tar file (not a posix/pax one), a completely different format is used to store the sparse file in the tar archive that uses so-called "extension headers" following the main header containing the sparse map. This one causes the most serious problem, because if the tar reader is not aware of the format, it will likely cause it to fail with an error, unlike the posix tar sparse formats. This is what caused the specific error mentioned in issue 3864.
Do we want to handle all these formats by expanding the "holes" into zero bytes? Or do we simply want to make sure it doesn't crash when it sees the old GNU format?