I'd say no, I don't think that it should be the buildpack's responsibility to determine how to detect and handle archives. Since file extensions are a very crude (and often inaccurate) way of determining what type a file is, you'd end up with every buildpack replicating the same detection, extraction, error handling, etc. for dealing with archives. I much prefer that Cloud Foundry (either the CLI or the core runtime) handle that in a central place. In fact, I believe that one of the big reasons that the CLI even does this in the first place is so that it can explode the archive and upload only the change files for performance.
That being said, the current way of detecting archives for extraction (via file extensions) isn't great either. What we should be doing is examine the bytes of each file to determine if it is an archive using each archive type's magic headers. You can see as an example that the Apache Compress ZipArchiveInputStream has the ability to '
match' a Zip file. That is to say that it examines its bytes for a magic grouping and that's what it uses to determine if the file is actually a Zip file, not its extension. Since all of the Java file types are essentially specially formatted Zips, and both Tar and GZIP have magic headers as well, I suspect that this would cover nearly all of the archive types we'd want to upload and as new esoteric archive types (e.g. SARs) are used, it'd require no changes to either the cli, core-runtime, or buildpacks. If a genuinely new archive type appeared (e.g. bzip2) we could use the same magic header detection for those as well.
-Ben Hale
Cloud Foundry Java Experience