At first, I thought that the existence of a file named AndroidManifest.xml at the root level of the APK/ZIP package would be enough to identify it. Then we had a discussion with Johan and Ross (
https://digipres.club/@bitsgalore/112478785077854374), who pointed me to a possible confusion with XAPK files.
"
The XAPK format was introduced to package the APK file and OBB file
together for a seamless delivery and installation process when
downloading an Android app from a non-Google Play site."
"XAPK files are compresed using the standard
ZIP
file format. These can be extracted using a standard
compression/decompression software such as WinZIP. Once the XAPK file is
extracted to the disc, it contains the following files in folder.
- APK - Standard installation file for installing the application on Android devices
- OBB - Additional file that contains relevant resource files"
So I would tend to think that the AndroidManfest.xml file would never be found at the root level of the XAPK container.
The LoC entry for APK (
https://www.loc.gov/preservation/digital/formats/fdd/fdd000592.shtml) lists other folders that are found in the package, but as said Johan, we cannot be absolutely sure that they would always be present. So I would tend to limit the identification pattern to the presence of an AndroidManifest.xml file at the root level.
What do you think?
Kind regards,
Bertrand