Hi Lukács,
While working on fixing some subtle aspects of RepoMappingManifestAction related to runfiles symlinks, I delved into Runfiles#fingerprint and wondered why it is the way it is.
Specifically:
This is a slam dunk. We'd be grateful for a fix, I think this was a simple oversight on our part.
2. #fingerprint and #getEmptyFilenames flatten all nested sets and don't use the fingerprint cache for nested sets provided by ActionKeyContext. It seems that this could be changed simply by having custom MapFns for SymlinkEntry and Artifact. Assuming that the EmptyFilesSupplier is a pure function of its inputs, it probably wouldn't have to be included in the digest at all.
I think for #fingerprint(), this is also a slam dunk, it's just that the code evolved in such a way that it didn't occur to anyone to use the fingerprint cache. It's a little more complicated than trivial because Artifact.getRunfilesPath() also needs to be part of the key, but that's an obstacle that's entirely surmountable.
The reason why the set of empty files needs to be in the key is that even though it only gets a list of paths, it can close over any data produced during the analysis phase, so the same set of runfiles can result in a different set of empty files. I could imagine fixing this by adding a computeKey() method to EmptyRunfilesSupplier, but ignoring it doesn't work.
I don't know what the performance impact would be.
Happy to improve any of that, but I would first like to know confirm that 1) this isn't intended for a reason I am unaware of and 2) isn't negligible anyway (e.g., since it only applies to executable targets, which only make up a small fraction of all targets).
Best,
Fabian