Hi Sally,
I have part of an answer for you.
I've examined our current archival description index mapping, and noticed in the following sections that we are currently indexing a number of fields - including the history field - not just from linked creators, but from INHERITED linked creators as well, per:
So in general, I think your theory is right - the majority of this problem comes from this indexing pattern, as you have rightly deduced. I managed to reproduce this behavior locally, and filed the following issue ticket for our Maintainers to consider:
A couple important notes:
First, I tried to see if I could just comment out those lines in the mapping.yml file, restart Elasticsearch, and then reindex (to see if you could potentially do this as a workaround until this is addressed)... and unfortunately I got a reindexing error. So - there's a bit more to be changed than just removing those lines, but it's beyond my non-developer skills to figure out.
Second, while I was able to reproduce the issue in general, for me it returned ALL inherited records, not 606/800 as in your case. Without closing examining your records, I can't speak to why most but not all were returned.
I have recommended in the issue ticket that the Maintainers review some of the other indexing choices as well when addressing this issue, as there may be more cases where we are indexing too much and introducing noise into the results. Hopefully this is something that can be addressed in an upcoming release, although I think at this point that the scope is already set for the upcoming 2.7.2 release, so any fix would likely be 2.8 or beyond.
Regards,