Hi Carolyn,
Yes, we see this Elasticsearch error from time to time, and typically need to restart elasticsearch for search to recover. Have you configured a robots.txt file in the root of your site? This should at least make honorable bots go away, but I'm pretty sure ByteDance won't.
Our robots.txt is as follows, which "allows" those listed and "disallows" any others:
User-agent: Googlebot
Disallow:
User-agent: DuckDuckBot
Disallow:
User-agent: BingBot
Disallow:
User-agent: Msnbot
Disallow:
User-agent: Yahoo-slurp
Disallow:
User-agent: *
Disallow: /
Just to reiterate that bots don't have to honour these rules; it's very much advisory only with no guarantees. But I think it should help, so definitely worth doing.
In terms of aggressively disallowing bots in Nginx, there is
the Nginx Ultimate Bad blocker, though I haven't tried it. As it looks like a comprehensive solution, I guess this might be likely to generate false positives, and potentially block some real users, or have other unforeseen effects on AtoM. Another (perhaps safer) option would be to create some simple rules to catch persistent offenders to your site using something like
this guide.
I really don't think Google Analytics would cause any problems like you're experiencing. GA javascript code typically runs on legitimate clients and wouldn't cause much if any extra load on the AtoM web server. Most (all?) bots won't be running javascript at all, but the sheer volume of requests can consume resources to the point of causing outages.
I hope that helps a bit.
Thanks, Jim