Dear all,
We’re excited to share a new GLAM-E Lab report that captures the impact that bots building datasets for AI model training are having on online cultural collections.
Are AI Bots Knocking Cultural Heritage Offline? https://www.glamelab.org/products/are-ai-bots-knocking-cultural-heritage-offline/
In late 2024, isolated reports began to appear from individual online cultural heritage collections that described servers and collections straining – and sometimes breaking – under the load of swarming bots. These bots were reportedly scraping data and images from collections websites to build datasets to train AI models.
Highlighting the results of the short survey circulated in April 2025, this report, written by GLAM-E Lab Co-Director Michael Weinberg, indicates that online cultural heritage collections are struggling with new traffic spikes. Those reports are looking more and more like an early warning. Not every online collection is being impacted, but many are.
What’s happening? Bots scraping online collections to build AI model training data are swarming sites, overwhelming their infrastructure and knocking them offline. Collections don’t have infinite $ to keep building out new infrastructure. The bots often operate like a roving, relatively brief DDOS attack.
Ultimately, this is a problem that will probably need to be solved by an updated set of collectively-determined norms. Spy v Spy technical measures and countermeasures are not sustainable.
Read more in the report https://www.glamelab.org/products/are-ai-bots-knocking-cultural-heritage-offline/
Best wishes,
Francesca