Forward thinking Em!
Several organizations are already working on preserving government environmental data, including DataRefuge (a little slow right now), the Internet Archive's End of Term project, and the Environmental Data & Governance Initiative (EDGI). They've developed robust data preservation strategies that go beyond simple web scraping, including capturing complete datasets, APIs, and associated documentation.
Beyond the EPA, we'll want to look at NOAA, USGS, NASA Earth Science Data, DOE's Office of Scientific and Technical Information, and NREL. Each has valuable environmental and energy datasets that should be preserved. The NIH and CDC also maintain relevant environmental health databases that often get overlooked in preservation efforts.
Let's start by connecting with existing preservation projects rather than building something from scratch. The academic library system, particularly through organizations like the Data Curation Network, has established protocols for this kind of work. They can help with both the technical aspects of data preservation and navigating the legal considerations around federal data archiving.
https://envirodatagov.org/
https://www.rd-alliance.org/
https://eotarchive.org/
https://datacurationnetwork.org/
https://www.datarefuge.org/
C
--
You received this message because you are subscribed to the Google Groups "GreenYes" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
greenyes+u...@googlegroups.com.
To view this discussion visit
https://groups.google.com/d/msgid/greenyes/CAL6pS7TxOW9hQekT7Rv92ZMQyatgm%3D%2BjdMTyB2wgCKwc9-G3pA%40mail.gmail.com.