This is a bit belated and I suspect many of you saw our Twitter thread
, but after more than a year we finally cut a new v0.4.0 release of the PUDL software and data. We integrated both older and newer data for the EIA 860/923, brought in 2019-2020 for the EPA CEMS hourly emissions, 2019 data for the FERC Form 1. Added experimental integrations of the EIA 861 (2001-2019) and the hourly electricity demand reported in FERC Form 714 (2006-2019).
There are also some new derived data products that are incorporated, including hourly historical electricity demand estimates by state, based on the EIA 861 and FERC 714, along with the collection of counties that made up our best guesses for the historical utility and balancing authority territories.
You can see more details on our new and pretty up to date release notes page
of our documentation. If you want to follow along and see what's happening as it hits our main branch that's a good resource to watch.
We're trying to get a v0.5.0
out by the end of October that includes all the 2020 data for FERC 1, EIA 860/923, a crosswalk table linking the EIA data to the EPA CEMS robustly (based on the one published recently by EPA), and (on the back end) a much simpler ETL process that outputs directly to SQLite and Parquet, and a better system for managing all of the metadata and database schemas, using Pydantic