Summary:
Focus: understanding the behavior of the world's ecosystems.
Increasing global populations have put a strain on ecosystems and natural resources: land use, water use, energy use
To sustain ecosystems we need to understand them: carbon, water, nutrients
Strongly coupled/interdependent with
Global climate: rainfall, temperature, variability
Human management: fertilization, irrigation, land use
Measuring ecosystems is challenging
Limited measurement data
Challenges:
Ecosystem heterogeneity: soil, vegetation, management
Complex biogeochemical, physical processes
Opportunity: AI for ecosystem understanding
Data: in-situ sensor networks, remote sensing, meteorological data, bopgeochemical, geospatial&survey, synthetic
Applications: greenhous gases, carbon sequestration,. Production, water, air quality, soils, biodiversity, natural hazards
Modeling can infer unknown information from measurements
Process-based models incorporate known dynamics but are limited by process representation
Machine learning models (black box) adapt to data but have poor explainability and generalizability
Hybrid AIU models can leverage the best of both techniques (e.g. differentiable simulations, SciML)
Knowledge-Guided ML: Training based on domain-specific prior knowledge: invariants, useful examples
Example: advancing agroeconomics in US corn belt
Challenge: simulating N2O emissions from use of fertilizer in corn farming
High spatial/temporal variability
KGML:
Train ML model using runs oc ecosys model https://github.com/jinyun1tang/ECOSYS
KGML model is able to accurately predict N2O emissions
Supports data inversion across US midwest cornbelt: infer emissions hotspots from sparse regional measurements
Supports interpretability via causal diagrams, which can guide additional improvements in process-based model (using PCMCI: https://jakobrunge.github.io/tigramite)
Example: understanding carbon budget in agriculture
Agriculture is both a source and a sink for carbon
Modeling emissions from agricultural activities
Train ML model based on traces of ecosys model
Used a knowledge-guided loss function: mass balance, threshold control, response control
Knowledge-guided extrapolation by assimilating remote-sensed data
Hybrid model
Outperforms both pure-ML and ecosys alone in accuracy
More efficient than ecosys
Can use emissions model to estimate carbon credit risk
Example: Advancing natural ecosystem understanding with hybrid AI
Methane is a major contributor to climate change
Data on emissions is sparse and diverse
Pure AI models are good in-sample but don’t generalize
Most data are limited to high latitude in North Hemisphere
Poor accuracy in tropics, South hemisphere
Integrating scientific knowledge can improve accuracy
Knowledge-guided initialization TEM-MDM: https://www.eaps.purdue.edu/ebdl/resources/ecosystems-biochemical.html
Training ML model on its traces improves model accuracy
AI model can help plan new locations for placing measurement sites across the world to maximize predictive accuracy
AI for Natural Methane Working Group:
Ongoing work:
Water quantity & natural concentration
Precision agriculture
Global carbon cycle