Planetary-Scale Inference from Earth Observation and Machine Learning | 9am Tues, January 13, 2025

37 views
Skip to first unread message

Grigory Bronevetsky

unread,
Jan 11, 2026, 11:38:52 PMJan 11
to ta...@modelingtalks.org
image.pngModeling Talks

Planetary-Scale Inference from Earth Observation and Machine Learning 
image.png

Tues, Jan 13, 2026 | 9am PT

Meet | Youtube Stream


Hi all,


The presentation will be via Meet and all questions will be addressed there. If you cannot attend live, the event will be recorded and can be found afterward at

https://sites.google.com/modelingtalks.org/entry/planetary-scale-inference-from-earth-observation-and-machine-learning


More information on previous and future talks: https://sites.google.com/modelingtalks.org/entry/home


Abstract:

Recent advances in Earth observation and machine learning enable inference about the Earth system at planetary scale. However, real-world applications are constrained by sparse ground truth, heterogeneous sensing conditions, and domain shift across regions. Addressing these challenges requires learning representations that generalize across geography and time, as well as enabling statistically valid inference from machine learning-derived Earth observation products. I will present two case studies, one on learning invariant features for crop type mapping and one on using machine learning-derived Earth observation maps to enable statistically valid downstream inference. Together, these examples demonstrate how combining machine learning with principled use of Earth observation modalities can yield scalable, reliable insights about human and environmental systems.

 

Bio:

Sherrie Wang is an Assistant Professor at MIT in the Department of Mechanical Engineering and the Institute of Data, Systems, and Society. Her research spans Earth observation data, machine learning, and statistical inference, with the goal of enabling reliable understanding of land and atmospheric systems at scale. Her work spans developing and evaluating geospatial data products, designing machine learning algorithms that generalize under data scarcity and domain shift, and performing downstream inference with principled uncertainty quantification. A central theme of her work is understanding how different sensing modalities, such as satellite imagery and LiDAR, interact with learning algorithms to produce representations that transfer across geographic and temporal scales. Her research supports applications in agriculture, greenhouse gas monitoring, and localized weather inference, particularly in settings where ground-based measurements are limited.

Grigory Bronevetsky

unread,
Jan 27, 2026, 2:37:42 PM (9 days ago) Jan 27
to Talks, Grigory Bronevetsky
Video Recording: https://youtube.com/live/i7jlCeNqh7c

Summary:

  • Availability and spatiotemporal resolution of satellite observations have grown significantly over past few decades, both public and commercial

  • Can see major global dynamics

    • Economic activity (from nighttime lights)

    • Natural disasters

    • Agriculture

    • Forests

  • ML for remote sensing has exploded over past decade

  • Earth Intelligence Lab @ MIT

    • Algorithms: label-scarce settings, benchmarks, algorithms tailored to unique properties of remote sensing data

    • Data Products: global scale, uncertainty quantification, accurate novel data sources

    • Causal Inference and Forecasting: assess impacts, scenario-sensitive

  • Representation Learning for Satellite Imagery

    • Tile2Vec: https://arxiv.org/abs/1805.02855

    • Ensure embeddings of nearby tiles in satellite images are are closer to each other than embeddings of more distant tiles

    • Resulting embeddings outperform dedicated end-to-end supervised learning algorithms with explicit labels provided

  • LLMs for geospatial tasks

    • Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data: https://arxiv.org/abs/2401.17600 (2024)

    • LLMs are good at creating captions in images and recognizing landmarks

    • Fail to count items in images or create bounding boxes around specific regions within images

  • Localized off-grid weather forecasting

    • Global models don’t forecast local conditions well (e.g. wind patterns), e.g. missing surface friction due to buildings, etc.

    • High-resolution satellite data is highly local and resolved

    • Idea: combined gridded forecasts with satellite images

    • Model: train transformer on images + HRRR weather forecast https://rapidrefresh.noaa.gov/hrrr

    • Local Off-Grid Weather Forecasting with Multi-Modal Earth Observation Data: https://arxiv.org/abs/2410.12938

  • Global map of sugarcane

    • Mapping sugarcane globally at 10 m resolution using Global Ecosystem Dynamics Investigation (GEDI) and Sentinel-2: https://essd.copernicus.org/articles/16/4931/2024/

    • Idea: some crops (sugarcane, corn) are much taller than others, so use satellite-based LiDAR (from GEDI satellite) to map them

    • Focus here on sugarcane due its relatively longer growing season (another available feature)

    • GEDI has spotty spatial coverage so they use it to identify some locations where sugarcane is grown

    • Then they used these locations as prediction labels on a vision model that predicts sugarcane locations are a uniform spatial grid using visual satellite imagery

  • What does the Clean Water Act regulate

    • Machine learning predicts which rivers, streams, and wetlands the Clean Water Act regulates: https://www.science.org/doi/10.1126/science.adi3794

    • The water bodies covered by the act change over time

    • ML model maps the waterbodies covered by various versions of the Act.

  • Global crop type mapping

    • To predict crop types we typically need labeled data

    • Challenge: we don’t have labels for most of the world

    • Can we create crop type models that apply across geographies, making it possible to use labels from some regions to make predictions for other regions

      • Temporal multi-spectral features

      • Traditional: harmonic regression to get over gaps in imagery, 1D median features to combine spectral data over time

      • Idea: 2D median features that combine spectra over time

    • CropGlobe dataset: crops in US, Argentina, France, UK, China, Australia

    • Model: CropNet

      • Lightweight 9-layer CNN + 2 downsampling stages + spatial dropout for improved robustness

      • 2m parameters

      • Incorporating time shifts, time scale and magnitude warping to transform input features to robustly capture temporal dynamics

    • Use of 2D median features improved accuracy for cross-regional predictions and hyperspectral provided some further lift

    • Insight: satellite-only features can be quite effective at cross-region prediction of crop type, suggesting a lot of potential predictive capability for these models

    • Observation: prediction for each crop type depends on the same features across the world, though the timing changes

Invariant Features for Global Crop Type Classification: https://arxiv.org/abs/2509.03497

Reply all
Reply to author
Forward
0 new messages