Three graduation projects focused on machine learning for EO data

98 views
Skip to first unread message

Mitra Baratchi

unread,
Sep 24, 2021, 8:35:02 AM9/24/21
to LIACS thesis projects
Dear students,

Please find below description of three graduation projects all based on spatio-temporal modelling of earth observation data. We are working closely with ESA (European Space Agency) and the CML institute.

Regards,
Mitra
________

Project #1: Estimating canopy structure from GEDI lidar data in tropical forest using Convolutional Neural Networks

During deforestation and forest fires, the carbon stored in the Earth’s forests is released into the atmosphere, contributing greatly to climate change. Hence, accurately measuring and mapping forest structure globally using satellite remote sensing data is of great importance. One example of such a satellite sensor is the Global Ecosystem Dynamics Investigation (GEDI): a laser ranging (lidar) instrument mounted on the international space station (ISS) (Dubayah et al., 2020). This instrument is currently collecting information on the three-dimensional  canopy structure of all temperate and tropical forests. Traditionally, the collected data requires several computationally intensive steps to obtain information on the forest. A machine learning approach may be faster, more accurate, and require fewer complex preprocessing steps. In this study you will evaluate the use of advanced machine learning algorithms such as convolutional neural networks to estimate various canopy structure metrics (ground height, canopy height, foliage density along the vertical forest profile) from the laser waveform signals (Dubayah et al., 2021b). Cloud cover and atmospheric conditions lead to inconsistencies in the data, and the instrument’s sampling method results in varying data density across the world. This influences the reliability and resolution of the information that can be extracted from the data. It is expected that smart machine learning methods may overcome some of these limitations and produce more reliable, higher-resolution datasets! You will estimate whether using neural networks is (a) more accurate than the conventional methods, and (b) faster than conventional methods. You will use data over a tropical forest-savanna study site in Gabon, Africa. You will use highly accurate lidar data collected during a field campaign in 2016 (Blair and Hofton, 2018a; Fatoyinbo et al., 2021; Marselis et al., 2019), that is known to provide very accurate canopy structure information (Marselis et al., 2018) to train your machine learning model for estimating the canopy metrics following a method similar to (Fayad et al., 2021). You will estimate whether the results of your model are more accurate than those of the standard GEDI data products (Dubayah et al., 2021a, 2021c) and cross-validate your results over another study site in Gabon (Blair and Hofton, 2018b; Marselis et al., 2019).

We are looking for candidates that match the following profile:

·   Interest in working with Earth observation data

·   Willing to adopt an inter-disciplinary perspective

·   Strong programming skills in Python 

Skills: Python programming, machine learning, neural networks

For information about this project, you can contact Mitra Baratchi, Assistant Professor, email: m.bar...@liacs.leidenuniv.nl

Supervisors: Suzanne Marselis (CML), Laurens Arp (LIACS), Mitra Baratchi (LIACS)




References

Blair, J.B., Hofton, M., 2018a. AfriSAR LVIS L1B Geolocated Return Energy Waveforms, Version 1 [Data set]. https://doi.org/https://doi.org/10.5067/ED5IYGVTB50Z

Blair, J.B., Hofton, M., 2018b. AfriSAR LVIS L2 Geolocated Surface Elevation Product, Version 1. https://doi.org/https://doi.org/10.5067/A0PMUXXVUYNH

Dubayah, R., Blair, J.B., Goetz, S., Fatoyinbo, L., Hansen, M., Healey, S., Hofton, M., Hurtt, G., Kellner, J., Luthcke, S., others, 2020. The Global Ecosystem Dynamics Investigation: High-resolution laser ranging of the Earth’s forests and topography. Sci. Remote Sens. 1, 100002.

Dubayah, R., Hofton, M., Blair, J., Armston, J., Tang, H., Luthcke, S., 2021a. GEDI L2A Elevation and Height Metrics Data Global Footprint Level V002 [Data set]. https://doi.org/https://doi.org/10.5067/GEDI/GEDI02_A.002

Dubayah, R., Luthcke, S., Blair, J., Hofton, M., Armston, J., Tang, H., 2021b. GEDI L1B Geolocated Waveform Data Global Footprint Level V002 [Data set]. https://doi.org/https://doi.org/10.5067/GEDI/GEDI01_B.002

Dubayah, R., Tang, H., Armston, J., Luthcke, S., Hofton, M., Blair, J., 2021c. GEDI L2B Canopy Cover and Vertical Profile Metrics Data Global Footprint Level V001 [Data set]. https://doi.org/https://doi.org/10.5067/GEDI/GEDI02_B.001

Fatoyinbo, T., Armston, J., Simard, M., Saatchi, S., Denbina, M., Lavalle, M., Hofton, M., Tang, H., Marselis, S., Pinto, N., others, 2021. The NASA AfriSAR campaign: Airborne SAR and lidar measurements of tropical forest structure and biomass in support of current and future space missions. Remote Sens. Environ. 264, 112533.

Fayad, I., Ienco, D., Baghdadi, N., Gaetano, R., Clayton, A.A., Stape, J.L., Ferraco Scolforo, H., Le Maire, G., 2021. A CNN-based approach for the estimation of canopy heights and wood volume from GEDI waveforms over Eucalyptus plantations. Remote Sens. Environ. In Press.

Marselis, S.M., Tang, H., Armston, J., Abernethy, K., Alonso, A., Barbier, N., Bissiengou, P., Jeffery, K., Kenfack, D., Labrière, N., others, 2019. Exploring the relation between remotely sensed vertical canopy structure and tree species diversity in Gabon. Environ. Res. Lett. 14, 94013.

Marselis, S.M., Tang, H., Armston, J.D., Calders, K., Labrière, N., Dubayah, R., 2018. Distinguishing vegetation types with airborne waveform lidar data in a tropical forest-savanna mosaic: A case study in Lopé National Park, Gabon. Remote Sens. Environ. 216, 626–634.

______________

Project #2: Data-guided explainable super-resolution


This project is part of a larger project carried out in collaboration with the European Space Agency.

Remote sensing data, such as imagery captured by satellites, is of tremendous importance to tasks like vegetation monitoring [1], climate analysis [2] and the prediction of extreme events like floods and fires [3]. However, such data is rarely consistent in terms of resolution, making it difficult for prediction methods to be applied to multiple data-sources (data fusion). Moreover, some sensors may simply provide data that is too low-grained to draw meaningful conclusions from. As a result, effective super-resolution methods, aimed at providing a homogeneous set of high resolution images for a given area from various sources, have the potential to generate a large amount of impact in various downstream application areas. Importantly, to assure data reliability and accountability, such methods should ideally also be explainable, rather than black-box approaches. Unlike natural images, remote sensing imagery can use auxiliary high-resolution data sources to guide this process, motivating the use of specific methods exploiting this additional resource. In this project, you will use Python to explore the use of an explainable data-driven interpolation method [4], or other methods of choice (especially machine learning/deep learning), to generate high resolution imagery from low resolution sources. Your work would have a high impact potential to communities dependent on this data, particularly to those relating to the Digital Twin of Earth (DTE) [5].

We are looking for candidates that match the following profile:

·   Interest in working with Earth observation data

·   Strong programming skills in Python 

Skills: Python programming, machine learning, neural networks

For information about this project, you can contact Mitra Baratchi, Assistant Professor, email: m.bar...@liacs.leidenuniv.nl

Supervisors: Laurens Arp (LIACS), Mitra Baratchi (LIACS), Suzanne Marselis (CML)


References:

[1] Jochem Verrelst et al, 2019. Quantifying vegetation biophysical variables from imaging spectroscopy data: a review on retrieval methods. Surveys in Geophysics 40.3 (2019): 589-629.

[2] Qiangqiang Yuan et al, 2020. Deep learning in environmental remote sensing: Achievements and challenges. Remote Sensing of Environment 241 (2020): 111716.

[3] Joshua Lizundia-Loiola et al, 2020. A spatio-temporal active-fire clustering approach for global burned area mapping at 250 m from MODIS data. Remote Sensing of Environment 236 (2020): 111493.

[4] Laurens Arp, Mitra Baratchi, Holger Hoos, 2021. Value propagation-based spatio-temporal interpolation inspired by Markov reward processes. Submitted to the ECML-PKDD Journal Track in March 2021, currently under review.

[5] Sveinung Loekken, Bertrand Le Saux, Sara Aparicio, 2020. The contours of a trillion-pixel Digital Twin Earth. A presentation to EarthVision 2020 Seattle (and the aether).

________________

Project #3: Causal inference: interventions


This project is part of a larger project carried out in collaboration with the European Space Agency. 

Many of the big problems humanity faces today, like global warming and its effects, are highly complex and require effective interventions. The great success of machine learning and deep learning techniques in recent years motivates their use in Earth system applications, supporting policy makers and allowing them to make informed decisions. However, unlike traditional physics-based methods, machine learning tends to be purely data-driven and exploits correlation, rather than an understanding of causation. This has led to the call for causal inference to be included in machine learning approaches and Earth system sciences [1]. In particular, when the methods are used to decide on interventions to change the system (e.g., mitigating the consequences of global warming), the interventions themselves may alter the distribution of the data the machine learning models were trained on, thus reducing their effectiveness and reliability as tools supporting policy making. As a result, an understanding of the causal link between interventions and their effects is needed to make counterfactual predictions [2,3]. In this project, you will work on counterfactual machine learning predictions in the context of Earth system interventions. Counterfactuals in an Earth observation context will come with particular challenges, such as a potential delay prior to observing the outcome of changes [4]. You will work with machine learning models in Python and learn about incorporating causal relations into these models. The work would fit in the context of a greater project, carried out in collaboration with the European Space Agency, aimed at creating a physics-aware Digital Twin of Earth [5]. 

We are looking for candidates that match the following profile:

·   Interest in working with Earth observation data

·   Strong programming skills in Python 

Skills: Python programming, machine learning, neural networks

For information about this project, you can contact Mitra Baratchi, Assistant Professor, email: m.bar...@liacs.leidenuniv.nl

Supervisors: Laurens Arp (LIACS), Mitra Baratchi (LIACS), Suzanne Marselis (CML)


References:

[1] Jakob Runge et al, 2019. Inferring causation from time series in Earth system sciences. Nature communications, 2019, 10.1: 1-13.

[2] Jason Hartford et al. 2017. Deep IV: A flexible approach for counterfactual prediction. International Conference on Machine Learning.

[3] Mattia Prosperi et al. 2020. Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence 2.7 (2020): 369-375.

[4] Philip Naumann, and Eirini Ntoutsi. 2021. Consequence-aware Sequential Counterfactual Generation. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (2021)

[5] Sveinung Loekken, Bertrand Le Saux, Sara Aparicio, 2020. The contours of a trillion-pixel Digital Twin Earth. A presentation to EarthVision 2020 Seattle (and the aether).


Reply all
Reply to author
Forward
0 new messages