Multiple master thesis projects at the ADA research group (AutoML/ML for spatio-temporal datasets))

149 views

Skip to first unread message

Mitra Baratchi

unread,

Sep 28, 2022, 4:09:57 AM9/28/22

to LIACS thesis projects

AutoML for EO

In a previous master's thesis project, a student developed an Automated machine learning approach to single-image super-resolution called AutoSR4EO. This framework automatically creates a neural network depending on the training dataset. Single-image super-resolution is a technique used to increase the resolution of images by mapping a single low-resolution image to a higher resolution one. Super-resolution is a low-level task that can be applied to images before a downstream task like classification or detection is applied. This framework was able to achieve state-of-the results. This framework could be improved or extended in different ways. For instance, nowadays, multi-image super-resolution which maps multiple low-resolution images to a single high-resolution one is gaining in popularity (over 1-to-1 mappings from low- to high-resolution). However, the design of these methods costs a lot of time. This creates the need for AutoML approaches to the problem.

Another future research angle could address that super-resolution methods are often trained in a supervised manner by only considering a ground-truth high-resolution image. A task-driven approach which takes downstream performance into account (e.g. with additional loss) could improve performance of the complete analysis pipeline.

[1]Razzak, M., Mateo-Garcia, G., Gómez-Chova, L., Gal, Y., and Kalaitzis, F., “Multi-Spectral Multi-Image Super-Resolution of Sentinel-2 with Radiometric Consistency Losses and Its Effect on Building Delineation”, 2021.

[2] Märtens M., Izzo D., Krzic A. and Cox D. "Super-resolution of PROBA-V images using convolutional neural networks." Astrodynamics 3.4 (2019): 387-402. (arxiv version)

[3] Jacob Shermeyer, Adam Van Etten; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 0-0

[4] Zou, F., Xiao, W., Ji, W. et al. Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image. Neural Comput & Applic 32, 14549–14562 (2020). https://doi.org/10.1007/s00521-020-04893-9

[5] Haris, M., Shakhnarovich, G., Ukita, N. (2021). Task-Driven Super Resolution: Object Detection in Low-Resolution Images. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_45

Skills: AutoML, Deep Learning, Keras/Pytorch, image processing

_____________________

Instance features for Earth observation data

Earth observation data is a type of remote sensing data that collects data on the planet, typically (but not always) carried out by satellites. Nowadays, Earth observation data is frequently being combined with machine learning and deep learning models for common applications including land cover classification, flood mapping and crop monitoring. However, due to the inevitable diversity, noise and inconsistency of this type of data spanning most of the planet, evaluating and configuring (calibrating/parameterising) models has been known to be a non-trivial problem for traditional physical models [1], and the performance of machine learning models can be similarly erratic, especially due to problems with data scarcity [2]. As a result, it is often not clear which model, with which configuration, a user should use for a specific problem (e.g., plant matter retrieval on sub-Saharan Africa), even if experimental results for the same problem and models on a different region (e.g., a forest in Sweden) are available.

In this project, you will research possible instance features for multi-spectral Earth observation data. You can explore physically relevant features (such as vegetation indices) or structural features (such as contrast), but also work with latent representations or other approaches, although explainability will be a desirable property. The representations should enable explanation of experimental results (e.g., model X performs well on instance Y because of properties Z), and improve the performance of algorithm selection problems on new instances using these instance features. You will work on this MSc thesis project as part of the ADA research group, supervised by Mitra Baratchi.

Skills: Machine learning, deep learning, image processing, AutoML

[1] Beck, Hylke et al. 2016. Global-scale regionalization of hydrolic model parameters. Water Resources Research 52-5.

[2] Vali, Ava et al. 2020. Deep Learning for Land Use and Land Cover Classification Based on Hyperspectral and Multispectral Earth Observation Data: A Review. Remote Sensing 12-15.

__________

Robust counterfactual suggestions for Earth observation data

When combined with machine learning models, Earth observation data can be used to derive important information about, for example, the health of ecosystems, the temperature of the planet, and the yield of crop fields. Moreover, temporal forecasting models can be used to estimate the future status of the planet, given some prior data (such as the past and current states, or actions taken). Although these models can be of tremendous value for monitoring purposes, they only model factual predictions – that is, they provide a prediction given some real input. However, in many cases one may also be interested in counterfactual predictions: which inputs should be changed in order for some other, generally more desirable, outcome (for example: a greater crop yield, or a lower global temperature) to be predicted [1]? These suggested changed to the inputs could form the basis of pursued policies. However, machine learning models (and deep learning in particular) can be vulnerable to “gaming the system”, where, for example, gaps in the training data may simply be exploited, leading to unreliable counterfactual suggestions that do not reflect the true outcomes in the real world.

In this project, you will work on robust machine learning models for Earth observation data, applied to counterfactual suggestions. You can explore causal inference mechanisms [2], and counterfactual actions based thereon, as a possible approach to address this problem, but you are free to pursue other methods as well. You will work on this MSc thesis project as part of the ADA research group, supervised by Mitra Baratchi.

Skills: Machine learning, deep learning, spatio-temporal data, counterfactuals

[1] Naumann, Philip et al. 2021. Consequence-Aware Sequential Counterfactual Generation. Proceedings of Machine Learning and Knowledge Discovery in Databases.

[2] Runge, Jakob et al. 2019. Inferring causation from time series in Earth system sciences. Nature Communications.

[3] Herlands, William, Daniel B. Neill, Hannes Nickisch, and Andrew Gordon Wilson. Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction. J. Mach. Learn. Res. 20 (2019): 99-1.

[4] Hartford, J., Lewis, G., Leyton-Brown, K. and Taddy, M., 2017, July. Deep IV: A flexible approach for counterfactual prediction. In International Conference on Machine Learning (pp. 1414-1423). PMLR.

___________________

Next generation Nitrogen monitoring system

Atmospheric imagers, such as TROPOMI, collect information on the Earth's atmosphere from space. With such imagers it is possible to collect information on the composition of gases in the atmosphere. However, they have limited spatial resolution. In other words: the images have rather ‘coarse pixels’. This results in a lack of possibility to detect changes of individual sources (points at which specific gases are emitted). Airplane observation have the ability to resolve the spatial variability, but lack the temporal resolution and spatial coverage that is required to track the emission of different sources over time.

In this project you will be tackling two main challenges. Your first focus will be the automated detection of nitrogen plumes from point sources. Such plumes are relatively easy to detect by the human eye but more difficult to distinguish by a computer. Your second challenge will be to improve the spatial resolution of reactive nitrogen monitoring through a combination of satellite and airplane observations. You will be using the higher resolution airplane observations as ground truth to train machine learning algorithms on the lower resolution satellite observations. Supervision will take place by Mitra Baratchi (LIACS), Suzanne Marselis (CML) and Enrico Dammers (CML & TNO).

Skills: Machine learning, spatio-temporal data, sparse data

[1] Tack et al. 2021 https://doi.org/10.5194/amt-14-615-2021

[2] Kuhlmann et al. 2019 https://doi.org/10.5194/amt-12-6695-2019

______________________

Detecting and Resolving Striping in Satellite measurements with sparse data

Striping is a common problem for satellite sensors, such as TROPOMI. Striping occurs because of systematic errors/biases in individual detector pixels. Such issues limit the usefulness of the data as striping artefacts obscure real information and introduce bias (patterns) into the data. Identifying striping and developing methods to resolve the striping is essential to increase the usability of the data.

In this project you will focus on two major challenges. First you will generate a method that can automatically identify striping in satellite data. Second, you will analyze the detected striping for relations to any secondary information in the satellite product and develop and test several different machine learning methods to replace the striping with more useful data. You will compare your results to the state-of-the-art methods that are currently available. In this project we will specifically look at atmospheric data from TROPOMI, but the focus is also to develop a method that is applicable across a broader range of satellite and aircraft data. Supervision will take place by Mitra Baratchi (LIACS), Suzanne Marselis (CML) and Enrico Dammers (CML & TNO).

Skills: Machine learning, spatio-temporal data, artefact removal, sparse data.

Related work:

[1] Borsdorff et al 2019 https://doi.org/10.5194/amt-12-5443-2019

[2] Lloyd et al. 2020 https://doi.org/10.1117/12.2574449

Image (from borsdorff et al. 2019) to demonstrate the concept:

Reply all

Reply to author

Forward

0 new messages