Virtual seminar on Reproducibility in Signal Processing Research (22/09/2021 morning and 20/10/2021 afternoon, Paris time)

20 views

Skip to first unread message

Mathieu Lagrange

unread,

Sep 6, 2021, 6:00:06 AM9/6/21

to Community Announcements

-- Apologies for any cross posting --

Dear all,

we are organizing a two half day virtual seminar (22/09/2021 morning and 20/10/2021 afternoon, Paris time) on "Reproducibility in Signal Processing" with the following speakers:

- Cynthia Liem, Associate professor, TU Delft, Pays-Bas, https://www.cynthialiem.com

- Alexandre Gramfort, Research Director, INRIA, France, https://alexandre.gramfort.net

- Brian McFee, Assistant professor, New York University, États-Unis, https://brianmcfee.net

- Annamaria Mesaros, Assistant professor, Tampere University, Finlande, https://homepages.tuni.fi/annamaria.mesaros

This event is organized as part of the "Groupe de recherche" ISIS of the CNRS. Registration is free and mandatory: http://www.gdr-isis.fr/index.php/reunion/459

We have a standing **call for contribution**. If your are a PhD student or a Post Doc willing to share your experience on reproducibility in your own research, please email me (mathieu<dot>lagrange<at>ls2n<dot>fr). You will be invited to present to the whole audience for 5 minutes and virtual rooms will be opened later on for more informal discussion.

Organisation

Half-Day 1: September 22 2021

zoom link: https://ec-nantes.zoom.us/j/96169555852 (passcode: Pw@JAhH1)

9:00 Cynthia Liem: "Validation and validity in music processing pipelines

10:00 Alexandre Gramfort: "Reproducible Machine Learning: software challenges, anecdotes and some engineering solutions"

11:00 Phd presentations

11:30 Virtual room discussion on Phd presentations

12:30 Closing

Half-Day 2: October 20 2021

zoom link: https://ec-nantes.zoom.us/j/99175543775 (passcode: qPrT%65N)

14:00 Brian McFee: "Reproducibility, open source, and open data in Music Information Retrieval research"

15:00 Annamaria Mesaros: "Reproducibility in system evaluation"

16:00 "Tour de table" with all speakers

17:30 Closing

Organizers:

Mathieu Lagrange, chargé de recherche CNRS, LS2N, UMR 6004

Vincent Lostanlen, chargé de recherche CNRS, LS2N, UMR 6004

Slim Essid, Professeur, LTCI, Télécom ParisTech

Context:

The process of scientific experimentation is increasingly based on information science. In particular, signal and image processing (SIP) tools have played an essential role in many recent discoveries in physics: the detection of gravitational waves and the observation of black holes, for example. In addition, recent advances in certain digital technologies, such as functional neuroimaging and sound classification, are based on increasingly sophisticated software codebases. However, each of these SIP applications is not the result of a single algorithm, but of the joint work of a specialized research sub-community. Whether in astrophysics or acoustics, the innovation process remains essentially the same: first, the community develops massive databases, performance metrics, and a common software environment. Then, individual research groups compete to improve the state of the art. For example, the renewed growth of deep neural networks during the decade 2010?2020 was made possible thanks to new databases (eg, ImageNet, AudioSet), official ``challenges'' (eg, ILSVRC, DCASE), and numerical libraries (eg, TensorFlow, PyTorch).

In this context, the reproducibility of the experiments bears a crucial importance. First, when addressing a new problem, it is useful to begin with a simple-minded approach whose theoretical properties are well understood. Such a baseline should be made freely accessible. Secondly, students gain a hands-on experience by inspecting and re-implementing well-established methods in signal processing; and, more generally, in information science. Finally, developing software in open-source communities rather than in vertical organizations (silos) has advantages per se: quicker bug reporting and troubleshooting, up-to-date documentation, and schedule feature requests.

However, the need for research reproducibility goes beyond a simple list of good practices such as version control or the use of unit tests. In her work on "trustworthy information systems" (TIS), Cynthia Liem has shown that state-of-the-art deep neural networks for music classification are far from having a "musical ear": rather, these models tend to exaggerate some imperceptible aspects of music while lacking sensitivity to musically meaningful transformations.

The high cost of data acquisition, in the field of neuroscience for example, jeopardizes the reproducibility of numerical experiments. Therefore, in order to boost the adoption of open data, it is necessary to integrate software routines for loading and formatting data alongside transformation and statistical learning tools. This is what Alexandre Gramfort proposed with the scikit-learn libraries as well as with the Rapid Analytics and Model Prototyping (RAMP) project.

In addition, signal processing research often operates on highly structured data: such is the case, for example, of a musical score or a chord progression. To guarantee the reproducibility of music information retrieval systems, this rich structure should be preserved in machine predictions and remain interpretable by humans.The work of Brian McFee on the JSON-Annotated Music Specification (JAMS) format reflects this concern for structuring and software interoperability.

Lastly, the definition of relevant evaluation metrics requires special attention. Indeed, it is on the basis of these metrics that the scientific community concerned decides on its future directions and assesses the relevance of its proposals. Annamaria Mesaros, who in particular has been organizing theDetection and Classification of Acoustic Scenes and Events (DCASE) challenge since 2016, has a long experience of these questions of evaluation and editorialization of applied research. In particular, she maintains the sed_eval software library which is now the de facto standard for evaluating the performance of a sound event detector.

Reply all

Reply to author

Forward

0 new messages