Hello Marjolaine,
Thank you very much for your response. It is very helpful to understand how you are currently handling this issue.
I fully agree that keeping even "bad data", or at least its metadata, catalogued can be important for traceability, especially when the information comes from the user or from the experimental context. I can imagine that this could provide useful metadata related to operational efficiency such as recurring cause of remeasurement or beamline-specific bottlenecks.
At Sirius, we are devising a solution towards a similar direction, but with a specific focus on data lifecycle during the period between acquisition and long-term cataloguing, which we are calling of transient datasets. In our case, we are considering the development of a data management layer for this purpose, as our Bluesky-based control system currently has no features to consistently manage this kind of information throughout a long sequence of measurements.
The idea would be to capture, during the experiment, information such as whether a measurement is valid, should be remeasured, should be preserved, or could eventually be discarded. Users already make these decisions during experiments, and capturing this feedback as it happens could make the curation process more efficient, especially when dealing with individual scans or subsets of scans within a larger dataset, which I fully agree is probably the most challenging case. If this information is collected only later, users may no longer remember the decisions they made, or the experimental context that motivated them.
Regarding the interface for collecting user feedback in practice, I agree that the Data Portal could be the best place to do this, since users are already used to interacting with this interface and it would avoid asking them to use yet another system. We are currently discussing this conceptual project, and we expect to start developing a proof of concept soon for a component that will interact with transient data. We are also considering presenting or discussing this idea at NOBUGS.
I would be very happy to share our thoughts and discuss how we could contribute to the ICAT community, especially by thinking beyond our local case at Sirius with Bluesky-based control systems and considering how such a mechanism could support other acquisition environments. Once I have material showing the main idea and architectural design of the solution we are devising, I would be glad to schedule a presentation for anyone interested in this topic.
Best regards,
Allan