I can definitely see why FID has been chosen, and it's got great merit.
But I wonder if there's room for a more extended metric to capture the experience on longer lived pages, where multiple inputs might occur?
Example use case, A web app that on clicking a button loads in a largish data set, the FID is very low here because the thread is quite at this point.
The data is displayed in a table, perhaps with a column being calculated from other fields, the user clicks to sort the field and the delay in that input is perhaps much longer than the very low FID the page recorded.
I think some visibility of that might give insight to developers that perhaps it's worth investigating using a worker to move it off the main thread, or doing the calculations server side, or investigate some more performant options.
Perhaps something akin to the new maximum session window approach CLS is taking?
Is this something that's too expensive to track? Perhaps too noisy?