Dear Rafael,
Great that you got interested in OpenML :)
To give you a little 101, OpenML works with the concepts of tasks (rather than datasets)
In short, a task represents a dataset together with an estimation procedure (e.g., cross-validation) and a target feature.
For more information, please see this blog post:
So, the correct question would be: "how can I get the best results/metrics (like MSE or F1) per task at OpenML" :)
This can be done through the various api's that we provide (REST/Python/R/Java)
although you need kind of a detour (we don't support grouping operations per task yet,
but it seems like a useful feature so feel free to open an issue).
Through the REST API, I would suggest something like this:
By iterating over all tasks, the following call will give you all results for a given task:
And then it's a matter of getting the highest.
All these api calls are natively supported by the packages in the aforementioned programming languages.
Best,
Jan