How to get results/metrics per dataset?

25 views
Skip to first unread message

Rafael Novello

unread,
Jun 4, 2018, 3:26:21 PM6/4/18
to OpenML
Hi guys!

My name is Rafael, I'm a "newbie" machine learning engineer from Brazil and I got so interested in OpenML!!

I did some search in docs and web but I did not found how can I get the best results/metrics (like MSE or F1) per dataset at OpenML. There are some way to do that?

I'm working with AutoML and I need to compare the results I got with AutoML libs and the results obtained by the traditional methods.

Thanks a lot and sorry for poor English.

Jan van Rijn

unread,
Jun 4, 2018, 4:43:56 PM6/4/18
to Rafael Novello, OpenML
Dear Rafael,

Great that you got interested in OpenML :)

To give you a little 101, OpenML works with the concepts of tasks (rather than datasets)
In short, a task represents a dataset together with an estimation procedure (e.g., cross-validation) and a target feature.
For more information, please see this blog post:

So, the correct question would be: "how can I get the best results/metrics (like MSE or F1) per task at OpenML" :)

This can be done through the various api's that we provide (REST/Python/R/Java)
although you need kind of a detour (we don't support grouping operations per task yet,
but it seems like a useful feature so feel free to open an issue).

Through the REST API, I would suggest something like this:
https://www.openml.org/api/v1/task/list/type/1 (will give all supervised classification tasks)

By iterating over all tasks, the following call will give you all results for a given task:

And then it's a matter of getting the highest. 
All these api calls are natively supported by the packages in the aforementioned programming languages.

Best,
Jan



--
You received this message because you are subscribed to the Google Groups "OpenML" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openml+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Joaquin Vanschoren

unread,
Jun 4, 2018, 4:58:40 PM6/4/18
to Jan van Rijn, Rafael Novello, OpenML
Hi Rafael, 

Just to add, this is the call you want on Python:

So you get all evaluations for a specific task and evaluation measure, then check which flow has the best score. You'd have to do that task by task.

It would be easier if the evaluation listing calls had a sort field. @janvanrijn: how hard would it be to add that to the REST API?

Best,
Joaquin


To unsubscribe from this group and stop receiving emails from it, send an email to openml+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "OpenML" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openml+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Thank you,
Joaquin

Rafael Novello

unread,
Jun 5, 2018, 11:41:39 AM6/5/18
to Joaquin Vanschoren, Jan van Rijn, OpenML
Hi guys!

Thank you so much Jan and Joaquin for the great help!

I have read the material and made some tests with the Python client and I would like to confirm some assumptions:

1 - There are more than one task per dataset. I have requested a list with 200 tasks and looking for tasks names or source data I got only 107 unique values. Snippet here: https://git.io/vhB4r

2 - For each task there are many flows and each flow have your own metrics. So, to get the best result (ex.: precision) of a task I need to get all flows of them and find for the best value.

Is my assumptions right?

Thanks a lot for the help!

Atenciosamente,
Rafael J. R. Novello

Skype: rafael.novello

To unsubscribe from this group and stop receiving emails from it, send an email to openml+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "OpenML" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openml+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Thank you,
Joaquin

Jan van Rijn

unread,
Jun 5, 2018, 11:50:58 AM6/5/18
to Rafael Novello, Joaquin Vanschoren, OpenML
Dear Rafael,

2018-06-05 11:41 GMT-04:00 Rafael Novello <rafa.rei...@gmail.com>:
Hi guys!

Thank you so much Jan and Joaquin for the great help!

I have read the material and made some tests with the Python client and I would like to confirm some assumptions:

1 - There are more than one task per dataset. I have requested a list with 200 tasks and looking for tasks names or source data I got only 107 unique values. Snippet here: https://git.io/vhB4r


Correct. You can do multiple things with a dataset, e.g., do cross-validation, holdout set, 10 times cross-validation, etc.
This concept is caught in the notion of tasks, and this is why there are multiple tasks defined per dataset.

If you are looking for a consistent dataset/task combination, the easiest thing to do is to use the OpenML100, see this paper including Python code:
 
2 - For each task there are many flows

Correct. A flow is basically a classifier.

and each flow have your own metrics. So, to get the best result (ex.: precision) of a task I need to get all flows of them and find for the best value.


For each flow that is ran on a dataset (= called a run) OpenML calculates on line a wide range of metrics, under which precision, accuracy, recal, auroc, etc. So all these measures are available for each run.
Using the list evaluation you can specify one of these, and then pick the highest. Using the setup objects you can even get the parameter settings that were used to get this performance.

Cheers,
Jan

Rafael Novello

unread,
Jun 11, 2018, 1:30:56 PM6/11/18
to Jan van Rijn, Joaquin Vanschoren, OpenML
Hi Jan!

Sorry for the delay!

Thanks a lot for the great help! Now OpenML is much more clear to me!

When I finish my tests I'll share my results with you all!

Best regards!

Atenciosamente,
Rafael J. R. Novello

Skype: rafael.novello

Reply all
Reply to author
Forward
0 new messages