friday online meeting

Martin Jaggi

unread,

Aug 8, 2018, 2:51:17 PM8/8/18

to mlbench

we'll meet via hangouts friday 13:30, would be nice if you can make it.

ralf will walk us through some updates about the current status of the project.

Tao Lin

unread,

Aug 8, 2018, 5:10:46 PM8/8/18

to Martin Jaggi, mlbench

fine for me.

On Wed, Aug 8, 2018 at 8:51 PM Martin Jaggi <m.j...@gmail.com> wrote:

we'll meet via hangouts friday 13:30, would be nice if you can make it.

ralf will walk us through some updates about the current status of the project.

--
You received this message because you are subscribed to the Google Groups "mlbench" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlbench+u...@googlegroups.com.
To post to this group, send email to mlb...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mlbench/d5edaa96-9750-4e42-9bc2-1933fdefeeb8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

lie.he

unread,

Aug 8, 2018, 5:33:23 PM8/8/18

to mlbench

Works for me as well.

Ralf Grubenmann

unread,

Aug 9, 2018, 4:17:54 AM8/9/18

to mlbench

We will discuss the current status as well as some future design decisions.

Possible questions:

What categories of benchmarks should there be (Open/Closed)? What restrictions do we impose?
What should users be able to do? How can they extend the framework?
What metrics do we collect?
Which parts do we standardize and how do we abstract those parts away?
What area do we focus on first?

I can't be there in person, so I'll participate via Hangouts or Skype.

If anyone else wants to participate remotely, please let me know.

Ralf G.

unread,

Aug 10, 2018, 11:04:03 AM8/10/18

to mlbench

Protocol of the meeting:

We discusses the current status: Dashboard/API is working, OpenMPI with simple example is working, only minibatch-sgd example and metrics gathering for example still need to be done.

Discussed how we should measure performance:

The clock should be stopped when calculating Accuracy to not falsify results, evaluation should not be included in benchmarking metric.

Maybe just save checkpoints during training at regular interval and calculate accuracy at the end of the run.

Training can be done until accuracy is reached with time taken as metric, or we train for a fixed-time and use final accuracy as metric.

Discussed data distribution:

Data should already be present on workers when benchmarking starts (Possibly already in memory).

Possibility for algorithms to load data themselves on Open division.

We need to decide on some datasets that are fixed for closed divisions, no other datasets are allowed.

Preprocessing needs to be fixed in closed division as well.

We could create 2-3 default dataloaders to deal with the most common use-cases and those are proscribed for all benchmarking tasks (All data on all workers, Even split of data among workers, ?)

Decided to create a new file on github detailing the open/closed division.

Dimensions to compare are (for now): Hardware (Google Cloud vs. AWS, GPUs etc.), Scaling (1,2,4,8 etc nodes), Network bandwidth

In the beginning, only synchronous training will be implemented.

Fair comparisons are the most important criteria for our implementation.

Please let me know if I missed something.

Fabian Pedregosa

unread,

Aug 11, 2018, 11:22:10 AM8/11/18

to Ralf Grubenmann, mlbench

On Fri, Aug 10, 2018, 08:04 Ralf G. <ralf.gr...@gmail.com> wrote:

Protocol of the meeting:

We discusses the current status: Dashboard/API is working, OpenMPI with simple example is working, only minibatch-sgd example and metrics gathering for example still need to be done.

Discussed how we should measure performance:
The clock should be stopped when calculating Accuracy to not falsify results, evaluation should not be included in benchmarking metric.
Maybe just save checkpoints during training at regular interval and calculate accuracy at the end of the run.

Training can be done until accuracy is reached with time taken as metric, or we train for a fixed-time and use final accuracy as metric.

Discussed data distribution:
Data should already be present on workers when benchmarking starts (Possibly already in memory).
Possibility for algorithms to load data themselves on Open division.
We need to decide on some datasets that are fixed for closed divisions, no other datasets are allowed.
Preprocessing needs to be fixed in closed division as well.

Hi Ralf,

Thanks for the report. What is open/closed division? Never heard of it.

Otherwise looking good!

Fabian

To view this discussion on the web visit https://groups.google.com/d/msgid/mlbench/b09a6c33-0ec5-4434-97f7-f0abb19d6047%40googlegroups.com.

Ralf G.

unread,

Aug 13, 2018, 5:18:28 AM8/13/18

to mlbench

Hi Fabian,

Thanks!

Open/Closed division is similar to what MLPerf is using.

The Closed division is for strict Benchmarking, with everything fixed and users not being able to implement their own stuff (unless it's approved). So the models are fixed, the datasets are fixed, etc. This is to allow a fair comparison of the benchmarked dimensions.

The Open division is where (almost) anything goes, i.e. users can submit/execute their own models and implementations and get comparisons, albeit it is up to the user to make sure the comparisons are fair.

Regards,

Ralf

On Saturday, August 11, 2018 at 5:22:10 PM UTC+2, Fabian Pedregosa wrote:

To unsubscribe from this group and stop receiving emails from it, send an email to mlbench+unsubscribe@googlegroups.com.

Martin Jaggi

unread,

Aug 13, 2018, 6:47:59 AM8/13/18

to Ralf G., mlbench

On Mon, Aug 13, 2018 at 11:18 AM Ralf G. <ralf.gr...@gmail.com> wrote:

Hi Fabian,

Thanks!

Open/Closed division is similar to what MLPerf is using.

The Closed division is for strict Benchmarking, with everything fixed and users not being able to implement their own stuff (unless it's approved). So the models are fixed, the datasets are fixed, etc. This is to allow a fair comparison of the benchmarked dimensions.

yes, adding to this that this 'division' is where we will provide reference implementations of the training algorithms. i.e. algorithms are fixed - prescribed by the benchmark task.

this one will be most interesting therefore to benchmark different hardware etc, while the open one below allows all kinds of training 'tricks'. will of course be interesting to see the difference in performance between the two 'divisions'

To view this discussion on the web visit https://groups.google.com/d/msgid/mlbench/b09a6c33-0ec5-4434-97f7-f0abb19d6047%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the Google Groups "mlbench" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlbench+u...@googlegroups.com.
To post to this group, send email to mlb...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/mlbench/d8b856b1-9bd4-4f1b-8d2b-dcf15f5aad88%40googlegroups.com.

Fabian Pedregosa

unread,

Aug 13, 2018, 11:16:52 AM8/13/18

to Martin Jaggi, Ralf Grubenmann, mlbench

Makes sense, thanks

To view this discussion on the web visit https://groups.google.com/d/msgid/mlbench/CAKcN99%3DrLhjA2FtX0hJOgs8OLXcXY-XG3GFksjnAG-6wL67dOA%40mail.gmail.com.

Reply all

Reply to author

Forward