About having more repositories for mlbench

lie.he

unread,

Aug 21, 2018, 8:22:55 AM8/21/18

to mlbench

Hello everyone,

Currently, there are two repositories in github organization `mlbench`: mlbench.github.io and mlbench.

While `mlbench.github.io` repository only hosts a blog for mlbench, `mlbench` has many resposibilities including:

dashboard for mlbench
helm charts for setting up a cluster
reference implementations on workers

Since they are only weakly coupled (worker post metrics to master via a RESTful service), it might be better to have separate repositories for each of them.

Besides, we may also need a task-oriented repository to customize:

preprocessing, model, optimizer, etc
configurations
dataset
environment (docker images, etc)

These codes and configurations can be applied to mlbench by, say: mounting the directory to containers, adding the directory to docker image, upload to dashboard, cloning from github etc.

Any thoughts on it?

Best regards,

Lie He

Martin Jaggi

unread,

Aug 21, 2018, 10:28:15 AM8/21/18

to lie.he, mlbench

thanks, these are very good comments!

in particular would be very nice to keep the reference algorithm implementations as separate and self-contained as possible (so that they ideally can also be run easily in a regular context without the whole benchmark context)

let's discuss once we have the CIFAR10 running, then we can discuss the interfaces a bit better

--
You received this message because you are subscribed to the Google Groups "mlbench" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlbench+u...@googlegroups.com.
To post to this group, send email to mlb...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mlbench/1bb9fba9-e192-4a1c-9793-ebfaf3ac39a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ralf G.

unread,

Aug 21, 2018, 10:57:36 AM8/21/18

to mlbench

I think master/helm in separate repositories makes complete sense. Especially since the Helm public repository guys expect the chart to be a separate repo.

As for the worker, having separate base-images for different frameworks makes sense (Pytorch, Tensorflow).

And the dataloading/preprocessing/model code is specific to those frameworks, so it needs to be "duplicated" for each one.

But Dataloading/Preprocessing/Models should be interchangable within one framework. I.e. Cifar10+Resnet, Cifar10+Alexnet, Imagenet+Resnet, Imagenet+Alexnet should not be 4 independent implementations, but rather 1 implementation each for Cifar10 Loading, Imagenet Loading, Resnet, Alexnet, which can then be combined.

This could be spread across multiple repositories, but i fear that makes it a pain to maintain and to keep all versions compatible.

I think it makes more sense to have all pytorch code in one repo, all tensorflow code in another, with the Dataloading and Models neatly separated and interchangeable.

We could do individual Dockerfiles for composing different Dataloading and Models (E.g. 4 separate Dockerfiles for the example above), but I featrthe amount of dockerfiles would quickly explode (especially if there are additional dimensions to consider like optimizer etc.).

Instead I'd rather have some entrypoint script that instantiates the right implementations according to supplied parameters, with all pytorch code in a single repo.

What we could still do is separate models/dataloading etc. by Topic, e.g. Computer Vision, NLP, and so on, as there isn't much overlap between different Dataloaders and Models in that area. So those could be different repositories.

I'd leave the pytorch base image in a compeltely separate repo, containing only the pytorch installation and possibly some reporting code.This allows users to extend it without any of our implementations

I'd also add a dummy implementation for reporting that just logs to stdout, so it can be tested without the whole cluster.

So it could look like:

Master Repo
Helm Repo
Pytorch Base Repo
Pytorch Image Recognition Ref Impls Repo
Pytorch Language Generation Ref Impls Repo
Tensorflow Base Repo
Tensorflow Image Recognition Ref Impls Repo
Tensorflow Language Generation Ref Impls Repo
etc.

On Tuesday, August 21, 2018 at 4:28:15 PM UTC+2, Martin Jaggi wrote:

thanks, these are very good comments!
in particular would be very nice to keep the reference algorithm implementations as separate and self-contained as possible (so that they ideally can also be run easily in a regular context without the whole benchmark context)

let's discuss once we have the CIFAR10 running, then we can discuss the interfaces a bit better

On Tue, Aug 21, 2018 at 2:22 PM lie.he wrote:

Hello everyone,

Currently, there are two repositories in github organization `mlbench`: mlbench.github.io and mlbench.

While `mlbench.github.io` repository only hosts a blog for mlbench, `mlbench` has many resposibilities including:
dashboard for mlbench
helm charts for setting up a cluster
reference implementations on workers
Since they are only weakly coupled (worker post metrics to master via a RESTful service), it might be better to have separate repositories for each of them.

Besides, we may also need a task-oriented repository to customize:
preprocessing, model, optimizer, etc
configurations
dataset
environment (docker images, etc)
These codes and configurations can be applied to mlbench by, say: mounting the directory to containers, adding the directory to docker image, upload to dashboard, cloning from github etc.

Any thoughts on it?

Best regards,
Lie He

--
You received this message because you are subscribed to the Google Groups "mlbench" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mlbench+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward