Hello,
I am the lead dev for GoldenCheetah, a desktop application used by many thousands of Cyclists and Triathletes.
To support ML and more general research I am adding features to the desktop app to allow users to post their workout data for public use. I was planning on publishing this data quarterly.
It is likely to be a high volume set, each athlete is likely to post 500-5000 workouts each with around 3600-50000 rows of data.
So with a very conservative estimate of 100 athletes data in the course of a year we are talking 100x500x3600= 180million rows of data, probably stored per workout so 50,000 files.
Is OpenML the right place to post this?
If so, any advice on how to manage this, bearing in mind that athletes will be contributing data over time ?
Regards,
Mark