Storage on the Public Instance

23 views
Skip to first unread message

Shihao Shen

unread,
Feb 26, 2022, 11:51:51 PM2/26/22
to codalab-competitions
Hello,

I have two doubts regarding the storage on the public CodaLab instance.
 
1. Since we cannot release our test set to the participants (it can be easily labeled and cheated), we are deciding to let participants submit their script along with their trained models. The script will be used to reproduce their model but will not be used during ingestion or scoring. Instead, we want to load their model and directly produce their predictions by evaluating their model. However, since we are holding a continual learning challenge on a dataset with natural data distribution shift spanning a decade, every submission would include 10 PyTorch models. We are concerned about the data storage on the CodaLab instance. I know we will add extra workers to the queue by providing computation capabilities during ingestion/scoring, but how could we provide extra data storage if the public CodaLab instance can't hold too-large-sized submissions? We also want to try our best to avoid running our own instance or building CodaLab from scratch. 

2. Additionally, I tried to upload our public dataset onto CodaLab but failed because it was too large. What is the maximum size allowed on the CodaLab instance for a dataset? We need to upload our private test data onto it (which is smaller than the public dataset but won't be too small) in order to evaluate their code submission using the CodaLab instance. What can be a workaround on this? 

Thank you in advance.

Best,
Shihao
Reply all
Reply to author
Forward
0 new messages