Hi,
This looks cool, I always find that most ML work is super structured though. Might be helpful to have some predefined block combinations. Or have a start block that loads in data, performs train test split etc.
I copy pasted my replication of your code into codelab and it errored out. Issue with random forest_ml being an int? for some reason its inserting an randomForest_ML=0 after the training.
That would also be another thing i would say. Try to make a "train model" block that creates a variable for a trained model. and a function that is "test model" that generates your output.
Basically. Avoid letting people use variables as much as possible. In our tool we only use variables as an alternative to magic numbers. I think its more intuitive for users.
Thats a lot of nitpicks but in general i really like what you're doing! I think there is a big use case for this.
Deep learning would be super cool. I would recommend building some form of function system for defining the DL model. (I think you could "technically" build a multi layered perceptron using the basic blockly features. My secret dream is that someone actually tries that one day). So again nice helper blocks will be key here!
Best,
Jonty