Dear TadPole organizers,
I am interested in studying the performance of the different machine learning methods
in the diagnosis problem. I have downloaded the codes from TadpoleShare. Then I have
tried to reproduce the metrics reported in
for the Last Visit and SVM benchmark methods.
For the D2 on D4 experiment and Last Visit I am getting
mAUC 0.741
BCA 0.760
while in the table it is stated that the metrics should be
0.774
0.792.
For the D2 on D4 experiment and SVM benchmark I am getting
mAUC 0.796
BCA 0.767
while in the table it is stated that the metrics should be
0.836
0.764.
For the mAUC and SVM it seems to me a big difference...
Do you have any guess on who is responsible of the obtained differences?
Maybe the Tadpole dataset is dynamic and we are dealing with different
patients than when the table was generated?
Maybe the implementation of the algorithms have evolved and they are now different?
Have you experience the similar changes on the metrics?
Best regards.