StreamingGradientBoostedTrees returns identical results no matter seed value

Moura

unread,

Apr 12, 2024, 7:48:08 PM4/12/24

to MOA users

Hey there,

I conducted some experiments using the latest version of MOA (compiled from the GitHub repository) to assess the performance of StreamingGradientBoostedTrees on the airlines.arff dataset.

Interestingly, despite employing different random seeds for each execution, I consistently obtained identical results for Kappa and Accuracy. The seeds I utilized were 5, 9, 13, 17, 19, 23, 29, 31, 37, and 121, which are the same used in the paper Gradient boosted trees for evolving data streams.

Could you please review the command line below to confirm if it's ok?

seed=121
java -javaagent:sizeofag-1.1.0.jar -cp moa.jar moa.DoTask EvaluateInterleavedTestThenTrain \
-l "(meta.StreamingGradientBoostedTrees -l (trees.FIMTDD \
-s VarianceReductionSplitCriterion \
-g 25 \
-c 0.05 \
-e \
-p ))" \
-s "(ArffFileStream -f airlines.arff)" \
-i 1000000 \
-f 1000000 \
-r $seed \
-d sgbt_airlines_$seed.csv

Additionally, I conducted the experiment with StreamingRandomPatches, using the command line provided below. The results were aligned with those reported in the paper.

seed=121
java -javaagent:sizeofag-1.1.0.jar -cp moa.jar moa.DoTask EvaluateInterleavedTestThenTrain \
-l "(meta.StreamingRandomPatches \
-l (trees.HoeffdingTree \
-g 50 \
-c 0.01) \
-s 100 \
-o (Percentage (M * (m / 100))) \
-m 60 \
-a 6)" \
-s "(ArffFileStream -f airlines.arff)" \
-i 1000000 \
-f 1000000 \
-r $seed \
-d srp_airlines_$seed.csv

Thank you for your attention.

PS:

#classifications correct (percent)

#Results for seeds: 5, 9, 13, 17, 19, 23, 29, 31, 37, and 121
#SRP results
[68.51402435746027,
68.555553289592,
68.47657415973435,
68.58021109304521,
68.52700214875145,
68.59634063364993,
68.59800920681593,
68.45377032646562,
68.54368788041151,
68.53738438178438]

#SGBT results
[67.95338377368215,
67.95338377368215,
67.95338377368215,
67.95338377368215,
67.95338377368215,
67.95338377368215,
67.95338377368215,
67.95338377368215,
67.95338377368215,
67.95338377368215]

Nuwan Gunasekara

unread,

Apr 14, 2024, 8:15:52 PM4/14/24

to MOA users

Hi Moura,

This indeed looks like a bug in StreamingGradientBoostedTrees implementation where it does not get the random seed set by the evaluator.

To get around this please use the random seed option (-r) in StreamingGradientBoostedTrees

EvaluateInterleavedTestThenTrain -l (meta.StreamingGradientBoostedTrees -r 121) -s (ArffFileStream -f airlines.arff) -i 1000000 -f 1000000

Please feel free to raise a bug for the initial issue so that I could look into fixing it.

Thanks and regards,

Nuwan

Nuwan Gunasekara

unread,

Apr 15, 2024, 8:03:16 AM4/15/24

to MOA users

Fix available in PR https://github.com/Waikato/moa/pull/297

Message has been deleted

Nuwan Gunasekara

unread,

Apr 15, 2024, 7:10:10 PM4/15/24

to MOA users

Changes are in the main branch now.

Moura

unread,

Apr 15, 2024, 7:31:32 PM4/15/24

to MOA users

Thank you very much, Nuwan!

Reply all

Reply to author

Forward