I just ran XGBoost on my Linux machine and I can see a difference in
CPU usage and speed between using 1 thread and more than 1. However,
on the dataset that I've tested it with, there is no linear
relationship between number of threads and the speed up in
building/evaluating the models.
The following numbers are taken from a single run of 10-fold
cross-validation in the Weka Investigator:
- 1 thread: 6.7s
- 2 threads: 4.9s
- 8 threads: 4.2s
The dataset that I used had only numeric attributes:
# Attributes: 1193
# Instances: 10000
The XGBoost classifier in ADAMS just wraps around the following
library:
https://github.com/dmlc/xgboost
Specifically version 2.1.0 of xgboost4j_2.12.
Documentation for the parameters is available here:
https://xgboost.readthedocs.io/en/release_2.1.0/parameter.html
The "numThreads" option in the ADAMS wrapper gets translated to the
"nthread" parameter for the underlying xgboost algorithm.
Cheers, Peter