I may be doing something wrong, but to start, I did a speed test.
First locally,
> time mpiexec -n 16 ./plda/mpi_lda --num_topics 50 --alpha 0.1 --beta 0.01 --training_data_file plda/testdata/test_data.txt --model_file /tmp/lda_model.txt --total_iterations 50
.....
......
Iteration 49 ...
real 0m35.081s
user 4m17.280s
sys 0m20.813s
Then, by adding one another machine, specified in a machinefile:
> time mpiexec -f machinefile -n 16 ./plda/mpi_lda --num_topics 50 --alpha 0.1 --beta 0.01 --training_data_file plda/testdata/test_data.txt --model_file /tmp/lda_model.txt --total_iterations 50
.....
.....
Iteration 49 ...
real 1m5.667s
user 8m1.574s
sys 0m34.070s
So, it seems to have taken twice as long on two machines. It was
correctly running on the other machine, however. Am I being stupid,
but I thought I should see a speed up?
-m