Hi Justin, thank you for the
precision. I don't think that one node evaluation is not
something to consider. It is actually something that should
always be accessible, even as you add features or contigencies.
I don't think you want to recreate the tournament conditions,
right?. This is more about analysis. The one node is basically
the valuation part of the RL training if I understood well, and if
the common stem (before the heads), were to be sufficiently
expressive in transforming the input space in the best way
possible so that the WDL (and other classification) associations
would be all grouped into connected sets with very smooth
probability contours lines (I am still trying various angles of
explanation for loss function optimizing that happens there, and
the idea of CNN input space transformation, some people call the
sets of weights after training there an embeding of the
representation that best allow for classication task, when those
weight are those from a classification problem for example).
sorry i got lost in parentheses. I meant to say that if that
mostly transformative stem were expressive to the max that the
chess space with current legal rules allows, then we might not
need more than one node, as the WDL valuation would provide for a
proper discerning of best move purely from position, it would not
need to clarify the picture through reenactments across many
nodes.
That last part may be wrong due to my still not sufficient
understanding of what the policy head training really does, beside
controlling self-play, into using the value head to keep playing
better. Once happy with the tail of self-play games (self-play
convergence or loss functions plateauing to satisfaction), why do
we need many nodes to clarify the valuation of the one node
position candidates for next move?
So, if you were to keep adding nodes, but still display the
evaluation with less nodes (but with different sizes of CNN common
stem), we could learn from such a table about various aspects of
leela training and performance, which could teach us, how to use
the engines or engine instances corresponding to various such
parameters to perform in certain situations.
I was asking, also, i did not know if it was a single node. Is
there somewhere, we can read about the parameters behind the
displayed results?
I have not looked at the site recently. But I will.
D.