Completely agree - everything I've learned about evolving neural networks has buttressed the idea that "more varied landscape = faster, deeper, and more flexible learning".
By [crude] analogy: if you were going to train an autonomous truck, with the sole requirement that it successfully transport goods between two fixed points over the same route, would it be better to train it exclusively on that route, or to train it on a wide variety of roads? With the "restricted training" approach, the truck is going to struggle mightily to learn simple fundamentals. For instance, if there's only one stoplight on the route, imagine how challenging it will be for the truck to pick up on the meaning of red/yellow/green. There are also going to be false-positive artifacts: perhaps on the route in question, 9 of the 10 bridges it goes under are followed by a curve in the road... and the truck gains a deeply-engrained association which leads to erratic slowdowns whenever it goes under the 10th bridge. Even if tested on the main route, I have no doubt the varied-routes-trained truck would greatly outperform the main-route-trained truck in virtually every way.
I suspect that the same kinds of things are hindering LC0's learning by restricting its training to standard chess openings. There are probably simple fundamentals that it struggles to pick up on, simply because the occasions to learn them are less common (and the contexts in which they occur more homogeneous) in standard chess. Perhaps the current weakness of LC0's tactical play stems from such limitations. There are probably also false/misleading patterns that LC0 picked up early and must now work extensively to unlearn (and may never be "completely/cleanly unlearned", from a NN perspective). It seems to me that the only way LC0 could improve in this landscape is by gaining a deep, non-deviating sense of the one goal: trap and kill the opponent's king. Any strategy that works by "luck" (e.g. trap-sacrifices that give an advantage merely as an artifact of a particular starting position) would be utterly useless in the chess960 landscape.
I kind of wonder if might be worth considering taking this idea a little further... If it's plausible that the chess960-shuffle could improve training, might it help to also throw in other random variations? Such as:
* Continue to use the 6 "standard pieces", but add a handful of other "make believe" pieces. Standard knights move in a 1x2 pattern: let's add a couple other knight-like pieces which move in 2x2 and 1x3 patterns. Perhaps add a weakened rook-like piece restricted to only move 2 squares at a time, or a super-powerful queen-like piece (that can move like normal queens but also move like knights).
* Different board dimensions... in addition to the 8x8 grid, randomly start games with randomly-selected alternatives: 7x8, 7x9, 9x9, and 10x6 grids.
* Other minor deviations in rules?
I would expect these variations to really help LC0 determine a piece's value not only in terms of what it can do, but also in terms of its surrounding context.
Don't get me wrong: there's obviously a line to be drawn between "more randomness" and utter chaos. If the random variations deviate too far from the standard game, I suspect some deeper strategies of standard chess may prove too elusive the generalized training to learn. For instance, if king can start off-center, will the random-variation LC0 be able to fully appreciate the "deep pattern" value of controlling the center early on in standard chess? So I definitely feel it'd be best to retain certain fundamentals: things like "king starts in center of back row"; castling rules; major/minor pieces on back row behind a row of pawns, and so on.
Anyway - if this line of thinking should gain traction, I think it'd make more sense to research this "more random" approach than just testing chess960 (following up the name "chess960", perhaps it could be called "Chess1million").