Yes. Well spotted. Confirmed quickly here on my set up too.
I've been waiting for a meta-discussion on the history planes ever since this project began, back in AZ time last year. But, unfortunately, nothing doing.
I did post way back, on several different forums, that I thought history was absolutely key to this zero approach. And tried tackling it through the "time" angle. As follows: board position alone gives us two dimensions for evaluation (and policy) to play with; basically mass (the pieces) and distance (their xy coordinates), if we parallelise to physics there's not a lot of science we can do with just "mass" and "distance", but if we introduce time, then we can start playing with many concepts (velocity, trajectory, acceleration, energy, momentum and so on and so on, many of our fundamental equations). So, I think, at meta-level, this is what history is doing for us, it introduces the time dimension, and allows the neural net to play (how, we have no idea of course) with more concepts (of which we also have no idea). In chess, at high level, we often hear words like energy of a position, momentum and so on, so I think there is something here in this connection pieces-position-history (mass distance time).
Okay, so, what you seem to have picked up here, with your experiment, is some kind of "trajectory" concept. There's a change, in time, over the power exerted on particular squares; and you're suggesting (good suggestion imo) that the Policy is picking up this trajectory and getting interested in it. Extending the idea, that would mean, that when the search was, say, 10 moves deep, the Policy (and Eval) are picking up a trajectory theme (if the tree sequence is good, so to speak), which is almost a "plan of action", it can feel a kind of heat map flow from Pos(0) to Pos(10). I guess a smart net would be able to develop a concept of "progress" or "doing something sensible" over the entire history move sequence. Likewise an 'I am going nowhere" concept, and a "my position is falling apart" concept.
Sorry if I am leaping forward in giant steps here, but it tends to be the way my neurology works. Also I been thinking about this for a long time without any sparring partner on the same wavelength!
Leaping ahead even more, if the nets have time-history trajectory concepts of "going or not going places or staying the same", it can also begin to develop ideas about fortress positions. In fact it was the fortress positions in the AZ Stockfish games that launched me down this thinking path in the first place. How did AZ understand fortresses when not understanding fortresses is a big fatal flaw in hand crafted evals? That was the question. I'm sure "history" and thus the time dimension, is the answer. Which in turn would mean that, if time history is so important, we can think about presenting inputs in some different ways to try and accelerate the net exploiting this stuff.
I'ld be really interested in your thoughts, with apologies for jumping off into the far distance ....