That's to be expected because each site with completely undetermined positions has the same tip probability vector in all tips (that's the main difference between parsimony trait mapping and ML). The tips' probability vector p for N under ML is p(1,1,1,1); the substitution models we typically use and optimise on the fly within the pre-defined possible model space include the information about the overall frequency of As, Cs, Gs and Ts in the input data and their probility to be substituted by each other, as modelled over the tree considering also the input branch-lengths. In parsimony, if the tips are N all interior nodes (the hypothetical MRCA's) will be N, same state than all tips, irrespective of any root-tip distance along the tree. Under ML the probability to maintain an N may undergo subtle changes because of our optimising of the model parameters, and differ from clade to clade.
If I remember correctly, the ML character mapping implemented in IQTree and RAxML-ng applies asymmetric substitution models, i.e. we may model a higher probability that e.g. C's change into A's,G's and T's than vice versa. Within the framework of the substitution model the probability of the ancestral state for C is then naturally smaller than being A, G, or T: p(<<0.25, ~0.33, ~0.33, ~0.33). Hence, a (deep) D ("not C"), even when all tips have p(1,1,1,1).
That we have some D, some N, maybe even another not-X ambiguity code is part of the magic of the Gamma distribution (if included in the model): allowing non-fixed per-site variation.
Alexey and Alexi may have a link showing at which point, node probability vector, the output choses an ambiguity code over a defined base or each other (couldn't find it in the github advanced tutorial using a very quick browse).
Also when using symmetric models, we may observe something like this but the effect should be smaller.
For your interpretation of the results, this is, however, irrelevant. Again to come back to Alexi's "think twice, if you do": The 2nd though in this case is, the D is a purely model-triggered result, but not based on your actual input data. So, we would not include any ancestral base for the all-N or most-N sites as a result of our data+analysis (e.g. including them in tabulations or graphical representations).
/G.