Hi Luca,
Low ESSs for pop size parameters, rootHeight and clock rate frequently indicate weak temporal signal. This may also explain different mixing behavior under different coalescent models. The alignment length seems rather small, so assuming you have dated tips, it may be useful to investigate the temporal signal using TempEst. More informative priors (e.g. on the rate or rootHeight) should help if the sequence data cannot inform the dated tip model very well.
As the skygrid cut of value is not a parameter that is estimated, but something that needs to be fixed a priori, it will always have an ESS of 1, so that is not a problem. Concerning the value that one should choose for this cut-off, a preliminary run using a simple coalescent model should give a reasonable indication of the rootheight. According to your Tracer summary, something around 1500 could be a reasonable choice. There is no problem with very high ESSs for your the substitution parameters, but you could reduce the weight on their operators, or increase the weight of the parameters for which mixing is far more challenging (pop size parameters, rootHeight and clock rate).
Best,
Philippe