From
https://www.cog-genomics.org/plink/2.0/filter#hwe : "On chrX, p-values are now computed using the method described in
Graffelman J, Weir BS (2016) Testing for Hardy-Weinberg equilibrium at biallelic genetic markers on the X chromosome." This paper observed that on chrX, if the female genotype counts appeared to be in Hardy-Weinberg equilibrium, but male and female allele frequencies were different, the female genotype counts would be pushed out of Hardy-Weinberg equilibrium in the next generation. As a practical matter, the joint genotype-count/allele-frequency test introduced by the paper has found obvious QC issues in data I've worked with, that were missed by the old female-genotypes-only method on chrX.
So, no, it is not correct to perform a female-only test on chrX. Leave the males in. Plink2 will account for them in an appropriate manner, as long as sex information is provided.