I am not sure it will be totally valid to use Deseq2 for the higher level taxonomy (for example, not clear how valid the zero inflated ditribution they use fits the joined reads of several bacteria). However, if you have a small sample size and treat the results as preliminary indicators of taxonomic groups that may have differential abundance for further study (as opposed to using the results as a proof that these taxonomic groups definitely change), i think it's a legitimate possibility.
Another option is to use permutation based tests on whatever statistic you want to calculate on these groups (i.e. log ratio / difference in mean / presence-absence etc.). Just calculate the statistic for each taxonomic group (between the conditions) and compare to random label permutations. Note that you will still need FDR correction for the p-values.
Another option if you have not enough samples, is to maybe join groups together (i.e. compare samples from lesion groups 0,1 together to group 3,4 together)?
How many samples do you have in each group? Note that for the Mann-Whitney U, the requirement for >20 samples is in order to approximate the distribution of the U scores using normal approximation, but it can also be calculated for <20 samples (i.e. using permutations).
Does this make sense?