The ground truth was based on the gene content of the genomes in the community augmented by their abundance. So, for example, in a species A + species B community where each species has a homolog of gene family X, and A is twice as abundant as B, the expected output would look something like...
With abundance in units of coverage or RPK to compensate for differences in genome/gene length. Notably, this procedure does not account for non-uniform read sampling/sequencing along the length of a genome, which (in our experiments) explained ~0.1 units of Bray-Curtis distance between the expected and observed gene abundance profiles. Adding this read-level resolution to the gold standard is a lot more complicated since it requires tracing individual reads back to their genes of origin.