I'd be inclined to clean them separately and then combine. The main thing that can be confusing is the percent missing data. I find this to be a useful diagnostic for sample quality, but it can be confusing for samples that were typed on the MegaMUGA when you're looking at the combined MM and GM markers, since most markers wouldn't have been attempted to be genotyped.
I personally don't put much effort into trying to identify problem markers, and focus primarily on problem samples, because it seems like for MegaMUGA and GigaMUGA data, we can just smooth over any genotyping errors when calculating genotype probabilities, by using say error_prob=0.002 in calc_genoprob().
There's a function c.cross2() for combining cross2 objects, for the case of different sets of samples with the exact same set of markers.
karl