One of my labmates has used MCMCglmm quite a bit. It mostly worked for him, but he had trouble getting the Markov chains to mix well in certain cases and is now using
Stan, I think. Stan is generally more flexible and faster, but it's probably harder to learn from scratch.
Depending on what assumptions you're willing to make, it's also possible to trick lme4 into modeling this. The trick is to repeat each row of your data twice and add a column specifying whether the row is describing a surface measurement or a shallow measurement. You can then add correlated random effects and interaction terms as you see fit, so that the model "understands" how you expect the relationship between surface measurements and shallow measurements within a location to work.
Hope this makes sense.
Dave