Hi there,
I've been training my own image segmenter model that operates on 512x512 input, similar to the HairSegmenter model. The intention is to have this model run on the GPU using the MediaPipe GPU backend.
However, I'm running into an issue that I'm stumped on how to solve. When I use
MediaPipe studio, my model works fine on CPU. However, using the GPU delegate, the model's outputs are garbage.
To this end, I'd love to get some pointers on:
1. Any best practices on how to train the model to account for what I expect are numerical instabilities due to float16 vs. float32?
2. Any best practices on specifically training custom model architectures from scratch for MediaPipe application in particular.
Thanks so much! Would appreciate any guidance here!