Hi,
Quick background: I'm trying to build a texture classifier using transfer learning with MobileNet as a feature network. Basically, for a given image of dimensions say 512 x 512, I want to crop a series of 224 x 224 square at highly overlapping intervals to produce a tensor of shape, say [N 224 224 3], where my batch is of size N, each crop is 224 pixels in width and height, and there are 3 channels for RGB. The crops would be taken from the image origin of x ,y = [0, 0], [1, 0], [2, 0], ..., [1, 1], [2, 1], [2, 2], ... and so on, effectively sampling a field-of-view densely over the original image.
Letting N be 100, if I create 100 crops and load them as tensors of shape [1 224 224 3], I can tf.concat them to get to my desired batch size, but this operation alone takes > 2000ms on the workstation I'm using. For comparison, MobileNet inference on the same batch takes about 50ms, so it doesn't appear to be an issue of machine slowness. Is there a more efficient way to do this?
Cheers,
Larry