Hi,
I have a trained AlexNet where I have done the "net surgery" to replace fully connected layers by convolutional ones.
My input is a 2048x2048 image and fc8-conv layer is my output which gives me about what I expect; Dimensions [1, 5, 61, 61] when using 5 classes. This gives me a 61x61 "heat map" for each class. Why I don't get 64 by 64 is a mystery to me but fair enough.
When looking into layer fc7-conv, which has 4096 outputs, I would have expected dimensions [1, 4096, 64, 64] or similar but instead Python shows me [4096, 4096, 1, 1]. I have tried reshaping this blob into arrays containing a 4096 "feature vector" for each point in a 64x64 image, but I don't see how data is organized.
Any clues would be helpful.
When I run images through the standard AlexNet, I can see that the 4096 "feature vector" in fc7 makes sense, and I can see similar objects get similar values here. But no matter how I permute the data in fc7-conv the numbers just look random to me.