See link. http://fsr.utias.utoronto.ca/submissions/FSR_2015_submission_57.pdf
The code was simply the sklearn MiniBatchKmeans applied to features extracted from fc6 of the reference bvlc_reference_rcnn model.
Kmeans is simple and works reasonably well even with the high dimensionally of the features.
I'm also interested to hear if anyone has performed unsupervised learning with caffe directly.