Hi,
I want to build a 3D conv net with my own frames data (images within a sequence), but I don't know how to do it in video-caffe, can anyone tell me where to start from?
What is the different between caffe and video-caffe? I know that caffe is for 2D images and video-caffe is for videos or frames, but I don't know if building a CNN net in video-caffe the same as building in caffe?
In video-caffe, what is hdf5_classification means? Can I use this as a templet? How to generate my training data for it? (I can extract 2D images as hdf5 data, but what about here? My data is a sequence of images like video frames.)
Best