I'm planning to implement a system like this:

From: [1] Yao, Ting, Tao Mei, and Yong Rui. "Highlight Detection with Pairwise Deep Ranking for First-Person Video Summarization."
I need to load two separate alexnet models for each of highlight and non highlight data. It's possible to load two models in caffe. but loading a model twice is a little bit complex. I need to have another alexnet model with the same parameters and weights but different layer names.
So is there a way to rename the layers name without changing anything in a pretrained model like alexnet?
Do you have any other suggestions for implementing a system like the image above ?