What has happened:
- Python layer setup has had no error with my "original" version of caffe (commit: d8e2f0526d5748e1a262ba0f80d795c24d5ddfa1).
- I have added some CRF as RNN code manually to above (added several .hpp and .cpp layer definitions and utility functions, and some python wrapper for segmentation).
- The compilation (caffe and pycaffe) and testing (make test && make runtest) went without error.
- During the caffe::Net::Init(), it fails with segmentation fault _only for the Python layers_ (setting up other layers works).
Here's a minimal example for the error:
============================
$ ./caffe/build/tools/caffe test -model 'tmp/pythonlayer.prototxt' -weights 'tmp/pythonlayer.caffemodel' -gpu 1
I0811 11:48:06.189405 4122 caffe.cpp:237] Use GPU with device ID 1
I0811 11:48:11.878108 4122 caffe.cpp:241] GPU device name: Tesla K40m
I0811 11:48:12.306141 4122 net.cpp:49] Initializing net from parameters:
state {
phase: TEST
}
layer {
name: "one"
type: "Python"
top: "one"
python_param {
module: "util_python_layers"
layer: "ONE"
param_str: "{\'batch_size\': 1}"
}
}
I0811 11:48:12.306279 4122 layer_factory.hpp:77] Creating layer one
/BS/joon_projects/work/caffe-confusion/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr<caffe::Net<float> > already registered; second conversion method ignored.
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
/BS/joon_projects/work/caffe-confusion/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr<caffe::Blob<float> > already registered; second conversion method ignored.
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
/BS/joon_projects/work/caffe-confusion/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr<caffe::Solver<float> > already registered; second conversion method ignored.
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
I0811 11:48:13.443912 4122 net.cpp:91] Creating Layer one
I0811 11:48:13.443972 4122 net.cpp:399] one -> one
*** Aborted at 1470908893 (unix time) try "date -d @1470908893" if you are using GNU date ***
PC: @ 0x7fdb072e0154 (unknown)
*** SIGSEGV (@0x100000049) received by PID 4122 (TID 0x7fdb130c1a40) from PID 73; stack trace: ***
@ 0x7fdb0629d8d0 (unknown)
@ 0x7fdb072e0154 (unknown)
@ 0x7fdb0773fa9f (unknown)
@ 0x7fdab0033d2f caffe::PythonLayer<>::LayerSetUp()
@ 0x7fdb0f42f960 caffe::Net<>::Init()
@ 0x7fdb0f432370 caffe::Net<>::Net()
@ 0x4079e1 test()
@ 0x405db3 main
@ 0x7fdb05f04b45 (unknown)
@ 0x4065f8 (unknown)
Segmentation fault
=======================================
The Python layer "one" here takes no input, and produces "1" as output -- just a simple example.
Does anyone have any clue what could be the problem? Or which part of the caffe code should I look at?
Thanks,
Joon