I0218 00:38:05.609284 1490 solver.cpp:224] Learning Rate Policy: step
I0218 00:38:05.609299 1490 solver.cpp:267] Iteration 0, Testing net (#0)
I0218 00:55:57.192952 1490 solver.cpp:318] Test net output #0: loss = 299034 (* 1 = 299034 loss)
F0218 00:55:57.284987 1490 im2col.cu:59] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered
*** Check failure stack trace: ***
@ 0x7f2fbf6c3b8d google::LogMessage::Fail()
@ 0x7f2fbf6c5c8f google::LogMessage::SendToLog()
@ 0x7f2fbf6c377c google::LogMessage::Flush()
@ 0x7f2fbf6c652d google::LogMessageFatal::~LogMessageFatal()
@ 0x56b4c9 caffe::im2col_gpu<>()
@ 0x563bb9 caffe::ConvolutionLayer<>::Forward_gpu()
@ 0x52188f caffe::Net<>::ForwardFromTo()
@ 0x521c1f caffe::Net<>::ForwardPrefilled()
@ 0x53c400 caffe::Solver<>::Step()
@ 0x53cea7 caffe::Solver<>::Solve()
@ 0x4172d8 train()
@ 0x41175b main
@ 0x7f2fbc95bec5 (unknown)
@ 0x415a47 (unknown)
Aborted (core dumped)
F0226 14:05:57.918788 17788 base_data_layer.cu:25] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered
*** Check failure stack trace: ***
@ 0x7ffff692edaa (unknown)
@ 0x7ffff692ece4 (unknown)
@ 0x7ffff692e6e6 (unknown)
@ 0x7ffff6931687 (unknown)
@ 0x7ffff7221be7 caffe::BasePrefetchingDataLayer<>::Forward_gpu()
@ 0x419b28 caffe::Layer<>::Forward()
@ 0x7ffff7118102 caffe::Net<>::ForwardFromTo()
@ 0x7ffff7117ea1 caffe::Net<>::ForwardPrefilled()
@ 0x7ffff71182a4 caffe::Net<>::Forward()
@ 0x7ffff70baeb3 caffe::Net<>::ForwardBackward()
@ 0x7ffff70a8825 caffe::Solver<>::Step()
@ 0x7ffff70a821b caffe::Solver<>::Solve()
@ 0x414e36 train()
@ 0x416da6 main
@ 0x7ffff5e40ec5 (unknown)
@ 0x413c09 (unknown)
@ (nil) (unknown)
Program received signal SIGABRT, Aborted.
0x00007ffff5e55cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff5e55cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007ffff5e590d8 in __GI_abort () at abort.c:89
#2 0x00007ffff6936ec3 in ?? () from /usr/lib/x86_64-linux-gnu/libglog.so.0
#3 0x00007ffff692edaa in google::LogMessage::Fail() () from /usr/lib/x86_64-linux-gnu/libglog.so.0
#4 0x00007ffff692ece4 in google::LogMessage::SendToLog() () from /usr/lib/x86_64-linux-gnu/libglog.so.0
#5 0x00007ffff692e6e6 in google::LogMessage::Flush() () from /usr/lib/x86_64-linux-gnu/libglog.so.0
#6 0x00007ffff6931687 in google::LogMessageFatal::~LogMessageFatal() () from /usr/lib/x86_64-linux-gnu/libglog.so.0
#7 0x00007ffff7221be7 in caffe::BasePrefetchingDataLayer<float>::Forward_gpu (this=0x4d1a700, bottom=std::vector of length 0, capacity 0,
top=std::vector of length 1, capacity 1 = {...}) at src/caffe/layers/base_data_layer.cu:25
#8 0x0000000000419b28 in caffe::Layer<float>::Forward (this=0x4d1a700, bottom=std::vector of length 0, capacity 0, top=std::vector of length 1, capacity 1 = {...})
at ./include/caffe/layer.hpp:486
#9 0x00007ffff7118102 in caffe::Net<float>::ForwardFromTo (this=0x49a1af0, start=0, end=69) at src/caffe/net.cpp:600
#10 0x00007ffff7117ea1 in caffe::Net<float>::ForwardPrefilled (this=0x49a1af0, loss=0x7fffffffdc9c) at src/caffe/net.cpp:620
#11 0x00007ffff71182a4 in caffe::Net<float>::Forward (this=0x49a1af0, bottom=std::vector of length 0, capacity 0, loss=0x7fffffffdc9c) at src/caffe/net.cpp:634
#12 0x00007ffff70baeb3 in caffe::Net<float>::ForwardBackward (this=0x49a1af0, bottom=std::vector of length 0, capacity 0) at ./include/caffe/net.hpp:87
#13 0x00007ffff70a8825 in caffe::Solver<float>::Step (this=0x7100b0, iters=100000000) at src/caffe/solver.cpp:228
#14 0x00007ffff70a821b in caffe::Solver<float>::Solve (this=0x7100b0, resume_file=0x0) at src/caffe/solver.cpp:306
#15 0x0000000000414e36 in train () at tools/caffe.cpp:212
#16 0x0000000000416da6 in main (argc=2, argv=0x7fffffffe528) at tools/caffe.cpp:394
嗨,640次迭代后我得到相同的错误。最后我想出问题是网络最后一层的输出数量。我已经使用pascal上下文与33个标签,但是由于背景,输出的数量应该是34。
2015年5月5日星期二下午7:41:58 UTC + 10,Yoann写道:我也是。你知道如何解决这个问题吗?F0505 11:34:24.536881 2253 math_functions.cpp:91]检查失败:error == cudaSuccess(77 vs. 0)遇到非法内存访问***检查故障堆栈跟踪:***@ 0x7f5d50d40b7d google :: LogMessage :: Fail()@ 0x7f5d50d42c7f google :: LogMessage :: SendToLog()@ 0x7f5d50d4076c google :: LogMessage :: Flush()@ 0x7f5d50d4351d google :: LogMessageFatal :: 〜LogMessageFatal()@ 0x494cc8 caffe :: caffe_copy <>()@ 0x4e98ee caffe :: BasePrefetchingDataLayer <> :: Forward_gpu()@ 0x47d4df caffe :: Net <> :: ForwardFromTo()@ 0x47d81f caffe :: Net <> :: ForwardPrefilled()@ 0x4709ba caffe :: Solver <> :: Solve()@ 0x424ed9 train()@ 0x41ebdb主@ 0x7f5d4cd6476d(未知)@ 0x4225d9(未知)
Le mardi 5 mai 2015 11:21:09 UTC + 2,Ting Lee aécrit:我也一样!
在2014年10月13日星期一UTC + 2下午3:53:51,尼克Carlevaris -比安科写道:我在训练过程中有些随机出现了一个奇怪的错误。网络通常会成功训练多次迭代。但是,我会得到以下错误:F1013 09:33:06.971670 4890 math_functions.cpp:91]检查失败:error == cudaSuccess(77 vs. 0)遇到非法内存访问***检查故障堆栈跟踪:***@ 0x7ffff2577b9d google :: LogMessage :: Fail()@ 0x7ffff2579c9f google :: LogMessage :: SendToLog()@ 0x7ffff257778c google :: LogMessage :: Flush()@ 0x7ffff257a53d google :: LogMessageFatal :: 〜LogMessageFatal()@ 0x4f3b98 caffe :: caffe_copy <>()@ 0x548a4e caffe :: BasePrefetchingDataLayer <> :: Forward_gpu()@ 0x537c3f caffe :: Net <> :: ForwardFromTo()@ 0x537f7f caffe :: Net <> :: ForwardPrefilled()@ 0x5144de caffe :: Solver <> :: Solve()@ 0x4292e9 train()@ 0x422ddb主@ 0x7fffee26f76d(未知)@ 0x4269ed(未知)该型号很小,只能在4GB卡上使用几个100MB。所以卡的内存不足了。有没有人遇到类似的错误?