エラーメッセージについて

107 views
Skip to first unread message

MAX

unread,
Dec 29, 2020, 2:23:41 AM12/29/20
to Neural Network Console Users (JP)
学習を実行をすると以下のエラーメッセージがでてきて止まってしまいます。
どのように対処すればよいのでしょうか。 (使用環境はwin版、最新バージョンのNNCです)
2020-12-29 15:51:40,852 Training process is started. python "C:\Users\max\Desktop\neural_network_console\libs\Python\Lib\site-packages\nnabla\utils\cli\cli.py" train -c "C:\Users\max\Desktop\neural_network_console\samples\sample_project\image_recognition\ILSVRC2012\residual networks\resnet-50.files\20201229_155140\net.nntxt" -o "C:\Users\max\Desktop\neural_network_console\samples\sample_project\image_recognition\ILSVRC2012\residual networks\resnet-50.files\20201229_155140" 2020-12-29 15:51:48,811 [nnabla]: Train with contexts ['cpu', 'cuda', 'cudnn'] 2020-12-29 15:51:48,827 [nnabla]: Training epoch 1 of 120 begin Failed to allocate. Freeing memory cache and retrying. Failed to allocate again. 2020-12-29 15:51:49,467 [nnabla]: An error occurred while executing backward of function Convolution_21_RepeatStart_3[2] (nn.ConvolutionCudaCudnn) in network Training 2020-12-29 15:51:49,467 [nnabla]: Network traceback: 2020-12-29 15:51:49,467 [nnabla]: BatchNormalization_22_RepeatStart_3[2] 2020-12-29 15:51:49,467 [nnabla]: Convolution_22_RepeatStart_3[2] 2020-12-29 15:51:49,467 [nnabla]: ReLU_18_RepeatStart_3[2] 2020-12-29 15:51:49,467 [nnabla]: BatchNormalization_21_RepeatStart_3[2] 2020-12-29 15:51:49,467 [nnabla]: ->Convolution_21_RepeatStart_3[2] NNabla command line interface (Version:1.15.0.dev1, Build:201211124504) Traceback (most recent call last): File "C:\Users\max\Desktop\neural_network_console\libs\Python\Lib\site-packages\nnabla\utils\cli\cli.py", line 141, in cli_main return_value = args.func(args) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\cli\train.py", line 649, in train_command result, restart = _train(args, config) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\cli\train.py", line 465, in _train cost = _update(iteration, config, cost) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\cli\train.py", line 210, in _update o.update_interval == 0) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\network.py", line 177, in backward self.backward_function(seq) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\network.py", line 187, in backward_function seq.func.variable_inputs, seq.func.variable_outputs, seq.accum_grad) File "function.pyx", line 214, in nnabla.function.Function.backward RuntimeError: memory error in nbla::Memory::alloc C:\a\_w\sDeepConsolePrototype\sDeepConsolePrototype\nnabla\src\nbla\memory\memory.cpp:38 Failed `this->alloc_impl()`: class nbla::CudaMemory allocation failed.  

Kazuya Goto

unread,
Dec 29, 2020, 8:16:47 PM12/29/20
to MAX, Neural Network Console Users (JP)
エラーを見る限りは、VideoMemoryが不足しているような感じを受けます。
GPU実行でしょうか?

CPU実行で試して学習が進むようであれば、搭載ビデオカードのメモリが足りないということになりそうです。

そのビデオカードで実行したいということであれば、レイヤー構成を簡素化したり、出力のチャンネル数を減らす等、計算コストを下げる必要がありそうです。

2020年12月29日(火) 16:23 MAX <ore10sai...@gmail.com>:
学習を実行をすると以下のエラーメッセージがでてきて止まってしまいます。
どのように対処すればよいのでしょうか。 (使用環境はwin版、最新バージョンのNNCです)
2020-12-29 15:51:40,852 Training process is started. python "C:\Users\max\Desktop\neural_network_console\libs\Python\Lib\site-packages\nnabla\utils\cli\cli.py" train -c "C:\Users\max\Desktop\neural_network_console\samples\sample_project\image_recognition\ILSVRC2012\residual networks\resnet-50.files\20201229_155140\net.nntxt" -o "C:\Users\max\Desktop\neural_network_console\samples\sample_project\image_recognition\ILSVRC2012\residual networks\resnet-50.files\20201229_155140" 2020-12-29 15:51:48,811 [nnabla]: Train with contexts ['cpu', 'cuda', 'cudnn'] 2020-12-29 15:51:48,827 [nnabla]: Training epoch 1 of 120 begin Failed to allocate. Freeing memory cache and retrying. Failed to allocate again. 2020-12-29 15:51:49,467 [nnabla]: An error occurred while executing backward of function Convolution_21_RepeatStart_3[2] (nn.ConvolutionCudaCudnn) in network Training 2020-12-29 15:51:49,467 [nnabla]: Network traceback: 2020-12-29 15:51:49,467 [nnabla]: BatchNormalization_22_RepeatStart_3[2] 2020-12-29 15:51:49,467 [nnabla]: Convolution_22_RepeatStart_3[2] 2020-12-29 15:51:49,467 [nnabla]: ReLU_18_RepeatStart_3[2] 2020-12-29 15:51:49,467 [nnabla]: BatchNormalization_21_RepeatStart_3[2] 2020-12-29 15:51:49,467 [nnabla]: ->Convolution_21_RepeatStart_3[2] NNabla command line interface (Version:1.15.0.dev1, Build:201211124504) Traceback (most recent call last): File "C:\Users\max\Desktop\neural_network_console\libs\Python\Lib\site-packages\nnabla\utils\cli\cli.py", line 141, in cli_main return_value = args.func(args) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\cli\train.py", line 649, in train_command result, restart = _train(args, config) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\cli\train.py", line 465, in _train cost = _update(iteration, config, cost) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\cli\train.py", line 210, in _update o.update_interval == 0) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\network.py", line 177, in backward self.backward_function(seq) File "C:\Users\max\Desktop\neural_network_console\libs\Python\lib\site-packages\nnabla\utils\network.py", line 187, in backward_function seq.func.variable_inputs, seq.func.variable_outputs, seq.accum_grad) File "function.pyx", line 214, in nnabla.function.Function.backward RuntimeError: memory error in nbla::Memory::alloc C:\a\_w\sDeepConsolePrototype\sDeepConsolePrototype\nnabla\src\nbla\memory\memory.cpp:38 Failed `this->alloc_impl()`: class nbla::CudaMemory allocation failed.  

--
このメールは Google グループのグループ「Neural Network Console Users (JP)」に登録しているユーザーに送られています。
このグループから退会し、グループからのメールの配信を停止するには neural_network_consol...@googlegroups.com にメールを送信してください。
このディスカッションをウェブ上で閲覧するには https://groups.google.com/d/msgid/neural_network_console_users_jp/27d3ca6c-889f-49ee-bc18-4d2d06e5cf8cn%40googlegroups.com にアクセスしてください。
Reply all
Reply to author
Forward
0 new messages