I have a network which predicts segmentation labels. The network takes an RGB input of 256x256 and outputs a segmented image of size 256x256x16. Where the 16 channels corresponds to each class labels. I am using the cudnn.SpatialCrossEntropy on a batch size of 4. The forward pass works, but the network fails in the criterion's forward pass with the following:
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [823,0,0] Assertion `t >= 0 && t < n_classes` failed.
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [824,0,0] Assertion `t >= 0 && t < n_classes` failed.
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [825,0,0] Assertion `t >= 0 && t < n_classes` failed.
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [826,0,0] Assertion `t >= 0 && t < n_classes` failed.
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [827,0,0] Assertion `t >= 0 && t < n_classes` failed.
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [828,0,0] Assertion `t >= 0 && t < n_classes` failed.
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [829,0,0] Assertion `t >= 0 && t < n_classes` failed.
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [830,0,0] Assertion `t >= 0 && t < n_classes` failed.
/var/scratch/pdas/torch/extra/cunn/lib/THCUNN/SpatialClassNLLCriterion.cu:38: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int) [with T = float, AccumT = float]: block: [3,0,0], thread: [831,0,0] Assertion `t >= 0 && t < n_classes` failed.
THCudaCheck FAIL file=/var/scratch/pdas/torch/extra/cutorch/lib/THC/generic/THCStorage.c line=32 error=59 : device-side assert triggered
/var/scratch/pdas/torch/install/bin/luajit: cuda runtime error (59) : device-side assert triggered at /var/scratch/pdas/torch/extra/cutorch/lib/THC/generic/THCStorage.c:32
stack traceback:
[C]: at 0x2aaab6564c20
[C]: in function '__index'
.../.luarocks/share/lua/5.1/nn/SpatialClassNLLCriterion.lua:51: in function 'updateOutput'
/home/pdas/.luarocks/share/lua/5.1/nn/MultiCriterion.lua:21: in function 'forward'
./segCriterion.lua:156: in function 'forward'
Train.lua:245: in main chunk
[C]: in function 'dofile'
...pdas/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406540
The input to the criterion is 4x16x256x256 and the target is 4x256x256, where 4 is the batch size. The Spatial criterion isn't documented either so can't find any information regarding the error. Anyone knows how to fix this?
Thank you.