-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/generated/../THCReduceAll.cuh:317 terminate called after throwing an instance of 'at::Error' #50
Comments
Which dataset and model did you use? |
I choose several images from cityscapes and I want to try whether I can begin my training successfully. |
|
How did you fix this, I faced the same problem. @YijianLiu |
@meanmee Are you using custom dataloader? |
YES, Now I have fixed this by modifying some codes in root/lib/data/cityscape.py |
I've had got the same problem. In my case, masks contained elements with values bigger than number of classes. |
@tomaszkaliciak The same to me |
I guess there are some class labels are larger than (num_classes-1). You can print label information by print(label[label > (num-classes-1)]). |
could you please share what has been changed ? |
You need to change some codes in root/lib/data/cityscape.py,including self.label_mapping and self.class_weights. |
I meet an error and I really know how to solve this error! Help!!!!! Someone say,"May be your labels are out of n". But my labels is from 0 to n-1! And I need your help! Thanks!
/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T , T , T , long , T , int, int, int, int, int, long) [with T = float, AccumT = float]: block: [3,0,0], thread: [574,0,0] Assertion
t >= 0 && t < n_classes
failed.THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
Traceback (most recent call last):
File "/home/cartur/HRNet-Semantic-Segmentation/tools/train.py", line 251, in
main()
File "/home/cartur/HRNet-Semantic-Segmentation/tools/train.py", line 220, in main
trainloader, optimizer, model, writer_dict)
File "/home/cartur/HRNet-Semantic-Segmentation/tools/../lib/core/function.py", line 46, in train
loss = ### losses.mean()#
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/generated/../THCReduceAll.cuh:317
terminate called after throwing an instance of 'at::Error'
what(): CUDA error: invalid device pointer (CudaCachingDeleter at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCCachingAllocator.cpp:498)
frame #0: THStorage_free + 0x44 (0x7fd7638cf314 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #1: THTensor_free + 0x2f (0x7fd76396ea1f in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #2: at::CUDAFloatTensor::~CUDAFloatTensor() + 0x9 (0x7fd7404d2a59 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/lib/libcaffe2_gpu.so)
frame #3: torch::autograd::generated::CudnnConvolutionBackward::~CudnnConvolutionBackward() + 0x5d (0x7fd7656d1e7d in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #4: torch::autograd::deleteFunction(torch::autograd::Function) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #5: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #6: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #7: + 0x7674a2 (0x7fd7654d44a2 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #8: + 0x19aa5e (0x55e733ac1a5e in /home/cartur/.conda/envs/CenterNet_last/bin/python)
frame #9: std::_Sp_counted_deleter<torch::autograd::PyFunction, Decref, std::allocator, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x2e (0x7fd7654d64fe in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #10: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #11: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #12: torch::autograd::generated::ThresholdBackward0::~ThresholdBackward0() + 0x62 (0x7fd7656d0ed2 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #13: torch::autograd::deleteFunction(torch::autograd::Function) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #14: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #15: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #16: torch::autograd::generated::CudnnConvolutionBackward::~CudnnConvolutionBackward() + 0x73 (0x7fd7656d1e93 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #17: torch::autograd::deleteFunction(torch::autograd::Function) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #18: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #19: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #20: + 0x7674a2 (0x7fd7654d44a2 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #21: + 0x19aa5e (0x55e733ac1a5e in /home/cartur/.conda/envs/CenterNet_last/bin/python)
frame #22: std::_Sp_counted_deleter<torch::autograd::PyFunction, Decref, std::allocator, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x2e (0x7fd7654d64fe in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #23: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #24: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #25: torch::autograd::generated::ThresholdBackward0::~ThresholdBackward0() + 0x62 (0x7fd7656d0ed2 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #26: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #27: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #28: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #29: torch::autograd::generated::CudnnConvolutionBackward::~CudnnConvolutionBackward() + 0x73 (0x7fd7656d1e93 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #30: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #31: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #32: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #33: + 0x7674a2 (0x7fd7654d44a2 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #34: + 0x19aa5e (0x55e733ac1a5e in /home/cartur/.conda/envs/CenterNet_last/bin/python)
frame #35: std::_Sp_counted_deleter<torch::autograd::PyFunction*, Decref, std::allocator, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x2e (0x7fd7654d64fe in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #36: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #37: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #38: torch::autograd::generated::ThAddBackward::~ThAddBackward() + 0x3d (0x7fd7656ce8bd in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #39: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #40: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #41: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #42: torch::autograd::generated::ThresholdBackward0::~ThresholdBackward0() + 0x62 (0x7fd7656d0ed2 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #43: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #44: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #45: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #46: torch::autograd::generated::ThAddBackward::~ThAddBackward() + 0x3d (0x7fd7656ce8bd in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #47: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #48: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #49: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #50: torch::autograd::generated::ThresholdBackward0::~ThresholdBackward0() + 0x62 (0x7fd7656d0ed2 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #51: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #52: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #53: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #54: torch::autograd::generated::ThAddBackward::~ThAddBackward() + 0x3d (0x7fd7656ce8bd in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #55: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #56: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #57: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #58: torch::autograd::generated::ThresholdBackward0::~ThresholdBackward0() + 0x62 (0x7fd7656d0ed2 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #59: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #60: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fd7650f0225 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #61: torch::autograd::Function::~Function() + 0xfe (0x7fd7651be2ce in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #62: torch::autograd::generated::ThAddBackward::~ThAddBackward() + 0x3d (0x7fd7656ce8bd in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #63: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x47 (0x7fd7654c35d7 in /home/cartur/.conda/envs/CenterNet_lyj/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
The text was updated successfully, but these errors were encountered: