Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss: nan, iter:1/455(1, 1.076s) #8

Open
thuxugang opened this issue Nov 15, 2016 · 4 comments
Open

loss: nan, iter:1/455(1, 1.076s) #8

thuxugang opened this issue Nov 15, 2016 · 4 comments

Comments

@thuxugang
Copy link

hello,请问为什么我使用您的程序ctc loss从一开始就为nan呢?希望您指导一下,非常感谢~
下面是显示的内容:
Using gpu device 0: GeForce GTX 980 Ti (CNMeM is disabled, cuDNN not available)
C:\Anaconda\lib\site-packages\theano\tensor\signal\downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
"downsample module has been moved to the theano.tensor.signal.pool module.")
loaded 29143 samples from D:\xugang\OCR\cnn-lstm-ctc-master\dataset\english_sentence\train_img_list.txt
loaded 2914 samples from D:\xugang\OCR\cnn-lstm-ctc-master\dataset\english_sentence\val_img_list.txt
building symbolic tensors(0.0799999237061)
setting parameters(0.0799999237061)
('n_classes: ', 95)
('multi-step: ', set([79625, 68250, 45500]))
building the model(0.0799999237061)
computing updates and function(0.240000009537)
using normal sgd and learning_rate:0.00999999977648
('bw_lstm_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('fw_lstm_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('fw_lstm_U', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('fw_lstm_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('bw_lstm_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('bw_lstm_U', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('hidden_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('hidden_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
building training function(1.78999996185)
building validating function(29.6099998951)
begin to train(32.8609998226)
.epoch 1/200 begin(32.861)
[prefetch]height: 28, x_max_step:141.0, y_max_width:50
D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\layers\utee.py:137: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
x = np.zeros((batch_size, 1, height, x_max_len)). astype(config.floatX)
D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\layers\utee.py:138: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
x_mask = np.zeros((batch_size, x_max_len)).astype(config.floatX)
..loss: nan, iter:1/455(1, 1.076s)
..detect nan
..loss: nan, iter:1/455(1.076)
Traceback (most recent call last):
File "D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\train.py", line 150, in
sys.exit()
SystemExit

@aaron-xichen
Copy link
Owner

Maybe you can try this
[global]
device=gpu0
floatX=float32
Btw, please remember to backup your original .theanorc

@thuxugang
Copy link
Author

感谢您的指导,刚试了一下,还是不行。请问您那里可以么?
我的配置是:
[global]
openmp = False
device = gpu
floatX = float32
allow_input_downcast=True
[blas]
ldflags =
[gcc]
cxxflags = -IC:\Anaconda\MinGW
[nvcc]
flags = -LC:\Anaconda\libs
compiler_bindir = C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin
fastmath = True

@thuxugang
Copy link
Author

对了,我使用的是opencv 2.4.11,请问这个有影响么。。。谢谢

@aaron-xichen
Copy link
Owner

Sorry for the late reply, please set fastmath = False and try again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants