Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Use Padded Sequences in RNNs with ROCm and Tensorflow #2796

Open
MrYoavon opened this issue Dec 31, 2024 · 2 comments
Open

Unable to Use Padded Sequences in RNNs with ROCm and Tensorflow #2796

MrYoavon opened this issue Dec 31, 2024 · 2 comments

Comments

@MrYoavon
Copy link

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

binary

TensorFlow version

2.17

Custom code

Yes

OS platform and distribution

Linux Ubuntu 24.04

Mobile device

No response

Python version

3.10

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

The provided code is a simple code that has the same issue as my true code. The actual code is much more complex but this code represents the problem. I can't use any type of RNN (I'm trying to use LSTM layers at the moment) with my data. The data is video sequences but each one is a bit shorter than 160 frames. I'm padding everything to 160 to feed the model a consistent shape of data but the LSTM layers won't accept data that is padded and masked.

Standalone code to reproduce the issue

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense
from tensorflow.keras.preprocessing.sequence import pad_sequences
import numpy as np

# Generate sample data
num_samples = 1000
max_sequence_length = 20
vocab_size = 50  # Example vocabulary size

# Random data generation
data = np.random.randint(1, vocab_size, size=(num_samples, max_sequence_length))
labels = np.random.randint(0, 2, size=(num_samples, 1))  # Binary labels (0 or 1)

# Pad sequences
padded_data = pad_sequences(data, maxlen=max_sequence_length, padding='post', truncating='post')

# Create model
model = Sequential([
    Embedding(input_dim=vocab_size, output_dim=50),
    Bidirectional(LSTM(64, return_sequences=True)),
    Bidirectional(LSTM(32)),
    Dense(1, activation='sigmoid')
])

# Compile model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(padded_data, labels, epochs=10, batch_size=32)

# Evaluate the model
loss, accuracy = model.evaluate(padded_data, labels)
print(f"Loss: {loss}, Accuracy: {accuracy}")

Relevant log output

2024-12-31 18:16:33.367977: W tensorflow/core/framework/op_kernel.cc:1840] OP_REQUIRES failed at cudnn_rnn_ops.cc:1769 : INVALID_ARGUMENT: ROCm MIOpen only supports packed input output.
2024-12-31 18:16:33.367997: I tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: INVALID_ARGUMENT: ROCm MIOpen only supports packed input output.
	 [[{{function_node __inference_one_step_on_data_5177}}{{node sequential_1/bidirectional_1/forward_lstm_1/CudnnRNNV3}}]]
Traceback (most recent call last):
  File "/home/yoav/PycharmProjects/Lip-C/test.py", line 31, in <module>
    history = model.fit(padded_data, labels, epochs=10, batch_size=32)
  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 53, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node sequential_1/bidirectional_1/forward_lstm_1/CudnnRNNV3 defined at (most recent call last):
  File "/home/yoav/PycharmProjects/Lip-C/test.py", line 31, in <module>

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/backend/tensorflow/trainer.py", line 368, in fit

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/backend/tensorflow/trainer.py", line 216, in function

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/backend/tensorflow/trainer.py", line 129, in multi_step_on_iterator

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/backend/tensorflow/trainer.py", line 110, in one_step_on_data

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/backend/tensorflow/trainer.py", line 56, in train_step

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/layers/layer.py", line 899, in __call__

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/ops/operation.py", line 46, in __call__

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 156, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/models/sequential.py", line 213, in call

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/models/functional.py", line 182, in call

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/ops/function.py", line 171, in _run_through_graph

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/models/functional.py", line 632, in call

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/layers/layer.py", line 899, in __call__

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/ops/operation.py", line 46, in __call__

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 156, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/layers/rnn/bidirectional.py", line 218, in call

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/layers/layer.py", line 899, in __call__

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/ops/operation.py", line 46, in __call__

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 156, in error_handler

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/layers/rnn/lstm.py", line 570, in call

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/layers/rnn/rnn.py", line 402, in call

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/layers/rnn/lstm.py", line 537, in inner_loop

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/backend/tensorflow/rnn.py", line 841, in lstm

  File "/home/yoav/PycharmProjects/Lip-C/venv/lib/python3.10/site-packages/keras/src/backend/tensorflow/rnn.py", line 933, in _cudnn_lstm

ROCm MIOpen only supports packed input output.
	 [[{{node sequential_1/bidirectional_1/forward_lstm_1/CudnnRNNV3}}]] [Op:__inference_multi_step_on_iterator_5284]
@MrYoavon
Copy link
Author

I apologize in advance for adding a comment before anyone's answer but is there a beta version of ROCm that has a fix for this problem?

@MrYoavon
Copy link
Author

For anyone looking for a solution, as of the time of writing, ROCm does not have a fix for this issue. However, I have found a workaround. Instead of using ROCm's implementation of the LSTM (and possibly other RNN layers—I haven't tested others), you can use TensorFlow's implementation. Although TensorFlow's implementation isn't as optimized as ROCm's, it performs well and supports a wider range of memory layouts compared to ROCm.

Achieving this isn't as straightforward as it might seem. In my case—though there may be other ways to achieve the same result, depending on your specific requirements—this configuration worked:

LSTM(
    units=128,
    return_sequences=True,
    activation='tanh',
    recurrent_activation='sigmoid',
    recurrent_dropout=0.2, 
    use_bias=True
)

Of course, this isn't a perfect solution, and I still hope AMD addresses this issue soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants