Very large dataset #387
Unanswered
Mathieu-Istas
asked this question in
Q&A
Replies: 2 comments 12 replies
-
Hi @Mathieu-Istas , Thanks for your interest in our code! This is... strange. Are you low on storage on your machine's temporary directory? Is your tempdir on a network filesystem? Can you try to run with the environment variable |
Beta Was this translation helpful? Give feedback.
7 replies
-
I'll note also that there's a contributed HDF5 dataset option that may avoid this issue: https://github.com/mir-group/nequip/blob/develop/nequip/data/_dataset/_hdf5_dataset.py. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear Nequip team,
I have a large dataset (240 000 configurations of 200 atoms). I probably don't need that many configurations to get a good model, but if I can use the full dataset, I might as well do it.
When calling
nequip-train
, I get an error when processing the dataset. However, it is not an obvious OOM like this discussion, and I can't understand the error (I only copied the the part of the traceback that seemed relevant):The traceback finishes with
This error disappear if I point to a smaller file in
dataset_file_name
(for instance with 50 000 configurations), but loweringn_train
orn_val
with the original file with all configurations (around 5 Gb) result in the same error. Do you know if there something I could do to train on the full dataset?Regards,
Mathieu
Beta Was this translation helpful? Give feedback.
All reactions