Handle large ingests #17

jonseddon · 2022-07-20T08:18:58Z

There's an ERA5 dataset with over 500,000 files in it. This can't be ingested in the length of a standard batch system job. There are many ways to speed this:

use parallelism to read and checksum multiple files at once
allow ingestions to resume
allow ingestions to be split into smaller chunks
do all variables need to go into the same dataset?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle large ingests #17

Handle large ingests #17

jonseddon commented Jul 20, 2022

Handle large ingests #17

Handle large ingests #17

Comments

jonseddon commented Jul 20, 2022