Skip to content

A tool for politely caching bulk requests from STAC APIs

License

Notifications You must be signed in to change notification settings

developmentseed/stac-cache

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STAC Cache

A tool for politely making bulk requests for STAC metadata and storing it as geoparquet.

Background

Many STAC-based workflows rely on making many requests to STAC APIs, which can be a challenge for the analyst who needs to rely on the STAC API, and the API maintainer who needs to handle traffic to the API. There is at least one STAC client application (stacrs) that can be used to search through a STAC geoparquet archive like an API, which could reduce the burden on live APIs if there were more geoparquet archives of STAC metadata out there! You can also use duckdb to run queries on a STAC geoparquet file.

The goal of this project is to make a STAC API -> STAC geoparquet pipeline, inspired by the Cloud Native Geo blog post.

Installation

Install uv, then run:

uv sync

Usage

Example: cache HLS records for the circumpolar boreal region for a few days in July 2023

time uv run stac_cache.py \
  --stac-api=https://cmr.earthdata.nasa.gov/stac/LPCLOUD \
  --collections HLSS30_2.0 HLSL30_2.0 \
  --bbox=-180,30,180,80 \
  --start-date=2023-07-01 \
  --end-date=2023-07-05 \
  --limit=2000 \
  --output=data/hls_boreal_20230701-20230705 \
  --max-workers=2 \
  --x-chunk-size=120 \
  --y-chunk-size=10

Example: cache sentinel-2-l2a records from Microsoft Planetary Computer

time uv run stac_cache.py \
  --stac-api=https://planetarycomputer.microsoft.com/api/stac/v1 \
  --collections sentinel-2-l2a \
  --bbox=-180,30,0,80 \
  --start-date=2023-07-01 \
  --end-date=2023-07-05 \
  --limit=200 \
  --output=output/sentinel-2 \
  --max-workers=4 \
  --x_chunk_size=60 \
  --y_chunk_size=10

Example: (do not run it takes a long time)

time uv run stac_cache.py \
  --stac-api=https://cmr.earthdata.nasa.gov/stac/LPCLOUD \
  --collections HLSS30_2.0 HLSL30_2.0 \
  --bbox=-180,30,180,80 \
  --start-date=2023-07-01 \
  --end-date=2023-08-31 \
  --limit=2000 \
  --output=data/hls_boreal_20230701-20230831 \
  --max-workers=2 \
  --x-chunk-size=120 \
  --y-chunk-size=10

time uv run stac_cache.py \
  --stac-api=https://cmr.earthdata.nasa.gov/stac/LPCLOUD \
  --collections HLSS30_2.0 HLSL30_2.0 \
  --bbox=-180,30,180,80 \
  --start-date=2022-07-01 \
  --end-date=2022-08-31 \
  --limit=2000 \
  --output=data/hls_boreal_20220701-20220831 \
  --max-workers=2 \
  --x_chunk_size=120 \
  --y_chunk_size=10

time uv run stac_cache.py \
  --stac-api=https://cmr.earthdata.nasa.gov/stac/LPCLOUD \
  --collections HLSS30_2.0 HLSL30_2.0 \
  --bbox=-180,30,180,80 \
  --start-date=2019-07-01 \
  --end-date=2019-08-31 \
  --limit=2000 \
  --output=data/hls_boreal_20190701-20190831 \
  --max-workers=2 \
  --x_chunk_size=120 \
  --y_chunk_size=10

About

A tool for politely caching bulk requests from STAC APIs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages