-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add first version of tile retriever. #11
Conversation
Download the data. | ||
""" | ||
stack = stackstac.stack( | ||
items, resolution=RESOLUTION, assets=BANDS, dtype="uint16", fill_value=NODATA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure about setting the default nodata value to 0? I know uint16 only allows 0-65536, but 0 can also mean black.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always used 0
as nodata considered nodata, a real reflectance of 0 is probably wrong anyway, there is always some light scattered back. Althought with the new processing baseline things have become a bit more intricate, see
https://forum.step.esa.int/t/info-introduction-of-additional-radiometric-offset-in-pb04-00-products/35431/8
print(f"Storing {len(tiles)} tiles") | ||
# TODO: Make this an upload to S3. | ||
numpy.savez_compressed( | ||
f"/datadisk/clay/{stack.id.to_numpy()[0]}.npz", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have the option to set the folder path? And also create the folder if it doesn't already exist.
stack = stackstac.stack( | ||
items, resolution=RESOLUTION, assets=BANDS, dtype="uint16", fill_value=NODATA | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion on optimizing reads at different resolutions: Instead of resampling to certain resolution, it will be faster (tens-of-miliseconds instead of ~1min) to read from the overviews (see gjoseph92/stackstac#196 (comment)), but this will result in resolutions that are not round integer numbers.
Any thoughts on sticking to fixed resolutions like 10, 20, 60 versus overview-level defined resolutions? I know we discussed offline about aligning the 512x512 chips to what the Cloud-Optimized GeoTIFF is using internally for faster reads.
Side note: The default resampling with stackstac
is Nearest Neighbour (see https://stackstac.readthedocs.io/en/v0.5.0/api/main/stackstac.stack.html#stackstac.stack.params.resampling), which is ok for optical images. It might be good to explicitly set the resampling algorithm in the code to be clearer (also in case anyone copies this code for another dataset such as DEMs which should use another interpolation scheme).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are going for 10m so no overviews necessary, as we stick to highest resolution available. If we decide to use 20m then yes we should try to optimize for the 20m overview if they exist.
Closing in favor of #27 |
A first working version of a tiler that will create tiles over a scene based on location and dates. The tiles are in proper numpy format and each has metadata like bounds, centroid, and resolution.
This is a basis for discussion, but I am quite happy about the tiling overall and the approach. This should be very scalable, and can be switched to a different STAC provider very easily.
Refs #10