-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python bindings: add syntaxic sugar to get/set algorithm arguments + doc #11822
Conversation
rouault
commented
Feb 7, 2025
- Python bindings: add syntaxic sugar to get/set algorithm arguments
- Doc: add a few examples of how to use GDAL algorithms from Python
import json | ||
|
||
gdal.UseExceptions() | ||
alg = gdal.GetGlobalAlgorithmRegistry()["raster"]["info"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accessing an "algorithm registry" still feels a bit low-level to me. What do you think about something like:
with gdal.run("raster info", {"input" :"byte.tif"}) as result:
info = json.loads(result.output)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An algorithm could have several output arguments (like generating a main raster and a mask). So result.output
would only be allowed if the algorithm declare a single output argument ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed at all? A Python user can already do subprocess.run(["raster", "info"])
. I guess a built-in approach would prevent executable path issues? Definitely copy the subprocess API for this feature if you keep it. No need for a context manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed at all?
at least to work with in-memory datasets, and also for efficiency purposes since forking a process has measurable costs (e.g scenarios where you create tons of 256x256 tiles by clipping and warping a single source dataset)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed at all? A Python user can already do subprocess.run(["raster", "info"])
Maybe they can, though if they're like me and they've only been using Python since 2.1, they still have to look up the documentation for subprocess
every single time. How do I capture stdout again?
No need for a context manager.
Not in this example - the idea was to avoid having to call a Finalize()
method in subsequent examples.
|
||
from osgeo import gdal | ||
gdal.UseExceptions() | ||
alg = gdal.GetGlobalAlgorithmRegistry()["raster"]["convert"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or
with gdal.run("raster convert", {"input":"in.nc", "output":"out.tif", "output-format":"COG", "overwrite": True}) as result:
ds = result.output
# ds is closed on __exit__
ds = gdal.run("raster convert", {"input":"in.nc", "output":"out.tif", "output-format":"COG", "overwrite": True}).result
# ds must be manually closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Life-time of object may be tricky... e.g "gdal raster edit" takes a "dataset" argument that is both an input and output (edit in place)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be considered to have no output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be considered to have no output?
I guess the context manager could detect that situation, that the value of the argment is set to a dataset before Run() and still set after it, and then make the arg dis-own the dataset before Finalize() to prevent Finalize() from calling Close() on it.
Before I invest time in that, we should get some agreement on the type of API we want... I'd have a slight preference for Dan's idea with the algorithm path and a dict of input arguments. Or maybe defer to later, and take this PR as a preliminary improvement (at least for the sake of writing our tests!). I imagine that any further higher-level API will build on-top of the improvements of that PR anyway
I've create ticket #11832 to track that we can potentially do better. I'm merging this for now. |