Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python bindings: add syntaxic sugar to get/set algorithm arguments + doc #11822

Merged
merged 2 commits into from
Feb 10, 2025

Conversation

rouault
Copy link
Member

@rouault rouault commented Feb 7, 2025

  • Python bindings: add syntaxic sugar to get/set algorithm arguments
  • Doc: add a few examples of how to use GDAL algorithms from Python

@rouault rouault added funded through GSP Work funded through the GDAL Sponsorship Program gdal_cli Anything related to the new 3.11 "gdal" CLI frontend labels Feb 7, 2025
import json

gdal.UseExceptions()
alg = gdal.GetGlobalAlgorithmRegistry()["raster"]["info"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessing an "algorithm registry" still feels a bit low-level to me. What do you think about something like:

with gdal.run("raster info", {"input" :"byte.tif"}) as result:
    info = json.loads(result.output)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An algorithm could have several output arguments (like generating a main raster and a mask). So result.output would only be allowed if the algorithm declare a single output argument ?

Copy link
Contributor

@sgillies sgillies Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed at all? A Python user can already do subprocess.run(["raster", "info"]). I guess a built-in approach would prevent executable path issues? Definitely copy the subprocess API for this feature if you keep it. No need for a context manager.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed at all?

at least to work with in-memory datasets, and also for efficiency purposes since forking a process has measurable costs (e.g scenarios where you create tons of 256x256 tiles by clipping and warping a single source dataset)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed at all? A Python user can already do subprocess.run(["raster", "info"])

Maybe they can, though if they're like me and they've only been using Python since 2.1, they still have to look up the documentation for subprocess every single time. How do I capture stdout again?

No need for a context manager.

Not in this example - the idea was to avoid having to call a Finalize() method in subsequent examples.


from osgeo import gdal
gdal.UseExceptions()
alg = gdal.GetGlobalAlgorithmRegistry()["raster"]["convert"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or

with gdal.run("raster convert", {"input":"in.nc", "output":"out.tif", "output-format":"COG", "overwrite": True}) as result:
    ds = result.output

# ds is closed on __exit__

ds = gdal.run("raster convert", {"input":"in.nc", "output":"out.tif", "output-format":"COG", "overwrite": True}).result
# ds must be manually closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Life-time of object may be tricky... e.g "gdal raster edit" takes a "dataset" argument that is both an input and output (edit in place)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be considered to have no output?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be considered to have no output?

I guess the context manager could detect that situation, that the value of the argment is set to a dataset before Run() and still set after it, and then make the arg dis-own the dataset before Finalize() to prevent Finalize() from calling Close() on it.

Before I invest time in that, we should get some agreement on the type of API we want... I'd have a slight preference for Dan's idea with the algorithm path and a dict of input arguments. Or maybe defer to later, and take this PR as a preliminary improvement (at least for the sake of writing our tests!). I imagine that any further higher-level API will build on-top of the improvements of that PR anyway

@rouault
Copy link
Member Author

rouault commented Feb 10, 2025

I've create ticket #11832 to track that we can potentially do better. I'm merging this for now.

@rouault rouault merged commit ebf0679 into OSGeo:master Feb 10, 2025
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
funded through GSP Work funded through the GDAL Sponsorship Program gdal_cli Anything related to the new 3.11 "gdal" CLI frontend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants