Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first implementation of local udf #308

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

VincentVerelst
Copy link
Contributor

No description provided.

@clausmichele
Copy link
Member

@VincentVerelst there seem to be conflicts. Could you fix them before proceeding?

@VincentVerelst
Copy link
Contributor Author

@clausmichele , merge conflicts have been solved

Copy link
Member

@clausmichele clausmichele left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things to fix:

  1. There is a missing __init__.py file. It has to be inside the openeo_processes_dask/process_implementations/udf folder, with the following content:
from .udf import *
  1. The attributes are not maintained and with them the projection/CRS is also gone. It's necessary to keep it. You can test it with this MWE:
import openeo
from openeo.local import LocalConnection
local_conn = LocalConnection("./")

url = "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a"
spatial_extent =  {"east": 11.40, "north": 46.52, "south": 46.46, "west": 11.25}
temporal_extent = ["2022-06-01", "2022-06-10"]
bands = ["red", "nir"]
properties = {"eo:cloud_cover": dict(lt=80)}
s2_datacube = local_conn.load_stac(
    url=url,
    spatial_extent=spatial_extent,
    temporal_extent=temporal_extent,
    bands=bands,
    properties=properties,
)

b04 = s2_datacube.band("red")
b08 = s2_datacube.band("nir")
ndvi = (b08 - b04) / (b08 + b04)
ndvi_median = ndvi.reduce_dimension(dimension="time", reducer="median")

result_ndvi_xr = ndvi_median.execute()

# Build a UDF object from an inline string with Python source code.
udf = openeo.UDF("""
from openeo.udf import XarrayDataCube

def apply_datacube(cube: XarrayDataCube, context: dict) -> XarrayDataCube:
    array = cube.get_array()
    print(array.shape)
    array.values = 0.0001 * array.values
    return cube
""")

# Apply the UDF to a cube.
rescaled_cube = ndvi_median.apply(process=udf)

rescaled_cube_xr = rescaled_cube.execute()

print(result_ndvi_xr.rio.crs)
print(rescaled_cube_xr.rio.crs)

>> EPSG:32632
>> None

openeo_processes_dask/process_implementations/udf/udf.py Outdated Show resolved Hide resolved
tests/test_udf.py Outdated Show resolved Hide resolved
@VincentVerelst VincentVerelst self-assigned this Jan 22, 2025
@VincentVerelst
Copy link
Contributor Author

The attributes are not maintained and with them the projection/CRS is also gone. It's necessary to keep it. You can test it with this MWE:

This seems to be a broader problem in the apply process. Seems to be solved by adding keep_attrs=True in apply_ufunc.

@clausmichele
Copy link
Member

The attributes are not maintained and with them the projection/CRS is also gone. It's necessary to keep it. You can test it with this MWE:

This seems to be a broader problem in the apply process. Seems to be solved by adding keep_attrs=True in apply_ufunc.

@ValentinaHutter do you know if the attributes were not passed due to some technical reason? Maybe some edge case that would not work if we keep the attributes?

@ValentinaHutter
Copy link
Collaborator

The attributes are not maintained and with them the projection/CRS is also gone. It's necessary to keep it. You can test it with this MWE:

This seems to be a broader problem in the apply process. Seems to be solved by adding keep_attrs=True in apply_ufunc.

@ValentinaHutter do you know if the attributes were not passed due to some technical reason? Maybe some edge case that would not work if we keep the attributes?

It should be fine to set keep_attrs=True, there is no edge case, I know of... I think it was just not added before because we did not depend on the attributes and did not experience related issues.

@VincentVerelst
Copy link
Contributor Author

@clausmichele , two unit tests are failing because dask_geopandas 0.4.3 was released. GeoDataFrame is now part of dask_geopandas.expr instead of dask_geopandas.core. So we either change the dask_geopadas dependency to <0.4.3, or we change this line: https://github.com/Open-EO/openeo-processes-dask/blob/main/openeo_processes_dask/process_implementations/ml/random_forest.py#L98

Let me know what you prefer.

@clausmichele
Copy link
Member

@clausmichele , two unit tests are failing because dask_geopandas 0.4.3 was released. GeoDataFrame is now part of dask_geopandas.expr instead of dask_geopandas.core. So we either change the dask_geopadas dependency to <0.4.3, or we change this line: https://github.com/Open-EO/openeo-processes-dask/blob/main/openeo_processes_dask/process_implementations/ml/random_forest.py#L98

Let me know what you prefer.

I'm not the author of that function and we are not using it at the moment. EODC developed it and probably @ValentinaHutter knows if the change could impact something on their side.

@ValentinaHutter
Copy link
Collaborator

thanks for adding the change to 0.4.3! from EODC's side it's fine to switch to this version.

@VincentVerelst
Copy link
Contributor Author

@ValentinaHutter , @clausmichele , @jdries , how do we proceed further?

@ValentinaHutter
Copy link
Collaborator

Looks like tests are failing because of a change in either the test data or a dependency in load_stac. I now created a PR to independently test the ndvi process #316 but for load_stac, the issue still needs to be resolved.

@HansVRP HansVRP assigned HansVRP and unassigned VincentVerelst Feb 4, 2025
@HansVRP HansVRP assigned HansVRP and clausmichele and unassigned HansVRP Feb 4, 2025
@clausmichele
Copy link
Member

I will investigate! Something changed but I don't understand where. In fact, even the example in the docs now is failing: https://open-eo.github.io/openeo-python-client/cookbook/localprocessing.html

@clausmichele
Copy link
Member

clausmichele commented Feb 4, 2025

So, the issue occurs with the latest version of pystac. With pystac==1.9.0 it doesn't occur. @ValentinaHutter could you fix pystac==1.9.0 in the project? Edit: better pystac<1.12, see here: stac-utils/xpystac#39 and gjoseph92/stackstac#262

@ValentinaHutter
Copy link
Collaborator

Did you mean pystac or pystac-client?

@clausmichele
Copy link
Member

pystac, currently it is a dependency of stackstac (and also odc-stac if we will use it in the near future) and we don't have it in the requirements.

@ValentinaHutter
Copy link
Collaborator

Thanks for the clarification, I now merged the dependency into main, so you can pull from there 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants