-
-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grass.script: Allow init to modify only specified environment #3438
Conversation
The function create_location now works without a full session. Internally, it sets up a runtime environment to execute tools (modules) without the connection to a location (project) which is sufficient for executing g.proj. The new test covers the new functionality, but to actually test it, the test needs to run by itself because grass.script.setup.init (still) creates a global session.
…modified and when it is not, copy is up to the user which makes it clear when that actually happens which is longer but more readable, env is keyword-only, gisbase path handling is documented
This makes the message, verbose and other functions consistent with other wrappers around run_command family calls. While not needed for multiple-mapset situations and parallelization, it is necessary when calls with messages are used without global environment being set and only custom environment available in grass.script.setup. The PR aims at providing the interface, not updating all use cases (except the anticipated changes in grass.script.setup). This does not have any test since the API for that is missing. This will be tested indirectly in the future.
I also have concerns about passing around an object that has all env vars populated. I know in CI, we need to be careful to avoid leaking them to software that doesn't require them, and to not show them in logs, as we wouldn't want to see the tokens and secrets. Is the CI-security class that sent a PR yesterday would be interested in that question, or have an answer for this? |
One thing is that the object is readily available already in Python: import os
print(os.environ) Another thing is that we are passing these objects around already to achieve various tasks such as data management and parallelization: env = create_environment(...)
run_command(...env=env) I have a WIP PR #2923 which aims at hiding some of that, but the env would be simply hidden in another object changing tyhe above to: with gs.setup.init(...) as session:
tools = Tools(session=session)
tools.r_surf_gauss(output="surface")
I don't know anything else to prevent them besides don't print them. The original variables for the process are always available on Linux with
Diving into this might too much of a rabbit trail, but I'll discuss it with them. |
From what I'm aware of, we don't have a feature-freeze policy here, and we didn't decide to start the RC process yet. If you're committed to including it and think you'd still have the bandwidth to tune up in time for release, I don't see why it couldn't be there. Otherwise, if these changes are considered too breaking for this release (I personally think that full-code wide changes of location to project is way more impactful than this), then part of the usage of mocking env vars in tests could be integrated in some way. The only test fixture was to set no env vars though, not setting them/replacing them (that would make pytest tear them down afterwards) |
This makes the _message_, _verbose_ and other functions consistent with other wrappers around _run_command_ family calls. While not needed for multiple-mapset situations and parallelization, it is necessary when calls with messages are used without global environment being set and only a custom (local) environment is available which is the case in grass.script.setup (with #3438). The PR aims at providing the interface, not updating all use cases (it will be applied for grass.script.setup in #3438). This does not have any test since the current API does not allow for writing these test. This will be tested indirectly in the future (e.g. by #3438). Additionally, it fixes documentation for couple other function where the parameter documentation was misleading or wrong.
….message API" This reverts commit 4a6fd68.
This touches a lot of files, but that's just the update to use the better API in tests. The changes are in |
The CI fails with GISBASE being required in certain context. The test code accounts for that already (or at least it should). Locally, I'm not able to reproduce the issue (just as expected for how the test code is written). |
I think #3683 is the less controversial of the two, let's start with it. |
This is using a mechanism what the Python subprocess package is using. We use subprocesses and we need to customize the environment to pass different GRASS-related settings. That's how the whole grass.script works. While Bandit warns about usage of subprocess package (dangerous in certain context) and specific usages of the subprocess package, I'm not aware of a Bandit warning related to usage of the env parameter.
All the subprocesses in general get all the variables from parent process. Having env actually allows users to sanitize the environment and let the GRASS subprocesses know only about what they want them to know.
Printing env parameter or printing all parameters of run_command functions (which would include env) would be probably a bad idea. |
With #3689 merged, this PR should be rebased to remove the duplicated changes (that has conflicts) |
Co-authored-by: Ondrej Pesek <[email protected]>
…3438) The _grass.script.setup.init_ function modifies os.environ. While os.environ is fit for many, if not most, cases, usage in tests and parallel processing is limited and in all cases, os.environ stays partially modified even after session is finished (we don't destroy the runtime environment, i.e., variables such as GISBASE). With this change, _init_ takes _env_ parameter specifying the environment to modify and it modifies that environment. When no _env_ is provided, it still modifies os.environ, so the default behavior is not changed. This required only few changes to the initialization code, but more changes were needed for the cleanup code. A lot of tests can now take advantage of this functionality and this PR updates some of them. I plan to update all where it is applicable, but will leave as is some others, namely those which use ctypes (situation is more complex there) and _grass.jupyter_ (_env_ parameter is not implemented for most of them). There is plenty of tests covering it, but the cleanup part which needed most changes does not have any coverage. I don't think there is much concern in terms of API because _env_ parameters is what we now have in many functions. The only uncertainty with that is whether it should create its own copy or modify the provided environment. I go with modify because that makes the copy explicit in the call which is more clear (caller or reader is sure a copy is created) and it is consistent with how os.environ is modified.
…3438) The _grass.script.setup.init_ function modifies os.environ. While os.environ is fit for many, if not most, cases, usage in tests and parallel processing is limited and in all cases, os.environ stays partially modified even after session is finished (we don't destroy the runtime environment, i.e., variables such as GISBASE). With this change, _init_ takes _env_ parameter specifying the environment to modify and it modifies that environment. When no _env_ is provided, it still modifies os.environ, so the default behavior is not changed. This required only few changes to the initialization code, but more changes were needed for the cleanup code. A lot of tests can now take advantage of this functionality and this PR updates some of them. I plan to update all where it is applicable, but will leave as is some others, namely those which use ctypes (situation is more complex there) and _grass.jupyter_ (_env_ parameter is not implemented for most of them). There is plenty of tests covering it, but the cleanup part which needed most changes does not have any coverage. I don't think there is much concern in terms of API because _env_ parameters is what we now have in many functions. The only uncertainty with that is whether it should create its own copy or modify the provided environment. I go with modify because that makes the copy explicit in the call which is more clear (caller or reader is sure a copy is created) and it is consistent with how os.environ is modified.
The grass.script.setup.init function modifies os.environ. While os.environ is fit for many, if not most, cases, usage in tests and parallel processing is limited and in all cases, os.environ stays partially modified even after session is finished (we don't destroy the runtime environment, i.e., variables such as GISBASE). With this change, init takes env parameter specifying the environment to modify and it modifies that environment. When no env is provided, it still modifies os.environ, so the default behavior is not changed.
This required only few changes to the initialization code, but more changes were needed for the cleanup code. A lot of tests can now take advantage of this functionality and this PR updates some of them. I plan to update all where it is applicable, but will leave as is some others, namely those which use ctypes (situation is more complex there) and grass.jupyter (env parameter is not implemented for most of them).
I'm not decided whether this should go to 8.4 or wait for 8.5. There is plenty of tests covering it, but the cleanup part which needed most changes does not have any coverage. However, if someone gives it a careful read, it might be possible to have this for 8.4. I don't think there is much concern in terms of API because env parameters is what we now have in many functions. The only uncertainty with that is whether it should create its own copy or modify the provided environment.
This includes changes to g.message interface (
python/grass/script/core.py
) which I intent to submit in a separate PR (#3439). Also,python/grass/script/tests/conftest.py
needs a doc fix (done).Special thanks goes to @echoix for the advice about pytest's monkeypatch which made the test code possible (and simpler in other places).