Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use qvm.anon-whonix, deprecate securedrop-handle-upgrade for sys-whonix #1227

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rocodes
Copy link
Contributor

@rocodes rocodes commented Jan 2, 2025

Status

Ready for Review

Description of Changes

Refs #1060
Refs #1225
Refs / (possible fix) freedomofpress/securedrop-workstation-ci#68

Changes proposed in this pull request:

  • Use upstream qvm whonix related states to download whonix templates and configure system whonix VMs. States are extended (extend) to enforce template upgrade rather than using securedrop-handle-upgrade. Salt states also used instead of securedrop-handle-upgrade to shut down sys whonix vms in order to apply changes.

Testing

  • Visual review
  • CI
  • Manual upgrade testing (see below)

** Manual upgrade testing (existing SDW install) **

  • Start with Qubes 4.2 SDW.
  • On your system, change the template of sys-whonix and anon-whonix to any other template, or make changes to the existing template such as disabling apparmor. (Don't run/use these vms while you are making a mess, of course. This is basically to simulate a template upgrade/template change).
  • Build rpm from the tip of this branch and run apply.
  • Provisioning is successful
  • sys-whonix and anon-whonix have correct whonix 17 gw and ws templates, respectively, have apparmor configured, and sys-whonix has the anon-gateway tag.

Deployment

Any special considerations for deployment? Consider both:

  1. Upgrading existing pilot instances
  2. New installs

n/a.

Upgrading: todo. on existing installs, sd-whonix will be missing the tags that it should have inherited from whonix-gateway-17. Since sd-whonix is based on the sys-whonix template, and not cloned, it should have the anon-gateway tag after a successful Salt run.

Checklist

If you have made changes to the provisioning logic

  • All tests (make test) pass in dom0

If you have added or removed files

  • I have updated MANIFEST.in and rpm-build/SPECS/securedrop-workstation-dom0-config.spec

If documentation is required

  • I have opened a PR in the docs repo for these changes, or will do so later
  • I would appreciate help with the documentation

@rocodes rocodes force-pushed the 1060-qvm-whonix-state branch 2 times, most recently from b428544 to a53d4f3 Compare January 3, 2025 14:38
@rocodes rocodes force-pushed the 1060-qvm-whonix-state branch from a53d4f3 to 0def809 Compare January 10, 2025 16:36
@rocodes rocodes marked this pull request as ready for review January 10, 2025 16:48
@rocodes rocodes requested a review from a team January 10, 2025 16:58
@rocodes rocodes force-pushed the 1060-qvm-whonix-state branch from 0def809 to 6d4149d Compare January 10, 2025 18:22
@deeplow deeplow self-assigned this Jan 14, 2025
Copy link
Contributor

@deeplow deeplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, however, I am not sure this is the right approach. I agree with the principle of simplifying logic and avoiding duplicated references (which I'm all for). However, if we do this, we have to go all the way in, over fear that we'll have out of sync versions. Currently there are two "whonix version" variables (outs and Qubes'), which could lead to breakage in our logic introduced by new whonix version being rolled out (something like this). I'm happy to go into a bit more detail if it sounds unclear.

I'm planning to write something up in #1225 describing my thoughts on system template upgrades.

@rocodes
Copy link
Contributor Author

rocodes commented Jan 15, 2025

I'm wondering what you think is a bad approach about applying qvm.anon-whonix? The reason I am suggesting it is that it is part of whonix's intended/expected configuration, and not running it goes against their expectations for vm configuration. For example, the salt states do things like apply tags (which then govern some rpc policies, including iirc policies around qubes.GetDate, which I think could be why we are having so many clock issues that cause CI and/or end user updates to stall out, per freedomofpress/securedrop-workstation-ci#68).

If we don't apply whonix's salt states, we are saying that we will be responsible for all whonix vm configuration besides just the template download, including RPC policies, and I think there's a risk that our config and theirs will (continue to) diverge. I would say that as long as we are using whonix VMs, my preference would be to use them as closely as possible to how they are tested upstream, or at least, to build on top of an accepted upstream configuration.

(Edit: a last thing I should note is that we were relying on the qvm.anon-whonix state up until Qubes 4.1 support. There isn't a lot of information in that PR on why it was dropped, but my guess is it was to be confident about supporting one whonix version across different Qubes OS versions. However, I'm not sure enough attention was paid to the side-effects of losing those salt states.)

Re template upgrades/version number: It's not ideal that we also maintain a hardcoded version number corresponding to our intended whonix version, because it is duplicative. But what it means is that as soon as a new whonix version is supported, users will have it downloaded onto their machines. Then, when we have tested the latest whonix templates, we set our version to apply them to both our components (sd-whonix) and also to sys- and anon-whonix, and the changes are not applied to our VMs or to preexisting sys- and anon- whonix until we explicitly bump the supported whonix version.

Maybe I'm missing something, I'm certainly open to hearing that, but I'm not sure that I see the downsides to this approach? (It does mean that we would have to be timely about whonix version bumps, but we have to do so anyway, because iirc whonix templates go EOL fairly quickly after a new version is available.)

@rocodes rocodes force-pushed the 1060-qvm-whonix-state branch from 6d4149d to 1fa7a86 Compare January 16, 2025 14:37
@rocodes
Copy link
Contributor Author

rocodes commented Jan 16, 2025

(rebased + force-pushed to try to get the CI runner to do it's thing, looks like it was stalled out for a while)

@deeplow
Copy link
Contributor

deeplow commented Jan 17, 2025

But what it means is that as soon as a new whonix version is supported, users will have it downloaded onto their machines.
Then, when we have tested the latest whonix templates, we set our version to apply them to both our components (sd-whonix) and also to sys- and anon-whonix, and the changes are not applied to our VMs or to preexisting sys- and anon- whonix until we explicitly bump the supported whonix version.

I see your point. Even if qvm.anon-whonix fetches these templates and sets them for anon-whonix and other whonix qubes, we'll revert it right away. So, as long as we're not deploying in a system which only has the new whonix version (and SDW hasn't updated yet), we're golden. And that scenario is not likely to happen since we advise on the Qubes version to install.

I was originally more concerned of the original whonix formulas not accounting for inplace changes and being more aimed at running on install (and we do know that there are quite a few challenges with the mix of salt and state). But upon review the qvm.anon-whonix formulas are quite simple and change very rarely. So I think it's fine to implement this change.

I'm still have some questions about the specific implementation of using salt instead of the template-upgrade-script, especially given the unnecessary overhead of shutting down templates in formulas, regardless of the need to switch templates. But doing it in salt is not a regression, so I think it's fine to have it and get a final answer on that later.

@rocodes
Copy link
Contributor Author

rocodes commented Jan 20, 2025

Even if qvm.anon-whonix fetches these templates and sets them for anon-whonix and other whonix qubes, we'll revert it right away.

It's an even better story than that - we don't change and then revert. The reason that we include these lines is that, even when we were relying on qvm.anon-whonix to provision whonix vms, that state uses present and not prefs for template version. So if the user already has whonix vms configured, their templates are not automatically upgraded to the next whonix version. (so tldr, we are pretty safe IMO in using the upstream-provided salt states.)

@rocodes
Copy link
Contributor Author

rocodes commented Jan 20, 2025

I'm still have some questions about the specific implementation of using salt instead of the template-upgrade-script, especially given the unnecessary overhead of shutting down templates in formulas, regardless of the need to switch templates. But doing it in salt is not a regression, so I think it's fine to have it and get a final answer on that later.

I think this is worth discussing for sure and happy to hear your thoughts on #1225. In the case of sys-whonix in particular, all that that script does is kill the sys-whonix vm if its base template is wrong, and it's still being invoked via Salt, so I don't know that it's any better to do Salt cmd.run: qvm-kill instead of Salt qvm.shutdown - but I would be very happy to discuss the pros/cons in #1225

@deeplow
Copy link
Contributor

deeplow commented Jan 20, 2025

Even if qvm.anon-whonix fetches these templates and sets them for anon-whonix and other whonix qubes, we'll revert it right away.

It's an even better story than that - we don't change and then revert. The reason that we include these lines is that, even when we were relying on qvm.anon-whonix to provision whonix vms, that state uses present and not prefs for template version. So if the user already has whonix vms configured, their templates are not automatically upgraded to the next whonix version. (so tldr, we are pretty safe IMO in using the upstream-provided salt states.)

I agree that this look OK, but if you take a look at how the backporting of Whonix 17 to Qubes 4.1, it was done via prefs. Otherwise when running salt it would not actually upgrade existing systems. So we'll have to rely on the "revert" mechanic I was talking about, I believe.

@deeplow
Copy link
Contributor

deeplow commented Jan 20, 2025

I was testing out the a scenario from Whonix 16, before following the test plan. But I am getting a recursive requisite:

ID: dom0-enabled-apparmor-on-whonix-gw-template
    Function: qvm.vm
        Name: whonix-gateway-17
      Result: False
     Comment: Recursive requisite found
     Changes: 

No need to investigate. I may be doing something wrong. Just wanted post a progress update. I'll follow up tomorrow.

@rocodes
Copy link
Contributor Author

rocodes commented Jan 21, 2025

I agree that this look OK, but if you take a look at how the backporting of Whonix 17 to Qubes 4.1, it was done via prefs.

Ah, interesting. I see that that was ~2 days before the anticipated whonix 16 EOL; we can track upgrades to make sure that we are paying attention, but my thinking is that whonix version bumps don't affect us much and we should be able to keep up and release a version bump without much fuss in order to coincide with upstream?

I don't know if I fully understand what you mean about testing out the scenario from whonix 16 - can you explain what steps you took or what configuration you were using when you hit the recursive requisite? (Edit: and/or also post more of the console output?)

@deeplow
Copy link
Contributor

deeplow commented Jan 21, 2025

I don't know if I fully understand what you mean about testing out the scenario from whonix 16 - can you explain what steps you took or what configuration you were using when you hit the recursive requisite?

Sorry, I should have clarified. I have hit the recursive requisite when upgrading from main to this branch. Basically following your scenario.

@deeplow
Copy link
Contributor

deeplow commented Jan 21, 2025

After failing on make dev, I enabled debug mode in salt and ran sudo qubesctl state.sls securedrop_salt so I wouldn't have to run the full thing again. Here's the log (see the failure at the bottom -- maybe ctrl+f on dom0-enabled-apparmor-on-whonix-gw-template).

securedrop_salt.sd-sys-whonx-vms.sls-run.log

@rocodes rocodes force-pushed the 1060-qvm-whonix-state branch from 1fa7a86 to ef87138 Compare January 21, 2025 18:45
@zenmonkeykstop zenmonkeykstop added this to the 1.1.0 milestone Jan 21, 2025
@rocodes rocodes marked this pull request as draft January 22, 2025 23:42
@rocodes
Copy link
Contributor Author

rocodes commented Jan 22, 2025

I'm pulling this back into draft for a bit - it's an interesting/hilarious/unfortunate issue that as far as I can tell comes down to a naming "feature" in Salt where states can be referenced by either ID or by name, so when we require sls: qvm.anon-whonix (upstream) as a prereq, and then try to perform an operation on the anon-whonix or sys-whonix vm some time afterwards, Salt thinks we are creating a recursive requisite.

There are workarounds (for example, require the qvm.whonix-workstation-{template,gateway} states instead of the final anon-whonix state) but I want to see what the options are, I'm not out of ideas yet :)

@rocodes rocodes force-pushed the 1060-qvm-whonix-state branch 2 times, most recently from 8301aec to 4528473 Compare January 28, 2025 15:33
@rocodes
Copy link
Contributor Author

rocodes commented Jan 28, 2025

CI failure is on an unrelated section (sd-log shutdown) - will rebase/push hopefully for a successful CI run.

@rocodes rocodes marked this pull request as ready for review January 28, 2025 16:55
…enforce updated template status. Use salt to power off sys whonix VMs instead of securedrop-handle-upgrade script.
@rocodes rocodes force-pushed the 1060-qvm-whonix-state branch from 4528473 to 518ea6b Compare January 28, 2025 18:22
@rocodes
Copy link
Contributor Author

rocodes commented Jan 28, 2025

As currently written, this PR would deprecate sd-whonix-version and rely on the version set in the qvm.whonix pillar, meaning that as soon as the version is bumped upstream, sdw users would be upgraded.

After talking to @zenmonkeykstop, this approach seems preferable for simplicity and also because it avoids a case where there is time between an upstream version bump and our version bump, during which users performing a clean install would be downgraded to a lower whonix version.

This does mean that we should test the sd whonix config package against whonix testing, which we can do on dev machines using the builtin qvm.whonix-testing pillar, and/or also on CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready For Review
Development

Successfully merging this pull request may close these issues.

3 participants