-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable manual noble upgrade #7427
base: develop
Are you sure you want to change the base?
Conversation
9918778
to
e217766
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good on initial code review, @legoktm. I've left a few questions inline for discussion.
I got set up to test this but got stuck on a surprising error (see below) on UBUNTU_VERSION=noble make build-debs
that I didn't have while testing #7406, notably before #7437. I'll pick this back up next week, probably via cfm/terraform-metal-securedrop-staging@27bcac7.
UBUNTU_VERSION=noble make build-debs
dpkg-shlibdeps: error: cannot find library libssl.so.1.1 needed by debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/redwood/redwood.cpython-38-x86_64-linux-gnu.so (ELF format: 'elf64-x86-64' abi: 'ELF:64:l:amd64:0'; RPATH: '')
dpkg-shlibdeps: error: cannot find library libcrypto.so.1.1 needed by debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/redwood/redwood.cpython-38-x86_64-linux-gnu.so (ELF format: 'elf64-x86-64' abi: 'ELF:64:l:amd64:0'; RPATH: '')
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/redwood/redwood.cpython-38-x86_64-linux-gnu.so was not linked against libdl.so.2 (it uses none of the library's symbols)
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/redwood/redwood.cpython-38-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/redwood/redwood.cpython-312-x86_64-linux-gnu.so were not linked against libm.so.6 (they use none of the library's symbols)
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/redwood/redwood.cpython-38-x86_64-linux-gnu.so was not linked against libpthread.so.0 (it uses none of the library's symbols)
dpkg-shlibdeps: error: cannot continue due to the errors listed above
Note: libraries are not searched in other binary packages that do not have any shlibs or symbols file.
To help dpkg-shlibdeps find private libraries, you might need to use -l.
dh_shlibdeps: error: dpkg-shlibdeps -Tdebian/securedrop-app-code.substvars debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/mod_wsgi/server/mod_wsgi-py312.cpython-312-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/psutil/_psutil_linux.cpython-312-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/psutil/_psutil_posix.cpython-312-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/redwood/redwood.cpython-38-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/redwood/redwood.cpython-312-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/cryptography/hazmat/bindings/_rust.abi3.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/argon2/_ffi.abi3.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/sqlalchemy/cresultproxy.cpython-312-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/sqlalchemy/cprocessors.cpython-312-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/sqlalchemy/cutils.cpython-312-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/markupsafe/_speedups.cpython-312-x86_64-linux-gnu.so debian/securedrop-app-code/opt/venvs/securedrop-app-code/lib/python3.12/site-packages/_cffi_backend.cpython-312-x86_64-linux-gnu.so returned exit code 2
dh_shlibdeps: error: Aborting due to earlier error
make: *** [debian/rules:8: binary] Error 2
dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2
Script done.
make: *** [Makefile:514: build-debs] Error 2
install_files/ansible-base/roles/noble-migration/tasks/main.yml
Outdated
Show resolved
Hide resolved
|
||
- name: Skip migration if already done | ||
set_fact: | ||
already_finished: "{{ not migration_json.failed and (migration_json.content | b64decode | from_json)['finished'] == 'Done' }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the JSON check ever succeed if migration.json_failed == True
?
already_finished: "{{ not migration_json.failed and (migration_json.content | b64decode | from_json)['finished'] == 'Done' }}" | |
already_finished: "{{ (migration_json.content | b64decode | from_json)['finished'] == 'Done' }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting for my own understanding: We don't just set_fact
with the value of finished
here because later we'll use ansible.builtin.wait_for
to wait/block until the server reaches the target state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we still need to verify the ansible.builtin.slurp
step didn't fail, because it has ignore_errors: yes. Otherwise accessing migration_json.content
will error.
This happens when you do something with focal and then noble ones; I never tracked down the exact cause. There's some issue with the script reusing the so incorrectly (note that it's redwood.cpython-38 in a python3.12 path); a git clean will fix it IIRC. |
Admins can run `./securedrop-admin noble_migration` to trigger a manual noble migration. At a high level the playbook: * disables OSSEC notifications * triggers the app upgrade, waiting through two reboots * triggers the mon upgrade, again waiting through reboots * re-enables OSSEC notifications The most complicated part is how we for the reboots. We first have a `wait_for` that looks for a specific stage in the state file. Because the upgrade script writes the state file and then immediately reboots, it should never actually succeed and fail because the connection is interrupted. So we set `ignore_unreachable` and `ignore_errors`, and the next block is `wait_for_connection` for the server to come back up. There is a delay before we begin checking just in case the wait_for did succeed and we need to wait for the reboot to happen. Because of this sequencing, there isn't any support for the playbook failing mid-host and restarting it. It is probably unnecessary since, once started, the upgrade will automatically finish by itself. The script does support one host already being upgraded and the other still needing migration. So if e.g. app migration fails, you can manually fix the host, let it auto finish the upgrade, and then re-run the playbook to migrate mon. Fixes #7416.
e217766
to
f9fa489
Compare
Status
Ready for review
Description of Changes
Admins can run
./securedrop-admin noble_migration
to trigger a manual noble migration.At a high level the playbook:
The most complicated part is how we for the reboots. We first have a
wait_for
that looks for a specific stage in the state file. Because the upgrade script writes the state file and then immediately reboots, it should never actually succeed and fail because the connection is interrupted. So we setignore_unreachable
andignore_errors
, and the next block iswait_for_connection
for the server to come back up. There is a delay before we begin checking just in case the wait_for did succeed and we need to wait for the reboot to happen.Because of this sequencing, there isn't any support for the playbook failing mid-host and restarting it. It is probably unnecessary since, once started, the upgrade will automatically finish by itself.
The script does support one host already being upgraded and the other still needing migration. So if e.g. app migration fails, you can manually fix the host, let it auto finish the upgrade, and then re-run the playbook to migrate mon.
Fixes #7416.
Testing
How should the reviewer test this PR?
./securedrop-admin --force noble_migration
Deployment
Any special considerations for deployment? n/a
Checklist
make -C admin test
) pass in the admin development container