-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kola: Add isolation=readonly|dynamicuser
#3060
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This is great! |
db1f6a7
to
d58af5f
Compare
exclusive: false
tests to use ProtectSystem=strict
isolation=readonly|dynamicuser
and `dynamicuser` means `DynamicUser=yes`. Setting either of these options | ||
also implies `exclusive: false`. Use these for tests that are mostly about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really think it's best if we don't rely on exclusive: false
being implicit
anywhere. Tests can do things in their butane config that are exclusive and would/could cause other tests to fail.
If you want an isolation
and dynamicuser
flag that's OK I guess, but I really want to keep exlusive: false
explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we could also enforce that such tests don't provide an ignition config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really opposed to also requiring an explicit exclusive: false
as long as we don't force people to copy-paste the rationale for it 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now strengthened this to prevent the test from providing an Ignition config too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did also test the testing framework manually:
$ cat > src/config/tests/kola/content-origins/config.bu << 'EOF'
variant: fcos
version: 1.2.0
EOF
$ kola run ext.config.content-origins
Error: test ext.config.content-origins specifies isolation=dynamicuser but includes an Ignition config
2022-09-01T14:09:32Z cli: test ext.config.content-origins specifies isolation=dynamicuser but includes an Ignition config
$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that you still will probably include a comment/rationale for isolation: readonly
. I'm not really sure what you gain here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would I add a comment/rationale for that? Would you block coreos/fedora-coreos-config#1949 on that? If so I'm sorry I just disagree, it's self-evidently obvious and doesn't need a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem is that a lot of the configuration options here aren't self-evident especially to someone who didn't write the test and it's safer to just default to including a rationale for everything to make sure we catch the full context.
I agree with you that if you read the documentation on this one, and then you read the systemd documentation on that option, then it's clear: "this test is completely read-only and doesn't write to the filesystem at all", but that requires... work, we could just state it up front. It doesn't really cost us anything to help the user read/understand what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you block coreos/fedora-coreos-config#1949 on that?
And.. the answer is "no", I'm not that dogmatic. I'd suggest it but if you firmly say "are you going to block this on that?" I wouldn't. Later on, though, if I ever touched the file I'd be like "hmm, wonder why these options aren't documented like all the others?" and I'd add it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think we can continue a debate about the comment schema after landing this.
d58af5f
to
008fd81
Compare
Part of my war against duplicative comments in our kola tests. We have a few tests that write a comment like this: ``` # - exclusive: false # - This test doesn't make meaningful changes to the system and # should be able to be combined with other tests. ``` Of course, this comment is already redundant because the meaning of the `exclusive` tag is defined canonically in coreos-assembler (here) and copy-pasting that into every test that uses it would be pointlessly verbose. But - we can do one better. Instead of having a test flag which is mainly an "I promise not to mutate the system in a way which could interfere with other tests", let's add a field that *enforces* this. Then it doesn't need to be commented; we have a variety of tests which are just "system inspection" (e.g. query rpmdb) and run just fine with `DynamicUser=yes` and hence *cannot* affect the system, and hence are inherently isolated from other concurrent tests.
008fd81
to
a21e0e4
Compare
if userdata != nil { | ||
return fmt.Errorf("test %v specifies isolation=%v but includes an Ignition config", testname, targetMeta.Isolation) | ||
} | ||
targetMeta.Exclusive = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
targetMeta.Exclusive = false |
Still need to drop this bit as discussed in https://github.com/coreos/coreos-assembler/pull/3060/files#r960684553 and https://github.com/coreos/coreos-assembler/pull/3060/files#discussion_r960795802
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no use case for exclusive: true; isolation=<something>
tests right? Hence, it would always be exclusive: false; isolation=<something>
which is redundant.
I definitely see this replacing most of our use of exclusive: false
and so I hence don't see the value in requiring it to also be specified.
In practice, our test framework is mostly something that needs to be understood by a relatively small set of people. Sure, this is a change that some people would need to learn about later, but it's really not a big change either, and we have docs. And the existing test corpus acts as an example.
I think enhancing the robustness of the test suite outweighs the very short term transition period where we some people may need to learn that isolation
implies ➡️ exclusive: false
.
Here's another way to look at this - a lot of people I'm sure start writing a new test by copy-pasting an old one. Today it's really easy to copy-paste an exclusive: false
test but not think about what that means in your new test and acidentally write a test that does mutate the system in a way likely to cause flakes.
With isolation
- we're adding some locks around the gun case that you'd have to bypass to use the footgun 😄
Your proposal would implicitly support people accidentally doing just isolation: dynamicuser
and we'd have to remember that there's no use case for that and to later manually add exclusive: false
. Which would be another footgun case, although a much less severe one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just stating and restating and stating again that I don't want to imply exclusive: false
.
If someone added a test that was likely to cause flakes because it's run with other tests we'd catch that today because we automatically re-run failed tests on their own and we'd see that the re-run passed.
I get what you were trying to do with this, but the model we developed doesn't apply 1:1
with what you wanted it to be.
If we change anything what I would prefer is for non-exclusive tests to default to ProtectSystem=strict
and then allow adding a tag (or I guess it could be a toplevel option) to disable that (for example to those two tests that you found that don't comply). Then a user has to think about it when they choose to override that default behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already wrote up a comment in the other thread before I saw Dusty's reply proposing essentially the same thing. Moving it here for completeness:
I think the isolation functionality is potentially useful, but if the goal is to reduce boilerplate comments, the proposed flags are counterproductive. Tests have to opt in to this functionality by writing something like "isolation": "dynamicuser"
, and if someone wants to find out what that means, they can look it up in cosa docs and learn that it means DynamicUser=yes
. Okay, now they have to find an option in systemd docs, which isn't easy unless they know about systemd.directives(7)
. And why does the test use that? What are the implications? Why would I select dynamicuser
rather than readonly
, and why can't I select both? (dynamicuser
implies readonly
, but that's more spelunking in systemd docs.)
So let's figure out what we actually want here, without recreating systemd security option spaghetti. Proposal: for tests that have exclusive: false
and do not provide an Ignition/Butane config, automatically set DynamicUser=yes
unless a second option is specified, something like restrict: false
. That way we default to maximum isolation unless the test says that it wants less; we'll easily notice that case because the test will fail. If there are known cases where we want ProtectSystem=strict
without DynamicUser=yes
, we can provide an option for that too, or make the first one multi-valued. Maybe in the Ignition + exclusive: false
case we'd want to require an explicit restrict: false
, to make it clear that protections are being disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just stating and restating and stating again that I don't want to imply exclusive: false.
Your seems to overlap with bgilbert's, so replying below:
and if someone wants to find out what that means, they can look it up in cosa docs and learn that it means DynamicUser=yes. Okay, now they have to find an option in systemd docs, which isn't easy unless they know about systemd.directives(7).
Right, but the developers on the team already have to read the external-tests.md documentation to understand much at all. Needing that isn't...isolated...to this change.
Proposal: for tests that have exclusive: false and do not provide an Ignition/Butane config, automatically set DynamicUser=yes unless a second option is specified, something like restrict: false.
Hmmm...definitely implicit-over-explicit going on here. I think having an ignition/butane config provided turning off isolation would be too magical, and instead we should error out and require isolation: none
or so.
Overall I'm absolutely fine with defaulting exclusive: false
tests to being isolated...but it will require some ratcheting.
A tricky thing here is that some of our read only tests actually do need to still run as root because they traverse the filesystem and we have some protected directories.
This is why I introduced both isolation: readonly
(runs as root, but read only) and isolation: dynamicuser
.
Anyways then so then we're agreeing to:
- add
isolation: none|readonly|dynamicuser
to coreos-assembler but don't change its relationship withexclusive
- Do a PR to fedora-coreos-config which adds
exclusive: false; isolation: none
to the tests that need it - Another (or same) PR to fedora-coreos-config which adds
exclusive: false; isolation: readonly
to tests that need to be root but are still readonly - Change this PR to default
exclusive: false
toisolation: dynamicuser
I need to more carefully look at the case of external tests with ignition configs and see the impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did see it. It does not address my concerns from the comment #3060 (comment) it was replying to, nor my restatement just now in #3060 (comment). My -1 will remain in place until those concerns are addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah...you're proposing s/isolation: none/restrict: false/
basically, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think restrict
is slightly better than isolation
, but the main point is avoiding dynamicuser
and readonly
. Instead of dynamicuser
, we can have something like restrict: true
(implicit because it's the default), and instead of readonly
, something like an implicit restrict: true
with an explicit run-as-root: true
. (I'm open to better naming ideas also.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err...I was with you on the first half of the sentence - sure, we can have a boolean instead a tristate (I still don't think a tristate is hard or complicated but whatever, I was mainly on board with just having a boolean). That seems to be what Dusty was arguing for too.
Then in the second half you made up another boolean - and now we have two booleans which is (I hope you'd agree if you step back for a moment) not actually simpler.
For one thing, two booleans is four potential states. The new one of which (restrict: false, run-as-root: true) is redundant. And any future expansion via additional booleans becomes quickly grows the state space.
I think I'd much prefer just a boolean (if that's what we're going down to) alongside the ability for tests to inject arbitrary systemd unit directives, which would likely help solve other things too (e.g. ordering against other units).
Edit: of course, two booleans for this combines with wanting to retain a boolean exclusive: false
because now there are other states that seem at best useless (exclusive: true, restrict: true
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra boolean isn't simpler for the implementation, but we should prioritize building intuitive directives for the test author. I think distinct flags help with that, but I'm okay with a three-valued enum if we can find sensible names for the values. I see how the arbitrary directive injection could be useful, but using it to override DynamicUser
would be pretty awkward, and I don't think encouraging that helps with the goal of simplifying tests.
For completeness, I'll note a couple things. One is that an enum implies that there's only one dimension of hardening, and down the road we might find that we want to compose multiple hardening flags anyway. The other is that systemd has the sort of interlocking options you're arguing against. DynamicUser
is a separate option from ProtectSystem
, but implies it. (systemd has a gazillion options though, which I agree can be hard to work with.)
I have deadlines I need to focus on, so I'm muting this thread for now. I've tried to fully capture my constraints in #3060 (comment) (but s/flags/options/
if you want to pursue the enum). I'll check back later, which might be after OCP feature freeze, and my -1 stands until then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hold pending resolution of #3060 (comment).
@cgwalters: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Closing this. I don't think it's a priority right now. If we want to implement the suggested changes from the discussion let's start with an issue explaining the proposal. |
EDIT: I apologize for the tone in this PR that started from me; it was absolutely unnecessary. Again, I apologize.
Part of my battle against duplicative comments in our kola tests.
We have a few tests that write a comment like this:
Of course, this comment is already redundant because the
meaning of the
exclusive
tag is defined canonically in coreos-assembler(here) and copy-pasting that into every test that uses it would
be pointlessly verbose.
But - we can do one better. Instead of having a test flag which
is mainly an "I promise not to mutate the system in a way which could
interfere with other tests", let's add a field that enforces this.
Then it doesn't need to be commented; we have a variety of
tests which are just "system inspection" (e.g. query rpmdb) and
run just fine with
DynamicUser=yes
and hence cannot affectthe system, and hence are inherently isolated from other concurrent
tests.