[red-knot] Should A ∧ !A always be false? #15839

dcreager · 2025-01-30T22:49:55Z

This mimics a simplification we have on the OR side, where we simplify A ∨ !A to true. This requires changes to how we add while statements to the semantic index, since we now need distinct VisibilityConstraints if we need to model evaluating a Constraint multiple times at different points in the execution of the program.

carljm

I don't think this holds, because we are dealing with three-valued logic (there is the Ambiguous possibility), and a value of Ambiguous truthiness can change truthiness between one time it is checked and another.

Multiple checks of the truthiness of the same condition only occur in while loops, which is why all of the test failures occur in while-loop tests. This PR models the premise that a while-loop test can never change truthiness over time, which would indeed lead to the conclusion seen in the test failures, that a while loop with no break must either be never entered at all, or must be an infinite loop once entered.

I think in order for this PR to be valid, we would need to create two separate visibility constraints for the same while-loop test expression, one for "as checked on first entry" and another for "as checked on subsequent iteration", so that we wouldn't actually see those as "the same constraint", and would thus model the possibility of entering the loop and then leaving it on a later iteration. (I think this same change in while loop visibility constraints would be necessary if we switch to BDDs for visibility constraints.)

I think technically the existing OR simplification is wrong for the same reason? But in practice it doesn't cause a problem, either because the OR simplification case doesn't arise with while loops, or because simplifying to ALWAYS_TRUE doesn't lead us to eliminate paths as unreachable, or for both reasons.

carljm · 2025-01-31T00:56:08Z

cc @sharkdp, you might find this interesting.

dcreager · 2025-01-31T01:26:20Z

a value of Ambiguous truthiness can change truthiness between one time it is checked and another.

Yep, that makes sense! The part that I wasn't sure about was what the scope of "one time it is checked and another" is. My intuition would be that a literal (in the boolean formula sense, not the Python source sense) might evaluate to Ambiguous, meaning that we don't know if it's true or false — but that within the context of a particular evaluation of the boolean formula, it would have the same true or false value each time it appears. Which would make A ∧ !A == 0 hold.

Multiple checks of the truthiness of the same condition only occur in while loops, which is why all of the test failures occur in while-loop tests.

Or put another way, while-loop tests are the only place (at least for now) where we need a visibility constraint to model the relationship between two different states in the execution trace of the program.

I think technically the existing OR simplification is wrong for the same reason?

Definitely agree — and I think that's a better wording of my question! 😄 Either both A ∧ !A == 0 and A ∨ !A == 1 should hold, or neither should.

carljm · 2025-01-31T01:34:28Z

within the context of a particular evaluation of the boolean formula, it would have the same true or false value each time it appears

The problem is that we construct boolean formulae (for visibility of paths through a while loop) which contain a single constraint that represents checks of the while-test expression's truthiness at different points in execution state (first entry vs later iteration). Because we do this, we create the scenario that "the same constraint" in the same boolean formula doesn't necessarily have a consistent truthiness value, even for a single evaluation of the formula.

If we instead created separate visibility constraints (referring to the same test expression) for the truthiness checks in these two different execution states, then the simplification would hold. (I think this change in the semantic index builder's handling of while loops would not be too difficult, but I haven't tried to make the change.)

carljm · 2025-01-31T01:43:33Z

Once we model the back-edge in control flow in a while loop, we could even evaluate the test expression under two different conditions, one without the back-edge (for first entry) and one with the back-edge (for later iteration). This could change our evaluation of its static truthiness, too. (Imagine an x = True; while x: x = get_bool().)

* main: [`flake8-pyi`] Fix several correctness issues with `custom-type-var-return-type` (`PYI019`) (#15851) [`pyupgrade`] Reuse replacement logic from `UP046` and `UP047` (`UP040`) (#15840) [`refurb`] Avoid `None | None` as well as better detection and fix (`FURB168`) (#15779) Remove non-existing `lint.extendIgnore` editor setting (#15844) [`refurb`] Mark fix as unsafe if there are comments (`FURB171`) (#15832) [`flake8-comprehensions`] Skip when `TypeError` present from too many (kw)args for `C410`,`C411`, and `C418` (#15838) [`pyflakes`] Visit forward annotations in `TypeAliasType` as types (`F401`) (#15829)

dcreager · 2025-01-31T17:36:01Z

I think in order for this PR to be valid, we would need to create two separate visibility constraints for the same while-loop test expression, one for "as checked on first entry" and another for "as checked on subsequent iteration", so that we wouldn't actually see those as "the same constraint", and would thus model the possibility of entering the loop and then leaving it on a later iteration. (I think this same change in while loop visibility constraints would be necessary if we switch to BDDs for visibility constraints.)

This is done!

carljm

Looks like this fixes the test failures, sweet!

carljm · 2025-01-31T17:48:29Z

crates/red_knot_python_semantic/src/semantic_index/builder.rs

+                // The first time the body is executed, we will have only checked the first
+                // evaluation of the condition, which must be True. For the second and later times,
+                // the first evaluation must still have been True, and a later evaluation was also
+                // True. So the full visibility constraint is (first ∨ (first ∧ later)).
+                let first_and_later = self
+                    .current_visibility_constraints_mut()
+                    .add_and_constraint(first_vis_constraint_id, later_vis_constraint_id);
+                let body_vis_constraint_id = self
+                    .current_visibility_constraints_mut()
+                    .add_or_constraint(first_vis_constraint_id, first_and_later);
+                self.record_visibility_constraint_id(body_vis_constraint_id);


Doesn't (first ∨ (first ∧ later)) simplify to just first? If first is true, it is always true; if first is false, it is always false.

Which I think is also intuitive, because for the body of the while loop to be reachable, it is both necessary and sufficient that the first evaluation of the test condition be truthy.

Yes, good catch! I had thought I needed to assert that later evaluations of the condition also evaluate to true. But this shows that that's not the right interpretation — we don't know that we did evaluate the condition a second time, and so executing the body does not imply that (there was) a later evaluation (and it) evaluated to true.

carljm · 2025-01-31T17:58:02Z

crates/red_knot_python_semantic/src/semantic_index/builder.rs

+                let first_vis_constraint_id =
+                    self.add_visibility_constraint(VisibilityConstraint::VisibleIf(
+                        VisibilityConstraintAtom::new_copy(constraint, 0),
+                    ));
+                let later_vis_constraint_id =
+                    self.add_visibility_constraint(VisibilityConstraint::VisibleIf(
+                        VisibilityConstraintAtom::new_copy(constraint, 1),
+                    ));


I wonder if this could be a bit more ergonomic, and not have to expose VisibilityConstraintAtom and the implementation detail of its internal u8 to the semantic index builder? But I see there are complexities there, and with only one call site at the moment I don't think it matters much.

Done. I removed the Atom wrapper. My thought here was to reduce the number of , 0s I'd have to add, but most of those were already hidden behind helper methods

(Making the u8 not visible at all would be more invasive, since that's how the caller up in the semantic index builder specifies that it wants distinct copies. It's up to the caller to choose the values, such that distinct u8s refer to different evaluations of the constraint. Do you have thoughts on a different API shape to express that?)

I think I was vaguely envisioning some kind of copy API that would take an existing constraint ID and bump the counter behind the scenes. But the awkwardness there is that a constraint ID might not represent a VisibleIf constraint at all, making the API error prone. So I'm not sure that would be an improvement. It's already much nicer without the extra Atom type, I think what you have is good.

sharkdp · 2025-01-31T20:40:30Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+/// The primitive atoms of the formula are [`Constraint`]s, which express some property of the
+/// runtime state of the code that we are analyzing.
+///
+/// We assume that each atom has a stable value each time that the formula is evaluated. An atom
+/// that resolves to `Ambiguous` might be true or false, and we can't tell which — but within that
+/// evaluation, we assume that the atom has the _same_ unknown value each time it appears. That
+/// allows us to perform simplifications like `A ∨ !A → true` and `A ∧ !A → false`.
+///
+/// That means that when you are constructing a formula, you might need to create distinct atoms
+/// for a particular [`Constraint`], if your formula needs to consider how a particular runtime
+/// property might be different at different points in the execution of the program.


It's late here already, but just reading this, I wonder why it applies to while conditions, but not to something like

for _ in iterable: if <cond>: # …

Is it because we merge control flow twice here (after the if, after the for) instead of just once in the while loop? Would a break point inside the if complicate the situation?

I think it's because currently a for loop generates no visibility constraints. (We don't attempt to detect statically-known-empty iterables.) If we did add that feature, we might need to consider this, but I'm still not sure it would apply unless we also tried to detect statically-known-infinite iterables! Which I doubt we would ever do.

EDIT: ignore the above, I totally missed the point of your comment, which was the nested if test. Thinking about it...

Not sure I've totally thought through all the possible concrete visibility constraint trees we might generate here, but my intuition says it's not an issue in this scenario, because there is no difference between any one iteration of the loop and any other one, in terms of the effect on which code is visible or not. Which is not true in a while loop, where the key fact is that a static True means we'll never exit the loop, a static False means we'll never enter it, and a True that later changes to False means we can enter and then exit.

It seems to me like the following two are similar control flow constructs (ignoring else branches in loops…):

while cond: # <code block>

and

for _ in itertools.repeat(None): if not cond: break # <code block>

It looks like we can handle the former, but not the latter:

import itertools cond = False w = 1 f = 1 while cond: w = 2 for _ in itertools.repeat(None): if not cond: break f = 2 print(reveal_type(w)) # Literal[1] print(reveal_type(f)) # Literal[1, 2] ?

Or am I missing something? This might not be related to this PR at all(?).

(Edit: I guess I could only expect the latter to be working with #15817)

I think it's more than #15817. The latter example is only equivalent to the former if we are able to detect that an iterator is infinite. This is not reflected in the type system, and would require special casing for certain iterables (like ‘itertools.repeat`); it's not possible in general.

If the iterator is not infinite (or not known to be infinite), then exit from the loop is always possible at every iteration, regardless of the value of cond, and the result we currently give is correct.

Detecting known-empty iterators and their effect on for loop control flow is similarly difficult to do in general.

Should A ∧ !A always be false?

93efda5

dcreager requested review from carljm, MichaReiser, AlexWaygood and sharkdp as code owners January 30, 2025 22:49

dcreager added internal An internal refactor or improvement red-knot Multi-file analysis & type inference labels Jan 30, 2025

carljm approved these changes Jan 31, 2025

View reviewed changes

carljm requested changes Jan 31, 2025

View reviewed changes

dcreager added 3 commits January 31, 2025 09:48

Add (currently unused) copy number to constraint atoms

8b47792

Fix while loop builder to adhere to the new assumption

2b81b75

carljm approved these changes Jan 31, 2025

View reviewed changes

dcreager added 3 commits January 31, 2025 13:18

Simplify body constraint

43b1d8a

Remove atom wrapper

fc00f02

Vis update not needed

479d169

carljm approved these changes Jan 31, 2025

View reviewed changes

dcreager merged commit fab86de into main Jan 31, 2025
21 checks passed

dcreager deleted the dcreager/visconand branch January 31, 2025 19:06

sharkdp reviewed Jan 31, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Should A ∧ !A always be false? #15839

[red-knot] Should A ∧ !A always be false? #15839

dcreager commented Jan 30, 2025 •

edited

Loading

carljm left a comment •

edited

Loading

carljm commented Jan 31, 2025

dcreager commented Jan 31, 2025 •

edited

Loading

carljm commented Jan 31, 2025 •

edited

Loading

carljm commented Jan 31, 2025

dcreager commented Jan 31, 2025

carljm left a comment

carljm Jan 31, 2025 •

edited

Loading

dcreager Jan 31, 2025

carljm Jan 31, 2025

dcreager Jan 31, 2025

dcreager Jan 31, 2025

carljm Jan 31, 2025

sharkdp Jan 31, 2025

carljm Jan 31, 2025 •

edited

Loading

carljm Jan 31, 2025

sharkdp Feb 3, 2025 •

edited

Loading

carljm Feb 3, 2025 •

edited

Loading

[red-knot] Should A ∧ !A always be false? #15839

[red-knot] Should A ∧ !A always be false? #15839

Conversation

dcreager commented Jan 30, 2025 • edited Loading

carljm left a comment • edited Loading

Choose a reason for hiding this comment

carljm commented Jan 31, 2025

dcreager commented Jan 31, 2025 • edited Loading

carljm commented Jan 31, 2025 • edited Loading

carljm commented Jan 31, 2025

dcreager commented Jan 31, 2025

carljm left a comment

Choose a reason for hiding this comment

carljm Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

dcreager Jan 31, 2025

Choose a reason for hiding this comment

carljm Jan 31, 2025

Choose a reason for hiding this comment

dcreager Jan 31, 2025

Choose a reason for hiding this comment

dcreager Jan 31, 2025

Choose a reason for hiding this comment

carljm Jan 31, 2025

Choose a reason for hiding this comment

sharkdp Jan 31, 2025

Choose a reason for hiding this comment

carljm Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

carljm Jan 31, 2025

Choose a reason for hiding this comment

sharkdp Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

carljm Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

dcreager commented Jan 30, 2025 •

edited

Loading

carljm left a comment •

edited

Loading

dcreager commented Jan 31, 2025 •

edited

Loading

carljm commented Jan 31, 2025 •

edited

Loading

carljm Jan 31, 2025 •

edited

Loading

carljm Jan 31, 2025 •

edited

Loading

sharkdp Feb 3, 2025 •

edited

Loading

carljm Feb 3, 2025 •

edited

Loading