Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix two bugs (order dependency and false positive) in the required validation #5210

Merged
merged 3 commits into from
Jan 29, 2025

Conversation

harrylewis
Copy link
Contributor

@harrylewis harrylewis commented Jan 21, 2025

The goal of this PR is to fix two bugs identified in the required validation.

Bug #1: Order dependency for matching sets

When defining a required validation, if the first option defined is a set and that option is matched, the validation will pass, even if other options in the mutually exclusive group would match.

Consider the following query.

{
  findObject(node_id: '1', object_type: 'Invoice', object_id: '1') {
    id
  }
}

The behaviour of the required validation is different depending on how the validation is defined.

field :find_object, Node, null: true do
  argument :node_id, ID, required: false
  argument :object_type, String, required: false
  argument :object_id, Integer, required: false

  # This will return a validation error.
  validates required: { one_of: [:node_id, [:object_type, :object_id]] }

  # This will NOT return a validation error.
  validates required: { one_of: [[:object_type, :object_id], :node_id] }
end

This is addressed in f2e4c0d.

Bug #2: False positive for partially matched sets

When defining a required validation, if any of the options is matched alongside a set that is partially matched, the validation will pass. This violates the mutual exclusion rule.

Consider the following query.

{
  findObject(node_id: '1', object_type: 'Invoice') {
    id
  }
}

I would expect this to fail, given the following field definition.

field :find_object, Node, null: true do
  argument :node_id, ID, required: false
  argument :object_type, String, required: false
  argument :object_id, Integer, required: false

  # This will NOT return a validation error.
  validates required: { one_of: [:node_id, [:object_type, :object_id]] }
end

This is addressed in f2e4c0d.

Note: I came across this behaviour during development, and was surprised and confused by it. I can anticipate a perspective that deems this behaviour as acceptable because it still technically satisfies the property that "exactly one" option is matched. While I can appreciate this perspective, in my opinion it leaves room for ambiguity in the behaviour of validation.

Developer impact

The first bug requires a developer to be specific when defining the order of the mutually exclusive options. This leads to a brittle implementation, where a seemingly innocuous reordering of the options can causing a breaking change in the API.

The second bug does not uphold the contract of "exactly one" and "mutually exclusive" that the required validation is meant to provide, in my opinion. If I am checking for the presence of arguments in the resolver to determine which of the mutually exclusive option has been provided, depending on the order I check the presence of the keys, if one of those options is a set that has been partially matched, I could attempt to access the other keys in the set which have not matched and cause a runtime error.

Breaking changes

I consider both of these bug fixes to be breaking changes. A client passing a certain collection of inputs which may be valid under the existing behaviour may now experience a validation error, if they are not providing strictly mutually exclusive arguments.

When using the required validator, if the first option is a set and it
is matched, the validation will always be valid, even if other options
in the mutually exclusive group would match. This is due to the
presence of a `break` statement, which will cause a premature exit if a
set option is matched.
When using the required validator, if any of the options is a set that
is partially matched alongside another match, the validation is valid,
which violates the mutual exclusion of the group.
Copy link
Contributor Author

@harrylewis harrylewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see CI is failing for certain checks. I see the same failures on other PR's, so it doesn't appear to be related to my changes. I'll take a look and see what is going wrong.

matched_conditions += 1
break
full_match = one_of_condition.all? { |k| value.key?(k) }
partial_match = !full_match && one_of_condition.any? { |k| value.key?(k) }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered that #some? is only available in Rails 🙃

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could do it in a single iteration of one_of_condition:

Suggested change
partial_match = !full_match && one_of_condition.any? { |k| value.key?(k) }
any_match = false
full_match = true
one_of_condition.each do |k|
if value.key?(k)
any_match = true
else
full_match = false
end
end
partial_match = any_match && !full_match

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this suggestion, I'll go ahead and implement that!

Is this suggestion motivated by potential performance concerns? My GraphQL performance knowledge is limited. For most cases I imagine that one loop versus two loops for a relatively small amount of options would have a negligible impact. Of course though, it is still additional CPU cycles. But this is where my expertise and use case awareness falls off - are there use cases where this could potentially cause a performance impact?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was performance-minded. In practice, I can't think of a situation where this would be called in a tight loop and become a real bottleneck. But I have certainly been surprised before about how rubber meets the road when GraphQL-Ruby is used in real-world applications.

In micro-benchmarks, it makes a difference, for example:

# loops.rb
require "benchmark/ips"


one_of_condition = [:a, :b, :c, :d]
value = {
  a: 1,
  b: 2,
  c: 3
}
Benchmark.ips do |x|
  x.report(".all + .any") do
    full_match = one_of_condition.all? { |k| value.key?(k) }
    partial_match = !full_match && one_of_condition.any? { |k| value.key?(k) }
  end
  x.report(".each") do
    any_match = false
    full_match = true
    one_of_condition.each do |k|
      if value.key?(k)
        any_match = true
      else
        full_match = false
      end
    end

    partial_match = any_match && !full_match
  end

  x.compare!
end

Without YJIT, .all + .any is 23% slower:

$ ruby loops.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-darwin22]
Warming up --------------------------------------
         .all + .any   247.839k i/100ms
               .each   273.355k i/100ms
Calculating -------------------------------------
         .all + .any      2.269M (± 5.2%) i/s  (440.78 ns/i) -     11.401M in   5.039404s
               .each      2.797M (± 4.3%) i/s  (357.52 ns/i) -     14.214M in   5.091697s

Comparison:
               .each:  2797017.5 i/s
         .all + .any:  2268712.9 i/s - 1.23x  slower

Interestingly, with YJIT enabled, there's a much bigger difference (greater than 2x faster with .each):

$ ruby --yjit loops.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [x86_64-darwin22]
Warming up --------------------------------------
         .all + .any   324.644k i/100ms
               .each   759.224k i/100ms
Calculating -------------------------------------
         .all + .any      3.406M (± 5.4%) i/s  (293.57 ns/i) -     17.206M in   5.066546s
               .each      8.336M (± 5.4%) i/s  (119.95 ns/i) -     41.757M in   5.025170s

Comparison:
               .each:  8336485.6 i/s
         .all + .any:  3406394.2 i/s - 2.45x  slower

Maybe YJIT has special handling for Array#each 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate you taking the time to explain this, and provide this fulsome example. There is clearly a difference here. While it is minimal in this context, it's easy to address and removes any potential for this to be an added source of slowness.

Comment on lines -74 to +81
if matched_conditions == 1
if fully_matched_conditions == 1 && partially_matched_conditions == 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A partial match alongside a full match violates a stricter definition of "exactly one" and "mutually exclusive".

Comment on lines +29 to +30
{ query: "{ validated: multiValidated(a: 1, b: 2) }", result: nil, error_messages: ["multiValidated must include exactly one of the following arguments: a, (b and c)."] },
{ query: "{ validated: multiValidated(a: 1, c: 3) }", result: nil, error_messages: ["multiValidated must include exactly one of the following arguments: a, (b and c)."] },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests exercise the partial matches. Now every combination of the three arguments in this example is evaluated. This has been duplicated below for other cases.

Comment on lines +35 to +48
{
name: "Definition order independence",
config: { one_of: [[:a, :b], :c] },
cases: [
{ query: "{ validated: multiValidated(c: 1) }", result: 1, error_messages: [] },
{ query: "{ validated: multiValidated(a: 2, b: 3) }", result: 5, error_messages: [] },
{ query: "{ validated: multiValidated }", result: nil, error_messages: ["multiValidated must include exactly one of the following arguments: (a and b), c."] },
{ query: "{ validated: multiValidated(a: 1, b: 2, c: 3) }", result: nil, error_messages: ["multiValidated must include exactly one of the following arguments: (a and b), c."] },
{ query: "{ validated: multiValidated(a: 1, c: 3) }", result: nil, error_messages: ["multiValidated must include exactly one of the following arguments: (a and b), c."] },
{ query: "{ validated: multiValidated(b: 2, c: 3) }", result: nil, error_messages: ["multiValidated must include exactly one of the following arguments: (a and b), c."] },
{ query: "{ validated: multiValidated(a: 3) }", result: nil, error_messages: ["multiValidated must include exactly one of the following arguments: (a and b), c."] },
{ query: "{ validated: multiValidated(b: 2) }", result: nil, error_messages: ["multiValidated must include exactly one of the following arguments: (a and b), c."] },
]
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exercises the order dependency issue.

end
when Array
if one_of_condition.all? { |k| value.key?(k) }
matched_conditions += 1
break
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This break statement is what caused the order dependency issue. I looked in the Git history to see why it was present. It existed in the original commit, without an explicit explanation. Upon further evaluation and experimentation, I discovered it led to the order dependency mentioned. Removing it outright fixes this order depedency.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to ask the guy who wrote it 😅 I bet I was thinking "at least one of", not "exactly one of", but exactly one-of is the right idea.

@harrylewis harrylewis marked this pull request as ready for review January 21, 2025 18:24
@harrylewis harrylewis changed the title Fix two bugs (order dependency and false positive) in the required valiation Fix two bugs (order dependency and false positive) in the required validation Jan 23, 2025
Copy link
Owner

@rmosolgo rmosolgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the detailed writeup and code change. I just have one suggestion on the implementation.

end
when Array
if one_of_condition.all? { |k| value.key?(k) }
matched_conditions += 1
break
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to ask the guy who wrote it 😅 I bet I was thinking "at least one of", not "exactly one of", but exactly one-of is the right idea.

matched_conditions += 1
break
full_match = one_of_condition.all? { |k| value.key?(k) }
partial_match = !full_match && one_of_condition.any? { |k| value.key?(k) }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could do it in a single iteration of one_of_condition:

Suggested change
partial_match = !full_match && one_of_condition.any? { |k| value.key?(k) }
any_match = false
full_match = true
one_of_condition.each do |k|
if value.key?(k)
any_match = true
else
full_match = false
end
end
partial_match = any_match && !full_match

@rmosolgo
Copy link
Owner

Thanks again for this fix!

@rmosolgo rmosolgo added this to the 2.4.9 milestone Jan 29, 2025
@rmosolgo rmosolgo merged commit a466893 into rmosolgo:master Jan 29, 2025
13 of 15 checks passed
@harrylewis harrylewis deleted the bug-fixes-in-required-validator branch January 29, 2025 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants