Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: move the rollout controller to event driven model #1029

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ryanzhang-oss
Copy link
Contributor

@ryanzhang-oss ryanzhang-oss commented Jan 31, 2025

Description of your changes

  1. move the rollout controller to event driven model
  2. explicitly handle the diffReport condition so it won't block rollout
  3. minor fixes on comments and logs

Fixes #

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Special notes for your reviewer

@ryanzhang-oss ryanzhang-oss marked this pull request as draft February 1, 2025 00:50
Copy link
Contributor

@michaelawyu michaelawyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! I've added some comments, PTAL 🙏

// +kubebuilder:printcolumn:JSONPath=`.status.conditions[?(@.type=="Applied")].status`,name="ResourcesApplied",type=string
// +kubebuilder:printcolumn:JSONPath=`.status.conditions[?(@.type=="Available")].status`,name="ResourceAvailable",priority=1,type=string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! This needs a re-gen of the CRD YAML files I think.

@@ -563,6 +562,16 @@ func (r *Reconciler) calculateRealTarget(crp *fleetv1beta1.ClusterResourcePlacem
// A binding with not trackable resources is considered ready if the binding's current spec has been available before
// the ready cutoff time.
func isBindingReady(binding *fleetv1beta1.ClusterResourceBinding, readyTimeCutOff time.Time) (time.Duration, bool) {
// the binding is ready if the diff report has been reported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! Aside from the isBindingReady function we also have a HasBindingFailed function. Before the update the two functions are mutually exclusive, i.e., if isBindingReady returns true, HasBindingFailed would always return false. With the new edit this no longer seems to be stand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which brings up a situation that, for a bound binding with the ReportDiff apply strategy, it would be considered as a ready binding (because the DiffReported condition is true) and a failed binding at the same (because it has a false Applied conditon and no Available condition).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that this PR already addressed this exact issue?

@@ -563,6 +562,16 @@ func (r *Reconciler) calculateRealTarget(crp *fleetv1beta1.ClusterResourcePlacem
// A binding with not trackable resources is considered ready if the binding's current spec has been available before
// the ready cutoff time.
func isBindingReady(binding *fleetv1beta1.ClusterResourceBinding, readyTimeCutOff time.Time) (time.Duration, bool) {
// the binding is ready if the diff report has been reported
diffReportCondition := binding.GetCondition(string(fleetv1beta1.ResourceBindingDiffReported))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing that I might not have fully thought through is that if we treat DiffReported bindings to be not-failed, ready bindings it would be subject to rolling update strategy as well, which means it could also get stuck (e.g., member clusters get disconnected). Would this be a very confusing situation for end users (i.e., all the diffs reported would be stale)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that behavior is exactly what is expected? We are rolling out reportDiff just like any other change. Do you want to blast the reportDiff?

pkg/controllers/rollout/controller.go Show resolved Hide resolved
return
}
// these are the conditions we care about
conditionsToMonitor := []string{string(fleetv1beta1.ResourceBindingDiffReported), string(fleetv1beta1.ResourceBindingAvailable)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! All the logic down below would not run if I understand it correctly; if the conditions change the resource versions must have already mismatched; on the contrary, if the resource versions stay the same there would be no status change at all I believe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was meant to be generation (spec) instead of resource version. The log was right, the code has a typo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants