Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add VM delete protection #1199

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

jcanocan
Copy link
Contributor

@jcanocan jcanocan commented Jan 8, 2025

What this PR does / why we need it:
VirtualMachine objects are often managed by automation, CLI commands, 3rd party tools, etc. These automations may result in deleting accidentally VMs that should have not been deleted. These deletions may lead to a service degradation or out of service. Moreover, the deleted VMs may lead to information loss if the underlining PVC is deleted as a result of a cascaded delete.

It adds the ability to protect VirtualMachine objects from being deleted. If the label kubevirt.io/vm-delete-protection is set to True, any attempt to delete the VM will be rejected by a VAP policy.

This protection enables a protection against non-intended VM deletions, providing security and confidence to users.

Which issue(s) this PR fixes:

Fixes # CNV-50741

Release note:

Enables delete protection to VirtualMachine objects

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Jan 8, 2025
Copy link
Collaborator

@akrejcir akrejcir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you create a separate controllers instead of adding these new resources into an operand? We usually add new resources in operands. That way, they can be configured in the SSP resource.

Can you rename this PR and the commit? This is definitely not a chore.
Maybe something like:

feat: Add VM delete protection

},
Validations: []admissionregistrationv1.Validation{
{
Expression: `(!has(oldObject.metadata.labels) || !(variables.label in oldObject.metadata.labels) || !oldObject.metadata.labels[variables.label].matches('^(true|True)$'))`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be enough to test that the label exists, instead of checking its value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've considered clearer from the user point of view. My idea is: by forcing the user to add the "True" value, we make sure that user really wants the protection enabled, and I think it will avoid confusions. That being said, I'm not against it if we find it out better.

@akrejcir
Copy link
Collaborator

akrejcir commented Jan 8, 2025

/hold

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 8, 2025
@jcanocan
Copy link
Contributor Author

jcanocan commented Jan 8, 2025

Why did you create a separate controllers instead of adding these new resources into an operand? We usually add new resources in operands. That way, they can be configured in the SSP resource.

It is my understanding that operands are better fit when you are using CRD than controllers. In this case, we are just using a build-in feature such as VAP and VAPB. Given that, I'm not particularly against adding it though an operand. In fact, I question this myself multiple times. As we consider best.

Can you rename this PR and the commit? This is definitely not a chore. Maybe something like:

feat: Add VM delete protection

You are absolutely right. Thanks.

@jcanocan jcanocan force-pushed the vm-delete-protection branch from 57c3cc2 to e4f96d2 Compare January 8, 2025 13:51
@jcanocan jcanocan changed the title chore(vm-delete-protection): Add VM delete protection controller feat: Add VM delete protection Jan 8, 2025
@akrejcir
Copy link
Collaborator

akrejcir commented Jan 8, 2025

Why did you create a separate controllers instead of adding these new resources into an operand? We usually add new resources in operands. That way, they can be configured in the SSP resource.

It is my understanding that operands are better fit when you are using CRD than controllers. In this case, we are just using a build-in feature such as VAP and VAPB. Given that, I'm not particularly against adding it though an operand. In fact, I question this myself multiple times. As we consider best.

I'm not sure what do you mean by "operands are better fit when you are using CRD than controllers". Can you explain it more?

The SSP controller deploys various resources based on what is configured in the SSP CR. VAP and VAPB are one of these resources. Currently we don't configure them, but maybe we can. Operands are an abstraction used to group related resources together in the code, so it is easier to understand. For example we deploy multiple ClusterRole objects, and it would be harder to understand if they were all in one package.

I would say that this is the exact use case to add a new operand.

@jcanocan
Copy link
Contributor Author

jcanocan commented Jan 8, 2025

Why did you create a separate controllers instead of adding these new resources into an operand? We usually add new resources in operands. That way, they can be configured in the SSP resource.

It is my understanding that operands are better fit when you are using CRD than controllers. In this case, we are just using a build-in feature such as VAP and VAPB. Given that, I'm not particularly against adding it though an operand. In fact, I question this myself multiple times. As we consider best.

I'm not sure what do you mean by "operands are better fit when you are using CRD than controllers". Can you explain it more?

The SSP controller deploys various resources based on what is configured in the SSP CR. VAP and VAPB are one of these resources. Currently we don't configure them, but maybe we can. Operands are an abstraction used to group related resources together in the code, so it is easier to understand. For example we deploy multiple ClusterRole objects, and it would be harder to understand if they were all in one package.

I would say that this is the exact use case to add a new operand.

All right! You convinced me! Thanks. Let's create an operand.

@jcanocan jcanocan force-pushed the vm-delete-protection branch from e4f96d2 to 03ff41f Compare January 9, 2025 11:29
@jcanocan
Copy link
Contributor Author

jcanocan commented Jan 9, 2025

v2:

  • Dropped vap and vapb controllers.
  • Added a new operand to handle both vap and vapb resources.

@akrejcir
Copy link
Collaborator

akrejcir commented Jan 9, 2025

Please also add the new operand to the tests/cleanup_test.go, so it will check that cleanup works.

@jcanocan jcanocan force-pushed the vm-delete-protection branch from 03ff41f to e4d02a8 Compare January 9, 2025 16:37
@jcanocan
Copy link
Contributor Author

jcanocan commented Jan 9, 2025

Please also add the new operand to the tests/cleanup_test.go, so it will check that cleanup works.

Added.

@jcanocan jcanocan force-pushed the vm-delete-protection branch from e4d02a8 to 2c8aa38 Compare January 10, 2025 11:54
@akrejcir
Copy link
Collaborator

/unhold

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 10, 2025
@jcanocan jcanocan force-pushed the vm-delete-protection branch from 2c8aa38 to 4ff65b3 Compare January 10, 2025 13:48
@akrejcir
Copy link
Collaborator

Can you also remove unneeded text from the PR's description?

@jcanocan jcanocan force-pushed the vm-delete-protection branch from 4ff65b3 to 886d79d Compare January 10, 2025 15:25
@jcanocan
Copy link
Contributor Author

Can you also remove unneeded text from the PR's description?

Adjusted.
Is it better now?

@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Jan 13, 2025
@jcanocan jcanocan force-pushed the vm-delete-protection branch from e7a1d76 to c2d3f89 Compare January 13, 2025 15:04
@jcanocan jcanocan force-pushed the vm-delete-protection branch from c2d3f89 to 69bbfa5 Compare January 14, 2025 10:17
@jcanocan jcanocan force-pushed the vm-delete-protection branch from 69bbfa5 to 5c60ead Compare January 14, 2025 11:08
@jcanocan jcanocan force-pushed the vm-delete-protection branch 2 times, most recently from c9dcddc to 1c2b9e7 Compare January 14, 2025 14:50
Copy link
Member

@0xFelix 0xFelix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Thanks

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 0xFelix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2025
}).
StatusFunc(func(resource client.Object) common.ResourceStatus {
vap := resource.(*admissionregistrationv1.ValidatingAdmissionPolicy)
if vap.Status.TypeChecking != nil && len(vap.Status.TypeChecking.ExpressionWarnings) != 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If vap.Status.TypeChecking == nil, it means that it is still in progress, so we should not consider it a success.
This is written as a comment in the api:

// The results of type checking for each expression.
// Presence of this field indicates the completion of the type checking.
// +optional
TypeChecking *TypeChecking `json:"typeChecking,omitempty" protobuf:"bytes,2,opt,name=typeChecking"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice catch! Added. PTAL.

Comment on lines 115 to 128
It("should create a valid CEL expression", func() {
celEnv, err := cel.NewEnv()
Expect(err).ToNot(HaveOccurred())

_, issues := celEnv.Parse(vmDeleteProtectionCELExpression)
Expect(issues.Err()).ToNot(HaveOccurred())
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change this to a call to operand.Reconcile(&request), and then check the syntax on the VAP object that was created. That way we are sure it tests the correct thing even if the private constant changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted.


AfterEach(func() {
err := apiClient.Get(ctx, client.ObjectKeyFromObject(vm), vm)
Expect(err).To(Or(Not(HaveOccurred()), MatchError(errors.IsNotFound, "VM not found")))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use MatchError(errors.IsNotFound, "errors.IsNotFound"), so that the error message will be:

Expected ... to match error function errors.IsNotFound

Instead of

Expected ... to match error function VM not found

This is where the second parameter is used:

return format.Message(actual, fmt.Sprintf("to match error function %s", matcher.FuncErrDescription[0]))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Thanks. Adjusted.

vm = createVMWithDeleteProtection(labelValue)

Expect(apiClient.Delete(ctx, vm)).To(Succeed())
waitForDeletion(client.ObjectKeyFromObject(vm), &kubevirtv1.VirtualMachine{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to waste time waiting for deletion, because the above function would fail, if VAP blocked the deletion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped

vm = createVMWithLabels(nil)

Expect(apiClient.Delete(ctx, vm)).To(Succeed())
waitForDeletion(client.ObjectKeyFromObject(vm), &kubevirtv1.VirtualMachine{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped

})

AfterEach(func() {
err := apiClient.Get(ctx, client.ObjectKeyFromObject(vm), vm)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if vm == nil, this whole logic should be skipped

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

}, env.ShortTimeout(), time.Second).Should(Succeed())

Expect(apiClient.Delete(ctx, vm)).To(Succeed())
waitForDeletion(client.ObjectKeyFromObject(vm), &kubevirtv1.VirtualMachine{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set the vm = nil after the VM is deleted, so that we don't potentially reuse it in a test, if the test forgets to create a new VM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@jcanocan jcanocan force-pushed the vm-delete-protection branch from 1c2b9e7 to 7271722 Compare January 15, 2025 10:47
vap := resource.(*admissionregistrationv1.ValidatingAdmissionPolicy)
if vap.Status.TypeChecking == nil {
return common.ResourceStatus{
Progressing: ptr.To("Delete protection VAP type checking in progress"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the other fields need to be set. If the object is progressing, it means that it is not yet available and it is degraded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return common.ResourceStatus{
Progressing: ptr.To("Delete protection VAP type checking in progress"),
}
} else if vap.Status.TypeChecking != nil && len(vap.Status.TypeChecking.ExpressionWarnings) != 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above block returns, so else and vap.Status.TypeChecking != nil are not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

It adds the ability to protect VirtualMachine objects from being
deleted. If the label `kubevirt.io/vm-delete-protection` is set to
`True`, any attempt to delete the VM will be rejected by a VAP policy.

Signed-off-by: Javier Cano Cano <[email protected]>
@jcanocan jcanocan force-pushed the vm-delete-protection branch from 7271722 to 0ab684c Compare January 15, 2025 11:02
@akrejcir
Copy link
Collaborator

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 15, 2025
@jcanocan
Copy link
Contributor Author

/retest-required

Copy link
Member

@codingben codingben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm late as a reviewer due to sick leave. I think it looks good and can be merged, just a few nits about naming.

/lgtm

}
}

type VMDeleteProtection struct{}
Copy link
Member

@codingben codingben Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd prefer to name it as VirtualMachineDeleteProtection.

AppComponentTemplating AppComponent = "templating"
AppComponentTektonPipelines AppComponent = "tektonPipelines"
AppComponentTektonTasks AppComponent = "tektonTasks"
AppComponentVMDeletionProtection AppComponent = "vmDeleteProtection"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd suggest to stick to AppComponentVmDeleteProtection instead of AppComponentVMDeletionProtection.

Reconcile()
}

func reconcileVAPB(request *common.Request) (common.ReconcileResult, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd name it as reconcileVAPBinding and also the variables below then should be foundVapBinding and expectedVapBinding.

@codingben
Copy link
Member

/test e2e-upgrade-functests

@kubevirt-bot
Copy link
Contributor

@codingben: No presubmit jobs available for kubevirt/ssp-operator@main

In response to this:

/test e2e-upgrade-functests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubevirt-bot kubevirt-bot merged commit aa0a5cc into kubevirt:main Jan 15, 2025
12 checks passed
@jcanocan
Copy link
Contributor Author

@codingben Sorry, I didn't have the time to address your comments before it got merged. I will in a follow-up.

@codingben
Copy link
Member

@codingben Sorry, I didn't have the time to address your comments before it got merged. I will in a follow-up.

No need to sorry because it was only nits and I came late to the review cycle.

@iholder101
Copy link

Hey @jcanocan,

Too bad I wasn't aware of this PR before it merged.
The design proposal for this excact approach was rejected with many concerns from multiple people: kubevirt/community#363.

I find it concerning that it was merged without addressing the concerns that were raised up there and that no one of the reviewers were pinged to look at this PR.

Am I missing something? Were the concerns addressed in any way?
I think we need to consider reverting this before users can lay hand on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants