Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Orphaned subnets which reference a non-existent VPC cause new namespaces to never get correct annotations. #5028

Open
andrewlee1089 opened this issue Feb 25, 2025 · 3 comments
Labels
bug Something isn't working subnet vpc

Comments

@andrewlee1089
Copy link

Kube-OVN Version

v1.13.2

Kubernetes Version

v1.28.6

Operation-system/Kernel Version

Ubuntu 24.04.1 LTS
6.8.0-44-generic

Description

I am seeing log span of the form:

kube-ovn-controller-5875ddfbb8-bhzjv kube-ovn-controller I0225 13:54:43.998749       7 namespace.go:91] handle add/update namespace malc-tlgs-z5llx
kube-ovn-controller-5875ddfbb8-bhzjv kube-ovn-controller E0225 13:54:43.999128       7 controller.go:1315] "Unhandled Error" err="error syncing add namespace \"malc-tlgs-z5llx\": vpc.kubeovn.io \"rg-a7d345adb29d43401c2dcb9ca30a6a7d-default-sto-1\" not found, requeuing" logger="UnhandledError"
kube-ovn-controller-5875ddfbb8-bhzjv kube-ovn-controller I0225 13:54:44.004527       7 namespace.go:91] handle add/update namespace malc-tlgs-z5llx
kube-ovn-controller-5875ddfbb8-bhzjv kube-ovn-controller E0225 13:54:44.004793       7 controller.go:1315] "Unhandled Error" err="error syncing add namespace \"malc-tlgs-z5llx\": vpc.kubeovn.io \"rg-a7d345adb29d43401c2dcb9ca30a6a7d-default-sto-1\" not found, requeuing" logger="UnhandledError"

malc-tlgs-z5llx is a new namespace
rg-a7d345adb29d43401c2dcb9ca30a6a7d-default-sto-1-a is an existing subnet that references a non-existence VPC rg-a7d345adb29d43401c2dcb9ca30a6a7d-default-sto-1.

It seems odd to me that a non-default VPC which is in a 'bad state' can affect default VPCs. I'm OK with 'broken' VPCs having undefined behaviour but I feel uncomfortable that a broken VPC is causing new namespaces which are unrelated to be affected.

Steps To Reproduce

  1. Deploy using a version of kubeovn from before 5baabfb eg v1.12.22
  2. Deploy VPC and a subnet which references this VPC
  3. Delete VPC (which will be allowed because this is an older version)
  4. Upgrade to newest kubeOVN
  5. Create new namespace

Current Behavior

Namespace never gets correct annotations (e.g. LogicalSwitchAnnotation , CidrAnnotation, ExcludeIpsAnnotation) because handleAddNamespace function lists all subnets, and checks to find the VPCs that are referenced by the subnet.

Expected Behavior

Newly created namespace should get correct annotations (e.g. LogicalSwitchAnnotation , CidrAnnotation, ExcludeIpsAnnotation) because handleAddNamespace function lists all subnets, and checks to find the VPCs that are referenced by the subnet.

@andrewlee1089 andrewlee1089 added the bug Something isn't working label Feb 25, 2025
Copy link

dosubot bot commented Feb 25, 2025

Hey @andrewlee1089! I'm here to help you out with any bugs, questions, or contributions you have in mind. Let's tackle this together!

The issue you're experiencing is related to how the handleAddNamespace function processes subnets and their associated VPCs. When a subnet references a non-existent VPC, it can cause problems with namespace annotations, as you've observed.

In the Kube-OVN codebase, the handleAddNamespace function is responsible for associating namespaces with the correct subnets and VPCs, and updating their annotations accordingly. If a subnet references a VPC that no longer exists, this can lead to errors and prevent the namespace from receiving the correct annotations [1].

To address this issue, you can consider the following steps:

  1. Upgrade to a Version After the Commit: Ensure that you are using a version of Kube-OVN that includes the commit 5baabfb5f2e05e64f92d5ac405aea368d91f0166, which addresses the handling of VPCs and subnets by ensuring subnets are deleted before their associated VPCs. This change helps maintain the integrity of namespace annotations.

  2. Clean Up Orphaned Subnets: Manually identify and clean up any orphaned subnets that reference non-existent VPCs. This can prevent the handleAddNamespace function from encountering errors when processing namespaces.

  3. Review Error Handling: Consider reviewing and potentially enhancing the error handling in the handleAddNamespace function to better manage cases where referenced VPCs are missing. This could involve logging warnings and skipping problematic subnets rather than requeuing the namespace processing.

By ensuring that your environment is free of orphaned subnets and using a version of Kube-OVN that includes the necessary fixes, you should be able to avoid the issue of incorrect namespace annotations.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@oilbeater
Copy link
Collaborator

@zbb88888 can you take a look at this?

@zbb88888
Copy link
Collaborator

the pr is for making sure deleting vpc after all its subnet is deleted

In creation process:

  1. create vpc
  2. create subnet

In Deletion process:

  1. delete subnet
  2. delete vpc

In your Steps To Reproduce 2:

cloud you please post more details about the pre-exist subnet( before the vpc) ?

andrewlee1089 added a commit to andrewlee1089/kube-ovn that referenced this issue Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working subnet vpc
Projects
None yet
Development

No branches or pull requests

3 participants