[bug] Serial worker reloads can potentially block dynamic endpoint changes #12797
Labels
kind/bug
Categorizes issue or PR as related to a bug.
needs-priority
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
What happened:
We have an ingress-nginx deployment that uses
enable-serial-reloads
to prevent too many nginx workers from spawning when many ingress changes are made within close proximity to one another. This configuration does not seem to have detailed documentation in the website docs, but you can see the docs in the code here:ingress-nginx/internal/ingress/controller/config/config.go
Lines 475 to 480 in d1dc3e8
I think the intention behind
enable-serial-reloads
is to prevent too many configuration reloads that require new nginx worker processes to be launched. However, this configuration option seems to also unintentionally prevent dynamic configuration reloads (reloads that do not require a new nginx worker process). This seems to be a bug, because it can lead to a situation where a particularly long nginx reload blocks all dynamic endpoint changes for the controller. If a deployment rollout happens during a reload that requires a new nginx worker process, the updated list of endpoints does not seem to be picked up by the controller until the reload is finished.What you expected to happen:
If
enable-serial-reloads
is set totrue
, I expect reloads that require new nginx worker processes to be re-queued if another such reload is already in progress, but I expect for dynamic reloads to remain intact. Endpoint updates should still happen even during a particularly long reload.Code path:
Given a pending change that requires a configuration reload, which also includes endpoint changes:
syncIngresses
is called from the task queue:ingress-nginx/internal/ingress/controller/controller.go
Line 175 in d1dc3e8
utilingress.IsDynamicConfigurationEnough
is used to determine if a nginx reload is needed:ingress-nginx/internal/ingress/controller/controller.go
Line 195 in d1dc3e8
onUpdate
is called:ingress-nginx/internal/ingress/controller/controller.go
Lines 207 to 214 in d1dc3e8
onUpdate
, an error is thrown due to an nginx reload already happening:ingress-nginx/internal/ingress/controller/nginx.go
Lines 682 to 685 in dc3acbd
syncIngresses
function:ingress-nginx/internal/ingress/controller/controller.go
Line 240 in d1dc3e8
NGINX Ingress controller version (exec into the pod and run
/nginx-ingress-controller --version
): v1.12.0Kubernetes version (use
kubectl version
):Environment:
uname -a
):Linux ip-10-2-52-194.us-west-2.compute.internal 5.10.223-212.873.amzn2.x86_64 #1 SMP Wed Aug 7 16:53:32 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: