You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Which image of the operator are you using? e.g. ghcr.io/zalando/postgres-operator:v1.12.2
Where do you run it - cloud or metal? Kubernetes or OpenShift? K8s
Are you running Postgres Operator in production? yes
Type of issue? Bug
Hi,
I deployed a PostgreSQL instance, and the pods were stuck in pending state. During this time, the PostgresClusterStatus was set to Creating. After some time the postgres status was set to CreateFailed and the following warning was observed in the operator logs:
time="2025-02-14T11:07:11Z" level=error msg="failed to create cluster: pod labels error: still failing after 200 retries" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:07:11Z" level=warning msg="cluster created failed: pod labels error: still failing after 200 retries" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:07:11Z" level=error msg="could not create cluster: pod labels error: still failing after 200 retries" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=controller worker=2
And later when the sync event is called, it patches the PostgresClusterStatus to Running which is incorrect as the pods are still stuck in pending state.
This incorrect status is misleading, as it serves as the primary way for users to track the PostgreSQL cluster's state.
Logs from the operator during this sync event:
time="2025-02-14T11:11:29Z" level=debug msg="syncing Patroni config" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:11:29Z" level=warning msg="Patroni config updated? false - errors during config sync: could not get Postgres config from pod test-pg-demo/tcl-minimal-cluster-demo-0: could not get Postgres config from pod test-pg-demo/tcl-minimal-cluster-demo-0: is not a valid IP', 'could not get Postgres config from pod test-pg-demo/tcl-minimal-cluster-demo-1: could not get Postgres config from pod test-pg-demo/tcl-minimal-cluster-demo-1: is not a valid IP" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:11:29Z" level=error msg="errors while restarting Postgres in pods via Patroni API: could not restart Postgres in pod test-pg-demo/tcl-minimal-cluster-demo-0: could not get member data: is not a valid IP', 'could not restart Postgres in pod test-pg-demo/tcl-minimal-cluster-demo-1: could not get member data: is not a valid IP" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:11:29Z" level=debug msg="syncing pod disruption budgets" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:11:29Z" level=debug msg="syncing roles" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:11:29Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:11:44Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:11:59Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:12:14Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:12:29Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:12:44Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:12:59Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:13:14Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:13:14Z" level=error msg="could not sync roles: could not init db connection: could not init db connection: still failing after 8 retries" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:13:14Z" level=debug msg="syncing databases" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:13:14Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:13:29Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:13:44Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:13:59Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:14Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:29Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:44Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:59Z" level=warning msg="could not connect to Postgres database: dial tcp 10.245.50.134:5432: connect: connection refused" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:59Z" level=error msg="could not sync databases: could not init database connection" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:59Z" level=debug msg="syncing prepared databases with schemas" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:59Z" level=debug msg="syncing connection pooler (master, replica) from (false, nil) to (false, nil)" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:59Z" level=info msg="identified non running pod, potentially skipping major version upgrade" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:59Z" level=info msg="identified non running pod, potentially skipping major version upgrade" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=cluster worker=2
time="2025-02-14T11:14:59Z" level=info msg="cluster has been synced" cluster-name=test-pg-demo/tcl-minimal-cluster-demo pkg=controller worker=2
The text was updated successfully, but these errors were encountered:
Hi,
I deployed a PostgreSQL instance, and the pods were stuck in pending state. During this time, the
PostgresClusterStatus
was set toCreating
. After some time the postgres status was set toCreateFailed
and the following warning was observed in the operator logs:And later when the sync event is called, it patches the
PostgresClusterStatus
toRunning
which is incorrect as the pods are still stuck in pending state.This incorrect status is misleading, as it serves as the primary way for users to track the PostgreSQL cluster's state.
Logs from the operator during this sync event:
The text was updated successfully, but these errors were encountered: