Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Configure Backup for S3-Compliant Services Using Helm #2759

Open
dotori1995 opened this issue Sep 12, 2024 · 8 comments
Open

How to Configure Backup for S3-Compliant Services Using Helm #2759

dotori1995 opened this issue Sep 12, 2024 · 8 comments

Comments

@dotori1995
Copy link

Hello,

Thank you very much for all your hard work.

I apologize in advance if this is a beginner-level question. I have installed 'Zalando' and 'Zalando UI' using Helm and ArgoCD. After creating the cluster and integrating it with Keycloak, I have confirmed that the database is working correctly. However, I have encountered an issue.

When backing up the database, I am using S3 Compatible Storage instead of AWS S3. I am unsure which property in the 'Zalando' Helm chart needs to be configured to connect with S3 Compatible Storage. I am also unsure whether a separate ConfigMap needs to be set up for this.

If it’s not too much trouble, could you kindly provide an example of the Helm configuration for this setup?

I will also share the changes I have made to the Helm configuration from the Git repository.
(https://github.com/zalando/postgres-operator/blob/master/charts/postgres-operator/values.yaml)


configAwsOrGcp:
aws_region:
enable_ebs_gp3_migration: false
log_s3_bucket: ""
wal_s3_bucket: ""

configLogicalBackup:
logical_backup_docker_image: "ghcr.io/zalando/postgres-operator/logical-backup:v1.13.0"
logical_backup_job_prefix: "logical-backup-"
logical_backup_provider: "s3"
logical_backup_s3_access_key_id: ""
logical_backup_s3_bucket: ""
logical_backup_s3_bucket_prefix: "spilo"
logical_backup_s3_region: ""
logical_backup_s3_endpoint: ""
logical_backup_s3_secret_access_key: ""
logical_backup_s3_sse: "AES256"
logical_backup_s3_retention_time: ""
logical_backup_schedule: "30 00 * * *"
logical_backup_cronjob_environment_secret: ""


The following is the log of the Postgres installed according to the above configuration.


root@postgre-db-0:/home/postgres/pgdata/pgroot/pg_log# cat postgresql-4.log
2024-09-12 07:55:21 UTC [67]: [5-1] 66e29e69.43 0 LOG: ending log output to stderr
2024-09-12 07:55:21 UTC [67]: [6-1] 66e29e69.43 0 HINT: Future log output will go to log destination "csvlog".
2024-09-12 08:41:35 UTC [457]: [5-1] 66e2a93f.1c9 0 LOG: ending log output to stderr
2024-09-12 08:41:35 UTC [457]: [6-1] 66e2a93f.1c9 0 HINT: Future log output will go to log destination "csvlog".
2024-09-12 11:53:53 UTC [64]: [5-1] 66e2d651.40 0 LOG: ending log output to stderr
2024-09-12 11:53:53 UTC [64]: [6-1] 66e2d651.40 0 HINT: Future log output will go to log destination "csvlog".
INFO: 2024/09/12 11:53:54.331337 Files will be read from storages: [default]
ERROR: 2024/09/12 11:53:54.564233 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/00000004.history.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:53:54.699498 Files will be read from storages: [default]
ERROR: 2024/09/12 11:53:54.876852 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/00000003.history.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:53:54.982675 Files will be read from storages: [default]
ERROR: 2024/09/12 11:53:55.110616 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/000000030000000000000005.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:53:55.284958 Files will be read from storages: [default]
ERROR: 2024/09/12 11:53:55.575299 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/000000030000000000000006.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:53:55.783974 Files will be read from storages: [default]
ERROR: 2024/09/12 11:53:55.973747 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/000000030000000000000006.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:54:24.584261 Files will be read from storages: [default]
ERROR: 2024/09/12 11:54:24.730085 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/00000004.history.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:54:24.810595 Files will be read from storages: [default]
ERROR: 2024/09/12 11:54:24.991250 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/000000030000000000000007.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:54:25.190706 Files will be read from storages: [default]
ERROR: 2024/09/12 11:54:25.377577 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/000000030000000000000007.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:54:25.587005 Files will be read from storages: [default]
ERROR: 2024/09/12 11:54:25.740130 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/00000004.history.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
INFO: 2024/09/12 11:54:25.923450 Files will be read from storages: [default]
ERROR: 2024/09/12 11:54:26.098262 check file for existence in "default": failed to check s3 object 'spilo/postgre-db/01e92c1b-ac34-47b7-9e8f-ff59b2d15c45/wal/16/wal_005/00000003.history.lz4' existence: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
boto ERROR Unable to read instance data, giving up
wal_e.main ERROR MSG: Could not retrieve secret key from instance profile.
HINT: Check that your instance has an IAM profile or set --aws-access-key-id
root@postgre-db-0:/home/postgres/pgdata/pgroot/pg_log#


@dotori1995
Copy link
Author

this is argocd yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: zalando
namespace: argocd
finalizers:

  • resource-finalizer.argocd.argoproj.io
    spec:
    destination:
    namespace: zalando
    server: https://kubernetes.default.svc
    project: default
    source:
    repoURL: <git_url>
    path:
    targetRevision: HEAD
    helm:
    valueFiles:
    • my_values.yaml
      syncPolicy:
      syncOptions:
    • CreateNamespace=true
      automated:
      prune: true

@yoshi314
Copy link

same here, i have no idea where to put credentials for s3 for wal/db backup.

@yoshi314
Copy link

yoshi314 commented Sep 20, 2024

i figured it out based on a few articles, if it helps

you have to reference configmap or secret in operator's values.yaml

configKubernetes:
  pod_environment_configmap: "postgres-operator/pod-config"

and pod-config is a configmap/secret like so

apiVersion: v1
kind: ConfigMap
metadata:
  name: pod-config
data:
  WAL_S3_BUCKET: postgresql  # this bucket must exist, or it will fail in strange ways.
  WAL_BUCKET_SCOPE_PREFIX: "mybackups" # not sure if necessary, tbh
  WAL_BUCKET_SCOPE_SUFFIX: ""
#  USE_WALG_BACKUP: "true"
#  USE_WALG_RESTORE: "true"
  BACKUP_SCHEDULE: '00 10 * * *'
  # here are your s3 credentials
  AWS_ACCESS_KEY_ID: my_s3_account
  AWS_SECRET_ACCESS_KEY: sekritkey
  AWS_S3_FORCE_PATH_STYLE: "true" # allegedly necessary if using minIO
  AWS_ENDPOINT: https://my_s3.server.local
#  WALG_DISABLE_S3_SSE: "true"  # encryption of backups
  BACKUP_NUM_TO_RETAIN: "5"
#  CLONE_USE_WALG_RESTORE: "true"

i decided to go with wal-e here, commented out entries for older wal-g

@dotori1995
Copy link
Author

Thank you so much. I'll give it a try next Monday.

@dotori1995
Copy link
Author

i made it like this.
Once again, I would like to express my gratitude.

{{- if .Values.configKubernetes.pod_environment_configmap }}
apiVersion: v1
kind: ConfigMap
metadata:
  name: pod-config
  namespace: {{ .Release.Namespace }}
data:
  WAL_S3_BUCKET: {{ .Values.configLogicalBackup.logical_backup_s3_bucket | quote }}  # this bucket must exist, or it will fail in strange ways.
  WAL_BUCKET_SCOPE_PREFIX: {{ .Values.configLogicalBackup.logical_backup_s3_bucket_prefix | quote }} # not sure if necessary, tbh
  WAL_BUCKET_SCOPE_SUFFIX: ""
#  USE_WALG_BACKUP: "true"
#  USE_WALG_RESTORE: "true"
  BACKUP_SCHEDULE: '00 10 * * *'
  # here are your s3 credentials
  AWS_ACCESS_KEY_ID: {{ .Values.configLogicalBackup.logical_backup_s3_access_key_id | quote }}
  AWS_SECRET_ACCESS_KEY: {{ .Values.configLogicalBackup.logical_backup_s3_secret_access_key | quote }}
  AWS_S3_FORCE_PATH_STYLE: "true" # allegedly necessary if using minIO
  AWS_ENDPOINT: {{ .Values.configLogicalBackup.logical_backup_s3_endpoint | quote }}
#  WALG_DISABLE_S3_SSE: "true"  # encryption of backups
  BACKUP_NUM_TO_RETAIN: "5"
#  CLONE_USE_WALG_RESTORE: "true"
{{- end }}

@yoshi314
Copy link

i am still digging through the docs to see if i can make every cluster have separate backup settings. i do not want to set it up on operator level, as i can have many different pg clusters in many namespaces of one k8s cluster.

@Lebvanih
Copy link

Lebvanih commented Oct 1, 2024

We are using the environment variables per cluster in our installation, and that works pretty well in 1.11.0 (didn't check yet on newer versions). Also, I'd recommend using secrets instead of configmap for what you did above.

Here is what I have in our chart (relevant part) for the cluster manifest file (include the variable we use if we do a restore):

{{- if or .Values.cluster.backup.pitr.enabled .Values.cluster.backup.pitr.restore}}
  env:
{{- if .Values.cluster.backup.pitr.enabled}}
    - name: WAL_S3_BUCKET
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-pitr-s3
          key: bucket
    - name: AWS_ACCESS_KEY_ID
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-pitr-s3
          key: access    
    - name: AWS_ENDPOINT
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-pitr-s3
          key: endpoint  
    - name: AWS_REGION
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-pitr-s3
          key: region  
    - name: AWS_SECRET_ACCESS_KEY
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-pitr-s3
          key: secret  
    - name: WALG_LIBSODIUM_KEY
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-sodium-key
          key: key  
    - name: USE_WALG_BACKUP
      value: "true"
    - name: AWS_S3_FORCE_PATH_STYLE
      value: "true"
    - name: BACKUP_NUM_TO_RETAIN
      value: {{ .Values.cluster.backup.pitr.retention | quote}}
    - name: WAL_BUCKET_SCOPE_PREFIX
      value: {{ .Release.Namespace }}/
{{- end }}
{{- if .Values.cluster.backup.pitr.restore}}
    - name: CLONE_WAL_S3_BUCKET
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-recovery
          key: bucket
    - name: CLONE_AWS_ACCESS_KEY_ID
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-recovery
          key: access    
    - name: CLONE_AWS_ENDPOINT
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-recovery
          key: endpoint  
    - name: CLONE_AWS_REGION
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-recovery
          key: region  
    - name: CLONE_AWS_SECRET_ACCESS_KEY
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-recovery
          key: secret  
    - name: CLONE_WALG_LIBSODIUM_KEY
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-recovery
          key: sodiumkey  
    - name: CLONE_USE_WALG_BACKUP
      value: "true"
    - name: CLONE_AWS_S3_FORCE_PATH_STYLE
      value: "true"
    - name: CLONE_WALG_DISABLE_S3_SSE
      value: "true"
    - name: CLONE_METHOD
      value: "CLONE_WITH_WALE"
    - name: CLONE_WALG_S3_PREFIX
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-recovery
          key: walgS3Prefix
    - name: CLONE_TARGET_TIME
      valueFrom:
        secretKeyRef:
          name: {{ template "postgres.fullname" . }}-recovery
          key: targetTime
{{- end }}
{{- end }}

Small edit: This is only for PITR, I didn't check logicalBackup.

@yoshi314
Copy link

yoshi314 commented Oct 1, 2024

so from what i have read there are few options for per-pg-cluster backup settings.

  1. you can use the env: section in your cluster definition to provide all variables.

  2. you can use pod_environment_secret to reference a secret that will be expected to be there with your postgresql cluster ( in the same namespace )

This comes under assumption that each pg cluster has separate namespace, and that every such namespace has a secret of name given in that parameter and that there is no conflicting configmap that provides the same environment values , referenced from pod_environment_configmap

This way every pg cluster has separate env variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants