helm · tareksha · Jan 18, 2025 · Jan 18, 2025 · Jan 18, 2025 · Jan 18, 2025
diff --git a/hips/README.md b/hips/README.md
@@ -28,3 +28,4 @@ restricted markdown format and can be found in the
 - [hip-0017: Helm OCI MediaType Registration](hip-0017.md)
 - [hip-0018: Include repository URL and tarball digest in chart's release metadata](hip-0018.md)
 - [hip-0019: New annotations for displaying hook output](hip-0019.md)
+- [hip-0020: Wait for custom resource conditions](hip-0020.md)
diff --git a/hips/hip-0020.md b/hips/hip-0020.md
@@ -0,0 +1,241 @@
+---
+hip: 20
+title: "Wait for custom resource conditions"
+authors: "Tarek Sharafi <[email protected]>"
+created: "2024-01-16"
+type: "feature"
+status: "draft"
+---
+
+## Abstract
+
+This proposal introduces a first-class feature in Helm to natively support waiting for custom resource conditions.
+
+Helm simplifies Kubernetes workload management by supporting wait mechanisms for built-in resources like Deployments and Jobs, ensuring that charts install correctly and conveniently. However, as the Kubernetes ecosystem evolves, the concept of "application readiness" has become increasingly complex, often requiring support for custom resources. These resources frequently report readiness through a standardized list of conditions within their status sections.
+
+While popular tools like kubectl and Argo CD address this need by offering native support for condition-based waiting or providing robust tooling and documentation, Helm currently lacks similar functionality for custom resources.
+
+## Motivation
+
+Currently a common workaround for waiting for a custom resource is to add an ad-hoc job inside the chart that wraps a `kubectl wait .. --for condition=MyCondition` command. This introduces additional complexity as this itself is an additional workload installed in the chart: rbac, service account, image specifications, possible pull secrets, ... and this increases operational overhead. Let go the potential security risks of running this operational workload as part of the application manifest.
+
+In-chart solutions mix up the application defintion with its installation which is actually the area of the installation tool being used. The application manifest should focus on the application itself, not the infrastructure for installing it.
+
+Providing native support for waiting on custom resource conditions directly in Helm would address these challenges, reducing operational overhead, enhancing security, and improving overall user experience.
+
+## Rationale
+
+## Specification
+
+A custom annotation at resource level tells helm which condition should we wait for to become true:    
+```yaml
+"helm.sh/wait-condition": MyCondition
+```
+and a command line flag makes helm actually wait for these conditions:
+```yaml
+--wait-for-conditions
+```
+During `install` and `upgrade` if the flag `--wait-for-conditions` is set then helm records all resources that have custom conditions and wait for them as part of the existing wait phase (deployments, pods, ...).
+
+The process waits for a conditions whose `type` equals the value in the helm annotation and `status` equals `True`.
+
+
+## Backwards compatibility
+
+The new command line flag has a default value of `false` which means the new resource annotation has no effect unless explicitly requested by the operator. This ensures an opt-in and backwards-compatible behavior.
+
+## Security implications
+
+Security is actually improved and no ad-hoc workloads and RBACs are needed anymore in charts that adopt this feature.
+
+## How to teach this
+
+In the first instance, documentation plus the help text for `helm install` would explain the feature.
+
+An example template could be provided in documentation showing how to use this feature with a generic command used in a hook.
+
+A more advanced example showing how to use the new feature with Troubleshoot.sh to provide preflight checks could be linked in the documentation, provided directly in the documentation, or provided on the Troubleshoot.sh documentation site independently.
+
+## Reference implementation
+
+[Pull Request for Documentation ](https://github.com/helm/helm-www/pull/1242)
+
+[Pull Request for Helm](https://github.com/helm/helm/pull/10309) - most upvoted open PR
+
+
+## Rejected ideas
+N/A
+
+## Open issues
+N/A
+
+## References
+
+[Troubleshoot.sh](https://troubleshoot.sh/) - the tool that is the motivation for this HIP. 
+
+[safe-install plugin](https://github.com/z4ce/helm-safe-install) - Plugin that provides a similiar experience to what I hope this HIP will provide natively.
+
+## Reference - Examples Usage
+
+### Example using `false`
+
+Template:
+```yaml
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: "{{ .Release.Name }}-false-job"
+  labels:
+    app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
+    app.kubernetes.io/instance: {{ .Release.Name | quote }}
+    app.kubernetes.io/version: {{ .Chart.AppVersion }}
+    helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
+  annotations:
+    "helm.sh/hook": pre-install, pre-upgrade
+    "helm.sh/hook-output-log-policy": hook-failed, hook-suceeded
+    "helm.sh/hook-weight": "-5"
+    "helm.sh/hook-delete-policy": hook-succeeded, hook-failed
+
+spec:
+  backoffLimit: 0
+  template:
+    metadata:
+      name: "{{ .Release.Name }}"
+      labels:
+        app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
+        app.kubernetes.io/instance: {{ .Release.Name | quote }}
+        helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
+    spec:
+      restartPolicy: Never
+      containers:
+      - name: post-install-job
+        image: "alpine:3.18"
+        command: ["sh", "-c", "echo foo ; false"]
+```
+
+What it should loook when running:
+
+```text
+$ helm install ./ my-release
+Logs for pod: my-release-false-job-bgbz6, container: pre-install-job
+foo
+Error: INSTALLATION FAILED: failed pre-install: job failed: BackoffLimitExceeded
+```
+
+### Example using Troubleshoot Preflight Checks
+
+```yaml
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: "{{ .Release.Name }}-preflight-job"
+  labels:
+    app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
+    app.kubernetes.io/instance: {{ .Release.Name | quote }}
+    app.kubernetes.io/version: {{ .Chart.AppVersion }}
+    helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
+  annotations:
+    "helm.sh/hook": pre-install, pre-upgrade
+    "helm.sh/hook-output-log-policy": hook-failed
+    "helm.sh/hook-weight": "-5"
+    "helm.sh/hook-delete-policy": hook-succeeded, hook-failed
+
+spec:
+  backoffLimit: 0
+  template:
+    metadata:
+      name: "{{ .Release.Name }}"
+      labels:
+        app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
+        app.kubernetes.io/instance: {{ .Release.Name | quote }}
+        helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
+    spec:
+      restartPolicy: Never
+      volumes:
+        - name: preflights
+          configMap:
+            name: "{{ .Release.Name }}-preflight-config"
+      containers:
+      - name: post-install-job
+        image: "replicated/preflight:latest"
+        command: ["preflight", "--interactive=false", "/preflights/preflights.yaml"]
+        volumeMounts:
+        - name: preflights
+          mountPath: /preflights
+
+---
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  annotations:
+    "helm.sh/hook": pre-install, pre-upgrade
+    "helm.sh/hook-weight": "-6"
+    "helm.sh/hook-delete-policy": hook-succeeded, hook-failed
+  labels:
+    app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
+    app.kubernetes.io/instance: {{ .Release.Name | quote }}
+    app.kubernetes.io/version: {{ .Chart.AppVersion }}
+    helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
+  name: "{{ .Release.Name }}-preflight-config"
+data:
+  preflights.yaml: |
+    apiVersion: troubleshoot.sh/v1beta2
+    kind: Preflight
+    metadata:
+      name: preflight-tutorial
+    spec:
+      collectors:
+        {{ if eq .Values.mariadb.enabled false }}
+        - mysql:
+            collectorName: mysql
+            uri: '{{ .Values.externalDatabase.user }}:{{ .Values.externalDatabase.password }}@tcp({{ .Values.externalDatabase.host }}:{{ .Values.externalDatabase.port }})/{{ .Values.externalDatabase.database }}?tls=false'
+        {{ end }}
+      analyzers:
+        - clusterVersion:
+            outcomes:
+              - fail:
+                  when: "< 1.16.0"
+                  message: The application requires at least Kubernetes 1.16.0, and recommends 1.18.0.
+                  uri: https://kubernetes.io
+              - warn:
+                  when: "< 1.18.0"
+                  message: Your cluster meets the minimum version of Kubernetes, but we recommend you update to 1.18.0 or later.
+                  uri: https://kubernetes.io
+              - pass:
+                  message: Your cluster meets the recommended and required versions of Kubernetes.
+        {{ if eq .Values.mariadb.enabled false }}
+        - mysql:
+            checkName: Must be MySQL 8.x or later
+            collectorName: mysql
+            outcomes:
+              - fail:
+                  when: connected == false
+                  message: Cannot connect to MySQL server
+              - fail:
+                  when: version < 8.x
+                  message: The MySQL server must be at least version 8
+              - pass:
+                  message: The MySQL server is ready
+        {{ end }}
+```
+
+Which should yield the following output to stdout:
+
+```text
+$ helm install ./ my-release
+Logs for pod: my-release-preflight-job-bgbz6, container: pre-install-job
+   --- FAIL: Required Kubernetes Version
+      --- The application requires at least Kubernetes 1.16.0, and recommends 1.18.0.
+   --- FAIL: Must be MySQL 8.x or later
+      --- Cannot connect to MySQL server
+--- FAIL   preflight-tutorial
+FAILED
+name: cluster-resources    status: completed       completed: 1    total: 3
+name: mysql/mysql          status: running         completed: 1    total: 3
+name: mysql/mysql          status: completed       completed: 2    total: 3
+name: cluster-info         status: running         completed: 2    total: 3
+name: cluster-info         status: completed       completed: 3    total: 3
+
+Error: INSTALLATION FAILED: failed pre-install: job failed: BackoffLimitExceeded
+
+```