-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Casual RFC: Dynamic host volume plugins #24862
Comments
First of all, thank you for all the work that has gone into this! A Nomad native alternative to CSI is highly appreciated :) Two quick questions that came to mind:
Also, would it not make sense to expose additional information via the |
Thanks for the feedback @henrikjohansen! I think the first two questions are more about the overall feature, where this issue is more specifically about the plugin interface, but I'll try to address all of them.
What kind of telemetry are you looking for? The
Doesn't look like this is in code right now, but I'll see about adding it!
Great suggestion! I'll look into wiring up those values: We had been assuming that |
This is a copy of an internal RFC, seeking community feedback!
This is how the feature is currently implemented (slated for release in Nomad 1.10). Large-scale re-imaginings of the whole system are likely not to be entertained, but we do want to be responsive to concerns or requests to improve ease of use.
Thank you for your attention!
Background
We are building a feature to allow operators to dynamically create Nomad host volumes. Host volumes today require modifying Nomad agent configuration on a client node and restarting the agent. Dynamic host volumes (DHV) allow users to create volumes via Nomad API.
We will not describe the entire feature here. There was lots of discussion in #15489 and in our own internal RFC process. This document describes only the plugin specification and considerations for plugin authors.
When a user sends a request (via Nomad CLI or API) to create (or delete) a volume, the server forwards the request to an appropriate client agent. The client invokes the plugin declared in the volume specification file. A plugin is an executable file that adheres to the specification described by this document.
Of utmost importance for us is ease of adoption, since we expect this feature to be used in bespoke, artisanal environments. Host volume plugins should be easy for non-developer system administrators to write and maintain.
Within the overall orchestration ecosystem, consider two contrasting approaches to plugin architecture: CSI and CNI.
Within HashiCorp’s ecosystem, generally plugins are built with go-plugin (including Nomad task drivers, Terraform providers, Vault plugins, etc). It has many advantages, but gRPC is certainly more difficult to work with than a bash/powershell/python/etc script, which just about every systems administrator in the world has written extensively.
Proposal
External plugins manage the lifecycle of dynamic host volumes.
Inspired by the relative simplicity of CNI, our DHV plugin interface requires only an executable file that adheres to the specification described here. They may be written in any language, even a simple shell script.
A plugin will be registered with Nomad if:
host_volume_plugins
directory specific to this purposefingerprint
call (described below)Each volume specification includes a
name
that must be unique per Nomad client node. Thename
is used in job scheduling to place allocations, as with host volumes today. Additionally, Nomad assigns a volumeID
that is the true unique identifier for the volume -- volumeID
is unique across the whole cluster.Operations
A plugin must implement these operations:
fingerprint
,create
, anddelete
. For all operations, exit code0
indicates success.We pass in the absolutely-required parameters as CLI arguments. They are also passed as environment variables for authors who prefer referring to them by name (and e.g. so that a script may
set -u
to error by name if they are not set). The environment variables are prefixed with"DHV_"
(i.e. Dynamic Host Volume). No parameters are passed in stdin (in contrast to CNI), because while parsing input JSON is not especially complicated, our general data types are flat and simple, so well-suited to environment variables.fingerprint
Called when a Nomad client agent starts (or is reloaded) to discover valid plugins. The returned
"version"
is used to register the plugin on the Nomad node for volume scheduling.CLI arguments:
$1=fingerprint
Environment variables:
Expected response on stdout:
{"version": "0.0.1"}
Requirements:
create
Called when a volume is created with
nomad volume create
. Also called when the client agent is started (as with an agent restart or host reboot).CLI Arguments:
$1=create
$2=/path/to/expected/volume/destination
Environment variables:
Expected response on stdout:
{"path": $HOST_PATH, "bytes": 50000000}
Requirements:
create
, and on nomad client agent (re)startdelete
Called when the volume is deleted with
nomad volume delete
. Also run when an initial volumecreate
operation fails, since it may have been partially completed.CLI Arguments:
$1=delete
$2=/path/to/expected/volume/destination
Environment variables:
Expected response: none; stdout is discarded.
Requirements:
General considerations
Plugin authors should consider these details when writing plugins.
nomad
agent (likely root).name
is only unique per node, plugins that write into a SAN will need to take care not todelete
remote/shared state byname
unless they know that are no other volumes with that name. VolumeID
is unique cluster-wide, but may not be across a group of federated clusters.DHV_PLUGIN_DIR
to refer to the directory.mount_options
. Per-volume configuration should be set in the volumeparameters
. Per-node configuration should be in config file(s) as described above.id
in the volume specification and re-issuecreate
. Plugins are expected to handle this appropriately, or error (exit non-0) if they can not. E.g. if the volume size is changed, and the plugin can not modify the actual size, it should exit non-0 to reject the request.create
while restoring a volume during Nomad agent start will not halt the client. The error will be in client logs, and the volume will not be registered as available on the node.Example plugin
This example is a simple bash script that creates a directory. There is a plugin built into Nomad called
"mkdir"
that does this, but this serves as a basic example.The plugin needs to be placed in an appropriate plugin directory, which is configurable on the client and defaults to:
<nomad data dir>/host_volume_plugins
$ touch custom-mkdir && chmod +x custom-mkdir
Setting "custom-mkdir" as the
plugin_id
in a volume specification will make use of this plugin:Future Work
We may or may not do the following at some point in the future.
healthcheck
command in pluginsDHV_IN_USE
env var or similar whencreate
is called, so a plugin may decide whether to make changes while the volume is in useThe text was updated successfully, but these errors were encountered: