Skip to content

feat(kubernetes): support PVC subPath driver config#2034

Open
mjamiv wants to merge 7 commits into
NVIDIA:mainfrom
mjamiv:codex/kubernetes-pvc-subpath
Open

feat(kubernetes): support PVC subPath driver config#2034
mjamiv wants to merge 7 commits into
NVIDIA:mainfrom
mjamiv:codex/kubernetes-pvc-subpath

Conversation

@mjamiv

@mjamiv mjamiv commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Add Kubernetes driver-config support for mounting existing PVCs into the agent container with optional sub_path values. This gives deployments a narrow driver-owned storage topology for durable per-user PVC data without broad PodSpec overrides.

Related Issue

Fixes #2033

Changes

  • Adds driver_config.kubernetes.volumes[] for PVC-backed pod volumes.
  • Adds driver_config.kubernetes.containers.agent.volume_mounts[] for agent PVC mounts with optional sub_path.
  • Defaults user-supplied PVC volumes and mounts to read-only unless read_only: false is explicit.
  • Validates duplicate names, unknown volume references, protected mount paths, invalid sub_path values, and attempts to weaken read-only PVC volumes.
  • Skips the default /sandbox workspace PVC injection when an explicit Kubernetes driver-config mount targets a path below /sandbox/.
  • Documents the schema and a single-PVC multi-subPath example in the Kubernetes driver README and compute-driver reference docs.

Testing

  • mise run pre-commit passes
    • Not completed: this environment initially lacked mise; after installing it, mise run pre-commit refused to run until the repo config was trusted. I did not run mise trust because it persistently changes local trust settings.
    • Ran direct underlying checks instead:
      • cargo fmt --all -- --check
      • cargo test -p openshell-driver-kubernetes
      • cargo clippy -p openshell-driver-kubernetes --all-targets -- -D warnings
      • npx --yes markdownlint-cli2@0.22.0 crates/openshell-driver-kubernetes/README.md docs/reference/sandbox-compute-drivers.mdx
      • UV_CACHE_DIR=.cache/uv uv run python scripts/update_license_headers.py --check
  • Unit tests added/updated
    • Added positive validation coverage for a read-write PVC with multiple read-write subPath mounts.
  • E2E tests added/updated (if applicable)
    • Not run locally; a k3d or Kubernetes cluster proof with a pre-existing PVC should verify actual subPath mounts and pod recreation behavior.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)
    • Published docs and the Kubernetes driver README are updated; no top-level architecture doc was needed for this driver-local config addition.

Signed-off-by: mjamiv <michael.commack@gmail.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@austin-shih

Copy link
Copy Markdown

Thanks for this — the caller-owned-PVC-by-claim_name + subPath model is exactly what disposable-pod / durable-per-user-storage deployments need. A few test-scope observations from the perspective of someone planning to run this pattern:

  1. Positive validation coverage. The render test exercises a writable PVC, but it goes through sandbox_to_k8s_spec, which doesn't run validate(). The from_template tests are all rejection cases, so there's no happy-path assertion that a writable config is actually accepted. A test that from_template returns Ok for a read-write PVC with several RW subPath mounts (the real per-user shape: workspace, memory, sessions) would lock that path in.

  2. Missing-PVC behavior is untested. When a referenced claim_name doesn't exist, behavior currently falls to the kubelet's FailedMount. Worth covering (ideally in e2e) so the failure surfaces clearly rather than as a stuck pod.

  3. E2E is where the core claims live. The unit tests prove the pod template renders; they can't prove that a pre-existing PVC actually mounts at the subPaths, that the agent can read/write, that the default /sandbox injection is correctly skipped, and — most importantly — that data persists across pod recreation against the same PVC. The PR notes e2e wasn't run; a k3d-based e2e (pre-create PVC → create sandbox with this driver_config → verify mounts + RW → recreate referencing the same PVC → data persists) would de-risk the durability story.

Happy to help validate — we're deploying this exact pattern and can share e2e findings.

Signed-off-by: mjamiv <michael.commack@gmail.com>
@mjamiv

mjamiv commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the test-scope review. I added positive from_template validation coverage for a read-write PVC with multiple read-write sub_path mounts (workspace, memory, sessions) and also normalized the example names in the docs/tests to keep the contribution generic.

I agree the missing-PVC and durability claims need cluster-level coverage: kubelet FailedMount, actual subPath mount behavior, read/write checks, default /sandbox injection being skipped, and data surviving pod recreation all belong in k3d or Kubernetes e2e rather than unit tests.

@mjamiv mjamiv marked this pull request as ready for review June 29, 2026 01:43
WORKSPACE_VOLUME_NAME,
];

const KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS: &[&str] = &[

@elezar elezar Jun 29, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we redefining what should be shared across drivers (see driver_mounts::RESERVED_MOUNT_TARGETS)? Note that the openshell_core::driver_utils::SUPERVISOR_CONTAINER_* constants added to the list later would already be covered by /opt/openshell which is included in the shared list.

@elezar

elezar commented Jun 29, 2026

Copy link
Copy Markdown
Member

Review: PVC subPath driver config

Thanks for this — the direction is good and it correctly routes subpath/source/target validation through the shared openshell_core::driver_mounts module. Two themes below: code reuse (so the K8s driver doesn't become a fourth divergent copy of mount logic) and implementation issues (a few I'd gate merge on).

Code reuse / overlap with other drivers

All container drivers already converge on openshell_core::driver_mounts (Docker lib.rs:1779-1909, Podman container.rs:546-688), so subpath rules, NUL/empty checks, .. rejection, and the /sandbox + reserved-target base list are already unified. This PR reuses most of that — good. Remaining duplication worth consolidating:

  1. path_is_or_under is an exact copy of the private fn in driver_mounts.rs:99-104. Make the shared one pub and import it instead of re-declaring.
  2. default_true() is now defined a third time (already in docker/lib.rs:342 and podman/container.rs:136). Along with the #[serde(default = "default_true")] read_only: bool idiom, this belongs in the shared module.
  3. Duplicate mount-target detection is identical in three places — Docker (lib.rs:1857-1917), Podman (container.rs:625-681), and now here. A shared validate_unique_mount_targets(&[...], driver_name) would collapse all three.
  4. Protected-path list overlaps and risks drift. KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS repeats /etc/openshell, /etc/openshell-tls, /run/netns (already in driver_mounts::RESERVED_MOUNT_TARGETS) and redundantly lists SUPERVISOR_CONTAINER_DIR / SUPERVISOR_CONTAINER_BINARY, both already under /opt/openshell which the shared list covers. Since validate_kubernetes_mount_path calls validate_container_mount_target first, keep only the K8s-unique entries (/var/run/secrets/openshell, /spiffe-workload-api) locally and let the shared list cover the rest — otherwise the two lists will silently diverge.
  5. Consider promoting the stricter path checks into the shared validator. validate_kubernetes_mount_path adds rejection of internal empty segments (/a//b), . segments, and surrounding whitespace — checks validate_container_mount_target lacks today (Docker/Podman currently accept /a//b). If these are correctness improvements, hoist them so every driver benefits and the K8s wrapper shrinks to just the protected-list check.

Correctly kept driver-local (don't hoist): the reserved volume-name list (PVC names are K8s-only), the PVC config structs, and the SPIFFE runtime-conflict check.

Note for reviewers: subpath support is intentionally inconsistent across drivers — Docker supports it on volume mounts, Podman rejects it (reject_subpath). This PR adding subpath for PVC mounts matches Docker; it's deliberate, not an oversight.

Implementation issues

Gate on these:

  1. Validation is split across three sites with inconsistent handling of the same check. validate_kubernetes_mount_path is used in validate() (returns Err), in kubernetes_driver_volume_mount_to_k8s (.expect()panics), and in has_explicit_sandbox_data_mount (.ok()silently ignores). In the normal flow create_sandbox validates before rendering, so the panic is currently unreachable — but this PR widens the panic surface, since from_template() now fails on semantic checks (duplicate/reserved names, bad subpath, protected paths, read-only weakening), not just serde errors. Any future render path not preceded by validate_driver_config_for_sandbox will panic the driver instead of returning InvalidArgument. Render functions should take an already-validated, normalized config rather than re-parsing and .expect()-ing.

  2. Re-validation at render exists only to recompute the normalized string. kubernetes_driver_volume_mount_to_k8s re-runs validate_kubernetes_mount_path / validate_mount_subpath purely to get the normalized mountPath/subPath. Validate once, store the normalized values on the struct, render from those — removes the redundant work and the panic in (1).

  3. Reserved volume-name list duplicates pod-spec string literals. KUBERNETES_DRIVER_RESERVED_VOLUME_NAMES hardcodes "openshell-client-tls" and "openshell-sa-token"; the pod spec hardcodes the same literals (driver.rs:1465,1478). They match today, but if either managed volume is renamed the guard silently stops matching and a user volume could collide. Extract both to named constants used in both places (the other three already use constants).

Design — worth discussing:

  1. Any subpath mount under /sandbox disables all default workspace persistence. has_explicit_sandbox_data_mount flips inject_workspace off for any mount at-or-under /sandbox, so the documented example (/sandbox/.openshell/workspace, /sandbox/.openshell/memory) makes /sandbox itself ephemeral image content while only the two leaf subpaths persist. Kubernetes supports nested volumeMounts, so disabling the whole workspace isn't strictly required — it's a coarse simplification that silently drops persistence for everything else under /sandbox. At minimum this needs a prominent warning in the docs/README; better, reconsider whether the default workspace should coexist with narrow user subpath mounts.

  2. Double read-only opt-in is easy to trip. Both the PVC volume and the mount default to read_only: true, and the validator rejects a RW mount on a RO volume — so RW requires read_only: false in two places. Consider defaulting the mount's read_only from the volume's value to halve the opt-in.

Minor:

  1. validate_kubernetes_reference_name and the claim_name check don't enforce DNS-1123 label syntax, so invalid-but-printable names pass driver validation and fail later at the API server.
  2. validate_kubernetes_driver_runtime_mounts re-runs validate_kubernetes_mount_path on every mount already validated by validate() (~3× per create) — folds away once (1)/(2) land.

Checked and not issues: the reserved volume-name list does currently cover all five managed volume names; protected-path coverage of managed mounts (TLS, sa-token, supervisor) is complete via the protected list + shared RESERVED_MOUNT_TARGETS; mount_path_conflicts_with_protected_path correctly also blocks mounting a parent of a reserved path.

Net: approve the direction; I'd like 1–3 (and ideally reuse items 1–4) addressed before merge.

@elezar elezar left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: PVC subPath driver config

Thanks for this — the direction is good and it correctly routes subpath/source/target validation through the shared openshell_core::driver_mounts module. Two themes below: code reuse (so the K8s driver doesn't become a fourth divergent copy of mount logic) and implementation issues (a few I'd gate merge on).

Code reuse / overlap with other drivers

All container drivers already converge on openshell_core::driver_mounts (Docker lib.rs:1779-1909, Podman container.rs:546-688), so subpath rules, NUL/empty checks, .. rejection, and the /sandbox + reserved-target base list are already unified. This PR reuses most of that — good. Remaining duplication worth consolidating:

  1. path_is_or_under is an exact copy of the private fn in driver_mounts.rs:99-104. Make the shared one pub and import it instead of re-declaring.
  2. default_true() is now defined a third time (already in docker/lib.rs:342 and podman/container.rs:136). Along with the #[serde(default = "default_true")] read_only: bool idiom, this belongs in the shared module.
  3. Duplicate mount-target detection is identical in three places — Docker (lib.rs:1857-1917), Podman (container.rs:625-681), and now here. A shared validate_unique_mount_targets(&[...], driver_name) would collapse all three.
  4. Protected-path list overlaps and risks drift. KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS repeats /etc/openshell, /etc/openshell-tls, /run/netns (already in driver_mounts::RESERVED_MOUNT_TARGETS) and redundantly lists SUPERVISOR_CONTAINER_DIR / SUPERVISOR_CONTAINER_BINARY, both already under /opt/openshell which the shared list covers. Since validate_kubernetes_mount_path calls validate_container_mount_target first, keep only the K8s-unique entries (/var/run/secrets/openshell, /spiffe-workload-api) locally and let the shared list cover the rest — otherwise the two lists will silently diverge.
  5. Consider promoting the stricter path checks into the shared validator. validate_kubernetes_mount_path adds rejection of internal empty segments (/a//b), . segments, and surrounding whitespace — checks validate_container_mount_target lacks today (Docker/Podman currently accept /a//b). If these are correctness improvements, hoist them so every driver benefits and the K8s wrapper shrinks to just the protected-list check.

Correctly kept driver-local (don't hoist): the reserved volume-name list (PVC names are K8s-only), the PVC config structs, and the SPIFFE runtime-conflict check.

Note for reviewers: subpath support is intentionally inconsistent across drivers — Docker supports it on volume mounts, Podman rejects it (reject_subpath). This PR adding subpath for PVC mounts matches Docker; it's deliberate, not an oversight.

Implementation issues

Gate on these:

  1. Validation is split across three sites with inconsistent handling of the same check. validate_kubernetes_mount_path is used in validate() (returns Err), in kubernetes_driver_volume_mount_to_k8s (.expect()panics), and in has_explicit_sandbox_data_mount (.ok()silently ignores). In the normal flow create_sandbox validates before rendering, so the panic is currently unreachable — but this PR widens the panic surface, since from_template() now fails on semantic checks (duplicate/reserved names, bad subpath, protected paths, read-only weakening), not just serde errors. Any future render path not preceded by validate_driver_config_for_sandbox will panic the driver instead of returning InvalidArgument. Render functions should take an already-validated, normalized config rather than re-parsing and .expect()-ing.

  2. Re-validation at render exists only to recompute the normalized string. kubernetes_driver_volume_mount_to_k8s re-runs validate_kubernetes_mount_path / validate_mount_subpath purely to get the normalized mountPath/subPath. Validate once, store the normalized values on the struct, render from those — removes the redundant work and the panic in (1).

  3. Reserved volume-name list duplicates pod-spec string literals. KUBERNETES_DRIVER_RESERVED_VOLUME_NAMES hardcodes "openshell-client-tls" and "openshell-sa-token"; the pod spec hardcodes the same literals (driver.rs:1465,1478). They match today, but if either managed volume is renamed the guard silently stops matching and a user volume could collide. Extract both to named constants used in both places (the other three already use constants).

Design — worth discussing:

  1. Any subpath mount under /sandbox disables all default workspace persistence. has_explicit_sandbox_data_mount flips inject_workspace off for any mount at-or-under /sandbox, so the documented example (/sandbox/.openshell/workspace, /sandbox/.openshell/memory) makes /sandbox itself ephemeral image content while only the two leaf subpaths persist. Kubernetes supports nested volumeMounts, so disabling the whole workspace isn't strictly required — it's a coarse simplification that silently drops persistence for everything else under /sandbox. At minimum this needs a prominent warning in the docs/README; better, reconsider whether the default workspace should coexist with narrow user subpath mounts.

  2. Double read-only opt-in is easy to trip. Both the PVC volume and the mount default to read_only: true, and the validator rejects a RW mount on a RO volume — so RW requires read_only: false in two places. Consider defaulting the mount's read_only from the volume's value to halve the opt-in.

Minor:

  1. validate_kubernetes_reference_name and the claim_name check don't enforce DNS-1123 label syntax, so invalid-but-printable names pass driver validation and fail later at the API server.
  2. validate_kubernetes_driver_runtime_mounts re-runs validate_kubernetes_mount_path on every mount already validated by validate() (~3× per create) — folds away once (1)/(2) land.

Checked and not issues: the reserved volume-name list does currently cover all five managed volume names; protected-path coverage of managed mounts (TLS, sa-token, supervisor) is complete via the protected list + shared RESERVED_MOUNT_TARGETS; mount_path_conflicts_with_protected_path correctly also blocks mounting a parent of a reserved path.

Net: approve the direction; I'd like 1–3 (and ideally reuse items 1–4) addressed before merge.

@mjamiv

mjamiv commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

@austin-shih PR updated and ready for review.

@austin-shih

Copy link
Copy Markdown

Re-reviewed 3e86e520 — the gating items look addressed: normalization moved into validate() so the render helpers read normalized values (no re-parse/.expect()), reserved volume names are now shared constants, and path_is_or_under / default_true / validate_unique_mount_targets live in driver_mounts with Docker + Podman switched over. DNS-1123 validation and the /sandbox-injection doc note are in too. (One note: #5 — defaulting the mount's read_only from the volume — reads as documented rather than implemented; the double opt-in remains, which is fine by us, just flagging.)

Remaining gap is the cluster-level coverage we discussed: missing-PVC FailedMount and data-persisting-across-pod-recreation. We're set up to run that as a k3d e2e against this branch and will share results, so the durability claim has a proof.

@mjamiv mjamiv requested a review from elezar June 29, 2026 16:57
Comment thread crates/openshell-driver-kubernetes/README.md Outdated
&self.volumes,
&mut self.containers.agent.volume_mounts,
)
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the normalize_ functions, I think that they can all actually be validate_* functions. Note that I've created #2086 to make the driver_mounts::validate_* functions consistent with this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 96ac311. The Kubernetes config pass is now named validate and the helpers no longer mutate the deserialized config. I left the broader shared driver_mounts validator cleanup to #2086.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2086 is merged. Let's rebase this one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a9cf61c. Merged current upstream/main, including #2086, into this branch. The PR is mergeable again and the driver_mounts changes now follow main

Comment thread crates/openshell-driver-kubernetes/src/driver.rs
Ok(normalized)
}

fn validate_kubernetes_mount_path(mount_path: &str) -> Result<String, String> {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be modifying the path in a validate function.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 96ac311. The mount_path and sub_path validation helpers now return validation success only; non-normalized values are rejected instead of written back into the config.

let spiffe_mount_path = spiffe_socket_mount_path(socket_path);
for mount in &config.containers.agent.volume_mounts {
let mount_path = mount.mount_path.as_str();
if mount_path_conflicts_with_protected_path(mount_path, &spiffe_mount_path) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you comment on why the spiffe path is handled separately from other mounts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment in 96ac311. The static OpenShell-managed paths are validated during from_template, while the SPIFFE socket mount path is derived from gateway runtime config and can only be checked once that runtime path is available.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair explanation of why the split exists, but there's a related issue worth addressing: /spiffe-workload-api currently appears in both KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS and the dynamic check in validate_kubernetes_driver_runtime_mounts. The static entry is a hard-coded approximation of what is actually a runtime-derived path — if a gateway configures a non-default SPIFFE socket path, the static entry blocks the wrong directory while the dynamic check is doing the real work.

A cleaner approach would be to split the validation into two explicit concerns:

  • Structural correctness (absolute path, normalized, no .., no NUL bytes) — stays in validate_kubernetes_mount_path, runs at from_template time, no change needed there.
  • Protected path conflicts — moves entirely to kubernetes_driver_config_for_spec, where the complete set of protected paths (static + runtime-derived) is assembled once and passed to a single conflict-check pass.

This eliminates validate_kubernetes_driver_runtime_mounts as a separate function, removes /spiffe-workload-api from the static constant (leaving only paths that are truly always protected regardless of runtime config), and makes KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS an accurate invariant rather than a conservative approximation. It also scales cleanly to additional dynamic paths: kubernetes_driver_config_for_spec simply pushes each one onto protected_paths before the single conflict-check call, or accepts a list as input, rather than requiring a new validation function and loop per path.

Roughly:

-const KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS: &[&str] =
-    &[SERVICE_ACCOUNT_TOKEN_MOUNT_PATH, "/spiffe-workload-api"];
+const KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS: &[&str] = &[SERVICE_ACCOUNT_TOKEN_MOUNT_PATH];

 fn validate_kubernetes_mount_path(mount_path: &str) -> Result<(), String> {
     // structural checks only: whitespace, normalization, absolute path, etc.
-    for protected_path in KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS {
-        if mount_path_conflicts_with_protected_path(mount_path, protected_path) {
-            return Err(format!(
-                "mount path '{mount_path}' conflicts with reserved OpenShell path '{protected_path}'"
-            ));
-        }
-    }
     Ok(())
 }

+fn validate_kubernetes_protected_path_conflicts(
+    volume_mounts: &[KubernetesDriverVolumeMountConfig],
+    protected_paths: &[&str],
+) -> Result<(), String> {
+    for mount in volume_mounts {
+        let mount_path = mount.mount_path.as_str();
+        for protected_path in protected_paths {
+            if mount_path_conflicts_with_protected_path(mount_path, protected_path) {
+                return Err(format!(
+                    "mount path '{mount_path}' conflicts with reserved OpenShell path '{protected_path}'"
+                ));
+            }
+        }
+    }
+    Ok(())
+}
+
-fn validate_kubernetes_driver_runtime_mounts(
-    config: &KubernetesSandboxDriverConfig,
-    provider_spiffe_workload_api_socket_path: Option<&str>,
-) -> Result<(), String> {
-    let Some(socket_path) = provider_spiffe_workload_api_socket_path else {
-        return Ok(());
-    };
-    let spiffe_mount_path = spiffe_socket_mount_path(socket_path);
-    for mount in &config.containers.agent.volume_mounts {
-        let mount_path = mount.mount_path.as_str();
-        if mount_path_conflicts_with_protected_path(mount_path, &spiffe_mount_path) {
-            return Err(format!(
-                "mount path '{mount_path}' conflicts with reserved OpenShell path '{spiffe_mount_path}'"
-            ));
-        }
-    }
-    Ok(())
-}
-
 fn kubernetes_driver_config_for_spec(
     spec: Option<&SandboxSpec>,
     provider_spiffe_workload_api_socket_path: Option<&str>,
 ) -> Result<KubernetesSandboxDriverConfig, String> {
     let config = ...?;
-    validate_kubernetes_driver_runtime_mounts(&config, provider_spiffe_workload_api_socket_path)?;
+    let mut protected_paths: Vec<&str> = KUBERNETES_DRIVER_PROTECTED_MOUNT_PATHS.to_vec();
+    let spiffe_mount_path;
+    if let Some(socket_path) = provider_spiffe_workload_api_socket_path {
+        spiffe_mount_path = spiffe_socket_mount_path(socket_path);
+        protected_paths.push(&spiffe_mount_path);
+    }
+    validate_kubernetes_protected_path_conflicts(
+        &config.containers.agent.volume_mounts,
+        &protected_paths,
+    )?;
     Ok(config)
 }

Happy to open a follow-up issue or take a crack at the patch if useful.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a9cf61c. Removed /spiffe-workload-api from the static Kubernetes protected-path list, moved protected-path conflict checking into kubernetes_driver_config_for_spec, and now assemble static plus runtime-derived protected paths in one pass before rendering.

Ok(())
}

fn validate_kubernetes_dns1123_label(value: &str, field: &str) -> Result<String, String> {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Are we validating general resource names here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 96ac311. Volume names and volume_mount references now use DNS-1123 label validation, while PVC claim_name uses DNS-1123 subdomain validation. Docs and tests were updated for that split.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One cleanup worth considering here: validate_kubernetes_name_text creates an indirect dependency on driver_mounts::validate_mount_source (a mount-specific helper) just to check whitespace and NUL bytes — neither of which belongs in DNS name validation. The whitespace check is also redundant, since whitespace isn't in the allowed character set and is_kubernetes_dns1123_label would reject it anyway with an equally descriptive error.

is_kubernetes_dns1123_label is also a local reimplementation of is_dns_label from openshell-core/src/config.rs (and openshell-server/src/grpc/service.rs). Moving these to openshell-core as a follow-up would be easier if the local versions are already self-contained and structurally identical to what they'll replace.

Proposed cleanup (all within this PR's scope since these functions were introduced here):

-fn validate_kubernetes_name_text(value: &str, field: &str) -> Result<(), String> {
-    if value != value.trim() {
-        return Err(format!(
-            "{field} must not contain leading or trailing whitespace"
-        ));
-    }
-    driver_mounts::validate_mount_source(value, field).map(|_| ())
+// TODO: replace with openshell_core::k8s_names::is_dns_label once available
+fn is_dns_label(label: &str) -> bool {
+    if label.is_empty() || label.len() > 63 || label.starts_with('-') || label.ends_with('-') {
+        return false;
+    }
+    label
+        .bytes()
+        .all(|byte| byte.is_ascii_lowercase() || byte.is_ascii_digit() || byte == b'-')
 }
 
-fn is_kubernetes_dns1123_label(value: &str) -> bool { ... }
+// TODO: replace with openshell_core::k8s_names::is_dns_subdomain once available
+fn is_dns_subdomain(value: &str) -> bool {
+    value.len() <= 253 && value.split('.').all(is_dns_label)
+}
 
 fn validate_kubernetes_dns1123_label(value: &str, field: &str) -> Result<(), String> {
-    validate_kubernetes_name_text(value, field)?;
-    if !is_kubernetes_dns1123_label(value) {
+    if !is_dns_label(value) {
         return Err(format!("..."));
     }
     Ok(())
 }
 
 fn validate_kubernetes_dns1123_subdomain(value: &str, field: &str) -> Result<(), String> {
-    validate_kubernetes_name_text(value, field)?;
-    let is_dns1123_subdomain = value.len() <= 253 && value.split('.').all(is_kubernetes_dns1123_label);
-    if !is_dns1123_subdomain {
+    if !is_dns_subdomain(value) {
         return Err(format!("..."));
     }
     Ok(())
 }

The is_dns_label implementation here matches openshell-core/src/config.rs exactly, so the follow-up extraction becomes a one-line import swap.

Comment on lines +334 to +339
let normalized = driver_mounts::validate_mount_source(value, field)?;
if normalized != value {
return Err(format!(
"{field} must not contain leading or trailing whitespace"
));
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about just: if value != value.trim()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 96ac311. The whitespace check is now the direct value != value.trim() guard before the common source validation runs.

Comment on lines +529 to +536
validate_kubernetes_driver_runtime_mounts(
&config,
self.config.provider_spiffe_enabled().then_some(
self.config
.provider_spiffe_workload_api_socket_path
.as_str(),
),
)?;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quuestion: Since we call validate / normalize in KubernetesSandboxDriverConfig::from_template, does it not make sense to also check this there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done/clarified in 96ac311. Template-owned protected paths are checked during from_template; the remaining SPIFFE conflict depends on runtime gateway config. The new spec-level helper handles that runtime-dependent validation before rendering.

Comment on lines +453 to +726
let _ = KubernetesSandboxDriverConfig::from_sandbox(sandbox)
let driver_config = self
.validate_driver_config_for_sandbox(sandbox)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that neither sandbox nor driver_config can change after this point, does it make sense to rather only construct driver_config in sandbox_to_k8s_spec instead of passing it as an argument?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 96ac311. sandbox_to_k8s_spec now constructs and validates the Kubernetes driver_config itself and returns a Result. create_sandbox renders the spec before Kubernetes API discovery so invalid driver_config still returns InvalidArgument before any API call.

Signed-off-by: mjamiv <michael.commack@gmail.com>
@mjamiv

mjamiv commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

@elezar Updated in 96ac311 for the July 1 review pass.

What changed:

  • Renamed the Kubernetes config normalization pass to validation and made it non-mutating.
  • Removed the Kubernetes field-level read_only serde defaults and rolled back the driver_mounts::default_true move so Docker/Podman keep local helpers.
  • Split Kubernetes name checks: pod volume names and mount references use DNS-1123 labels; PVC claim_name uses DNS-1123 subdomain validation.
  • Added a comment explaining why the SPIFFE mount conflict check is runtime-config-dependent.
  • Moved Kubernetes driver_config construction/validation into sandbox_to_k8s_spec, which now returns Result; create_sandbox renders before Kubernetes API discovery so invalid config returns InvalidArgument before any API call.
  • Reordered the README accepted-key list so containers.agent settings are colocated.

Validation run locally:

  • cargo test -p openshell-driver-kubernetes driver_config
  • cargo test -p openshell-core driver_mounts
  • cargo test -p openshell-driver-docker driver_config
  • cargo test -p openshell-driver-podman driver_config
  • cargo check -p openshell-core -p openshell-driver-kubernetes -p openshell-driver-docker -p openshell-driver-podman
  • git diff --check

@mjamiv mjamiv requested a review from elezar July 1, 2026 13:20
|| driver_mounts::path_is_or_under(protected_path, mount_path)
}

fn kubernetes_driver_volume_to_k8s(volume: &KubernetesDriverVolumeConfig) -> serde_json::Value {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use the existing k8s-openapi types here instead of assembling loose JSON? We could implement From<&KubernetesDriverVolumeConfig> for Volume and the
equivalent for VolumeMount, then serialize those at the dynamic JSON boundary. This would keep Kubernetes field naming and optional-field handling schema-
aware without broadening the PR into refactoring the rest of the pod-template builder.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TaylorMutch do you have thoughts one this?

}

fn validate_kubernetes_mount_path(mount_path: &str) -> Result<(), String> {
if mount_path != mount_path.trim() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the structural checks in this function are already provided by driver_mounts::validate_container_mount_target on main (which rejects whitespace, NUL bytes, non-absolute paths, .., /sandbox, and the shared reserved paths without normalizing). The only checks here that aren't already shared are the empty-segment (//) and .-segment cases.

If those two checks were added to validate_container_mount_target in driver_mounts, this function could be removed entirely — the structural validation would be shared across all drivers automatically. Combined with the suggestion on the SPIFFE comment to move protected-path conflict checking into kubernetes_driver_config_for_spec, there would be no Kubernetes-specific mount path validation function needed at all:

  • Structural correctness → driver_mounts::validate_container_mount_target (shared, all drivers)
  • Protected path conflicts → kubernetes_driver_config_for_spec (Kubernetes-specific, assembled at runtime)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a9cf61c. Added the internal empty-segment and dot-segment checks to driver_mounts::validate_container_mount_target, removed the Kubernetes-specific mount path/subpath validation helpers, and now use the shared target/subpath validators from Kubernetes, Docker, and Podman paths.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That's much cleaner now. Does the implementation also handle .. segments?

…c-subpath

Signed-off-by: mjamiv <michael.commack@gmail.com>

# Conflicts:
#	crates/openshell-core/src/driver_mounts.rs
#	crates/openshell-driver-docker/src/lib.rs
#	crates/openshell-driver-kubernetes/src/driver.rs
#	crates/openshell-driver-podman/src/container.rs
"mount target must be normalized and must not contain empty path segments or '.'"
);
assert_eq!(
validate_container_mount_target("/sandbox/work/./tmp").unwrap_err(),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What bout /sandbox/work/../../tmp?

path_is_or_under(mount_path, protected_path) || path_is_or_under(protected_path, mount_path)
}

fn path_is_or_under(path: &str, parent: &str) -> bool {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates driver_mounts::path_is_or_under, which is already public. The local copy can be removed and all three call sites updated to use the shared function instead.

@mjamiv

mjamiv commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Hi @elezar, I pushed 83e2d1d1 with the small follow-up fixes from the open threads:

  • Made driver_mounts::path_is_or_under public and removed the Kubernetes-local copy.
  • Added regression coverage for /sandbox/work/../../tmp to pin the existing ParentDir rejection.
  • Made the Kubernetes DNS label/subdomain helpers self-contained, removing the mount-source validation dependency.
  • Used k8s_openapi core/v1 Volume, VolumeMount, and PersistentVolumeClaimVolumeSource for the driver-config PVC adapters, serializing only at the existing dynamic JSON boundary.

Local validation passed:

cargo test -p openshell-core driver_mounts
cargo test -p openshell-driver-kubernetes driver_config
cargo test -p openshell-driver-docker driver_config
cargo test -p openshell-driver-podman driver_config
cargo check -p openshell-core -p openshell-driver-kubernetes -p openshell-driver-docker -p openshell-driver-podman
git diff --check

The required OpenShell / Branch Checks and OpenShell / Helm Lint contexts are still waiting for /ok to test mirror. Could you please re-review the latest head and, if appropriate, start the mirror gate so CI can run? Trying to keep this moving while the review context is fresh. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(kubernetes): support PVC subPath mounts via driver_config

3 participants