Skip to content

feat(helm): add extensible Gateway API routing with TLS passthrough#2108

Open
krishicks wants to merge 1 commit into
mainfrom
hicks/push-umwsnmztxknl
Open

feat(helm): add extensible Gateway API routing with TLS passthrough#2108
krishicks wants to merge 1 commit into
mainfrom
hicks/push-umwsnmztxknl

Conversation

@krishicks

Copy link
Copy Markdown
Collaborator

Summary

Refactor the Helm chart’s Gateway API configuration under a route-neutral gatewayApi hierarchy, allowing Gateway resources and individual route types to be managed independently. Replace the Envoy TLS termination approach introduced by abcd15d1 with TLS passthrough, preserving end-to-end TLS and mTLS while avoiding certificate management at the Envoy listener.

Related Issue

Follow-up to #2015 (abcd15d1).

Changes

  • Replace the route-specific grpcRoute.gateway values hierarchy with:
    • gatewayApi.gateway for shared Gateway resource configuration.
    • gatewayApi.routes.grpc for independently managed GRPCRoute configuration.
    • gatewayApi.routes.tls for independently managed TLSRoute configuration.
  • Use gatewayApi as the parent key to distinguish the Kubernetes Gateway API integration from the OpenShell gateway server.
  • Allow the chart to create:
    • Only a Gateway.
    • Only a GRPCRoute or TLSRoute attached to a pre-existing Gateway.
    • A Gateway and either or both supported route types.
  • Preserve the fixed port 80 HTTP listener and existing GRPCRoute behavior.
  • Add a fixed port 443 TLS listener in Passthrough mode when TLSRoute and chart-managed Gateway creation are enabled.
  • Add a gateway.networking.k8s.io/v1 TLSRoute that routes by SNI and forwards the encrypted stream to the OpenShell service.
  • Keep TLS termination and client certificate validation at the OpenShell gateway server, preserving existing mTLS behavior through Envoy.
  • Avoid Gateway listener certificate references:
    • The OpenShell server certificate does not need to be reused as an Envoy certificate.
    • Operators do not need to provision a separate edge certificate or Secret.
    • The backend server certificate only needs a SAN covering the external SNI hostname.
  • Reject TLSRoute passthrough when server.disableTls=true.
  • Add migration validation for the removed grpcRoute values.
  • Standardize the local Envoy Gateway setup and documentation on gateway-helm v1.8.1.
  • Add Helm unit tests for Gateway-only, route-only, combined, pre-existing Gateway, TLS passthrough, hostname, and invalid plaintext configurations.
  • Update CI overlays, generated Helm documentation, the Kubernetes ingress guide, Skaffold guidance, and cluster troubleshooting instructions.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

@krishicks

krishicks commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

cc @zhaohuabing I think this solves the problem you created #2015 to fix while retaining mTLS support, and overall being a simpler implementation as it does TLS passthrough instead of termination.

This revises the ingress approach introduced by commit abcd15d to use TLS
passthrough instead of termination at the Envoy proxy. Envoy reads the
ClientHello SNI for routing and forwards the encrypted stream to the OpenShell
gateway, which remains the TLS termnation endpoint. That preserves end-to-end
TLS and client certificate authentication, so existing mTLS continues to work
through Envoy.

Passthrough also avoids certificate management at the Envoy listener. It
does not require reusing the OpenShell server certificate as a Gateway
certificate or provisioning a separate edge certificate and Secret.
The server certificate only needs to cover the external SNI hostname.

This also restores the HTTP listener on the Gateway, adding TLS as an
additional listener when enabled.

The values were restructured to be under a gatewayApi umbrella rather than
grpcRoute as there are now both GRPCRoute and TLSRoute resources.

Signed-off-by: Kris Hicks <khicks@nvidia.com>
@krishicks krishicks force-pushed the hicks/push-umwsnmztxknl branch from 6024a4d to 898069c Compare July 2, 2026 15:02
@zhaohuabing

zhaohuabing commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Hi @krishicks, thanks for pushing this forward — I really like the route-neutral gatewayApi.{gateway,routes} refactor. I'd like to reconsider the TLS approach before we lock it in, though.

My concern with Passthrough is that it makes Envoy a pure L4 SNI router. In that mode EG never sees the decrypted HTTP/2 stream, so none of its L7 surface applies — no SecurityPolicy (JWT/OIDC validation, ext-auth, rate limiting, CORS), no BackendTrafficPolicy (retries, circuit breaking, health checks, LB policy), no ClientTrafficPolicy. Functionally it's equivalent to a pure Service type=LoadBalancer, which makes me question what adopting Gateway API buys us on this path.

If in-cluster TLS is desired, I think we can keep the end-to-end-TLS guarantee that motivated passthrough and unlock L7, via terminate + re-encrypt:

client → HTTPS → EG :443 (Terminate) → [GRPCRoute + L7 policies] → re-encrypt (one-way TLS) → HTTPS → openshell gateway

The only thing passthrough preserves that termination gives up is the client's cert-based mTLS identity at the edge — but that advantage is moot in exactly the context where an EG ingress runs. Client-cert user identity is by design a single-user, local-gateway feature (Docker/Podman/VM); our own security guidance says Kubernetes deployments should use OIDC or a trusted proxy for user auth, and treats the transport certs as transport-only (docs/security/best-practices.mdx). The code defaults enforce the same split — mTLS user auth doesn't auto-enable on the Kubernetes driver and is disabled whenever an OIDC issuer is set.

An EG ingress is inherently a Kubernetes, multi-user deployment, so passthrough is going out of its way to preserve a cert identity that has no supported use in its own deployment context — while giving up the L7 features that same deployment actually wants.

Concretely this would mean a Terminate 443 listener with certificateRefs, the GRPCRoute pointed at it, and a BackendTLSPolicy (the upstream Gateway API one, not EG's BackendTrafficPolicy) referencing the chart CA with an SNI matching a server-cert SAN. Passthrough could stay as an opt-in for operators who specifically need end-to-end cert identity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants