Skip to content

tailscale-operator

Metadata

Field Value
Service tailscale-operator
Purpose Expose ozilab through Tailscale and Tailscale Funnel without a public IP or inbound NAT
Criticality Tier 1
Owner Platform / Networking owner
Clusters ozilab
Namespace tailscale
Exposure VPN and internet via Funnel
Stateful no
Backup class none
RPO / RTO Git-backed config, 15 to 30 minutes to restore
Last reviewed 2026-04-22

1. Service Overview

Tailscale Operator replaces cloudflared for the ozilab cluster edge path. It gives the cluster an outbound-only connectivity model that works in the corporate environment, then exposes Traefik through a Tailscale Funnel ingress.

Summary

Without this service, ozilab cannot publish HTTP services from inside the corporate network without reopening the Cloudflare Tunnel path that is currently blocked by the firewall.

Dependencies

Dependency Type Why it matters
traefik ingress Shared L7 router and TLS endpoint for cluster workloads
Tailscale control plane control plane Authenticates the operator and provisions ingress proxies
Cloudflare DNS DNS Keeps the public domain and points it at the generated Funnel hostname
Tailnet ACL policy identity Grants tag ownership, Funnel permission, and OAuth client scope

2. Architecture Diagram

[Client]
  -> [Cloudflare DNS / proxy]
  -> [ozilab-edge.<tailnet>.ts.net]
  -> [Tailscale Funnel ingress proxy]
  -> [Service traefik:443]
  -> [Traefik routes by Host header]
  -> [Cluster services]

[Tailscale operator]
  -> [Tailscale control plane]
  -> [OAuth client and tagged proxy devices]

3. Deployment Specifications

Item Value
Source path tailscale-operator
Deployment model Fleet native Helm chart with Kustomize post-render
Namespace tailscale
Workload kind Deployment plus Tailscale-managed ingress proxy StatefulSet
Chart or image version tailscale-operator chart 1.96.5, appVersion v1.96.5
Config files fleet.yaml, overlays/ozilab/kustomization.yaml, overlays/ozilab/values.yaml, overlays/ozilab/.operator-oauth.env.example, overlays/ozilab/traefik-funnel-ingress.yaml

Cluster mapping

Cluster Overlay path Notes
ozilab tailscale-operator/overlays/ozilab Replaces cloudflared as the edge transport for Traefik

4. Configuration Guide

Environment variables

Fleet renders the external Tailscale Helm chart directly from tailscale-operator/fleet.yaml, then runs the ozilab Kustomize overlay as a post-render step to add the namespace, bootstrap Secret, and Funnel ingress.

The Helm chart mounts a precreated Secret named operator-oauth into the operator pod. The Secret is generated locally by Kustomize from a non-committed env file.

Variable Source Purpose Secret?
client_id tailscale-operator/overlays/ozilab/.operator-oauth.env via secretGenerator to Secret operator-oauth OAuth client identity for the operator yes
client_secret tailscale-operator/overlays/ozilab/.operator-oauth.env via secretGenerator to Secret operator-oauth OAuth client secret for the operator yes
operatorConfig.hostname tailscale-operator/overlays/ozilab/values.yaml Device hostname registered in the tailnet no
proxyConfig.defaultTags tailscale-operator/overlays/ozilab/values.yaml Default tag for ingress proxies and Funnel devices no

ConfigMaps

The operator creates its own runtime configuration artifacts for proxies inside the tailscale namespace. They are not stored as static manifests in this repository.

Secrets management

  • Secret names: operator-oauth for bootstrap credentials, plus Tailscale-managed runtime secrets for generated proxy pods
  • Source of truth: local tailscale-operator/overlays/ozilab/.operator-oauth.env file rendered by secretGenerator into Secret operator-oauth, populated from a Tailscale OAuth client created in the admin console
  • Rotation trigger: OAuth client rotation, tailnet policy changes, or compromise response
  • Recovery note: replacing the OAuth client requires rotating the client in Tailscale, updating the local .operator-oauth.env file, and reconciling the overlay so the operator pod remounts fresh credentials

Tailnet prerequisites

Before rollout, the tailnet policy must provide:

  • tagOwners for tag:k8s-operator and tag:k8s
  • an OAuth client scoped with Devices Core, Auth Keys, and Services write permissions
  • a nodeAttrs rule granting the funnel attribute to tag:k8s or whatever proxy tag you choose

5. Access Protocols

Path URL or endpoint Audience Auth TLS terminates at
Internal traefik.traefik.svc.cluster.local:443 Cluster workloads and the Funnel proxy Traefik middleware chain Traefik
Tailnet https://ozilab-edge..ts.net Tailnet members Tailnet ACLs Tailscale ingress
Public Cloudflare-hosted domain proxied to the Funnel hostname Internet users Cloudflare plus app-specific auth at Traefik Cloudflare edge then Tailscale ingress

Cloudflare integration

The Kubernetes manifests only create the Tailscale side. To keep using your Cloudflare-hosted domain, point the public hostname at the generated Funnel hostname and keep the public host name distinct from the upstream TLS name used between Cloudflare and Tailscale.

Recommended pattern:

  1. Create a proxied CNAME from the public hostname to ozilab-edge..ts.net.
  2. Keep Cloudflare SSL mode on Full (strict) so the edge validates the origin certificate.
  3. Keep the HTTP Host header sent upstream as the public hostname so Traefik can continue routing by Host or HostRegexp.
  4. If your Cloudflare plan exposes Origin Rules SNI override, set only the upstream SNI to ozilab-edge..ts.net.
  5. Do not override the upstream Host header to the ts.net hostname unless you intentionally want Traefik to route on that ts.net host.

Precise flow for a public hostname such as app.example.com:

Hop Value to expect
Browser URL https://app.example.com
Browser to Cloudflare Host header app.example.com
Cloudflare proxied DNS target ozilab-edge..ts.net
Cloudflare to origin TCP destination ozilab-edge..ts.net:443
Cloudflare to origin TLS SNI ozilab-edge..ts.net
Cloudflare to origin HTTP Host header app.example.com
Certificate presented by Tailscale Funnel ozilab-edge..ts.net
Host seen by Traefik for routing app.example.com

Failure modes to avoid:

  • If Cloudflare sends SNI for the public hostname instead of the ts.net hostname, you can get a 526 because the Tailscale certificate only covers the ts.net hostname.
  • If Cloudflare overrides the HTTP Host header to the ts.net hostname, Traefik will route on the wrong host and your existing IngressRoute matches may fall through to the default router.
  • Cloudflare documents that a Host header override also rewrites SNI unless you set a separate SNI override, so do not use a Host header override as a shortcut here.

If your Cloudflare plan does not support Origin Rules SNI override, start with the proxied CNAME only and validate the cutover end to end. If the public hostname returns 526, you need either a plan or feature path that lets you override SNI, or a different edge pattern for that hostname.

6. Operations and Observability

  • Primary health indicators: operator deployment available, Funnel ingress has a hostname in status, and direct access to the ts.net hostname succeeds
  • Dashboards or alerts: operator and proxy pod logs, plus Tailscale admin console machine state
  • Log locations: kubectl logs in the tailscale namespace for the operator and generated proxy pods
  • Known failure modes: missing operator-oauth Secret, bad OAuth client scopes, missing Funnel nodeAttrs, namespace PSA blocking privileged pods, wrong Cloudflare SNI to the generated ts.net hostname, or wrong Cloudflare Host header reaching Traefik

7. Backup and Recovery Notes

  • Backup method: Git plus Tailscale admin console credential recreation
  • Restore prerequisites: working Tailscale tailnet, OAuth client, tag policy, and the traefik service in the ozilab cluster
  • Related runbook: ../runbooks/tailscale-operator.md

8. Release and Change Notes

  • Current deployed app version: Tailscale operator v1.96.5
  • Current chart version: 1.96.5
  • Last significant change: ozilab bundle switched from Kustomize helmCharts inflation to Fleet-native Helm rendering so Fleet can reconcile the chart without requiring unsupported --enable-helm behavior
  • Rollback reference: restore cloudflared to fleet/layer7/gitrepo-ozilab.yaml and remove the tailscale-operator path