Skip to content

vaultwarden Runbook

Metadata

Field Value
Service vaultwarden
Criticality Tier 1
Owner Platform / Security owner
Namespace vaultwarden
Clusters local
Last validated 2026-06-29
Related service page ../services/vaultwarden.md

Trigger Conditions

  • Vaultwarden UI or sync clients fail.
  • Websocket access breaks.
  • Users cannot unlock or sync vault items.
  • PVC-backed state or runtime secrets are unavailable.

1. Health Checks

kubectl -n vaultwarden get pods,svc,pvc,ingressroute
kubectl -n vaultwarden logs statefulset/vaultwarden --tail=200
# Readiness/liveness probes hit GET /alive on port 8080. Verify it returns 200:
kubectl -n vaultwarden exec statefulset/vaultwarden -- \
  wget -qO- http://localhost:8080/alive && echo

A pod that is not Ready is failing the /alive readiness probe and will not receive traffic.

2. Troubleshooting Workflows

Check ingress, websocket handling, and PVC health first.

kubectl -n vaultwarden describe statefulset vaultwarden
kubectl -n vaultwarden describe ingressroute
kubectl -n vaultwarden get secret
kubectl -n vaultwarden describe pod -l run=vaultwarden

Common causes:

  • Pod stuck NotReady / restarting: failing /alive probe, missing or malformed vaultwarden-secrets, or /data PVC not attached.
  • /tmp permission errors: the container runs with a read-only root filesystem; /tmp is an emptyDir. If vaultwarden-server cannot write, confirm the tmp volume is mounted.
  • SMTP errors: check SMTP_* values in ConfigMap vaultwarden-config and credentials in Secret vaultwarden-secrets.
  • Admin panel locked out: verify ADMIN_TOKEN in Secret vaultwarden-secrets.

3. Disaster Recovery

  1. Restore the data volume from snapshot.
  2. Recreate the runtime Secret: populate overlays/local/vault-secrets.env from vault-secrets.env.example (SMTP credentials + ADMIN_TOKEN).
  3. Reconcile vaultwarden/overlays/local.
  4. Validate /alive returns 200 and web login + client sync work.

4. Scaling and Resource Management

kubectl -n vaultwarden top pod

Resource changes are usually small, but adjust memory or storage in Git if the StatefulSet becomes constrained.

5. Maintenance Procedures

  • Rotate ADMIN_TOKEN and SMTP_PASSWORD in Secret vaultwarden-secrets.
  • Validate websocket behavior after Traefik changes.
  • Schedule updates carefully because this service stores credentials.
  • Before a vaultwarden version upgrade or downgrade, snapshot the data PVC: data-format changes can make downgrades unsafe.

6. Rollback Strategy

  • Revert the overlay to the previous working revision in Git.
  • Restore the prior data snapshot if a configuration or version change corrupts startup.
  • A vaultwarden image downgrade after a data-format change may be unsafe; restore from a pre-change snapshot rather than rolling the image back in place.

7. Post-Incident Actions

  1. Add a changelog fragment for recovery work.
  2. Update the service page if exposure or secret handling changed.
  3. Extend this runbook with any newly discovered failure mode.