Service Catalogue Strategy¶
Naming convention¶
The documentation model mirrors the repository layout.
- Top-level deployment directory: SERVICE_NAME/
- Service page: docs/services/SERVICE_NAME.md
- Runbook page when required: docs/runbooks/SERVICE_NAME.md
This removes ambiguity and allows operators to infer the documentation path directly from the repository tree.
When a service page is mandatory¶
Every top-level deployment directory should eventually gain a service page, including platform components and application workloads.
The minimum standard is:
- Service overview and owner
- Architecture and dependency map
- Deployment source and namespace details
- Configuration and secrets handling notes
- Access methods and URLs
When a runbook is mandatory¶
A dedicated runbook is required if any of the following are true:
- The service is Tier 0 or Tier 1
- The service is security-sensitive or part of the access path
- The service is shared, stateful, or required for platform recovery
- The service is externally exposed and has non-trivial rollback or restore steps
Current coverage snapshot¶
Status as of 2026-07-02:
- Current service pages: 26.
- Current runbooks: 23.
- Current deployment-like top-level directories detected by the validation heuristic: 58.
- Deployment directories with service pages: 25.
fleetis documented as a GitOps bootstrap service even though it is not a normal deployment directory.- Existing service pages and runbooks are enforced by
scripts/validate_docs.py.
Current priority gaps¶
The next documentation backfill should prioritize services with the highest operational impact.
| Priority | Services | Why they come first |
|---|---|---|
| 1 | calico, csi-driver-nfs, longhorn, velero, democratic-csi | They control networking, storage, or recovery for the rest of the estate |
| 2 | cloudflared, crowdsec-lapi, headscale, wazuh | They protect or expose edge and security workflows |
| 3 | kube-prometheus-stack, mimir, grafana-agent, minio | They provide observability or shared state needed during incidents |
| 4 | arr-stack, netbox, seafile, portainer, uptime-kuma | They are stateful, externally useful, or operationally visible services |
| 5 | Remaining deployment directories | Lower blast radius or easier to backfill during routine maintenance |
Current catalogue pages¶
- actualbudget
- argocd
- authelia
- defectdojo
- Fleet
- forgejo
- gitea
- gitlab
- grafana-dashboard
- infisical
- k8s-monitoring
- lgtm-distributed
- loki
- metallb
- mealie
- nextcloud
- ollama
- openvas
- prometheus
- pulp3
- rancher
- renovate
- semaphore
- tailscale-operator
- Traefik
- vaultwarden
Service page creation workflow¶
- Copy the application template into docs/services/SERVICE_NAME.md.
- Fill the metadata block first so ownership, tier, clusters, and namespace are visible immediately.
- Add a separate runbook if the service meets the mandatory criteria.
- Add the service to
mkdocs.ymlandscripts/validate_docs.pyonce the README, service page, and required runbook exist. - Add or update a changelog fragment when the service page is introduced during a broader deployment or validation change.
Completion standard¶
The catalogue is only complete when a new operator can locate a service, understand its dependencies, and open the correct runbook without searching outside the repository.