gitlab Runbook¶
Metadata¶
| Field | Value |
|---|---|
| Service | gitlab |
| Criticality | Tier 2 |
| Owner | Platform / SCM owner |
| Namespace | gitlab |
| Clusters | homelab |
| Last validated | 2026-05-20 |
| Related service page | ../services/gitlab.md |
Trigger Conditions¶
- GitLab UI or API is down.
- Git, registry, or web traffic fails.
- Omnibus startup loops or PostgreSQL errors occur.
- Storage pressure or migration failures are observed.
1. Health Checks¶
kubectl -n gitlab get pods,svc,pvc,ingressroute
kubectl -n gitlab logs deploy/gitlab --tail=200
kubectl -n gitlab get deploy,statefulset
2. Troubleshooting Workflows¶
kubectl -n gitlab describe deploy gitlab
kubectl -n gitlab logs deploy/gitlab --tail=400
kubectl -n gitlab describe ingressroute
Focus on DB connectivity, storage saturation, ingress drift, and omnibus bootstrap errors.
3. Disaster Recovery¶
- Restore DB state and omnibus data.
- Restore application secrets and TLS material.
- Reconcile
gitlab/overlays/homelab. - Validate UI, Git, and any exposed auxiliary endpoints.
4. Scaling and Resource Management¶
Resize CPU, memory, or storage in Git when the omnibus container or DB saturates.
5. Maintenance Procedures¶
- Rotate GitLab admin and DB secrets.
- Validate ingress and DNS after hostname changes.
- Schedule maintenance windows for image upgrades and DB-heavy tasks.
6. Rollback Strategy¶
- Revert the overlay to the last working revision.
- Restore the previous DB or omnibus backup if startup or migrations failed.
7. Post-Incident Actions¶
- Add a changelog fragment covering the recovery.
- Update the service page with any new operational constraints.
- Extend this runbook if the incident exposed missing steps.