nextcloud Runbook¶
Metadata¶
| Field | Value |
|---|---|
| Service | nextcloud |
| Criticality | Tier 2 |
| Owner | Application / platform owner |
| Namespace | nextcloud |
| Clusters | dev, local |
| Last validated | 2026-05-20 |
| Related service page | ../services/nextcloud.md |
Trigger Conditions¶
- Web UI or sync clients fail.
- File uploads or locking fail.
- MySQL or Redis is unhealthy.
- PVC-backed storage is unavailable.
1. Health Checks¶
kubectl -n nextcloud get pods,svc,pvc,ingressroute
kubectl -n nextcloud logs deploy/nextcloud --tail=200
kubectl -n nextcloud get deploy,statefulset
2. Troubleshooting Workflows¶
Check app, DB, cache, and storage together:
kubectl -n nextcloud logs deploy/nextcloud --tail=400
kubectl -n nextcloud logs statefulset/mysql --tail=100
kubectl -n nextcloud logs deploy/redis --tail=100
Look for trusted-domain drift, DB migrations, Redis lock issues, and PVC attach failures.
3. Disaster Recovery¶
- Restore MySQL state.
- Restore file data from snapshot.
- Restore application secrets.
- Reconcile the active overlay.
- Validate web access and a sync-client login.
4. Scaling and Resource Management¶
Increase application, DB, or cache resources in Git when uploads, previews, or background jobs saturate the current allocation.
5. Maintenance Procedures¶
- Rotate DB and SMTP credentials.
- Review trusted domains after hostname changes.
- Plan upgrades during low-traffic windows because migrations can be slow.
6. Rollback Strategy¶
- Revert the active overlay revision.
- Restore the previous DB and file snapshot if schema or config changes broke startup.
7. Post-Incident Actions¶
- Add a changelog fragment for manual interventions.
- Update the service page if dependencies or hostnames changed.
- Extend this runbook with any newly observed operational issue.