defectdojo Runbook¶
Metadata¶
| Field | Value |
|---|---|
| Service | defectdojo |
| Criticality | Tier 2 |
| Owner | Platform / Security owner |
| Namespace | defectdojo |
| Clusters | homelab |
| Last validated | 2026-05-23 |
| Related service page | ../services/defectdojo.md |
Trigger Conditions¶
- DefectDojo UI is unavailable.
- Login fails for the bootstrap admin on a fresh install.
- The initializer Job fails or stays blocked.
- PostgreSQL, Valkey, or the Django pod becomes unhealthy.
1. Health Checks¶
kubectl -n defectdojo get pods,job,svc,pvc,ingressroute
kubectl -n defectdojo logs job/defectdojo-initializer --tail=200
kubectl -n defectdojo logs deploy/defectdojo-django -c uwsgi --tail=200
kubectl -n defectdojo logs deploy/defectdojo-django -c nginx --tail=200
Check that the initializer Job completed successfully before debugging the login flow.
2. Troubleshooting Workflows¶
Bootstrap admin password does not work on first startup¶
- Confirm the expected password stored in the cluster Secret.
kubectl -n defectdojo get secret defectdojo \
-o jsonpath='{.data.DD_ADMIN_PASSWORD}' | base64 -d && echo
- Inspect the initializer logs.
-
If the initializer is stuck in
CreateContainerConfigErrorbecauseDD_ADMIN_PASSWORDis missing, fix the local secret input and re-reconcile the overlay. This is the intended fail-fast behavior. -
If the initializer already completed during an earlier broken bootstrap and the
dojo-adminpassword still fails, reset it from theuwsgicontainer.
kubectl -n defectdojo exec -it deploy/defectdojo-django -c uwsgi -- \
python manage.py changepassword dojo-admin
- If
dojo-admindoes not exist or bootstrap created the wrong account state, create a new superuser.
kubectl -n defectdojo exec -it deploy/defectdojo-django -c uwsgi -- \
python manage.py createsuperuser
- After manual recovery, sign in through the UI and record which account is now the primary emergency admin.
General runtime diagnostics¶
kubectl -n defectdojo describe job defectdojo-initializer
kubectl -n defectdojo describe deploy defectdojo-django
kubectl -n defectdojo logs deploy/defectdojo-celery-worker --tail=100
kubectl -n defectdojo logs statefulset/defectdojo-postgresql --tail=100
kubectl -n defectdojo logs statefulset/defectdojo-valkey --tail=100
Look first for missing secrets, DB auth failures, CSRF or host configuration issues, and uwsgi restarts caused by memory pressure.
3. Disaster Recovery¶
- Restore PostgreSQL from backup.
- Restore the
defectdojoand database-related secrets. - Reconcile defectdojo/overlays/homelab.
- Validate the initializer Job, login, and a basic page load.
4. Scaling and Resource Management¶
If login POST requests or imports restart uwsgi, increase Django resources or reduce process count in defectdojo/overlays/homelab/values.yaml.
5. Maintenance Procedures¶
- Rotate
DD_ADMIN_PASSWORD,DD_SECRET_KEY, andDD_CREDENTIAL_AES_256_KEYthrough the local secret workflow. - Revalidate the external URL and CSRF trusted origins after ingress changes.
- Keep a tested emergency superuser recovery path using the
uwsgicontainer.
6. Rollback Strategy¶
- Revert the last known-good DefectDojo overlay and Fleet change in Git.
- If the failure involved bad bootstrap or migrations on disposable data only, consider restoring the database before redeploying.
7. Post-Incident Actions¶
- Update the README or service page if the bootstrap procedure changed again.
- Extend this runbook if a new admin recovery pattern was required.
- Record whether the issue came from missing secret material, bad bootstrap order, or runtime resource limits.