Skip to content

defectdojo Runbook

Metadata

Field Value
Service defectdojo
Criticality Tier 2
Owner Platform / Security owner
Namespace defectdojo
Clusters homelab
Last validated 2026-05-23
Related service page ../services/defectdojo.md

Trigger Conditions

  • DefectDojo UI is unavailable.
  • Login fails for the bootstrap admin on a fresh install.
  • The initializer Job fails or stays blocked.
  • PostgreSQL, Valkey, or the Django pod becomes unhealthy.

1. Health Checks

kubectl -n defectdojo get pods,job,svc,pvc,ingressroute
kubectl -n defectdojo logs job/defectdojo-initializer --tail=200
kubectl -n defectdojo logs deploy/defectdojo-django -c uwsgi --tail=200
kubectl -n defectdojo logs deploy/defectdojo-django -c nginx --tail=200

Check that the initializer Job completed successfully before debugging the login flow.

2. Troubleshooting Workflows

Bootstrap admin password does not work on first startup

  1. Confirm the expected password stored in the cluster Secret.
kubectl -n defectdojo get secret defectdojo \
  -o jsonpath='{.data.DD_ADMIN_PASSWORD}' | base64 -d && echo
  1. Inspect the initializer logs.
kubectl -n defectdojo logs job/defectdojo-initializer --tail=200
  1. If the initializer is stuck in CreateContainerConfigError because DD_ADMIN_PASSWORD is missing, fix the local secret input and re-reconcile the overlay. This is the intended fail-fast behavior.

  2. If the initializer already completed during an earlier broken bootstrap and the dojo-admin password still fails, reset it from the uwsgi container.

kubectl -n defectdojo exec -it deploy/defectdojo-django -c uwsgi -- \
  python manage.py changepassword dojo-admin
  1. If dojo-admin does not exist or bootstrap created the wrong account state, create a new superuser.
kubectl -n defectdojo exec -it deploy/defectdojo-django -c uwsgi -- \
  python manage.py createsuperuser
  1. After manual recovery, sign in through the UI and record which account is now the primary emergency admin.

General runtime diagnostics

kubectl -n defectdojo describe job defectdojo-initializer
kubectl -n defectdojo describe deploy defectdojo-django
kubectl -n defectdojo logs deploy/defectdojo-celery-worker --tail=100
kubectl -n defectdojo logs statefulset/defectdojo-postgresql --tail=100
kubectl -n defectdojo logs statefulset/defectdojo-valkey --tail=100

Look first for missing secrets, DB auth failures, CSRF or host configuration issues, and uwsgi restarts caused by memory pressure.

3. Disaster Recovery

  1. Restore PostgreSQL from backup.
  2. Restore the defectdojo and database-related secrets.
  3. Reconcile defectdojo/overlays/homelab.
  4. Validate the initializer Job, login, and a basic page load.

4. Scaling and Resource Management

kubectl -n defectdojo top pod
kubectl -n defectdojo describe deploy defectdojo-django

If login POST requests or imports restart uwsgi, increase Django resources or reduce process count in defectdojo/overlays/homelab/values.yaml.

5. Maintenance Procedures

  • Rotate DD_ADMIN_PASSWORD, DD_SECRET_KEY, and DD_CREDENTIAL_AES_256_KEY through the local secret workflow.
  • Revalidate the external URL and CSRF trusted origins after ingress changes.
  • Keep a tested emergency superuser recovery path using the uwsgi container.

6. Rollback Strategy

  • Revert the last known-good DefectDojo overlay and Fleet change in Git.
  • If the failure involved bad bootstrap or migrations on disposable data only, consider restoring the database before redeploying.

7. Post-Incident Actions

  1. Update the README or service page if the bootstrap procedure changed again.
  2. Extend this runbook if a new admin recovery pattern was required.
  3. Record whether the issue came from missing secret material, bad bootstrap order, or runtime resource limits.