Skip to content

ollama

Metadata

Field Value
Service ollama
Purpose Local LLM inference service with PVC-backed model cache
Criticality Tier 3
Owner Application / platform owner
Clusters homelab
Namespace ollama
Exposure internet
Stateful yes
Backup class snapshot
RPO / RTO Model cache can be rebuilt, 1 to 2 hours to restore service
Last reviewed 2026-05-20

1. Service Overview

Ollama serves local model inference APIs for interactive or automation workloads and stores pulled models on a persistent volume.

Summary

If it fails, LLM-backed workflows stop, but model artifacts can usually be re-pulled if the cache is lost.

Dependencies

Dependency Type Why it matters
PVC-backed storage storage Stores downloaded model artifacts
Traefik ingress Exposes the external API
Node scheduling runtime Determines where model execution occurs

2. Architecture Diagram

[Client]
  -> [Traefik]
  -> [Ollama API]
  -> [PVC-backed model cache]

3. Deployment Specifications

Item Value
Source path ollama/base and ollama/overlays/homelab
Deployment model Kustomize plus Fleet bundle
Namespace ollama
Workload kind Deployment
Chart or image version See base manifests for the current image tag
Config files base/kustomization.yaml, overlays/homelab/kustomization.yaml, fleet.yaml

Cluster mapping

Cluster Overlay path Notes
homelab ollama/overlays/homelab Current deployment target

4. Configuration Guide

Environment variables

Variable Source Purpose Secret?
Ollama runtime settings overlay manifests and optional secrets API and model-cache behavior mixed

ConfigMaps

Resource Path Purpose
Kustomize-managed runtime config ollama/base and ollama/overlays/homelab Scheduling and service customization

Secrets management

  • Secret names: optional runtime secrets in the ollama namespace
  • Source of truth: overlay inputs and generated manifests
  • Rotation trigger: API or external integration changes
  • Recovery note: restore secrets and remount the model cache PVC before restart if needed

5. Access Protocols

Path URL or endpoint Audience Auth TLS terminates at
Internal Ollama service in the ollama namespace Cluster workloads namespace RBAC Traefik / Ollama
External https://ollama.mutana.site Operators and integrated clients ingress policy Traefik

6. Operations and Observability

  • Primary health indicators: Deployment healthy, API responsive, and model cache mounted.
  • Dashboards or alerts: shared cluster monitoring and pod resource metrics.
  • Log locations: Ollama pod logs.
  • Known failure modes: large-model disk exhaustion, node scheduling issues, or ingress/API errors.

7. Backup and Recovery Notes

  • Backup method: PVC snapshot when local model cache preservation matters.
  • Restore prerequisites: enough storage capacity and runtime secrets if used.
  • Related runbook: not required for this lower-blast-radius service.

8. Release and Change Notes

  • Current deployed app version: see ollama/base image tags.
  • Current chart version: N/A.
  • Last significant change: homelab overlay documented with the current Traefik exposure path.
  • Rollback reference: previous overlay revision in Git.