Infrastructure Layer¶
Scope¶
The infrastructure layer documents the substrate on which the Kubernetes estate depends: nodes, bootstrap assumptions, cluster prerequisites, and external services required before any workload can run.
Components that belong here¶
Typical repository examples and adjacent dependencies:
- rancher for platform administration and cluster lifecycle visibility
- node-feature-discovery for node capability awareness
- kube-metrics-server for node and pod resource metrics
- Any external host, hypervisor, DNS, NTP, UPS, or hardware inventory references that are not stored directly in this repository
Mandatory topics¶
Every infrastructure-layer document should answer:
- What clusters exist and what is the role of each one?
- What node classes, labels, taints, and hardware constraints exist?
- What prerequisites are required before Kubernetes control-plane or workloads can start?
- Which external dependencies are single points of failure?
- What is the minimum recovery path after a full-cluster or node loss event?
Required artifacts¶
- Cluster inventory table with purpose, environment, and owner
- Node role and labeling strategy
- Bootstrap or rebuild checklist
- External dependency register, including DNS, certificate, or storage prerequisites
- Recovery assumptions, including what must exist before GitOps reconciliation can resume
Recommended evidence¶
- Current node list and versions
- Control-plane access method
- Hardware or virtualization dependencies
- Time synchronization and DNS dependencies
- Links to storage and networking layer documents when recovery crosses those boundaries
Cluster inventory¶
| Slug | Type | K8s version | CNI | Storage class | Fleet namespace |
|---|---|---|---|---|---|
homelab |
Rancher management cluster | managed by Rancher | Rancher default | varies | fleet-default |
local |
k3s 3-node mixed-arch | v1.35.4+k3s1 | flannel / WireGuard | local-path | fleet-default |
jls |
JLS downstream | varies | varies | jelastic-dynamic-volume | fleet-default |
Node inventory — cluster local¶
| Hostname | Status | Role | Architecture | OS | Kernel | Container runtime |
|---|---|---|---|---|---|---|
oci-arm |
Ready | control-plane, edge | arm64 | Ubuntu 24.04.4 LTS | 6.17.0-1011-oracle | containerd 2.2.3-k3s1 |
layer7-vps1 |
Ready | workload | amd64 | Ubuntu 24.04.4 LTS | 6.8.0-110-generic | containerd 2.2.3-k3s1 |
oci-arm-free1 |
Ready | workload | arm64 | Ubuntu 25.10 | 6.17.0-1010-realtime | containerd 2.2.3-k3s1 |
Network topology¶
- All three nodes are interconnected via flannel with WireGuard backend.
oci-arminternal IP:192.168.3.240, external IP:89.168.42.7(Oracle Cloud eu-PAR).oci-arm-free1internal IP:192.168.2.203, external IP:130.61.236.200(Oracle Cloud eu-fra).layer7-vps1internal IP and external IP:185.55.240.102(Layer7 eu-PAR).
Node labels and scheduling conventions¶
| Label | Values | Purpose |
|---|---|---|
kubernetes.io/arch |
arm64, amd64 |
Architecture awareness; prefer nodeAffinity over hard nodeSelector |
node.io/ingress |
"true", "false" |
Pins edge singletons (Traefik, Authelia) to the ingress node |
node.io/role |
edge, workload |
Expresses the node's intended function |
node.io/provider |
oracle, layer7 |
Cloud or datacenter origin |
node.io/region |
eu-PAR, eu-fra |
Geographic region |
Storage constraint¶
The cluster uses local-path as the only available StorageClass. PVs are provisioned on the node where the pod first schedules. For stateful workloads, add volume.kubernetes.io/selected-node: <hostname> on the PVC to pin the volume to a specific node and prevent cross-node rescheduling from producing a Pending PVC.
Control-plane access¶
The control-plane runs on oci-arm. Access the cluster API via https://89.168.42.7:6443 or through the kubeconfig issued by k3s.
Cluster inventory¶
| Slug | Type | K8s version | CNI | Storage class | Fleet namespace |
|---|---|---|---|---|---|
homelab |
Rancher management cluster | managed by Rancher | Rancher default | varies | fleet-default |
local |
k3s 3-node mixed-arch | v1.35.4+k3s1 | flannel / WireGuard | local-path | fleet-default |
jls |
JLS downstream | varies | varies | jelastic-dynamic-volume | fleet-default |
Node inventory — cluster local¶
| Hostname | Status | Role | Architecture | OS | Kernel | Container runtime |
|---|---|---|---|---|---|---|
oci-arm |
Ready | control-plane, edge | arm64 | Ubuntu 24.04.4 LTS | 6.17.0-1011-oracle | containerd 2.2.3-k3s1 |
layer7-vps1 |
Ready | workload | amd64 | Ubuntu 24.04.4 LTS | 6.8.0-110-generic | containerd 2.2.3-k3s1 |
oci-arm-free1 |
Ready | workload | arm64 | Ubuntu 25.10 | 6.17.0-1010-realtime | containerd 2.2.3-k3s1 |
Network topology¶
- All three nodes are interconnected via flannel with WireGuard backend.
oci-arminternal IP:192.168.3.240, external IP:89.168.42.7(Oracle Cloud eu-PAR).oci-arm-free1internal IP:192.168.2.203, external IP:130.61.236.200(Oracle Cloud eu-fra).layer7-vps1internal IP and external IP:185.55.240.102(Layer7 eu-PAR).
Node labels and scheduling conventions¶
| Label | Values | Purpose |
|---|---|---|
kubernetes.io/arch |
arm64, amd64 |
Architecture awareness; prefer nodeAffinity over hard nodeSelector |
node.io/ingress |
"true", "false" |
Pins edge singletons (Traefik, Authelia) to the ingress node |
node.io/role |
edge, workload |
Expresses the node's intended function |
node.io/provider |
oracle, layer7 |
Cloud or datacenter origin |
node.io/region |
eu-PAR, eu-fra |
Geographic region |
Storage constraint¶
The cluster uses local-path as the only available StorageClass. PVs are provisioned on the node where the pod first schedules. For stateful workloads, add volume.kubernetes.io/selected-node: <hostname> on the PVC to pin the volume to a specific node and prevent cross-node rescheduling from producing a Pending PVC.
Control-plane access¶
The control-plane runs on oci-arm. Access the cluster API via https://89.168.42.7:6443 or through the kubeconfig issued by k3s.