Security
Zero Trust Kubernetes: a 2026 checklist
Practical hardening checklist for production Kubernetes — admission policies, identity, network, and supply chain.
A checklist we run before any cluster takes production traffic. Not exhaustive — opinionated.
Identity
- No long-lived credentials in pods. Use IRSA (AWS), Workload Identity (Azure / GCP), or SPIRE.
- CI / CD runners use OIDC federation. No static AWS access keys or service-principal secrets.
- Service-to-service auth via mTLS — Istio or Linkerd, with strict mode enabled.
- Human access via short-lived tokens through your IdP — no long-lived kubeconfig files distributed to laptops.
Admission
- Pod Security Standards:
restrictedprofile in production namespaces.baselineis not enough. - OPA Gatekeeper or Kyverno for custom policies — required labels, registry allowlist, image-signature verification.
- Image signature verification at admission. Unsigned images denied. (Cosign verifier or Notary v2.)
- Resource quotas per namespace. Prevents a misconfigured deployment from spawning 100 GPU nodes.
- NetworkPolicies default-deny. Explicit allow rules for required traffic.
Supply chain
- SBOM generated on every image build (Syft / Trivy).
- Image signed at build time (Cosign / Sigstore or Notary v2).
- Vulnerability scan as a required CI check — block on CRITICAL / HIGH with no waiver.
- Base images pinned by digest, not tag.
- Quarantine registry layer — incoming images held until scans pass, then promoted to the production registry.
- Secrets scanning in CI — Gitleaks or similar, fails build on hit.
Privileged access
- No standing privileged access to the cluster. Just-in-time elevation through your PAM (CyberArk, Vault, BeyondTrust).
- Privileged sessions recorded for compliance.
- Break-glass account exists, is offline, and rotation is on the calendar.
- Audit logs shipped off-cluster to immutable storage. A compromised cluster cannot delete its own logs.
Network
- Calico or Cilium with network policy enforcement — not just installed, actually enforcing.
- Egress controls — explicit allowlist for outbound traffic. No surprise
curl evil.example.comfrom a worker pod. - Ingress behind a WAF. Cloud-native (ALB WAF, Azure WAF, Cloudflare) or in-cluster (e.g. ModSecurity).
Secrets
- No secrets in env vars in pod spec. Mount from CSI driver pulling from Vault / AWS Secrets Manager / Azure Key Vault.
- Secret rotation automated. Manual rotation rots.
- No secrets in image layers. Trivy / Gitleaks check.
Observability
- Audit logs to SIEM with alerting on suspicious patterns (privilege escalation, exec into pods, secret access).
- Runtime threat detection — Falco, Sysdig, or equivalent — alerting on shell-spawning in containers, unexpected outbound connections.
- Failed authentication attempts dashboarded.
Common gotchas
baselinePSS in prod — privileged escalation, hostPath volumes, host network access all permitted. Userestricted.- Dangling Service Accounts with
cluster-admin— left over from migrations. Audit every quarter. - NetworkPolicy without DNS allow — pods can’t resolve. Fix with explicit egress to kube-dns.
- Image signing without verification at admission — half the value. The verifier closes the loop.