Blog

Notes from production AI infrastructure.

MLOps patterns, GPU FinOps, DevSecOps, SRE — written from on-call experience, not theory.

May 1, 2026
Where 70% of EKS spend hides: a 5-step GPU FinOps audit

How to find the unallocated GPU and compute spend that cost dashboards can't see — and what to do about it.

#finops#gpu#eks#kubernetes
Apr 12, 2026
Why we bake model.pkl into Docker images instead of pulling from MLflow at runtime

MLflow is great for experiment tracking and registry. It's not great as a runtime dependency for production inference pods.

#mlops#mlflow#kserve#kubernetes
Mar 18, 2026
Champion / challenger model promotion that doesn't break inference SLOs

A safe-by-default pipeline for promoting models in production: alias-based rollouts, evaluation gates, and canary traffic splits.

#mlops#mlflow#kserve#drift