Case Studies

Production wins, with the numbers.

Anonymized case studies from engagements across Tier-1 banking, Fortune 100 healthcare AI, and enterprise cybersecurity programs. Specific client identities are confidential by default.

2026
Featured
Recovering $170K/month in wasted GPU spend

Healthcare AI client running real-time RAG on EKS was burning ~$170–180K/month in idle GPU and over-provisioned compute. We traced and remediated 70% of unallocated spend.

EKSKarpenterKEDANvidia DCGMDatadogHarness CCM
2026
Featured
Production RAG: KServe + Knative + Istio + champion/challenger MLflow

End-to-end MLOps stack for real-time RAG inference at a Fortune 100 healthcare AI program — full lifecycle from experiment tracking to canary rollout on drift.

EKSAKSKServeKnativeIstioMLflowKubeflowEvidently
2025
Featured
Migrating a 5,000-server fleet to GitHub Actions

Tier-1 retail brokerage replaced legacy Harness CI/CD with GitHub Actions across 5,000+ Linux/Windows servers — reusable workflow library, OIDC-federated runners, security gates as required checks.

GitHub ActionsOIDCTrivySemgrepCosignAnsibleSaltStack
2019
65% team reduction: re-engineering a software asset platform

Global investment bank's Software Asset Management platform re-architected as a privileged-access governance system — automating access lifecycle, eliminating recurring incidents.

AnsibleHadoopSCCMHyper-VHashiCorp Vault
2024
Multi-site failover for a 50,000-server estate

Tier-1 financial services firm: automated multi-site / multi-zone failover for business continuity across a 50,000+ server estate. Zero-downtime release patterns on VMware Tanzu and Kubernetes.

VMware TanzuKubernetesKafkaAerospikeAnsibleSaltStack
2024
One-click cyber range for security training

Built a one-click, scalable, automated cyber range for a research university's cybersecurity program — full enterprise IT environment with subnets, DMZ, AD, IDS/IPS, and ELK in a single provision command.

ProxmoxKVMLXCcloud-initAnsibleELK