The 40% Rule
Most organisations waste 40-60% of their cloud spend. Not through negligence, but through lack of visibility. Resources are provisioned for peak load, left running out of hours, and forgotten after projects end. A systematic audit finds this waste and eliminates it.
An instance running at 15% CPU utilisation looks busy but is actually idle. A storage volume attached to a terminated instance is invisible in the console but still charged. A development environment left running over the weekend costs £200 per month.
The root cause is organisational, not technical. Engineers provision resources for their immediate need and do not revisit them. Project teams disband, but their resources live on. Budget owners see aggregate spend but not individual resource utilisation. No one is responsible for resource lifecycle management. FinOps fixes this by making cost visible and assigning ownership.
Phase 1: Visibility
Before you optimise, you must see. Tag everything: environment, team, project, cost centre. Use cost allocation tags consistently. Set up billing alerts at 50%, 80%, and 100% of budget. Without visibility, optimisation is guesswork.
Tagging is the foundation of cost visibility. Every resource must have: Environment (production, staging, development), Team (engineering, data science, operations), Project (the specific initiative), and Cost Centre (the budget owner). We enforce tagging through policy-as-code: Terraform policies, AWS Organizations SCPs, or Azure Policy. Resources without required tags are automatically flagged, and after a grace period, terminated.
Cost dashboards provide ongoing visibility. We use native tools (AWS Cost Explorer, Azure Cost Management, GCP Cost Console) for high-level views, and third-party tools (CloudHealth, Datadog, Spot.io) for detailed analysis. The key metrics: total spend by service, by team, and by project; spend trend over time; and cost per transaction or user (unit economics).
Phase 2: Right-Sizing
Analyse actual CPU and memory utilisation over 30 days. Rightsize instances to match observed peaks, not theoretical maximums. We typically find 30% of instances are oversized by two or more instance classes. Downsize them.
Right-sizing requires data. We collect CPU, memory, disk, and network metrics from CloudWatch, Azure Monitor, or Google Cloud Monitoring. The analysis covers 30 days to capture weekly patterns, monthly cycles, and any anomalies. We look at p95 and p99 utilisation, not average. Average utilisation of 30% might hide peaks of 90% that require larger instances.
The right-sizing process: identify instances with sustained low utilisation (under 20% CPU or memory for 30 days), identify instances with bursty patterns that could use burstable instance types, identify instances that could be replaced by serverless functions or container services, and identify instances that are idle (zero utilisation for 7+ days). Each category has a different remediation: downsize, switch instance type, replatform, or terminate.
Phase 3: Scheduling
Development and test environments do not need to run 24/7. Implement automated start/stop schedules based on working hours. For global teams, stagger schedules by region. We typically see 20-30% savings from scheduling alone.
The simplest scheduling pattern: start development environments at 8 AM and stop them at 7 PM, Monday to Friday. This saves 70% of development environment costs (12 hours running vs 24 hours). Test environments can follow the same pattern, or they can run during test execution windows only.
The exception: environments used for overnight batch jobs, global teams, or 24/7 testing. These should not be scheduled off but should be right-sized to their actual load. We review scheduling policies monthly to catch exceptions.
Phase 4: Storage Optimisation
Storage is often the largest hidden cost. Move infrequently accessed data to cheaper tiers. Delete orphaned snapshots and unused volumes. Implement lifecycle policies that transition data to archive storage automatically.
An EBS volume costs £0.10 per GB per month. A 1TB volume costs £1,200 per year. Most organisations have hundreds of volumes, many unused or oversized.
The storage optimisation process: identify unattached volumes and delete them after 7 days of non-attachment (with snapshot backup), identify idle volumes and downsize them, identify infrequently accessed data and transition to cheaper tiers (S3 Infrequent Access, Glacier, or Azure Cool Blob), and implement lifecycle policies that automatically transition data based on age and access patterns.
Phase 5: Reserved Capacity
For predictable baseline workloads, reserved instances or savings plans reduce costs by 30-60%. The key is committing only to predictable usage. Overcommitting wastes money; undercommitting leaves savings on the table.
Reserved instances require a 1- or 3-year commitment. The savings are substantial: 30-40% for 1-year, 50-60% for 3-year, all upfront. The risk is committing to usage that changes. We recommend starting with 1-year reservations for the most stable workloads (production databases, core services), then expanding as usage patterns become clearer.
The analysis process: identify instances that run 24/7 with stable utilisation, calculate the baseline usage (minimum observed over 30 days), reserve that baseline, and leave the variable portion on-demand. This provides the best of both: savings on predictable usage, flexibility on variable usage.
Our Recommendation
Run this audit quarterly. Cloud drift is real — resources accumulate, teams change, and usage patterns shift. A quarterly 2-day audit consistently finds 15-25% savings in mature environments.
When engineers see cost as a feature, not an afterthought, waste reduces organically. When teams are accountable for their cloud spend, they optimise proactively.
Start with visibility. You cannot optimise what you cannot see. Then right-size, schedule, optimise storage, and reserve capacity. Each phase builds on the previous, creating compounding savings. The organisations that succeed treat cost optimisation as a continuous practice, not a one-time project.