AWS cost optimisation: reading your bill like an engineer

TL;DR

Cloud waste is an architecture problem, not a procurement problem
The five biggest waste categories each have a specific AWS tool to fix them
Quick wins (EC2 rightsizing, idle IPs, orphaned volumes) can be done today
Structural fixes (VPC endpoints, CloudFront, tagging) unlock the material savings and reduce data transfer costs

30% average cloud spend wasted (Flexera)

$0.045 per GB through NAT Gateway

$0.01 per GB cross-AZ, each direction

Cloud waste is not accidental overspending on deliberate choices. It is spending on things that should have been switched off months ago, on services provisioned for peak capacity that never arrived, or on data moving between places it has no business moving. These are architectural decisions, or the absence of them. Most AWS cost optimisation fails because teams try to negotiate unit costs instead of fixing the architecture that generates the waste.

The accountant asks "how much?" The engineer asks "why?" Only one of those questions leads to a smaller bill next month.

Accountant reads

Total spend up 18%
EC2 is the biggest line item
Monthly delta flagged to engineering

Engineer reads

NAT Gateway processed 4TB last month - why?
3 RDS instances running at 8% CPU
Resources still running in ap-southeast-1

Five waste categories: where your AWS bill is actually going

Idle and over-provisioned EC2 instances

A proof-of-concept instance becomes production and nobody questions the size again. Idle instances from decommissioned workloads compound the problem. EC2 rightsizing is often the fastest way to reduce monthly costs.

Open Compute Optimizer - recommendations are already waiting for you
Filter Cost Explorer by instance, sort by lowest data transfer out
Action the top five rightsizing recommendations immediately

Tool: AWS Compute Optimizer + Trusted Advisor

NAT Gateway charges

The $0.045/hour per-gateway cost looks fine at provisioning time. The $0.045/GB data processing charge is where it compounds - especially when Lambda or ECS in private subnets call AWS services at volume.

Gateway VPC endpoints for S3 and DynamoDB are free
Interface endpoints for Secrets Manager, SSM, ECR cost ~$0.01/hour but eliminate NAT processing entirely
Check your top NAT data sources in VPC Flow Logs

Tool: VPC Endpoints (free for S3/DynamoDB)

Cross-AZ and egress data transfer

Cross-AZ is $0.01/GB in each direction. In a high-volume microservices architecture with services spread across AZs, this becomes a meaningful number fast. Egress to the internet starts at $0.09/GB.

Instrument traffic at the AZ level to find cross-AZ patterns
Serve large assets through CloudFront - origin fetch from within AWS is free or near-free
CloudFront eliminates a large proportion of origin requests entirely via caching

Tool: CloudFront + VPC Flow Logs

Over-provisioned RDS

A db.r5.2xlarge for a dev environment that is active four hours a day is not a reasonable trade-off. RDS auto-stop restarts after seven days, so manual management does not scale.

Compute Optimizer now covers RDS and will flag specific downsize recommendations
Migrate variable-load databases to Aurora Serverless v2 - scales to 0.5 ACUs when idle

Tool: Compute Optimizer + Aurora Serverless v2

Orphaned Elastic IPs and EBS volumes

Elastic IPs cost $0.005/hour when unattached ($3.60/month each). Detached EBS volumes from terminated instances accumulate silently. Manual snapshots from backup scripts never expire unless you enforce a retention policy.

Trusted Advisor surfaces unattached Elastic IPs directly
Run a monthly sweep of detached EBS volumes in Cost Explorer or the EC2 console
Audit EBS and RDS snapshots older than your retention window

Tool: Trusted Advisor + Cost Explorer

Quick wins vs structural fixes

Do today

Run Compute Optimizer, action top 5 rightsizing recommendations
Release unattached Elastic IPs
Delete detached EBS volumes older than 30 days
Check for resources in regions you are not actively using

Sprint-level fixes

Deploy VPC endpoints for top AWS services used from private subnets
Move static asset delivery to CloudFront
Migrate dev databases to Aurora Serverless v2
Implement resource tagging and cost allocation by team
Add Infracost to CI/CD to surface cost impact at PR time

The material savings are in the structural fixes. A 10% reduction from EC2 rightsizing is useful. A 40% reduction from eliminating unnecessary data transfer and fixing NAT routing is a different conversation entirely. This is why AWS cost optimisation is fundamentally an architecture problem.

Ongoing visibility and cloud cost management

The goal is not to read the bill better once. Build the infrastructure so cost anomalies surface automatically, before they compound.

Tagging enforcement. Every resource needs at minimum an environment tag and a team tag. Use AWS Config rules to flag untagged resources on creation. Without tags, Cost Explorer tells you how much EC2 costs - not whose EC2 costs that much.
AWS Budgets alerts. Set alerts at 80% and 100% per environment and per top service. Send to a channel engineers actually read - not email.
Cost Anomaly Detection. Free, 15 minutes to configure. Catches a runaway Lambda loop or misconfigured pipeline the same day it starts, not at month end.

Reading your AWS bill like an engineer, not an accountant

Five waste categories: where your AWS bill is actually going

Idle and over-provisioned EC2 instances

NAT Gateway charges

Cross-AZ and egress data transfer

Over-provisioned RDS

Orphaned Elastic IPs and EBS volumes

Quick wins vs structural fixes

Ongoing visibility and cloud cost management

Your AWS bill is an architecture report

Reading your AWS bill like an engineer, not an accountant

Five waste categories: where your AWS bill is actually going

Idle and over-provisioned EC2 instances

NAT Gateway charges

Cross-AZ and egress data transfer

Over-provisioned RDS

Orphaned Elastic IPs and EBS volumes

Quick wins vs structural fixes

Ongoing visibility and cloud cost management

The automation audit: five processes every SME should have automated by now

You do not need a data warehouse yet. Here is when you will.

Your AWS bill is an architecture report