Skip to content

Cost Optimization

Cloud computing shifts costs from upfront capital expenditure to ongoing operational expenditure. While this model offers flexibility, it also introduces the risk of unchecked spending. Without deliberate cost management, cloud bills can spiral out of control. Cost optimization is not just about spending less — it is about maximizing the business value of every dollar spent on cloud resources.


The Cost Optimization Mindset

Traditional IT Spending: Cloud Spending:
┌─────────────────────┐ ┌─────────────────────┐
│ Buy servers upfront │ │ Pay per hour/second │
│ Fixed cost │ │ Variable cost │
│ Long procurement │ │ Instant provisioning │
│ Depreciation model │ │ OpEx model │
│ Hard to scale down │ │ Easy to scale down │
└─────────────────────┘ └─────────────────────┘
The challenge: Easy provisioning means easy overspending.
The opportunity: Flexibility means you can optimize continuously.

Common Sources of Cloud Waste

Waste TypeTypical ImpactExample
Idle resources20-30% of cloud spendDev/test instances running 24/7
Oversized instances15-25% of compute costRunning m5.xlarge when t3.medium suffices
Unused storage10-15% of storage costOld snapshots, orphaned volumes
No reserved pricing30-60% higher than neededRunning on-demand when usage is predictable
Data transfer5-15% of total billCross-region data movement
Zombie resources5-10% of spendLoad balancers, IP addresses with no traffic

Right-Sizing

Right-sizing means matching resource allocations to actual workload requirements. It is consistently the highest-impact optimization you can make.

The Right-Sizing Process

Step 1: Monitor actual usage (2-4 weeks minimum)
┌──────────────────────────────────────────┐
│ CPU Usage for web-server-prod-01 │
│ │
│ 100%├ │
│ │ │
│ 80%├ │
│ │ │
│ 60%├ │
│ │ ██ │
│ 40%├ ██ │
│ │ ██ ██ ██ │
│ 20%├──██──██──██──██──██──██──██──██──── │
│ │ ██ ██ ██ ██ ██ ██ ██ ██ │
│ 0%├──┴───┴───┴───┴───┴───┴───┴───┴──── │
│ Mon Tue Wed Thu Fri Sat Sun │
└──────────────────────────────────────────┘
Average CPU: 18% Peak CPU: 45%
Current instance: m5.xlarge (4 vCPUs, 16 GB RAM)
Recommendation: m5.large (2 vCPUs, 8 GB RAM)
Savings: ~50% on this instance
Step 2: Analyze memory, network, disk I/O similarly
Step 3: Recommend a smaller instance type
Step 4: Test the recommendation in staging
Step 5: Apply in production with monitoring

Instance Type Selection Guide

Workload Type → Best Instance Family
General purpose (balanced) → t3/t4g, m5/m6i, m6g (ARM)
Compute-intensive → c5/c6i, c6g (ARM)
Memory-intensive → r5/r6i, x1
Storage-intensive → i3, d2
GPU / ML training → p4, g5
Burstable (variable CPU) → t3/t4g (cheapest for low-avg CPU)
ARM-based (20% cheaper) → m6g, c6g, r6g (Graviton)

Pricing Models

On-Demand vs Reserved vs Spot

Price (relative)
100% ├── On-Demand ──────────────────── (full price, no commitment)
70% ├── 1-Year Reserved ────────────── (30% savings, 1-yr commitment)
55% ├── 3-Year Reserved ────────────── (45% savings, 3-yr commitment)
40% ├── Convertible Reserved ───────── (60% savings, flexible type)
10% ├── Spot Instances ─────────────── (up to 90% savings, can be
│ interrupted with 2-min notice)
└────────────────────────────────────────────────────────────────
Pricing ModelSavingsCommitmentRiskBest For
On-Demand0% (baseline)NoneNoneShort-term, unpredictable workloads
Reserved (1yr)~30%1 yearMust pay even if unusedSteady-state production workloads
Reserved (3yr)~45%3 yearsLongest commitmentDatabases, core infrastructure
Savings Plans20-40%1 or 3 yearsCommit to dollar amount, not specific instancesFlexible workloads
Spot Instances60-90%None2-minute interruption noticeBatch processing, CI/CD, stateless workers

Spot Instance Strategies

Spot instances offer massive savings but can be interrupted. Use them for workloads that are fault-tolerant and stateless.

Good for Spot: Bad for Spot:
────────────── ──────────────
Batch processing Production databases
CI/CD build agents Single-instance applications
Data processing pipelines Stateful services
Machine learning training Real-time trading systems
Dev/test environments Anything with long startup time
Rendering farms
Distributed computing
Terminal window
# Request a spot fleet with mixed instance types
aws ec2 request-spot-fleet \
--spot-fleet-request-config '{
"IamFleetRole": "arn:aws:iam::123456:role/spot-fleet",
"TargetCapacity": 10,
"SpotPrice": "0.05",
"LaunchSpecifications": [
{
"InstanceType": "m5.large",
"ImageId": "ami-12345",
"SubnetId": "subnet-abc",
"WeightedCapacity": 1
},
{
"InstanceType": "m5.xlarge",
"ImageId": "ami-12345",
"SubnetId": "subnet-abc",
"WeightedCapacity": 2
},
{
"InstanceType": "m4.large",
"ImageId": "ami-12345",
"SubnetId": "subnet-def",
"WeightedCapacity": 1
}
],
"AllocationStrategy": "capacityOptimized",
"Type": "maintain"
}'

Auto-Scaling Strategies

Auto-scaling adjusts resource capacity based on demand, ensuring you have enough capacity during peak times and are not paying for idle resources during quiet times.

Types of Auto-Scaling

TypeDescriptionUse Case
Target trackingMaintain a target metric value (e.g., 70% CPU)Most common; simple and effective
Step scalingAdd/remove capacity in steps based on alarm thresholdsWorkloads with known traffic patterns
Scheduled scalingChange capacity at specific timesPredictable traffic patterns (business hours)
Predictive scalingML-based forecasting of future demandRecurring traffic patterns
Target Tracking Example (target: 70% CPU):
Capacity
10 ├ ┌───┐
│ │ │ ← Scale up
8 ├ ┌───┤ │ (CPU > 70%)
│ │ │ │
6 ├ ┌───┤ │ │
│ │ │ │ │
4 ├─────┬───┤ │ │ ├───┐
│ │ │ │ │ │ │ ← Scale down
2 ├─────┤ │ │ │ │ ├───── (CPU < 70%)
│ │ │ │ │ │ │
└─────┴───┴───┴───┴───┴───┴───▶ Time
6am 9am 12pm 3pm 6pm 9pm 12am

Auto-Scaling Best Practices


Storage Tiering

Cloud providers offer storage tiers at different price points. Moving data to cheaper tiers as it ages can dramatically reduce storage costs.

AWS S3 Storage Classes

Access Frequency: Frequent Infrequent Rare Archive
│ │ │ │
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│S3 Std │ │S3 IA │ │Glacier │ │Glacier │
│$0.023/GB│ │$0.0125 │ │Instant │ │Deep │
│ │ │/GB │ │$0.004/GB│ │$0.00099 │
│ │ │ │ │ │ │/GB │
└─────────┘ └─────────┘ └─────────┘ └─────────┘
Retrieval time: Immediate Immediate Immediate 12 hours
Retrieval cost: None Per-GB fee Per-GB fee Per-GB fee
Min duration: None 30 days 90 days 180 days

Lifecycle Policies

Object lifecycle:
Day 0 Day 30 Day 90 Day 365 Day 730
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
S3 Std → S3 IA → Glacier Instant → Glacier → Delete
Deep
Automated via lifecycle policy:
{
"Rules": [
{
"ID": "archive-old-data",
"Status": "Enabled",
"Filter": {
"Prefix": "logs/"
},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER_IR"
},
{
"Days": 365,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {
"Days": 730
}
}
]
}

Cost Monitoring and Tools

Cloud Provider Tools

ToolProviderCapabilities
AWS Cost ExplorerAWSVisualize, understand, and manage costs
AWS Trusted AdvisorAWSCost optimization recommendations
AWS Compute OptimizerAWSRight-sizing recommendations
Azure Cost ManagementAzureCost analysis and budgets
Azure AdvisorAzureRight-sizing and optimization
GCP Cost ManagementGCPCost breakdown and recommendations

Third-Party Tools

ToolFocus
CloudHealth (VMware)Multi-cloud cost management
Spot.io (NetApp)Spot instance management and optimization
KubecostKubernetes cost monitoring
InfracostCost estimation in CI/CD (Terraform)
VantageCloud cost transparency

Setting Up Cost Alerts

Terminal window
# Create a monthly budget with alerts
aws budgets create-budget \
--account-id 123456789012 \
--budget '{
"BudgetName": "monthly-budget",
"BudgetLimit": {
"Amount": "5000",
"Unit": "USD"
},
"TimeUnit": "MONTHLY",
"BudgetType": "COST"
}' \
--notifications-with-subscribers '[
{
"Notification": {
"NotificationType": "ACTUAL",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 80,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [
{
"SubscriptionType": "EMAIL",
"Address": "finance@company.com"
}
]
},
{
"Notification": {
"NotificationType": "FORECASTED",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 100,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [
{
"SubscriptionType": "EMAIL",
"Address": "engineering@company.com"
}
]
}
]'

FinOps Principles

FinOps (Financial Operations) is the practice of bringing financial accountability to cloud spending. It is a cultural practice, not just a set of tools.

The Three Phases of FinOps

┌─────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ INFORM │ │ OPTIMIZE │ │ OPERATE │
│ │ │ │ │ │
│ - Visibility │───▶│ - Right-sizing │───▶│ - Continuous │
│ - Allocation │ │ - Reserved │ │ improvement │
│ - Benchmarking │ │ pricing │ │ - Automation │
│ - Forecasting │ │ - Spot usage │ │ - Policy │
│ │ │ - Waste removal │ │ enforcement │
└─────────────────┘ └──────────────────┘ └──────────────────┘
◀─────────────────────────┘
(Continuous cycle)

FinOps Core Principles

  1. Teams need to collaborate: Engineering, finance, and business work together on cloud spending decisions.

  2. Everyone takes ownership: Engineers are accountable for the cost of the resources they provision.

  3. A centralized team drives FinOps: A FinOps team provides best practices, tools, and governance.

  4. Reports should be accessible and timely: Real-time cost data enables better decisions.

  5. Decisions are driven by business value: Cost optimization is about maximizing value, not just minimizing spend.

  6. Take advantage of the variable cost model: The cloud’s flexibility is an advantage, not just a risk.

Cost Allocation with Tags

Tagging resources enables cost attribution to teams, projects, and environments:

Required Tags for Cost Allocation:
Tag Key Example Values Purpose
─────────────── ────────────────── ──────────────────
team platform, payments Charge to team budget
environment prod, staging, dev Identify non-prod waste
project checkout-v2, search Track project costs
cost-center CC-1234 Financial allocation
owner jane.doe@company Accountability
managed-by terraform, manual Infrastructure tracking

Quick Wins Checklist


Summary

ConceptKey Takeaway
Right-sizingMatch instance size to actual workload needs; highest impact optimization
Reserved pricingCommit to 1-3 year terms for 30-60% savings on predictable workloads
Spot instancesUp to 90% savings for fault-tolerant, stateless workloads
Auto-scalingScale capacity with demand to avoid paying for idle resources
Storage tieringAutomatically move aging data to cheaper storage classes
Cost monitoringSet budgets, alerts, and use tools for continuous visibility
FinOpsCultural practice of financial accountability for cloud spending
TaggingTag every resource for cost attribution and identification