Skip to content

Serverless Architecture

Serverless computing lets you run code without provisioning or managing servers. The cloud provider dynamically allocates compute resources, executes your code, and charges you only for the actual execution time. Despite the name, servers still exist — you simply do not manage them.


How Serverless Works

Traditional Server:
┌─────────────────────────────────────────────────┐
│ Server (always running, always costing money) │
│ │
│ Request ──▶ [App] ──▶ Response │
│ │
│ Idle... Idle... Request ──▶ [App] ──▶ Resp │
│ │
│ Idle... Idle... Idle... Idle... │
│ (still paying for idle time) │
└─────────────────────────────────────────────────┘
Serverless:
Request ──▶ [Spin up] ──▶ [Execute] ──▶ Response ──▶ [Shut down]
(pay: 0)
...nothing running...
(pay: 0)
Request ──▶ [Spin up] ──▶ [Execute] ──▶ Response ──▶ [Shut down]
(pay: 0)
You only pay for the milliseconds of execution time.

The Execution Model

Event Source Serverless Platform Your Function
(trigger)
│ ┌─────────────────┐
│ 1. Event │ │
├─────────────────▶│ 2. Find/Create │
│ │ container │
│ │ │
│ │ 3. Load code │
│ │ │
│ │ 4. Initialize │──── Cold Start
│ │ runtime │ (if new container)
│ │ │
│ │ 5. Execute │──── Warm Start
│ │ function │ (if container reused)
│ │ │
│ │ 6. Return │
│ ◀───────────────│ response │
│ │ │
│ │ 7. Keep warm │
│ │ or destroy │
│ └─────────────────┘

Serverless Platforms Compared

FeatureAWS LambdaAzure FunctionsGCP Cloud Functions
Max execution time15 minutes230s (Consumption), 60m (Premium)9 min (1st gen), 60 min (2nd gen)
Memory range128 MB - 10 GBUp to 1.5 GB (Consumption)128 MB - 32 GB
LanguagesPython, Node.js, Java, Go, .NET, Ruby, customC#, JS, Python, Java, PowerShellNode.js, Python, Go, Java, .NET, Ruby, PHP
Triggers200+ event sourcesHTTP, Timer, Queue, Blob, Cosmos DBHTTP, Pub/Sub, Cloud Storage, Firestore
Concurrency1000 default (can increase)200 per instance1000 per function
Cold start (typical)100ms-1s (interpreted), 3-10s (Java)Similar to LambdaSimilar to Lambda
Pricing$0.20 per 1M requests + compute$0.20 per 1M requests + compute$0.40 per 1M requests + compute
Free tier1M requests/month1M requests/month2M requests/month

Cold Starts

A cold start occurs when the serverless platform needs to create a new execution environment for your function. This includes downloading your code, starting the runtime, and initializing your application.

Cold Start Breakdown:
│◄─── Cold Start Time (100ms - 10s) ──────────────────▶│
│ │
│ Download │ Start │ Initialize │ Execute Function │
│ Code │ Runtime │ App Code │ │
│ (10-100ms)│(50-500ms)│ (varies) │ (your code runs) │
│ │ │ │ │
Warm Invocation (container reused):
│ Execute Function │
│ (your code runs) │
│ (much faster!) │

Factors Affecting Cold Start Duration

FactorImpactMitigation
Language/runtimeJava/C# are slower; Python/Node.js are fasterChoose interpreted languages for latency-sensitive functions
Package sizeLarger deployment packages take longer to downloadMinimize dependencies, use layers
VPC attachmentENI creation adds 5-10 seconds (older Lambda)Use VPC-less designs when possible
Memory allocationMore memory allocates proportionally more CPUIncrease memory to reduce cold start time
Initialization codeDatabase connections, SDK setup in init phaseLazy initialization, connection pooling

Cold Start Mitigation Strategies

Terminal window
# Keep 5 warm instances ready at all times
aws lambda put-provisioned-concurrency-config \
--function-name my-function \
--qualifier prod \
--provisioned-concurrent-executions 5
# Schedule provisioned concurrency for peak hours
# using Application Auto Scaling
aws application-autoscaling register-scalable-target \
--service-namespace lambda \
--resource-id function:my-function:prod \
--scalable-dimension \
lambda:function:ProvisionedConcurrency \
--min-capacity 2 \
--max-capacity 20

Event-Driven Patterns

Serverless functions excel at event-driven architectures. Functions respond to events from various sources rather than running continuously.

Common Event Sources

┌─────────────────────────────────────────────────────────────┐
│ Event Sources for Serverless │
├──────────────────┬──────────────────────────────────────────┤
│ Synchronous │ HTTP/API requests (API Gateway) │
│ │ SDK invocations │
├──────────────────┼──────────────────────────────────────────┤
│ Asynchronous │ Message queues (SQS, Service Bus) │
│ │ Event streams (Kinesis, Event Hubs) │
│ │ Pub/Sub topics (SNS, Pub/Sub) │
│ │ File uploads (S3, Blob Storage) │
│ │ Database changes (DynamoDB Streams) │
│ │ Scheduled triggers (CloudWatch Events) │
│ │ IoT device messages │
│ │ Email (SES) │
└──────────────────┴──────────────────────────────────────────┘

Pattern 1: API Backend

Client ──▶ API Gateway ──▶ Lambda ──▶ DynamoDB
──▶ S3
──▶ SES (email)

Pattern 2: Event Processing Pipeline

S3 Upload ──▶ Lambda (resize) ──▶ S3 (thumbnails)
──▶ DynamoDB (metadata)
──▶ SNS (notification)

Pattern 3: Scheduled Jobs

CloudWatch Events ──▶ Lambda (cleanup)
(every hour) ──▶ Lambda (report generation)
(daily at 2am) ──▶ Lambda (data export)

Pattern 4: Stream Processing

Kinesis Stream ──▶ Lambda ──▶ Elasticsearch
(real-time │ ──▶ S3 (archive)
clickstream) │ ──▶ CloudWatch (metrics)
Batch of records
processed per invocation
import json
import boto3
from PIL import Image
import io
s3 = boto3.client('s3')
def handler(event, context):
"""
Triggered when an image is uploaded to S3.
Creates a thumbnail and stores it in another bucket.
"""
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
# Skip if already a thumbnail
if key.startswith('thumbnails/'):
continue
# Download the image
response = s3.get_object(Bucket=bucket, Key=key)
image_data = response['Body'].read()
# Create thumbnail
image = Image.open(io.BytesIO(image_data))
image.thumbnail((200, 200))
# Save thumbnail
buffer = io.BytesIO()
image.save(buffer, format='JPEG', quality=85)
buffer.seek(0)
thumbnail_key = f"thumbnails/{key}"
s3.put_object(
Bucket=bucket,
Key=thumbnail_key,
Body=buffer,
ContentType='image/jpeg'
)
print(
f"Created thumbnail: {thumbnail_key}"
)
return {'statusCode': 200}

Serverless Databases

Serverless databases automatically scale compute and storage, eliminating the need to manage database instances.

DatabaseTypeProviderKey Feature
DynamoDBKey-value / DocumentAWSSingle-digit ms latency at any scale
Aurora ServerlessRelational (MySQL/PostgreSQL)AWSAuto-scales capacity units
Cosmos DB (Serverless)Multi-modelAzureGlobal distribution, multiple consistency levels
PlanetScaleMySQL-compatibleIndependentGit-like branching for schemas
NeonPostgreSQLIndependentAutoscaling, branching
FirestoreDocumentGCPReal-time sync, offline support
UpstashRedis-compatibleIndependentPay-per-request Redis

Benefits and Limitations

Benefits

BenefitDetails
No server managementNo patching, no capacity planning, no OS management
Automatic scalingScales from zero to thousands of concurrent executions
Pay-per-usePay only for actual compute time (often per millisecond)
High availabilityBuilt-in redundancy across availability zones
Faster time to marketFocus on business logic, not infrastructure
Built-in integrationsNative connections to other cloud services

Limitations

LimitationDetails
Cold startsFirst invocation has higher latency (100ms to 10s)
Execution time limitsFunctions timeout after 15 minutes (Lambda)
StatelessnessNo persistent local state between invocations
Vendor lock-inEvent source integrations are provider-specific
Debugging difficultyHarder to debug distributed, event-driven systems
Concurrency limitsDefault limits may throttle during traffic spikes
Cost at scaleCan be more expensive than reserved instances at very high, constant load

Cost Crossover Point

Monthly Cost
│ ╱ Serverless
│ ╱ (pay per invocation)
│ ╱
│ ╱─────────── Reserved Instance
│ ╱ (fixed monthly cost)
│ ╱
│ ╱
│╱
├─────────────────────────────────────▶
0 Requests/Month High
At low traffic: Serverless wins (near-zero cost)
At high, constant traffic: Reserved instances win
The crossover depends on function duration and memory

When to Use Serverless


Serverless Framework Example

serverless.yml
service: user-api
provider:
name: aws
runtime: python3.12
region: us-east-1
memorySize: 256
timeout: 30
environment:
TABLE_NAME: !Ref UsersTable
iam:
role:
statements:
- Effect: Allow
Action:
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:Query
- dynamodb:DeleteItem
Resource:
- !GetAtt UsersTable.Arn
functions:
getUser:
handler: handlers/users.get_user
events:
- httpApi:
path: /users/{id}
method: get
createUser:
handler: handlers/users.create_user
events:
- httpApi:
path: /users
method: post
processOrder:
handler: handlers/orders.process
events:
- sqs:
arn: !GetAtt OrderQueue.Arn
batchSize: 10
dailyReport:
handler: handlers/reports.generate
events:
- schedule: cron(0 2 * * ? *)
resources:
Resources:
UsersTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: Users
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: userId
AttributeType: S
KeySchema:
- AttributeName: userId
KeyType: HASH
OrderQueue:
Type: AWS::SQS::Queue
Properties:
QueueName: order-processing

Summary

ConceptKey Takeaway
ServerlessRun code without managing servers; pay per execution
Cold startsFirst invocation latency; mitigate with provisioned concurrency
Event-drivenFunctions triggered by events (HTTP, queues, files, schedules)
Serverless databasesAuto-scaling databases with pay-per-use pricing
Best forVariable traffic, event processing, scheduled tasks
Not ideal forLong-running jobs, constant high load, stateful apps