Core Protocols
TCP for reliable delivery, UDP for low latency, HTTP/HTTPS for web communication, DNS for name resolution. Know when to use each.
Before designing any system, you need a solid grasp of the fundamental building blocks. This page covers the networking concepts, communication protocols, API design patterns, and core trade-offs that appear in every system design discussion.
The client-server model is the foundation of virtually all modern networked applications. A client initiates requests, and a server processes those requests and returns responses.
┌──────────┐ Request ┌──────────┐│ │ ──────────────────────► │ ││ Client │ │ Server ││ (Browser)│ ◄──────────────────── │ (Backend)││ │ Response │ │└──────────┘ └──────────┘| Property | Client | Server |
|---|---|---|
| Initiates | Requests | Responses |
| Lifecycle | Ephemeral (user session) | Long-running (always on) |
| Count | Many (millions of users) | Few (server fleet) |
| Location | Edge (user devices) | Data center |
| Trust | Untrusted | Trusted |
In modern systems, a server often acts as a client to other servers:
┌────────┐ ┌─────────────┐ ┌──────────┐ ┌──────────┐│ Mobile │────►│ API Gateway │────►│ Auth │────►│ User DB ││ App │ │ │ │ Service │ │ │└────────┘ │ │────►│──────────│ └──────────┘ │ │ │ Product │────►┌──────────┐┌────────┐ │ │ │ Service │ │Product DB││ Web │────►│ │ └──────────┘ └──────────┘│ App │ └─────────────┘└────────┘Understanding the network stack helps you reason about where things can go wrong.
Layer 7 - Application │ HTTP, WebSocket, gRPCLayer 6 - Presentation │ TLS/SSL encryptionLayer 5 - Session │ Session managementLayer 4 - Transport │ TCP, UDPLayer 3 - Network │ IP, routingLayer 2 - Data Link │ Ethernet, Wi-FiLayer 1 - Physical │ Cables, radio wavesFor system design, the most relevant layers are Transport (4) and Application (7).
Every machine on a network has an IP address (like a street address) and services listen on ports (like apartment numbers).
http://192.168.1.100:8080/api/users │ │ │ │ │ │ │ └── Path (resource) │ │ └── Port (which service) │ └── IP Address (which machine) └── Protocol (how to communicate)DNS translates human-readable domain names (like www.example.com) into IP addresses (like 93.184.216.34).
┌──────────────┐ ┌─────►│ Root DNS │ (knows .com, .org, etc.) │ │ Server │ │ └──────┬───────┘ │ │┌────────┐ │ ┌──────▼───────┐│ Client │────┼─────►│ TLD DNS │ (knows .com domains)│ │ │ │ Server │└────┬───┘ │ └──────┬───────┘ │ │ │ │ ┌────▼─────┐ ┌───▼──────────┐ │ │ Local DNS │ │ Authoritative│ (knows example.com IP) └──►│ Resolver │──► DNS Server │ │ (ISP) │ │ │ └───────────┘ └──────────────┘| Record | Purpose | Example |
|---|---|---|
| A | Maps domain to IPv4 address | example.com → 93.184.216.34 |
| AAAA | Maps domain to IPv6 address | example.com → 2606:2800:... |
| CNAME | Alias to another domain | www.example.com → example.com |
| MX | Mail server routing | example.com → mail.example.com |
| NS | Authoritative name server | example.com → ns1.example.com |
| TXT | Arbitrary text (SPF, DKIM) | example.com → "v=spf1 ..." |
The two primary transport-layer protocols have fundamentally different guarantees.
TCP Three-Way Handshake:
Client Server │ │ │──── SYN ───────────────►│ │ │ │◄─── SYN-ACK ───────────│ │ │ │──── ACK ───────────────►│ │ │ │ Connection Established │ │◄════════════════════════►│| Use Case | Protocol | Reason |
|---|---|---|
| Web pages (HTTP) | TCP | Need reliable, ordered delivery |
| File transfer | TCP | Cannot lose data |
| Email (SMTP) | TCP | Messages must arrive intact |
| Video streaming | UDP | Tolerates some packet loss, needs low latency |
| Voice calls (VoIP) | UDP | Real-time, latency-sensitive |
| DNS queries | UDP | Small, single request-response |
| Online gaming | UDP | Low latency more important than reliability |
| IoT sensor data | UDP | Lightweight, high-frequency updates |
HTTP (HyperText Transfer Protocol) is the application-layer protocol that powers the web.
Client Server │ │ │──── GET /api/users HTTP/1.1 ──►│ │ Host: api.example.com │ │ Authorization: Bearer ... │ │ │ │◄── HTTP/1.1 200 OK ───────────│ │ Content-Type: application/ │ │ json │ │ [{"id": 1, "name": "..."}] │ │ │| Method | Purpose | Idempotent | Safe |
|---|---|---|---|
| GET | Retrieve a resource | Yes | Yes |
| POST | Create a new resource | No | No |
| PUT | Replace a resource entirely | Yes | No |
| PATCH | Partially update a resource | Not guaranteed | No |
| DELETE | Remove a resource | Yes | No |
| HEAD | Same as GET but no body | Yes | Yes |
| OPTIONS | Describe communication options | Yes | Yes |
| Range | Category | Common Codes |
|---|---|---|
| 1xx | Informational | 101 Switching Protocols |
| 2xx | Success | 200 OK, 201 Created, 204 No Content |
| 3xx | Redirection | 301 Moved Permanently, 304 Not Modified |
| 4xx | Client Error | 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests |
| 5xx | Server Error | 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable |
HTTPS wraps HTTP in a TLS (Transport Layer Security) layer, providing:
TLS Handshake (Simplified):
Client Server │ │ │── ClientHello (supported ciphers)─►│ │ │ │◄─ ServerHello + Certificate ───────│ │ │ │── Verify cert, key exchange ──────►│ │ │ │◄─ Finished ────────────────────────│ │ │ │══ Encrypted HTTP traffic ═════════►│ │◄═══════════════════════════════════│| Feature | HTTP/1.1 | HTTP/2 | HTTP/3 |
|---|---|---|---|
| Multiplexing | No (one request per connection) | Yes (multiple streams) | Yes |
| Header compression | No | HPACK | QPACK |
| Server push | No | Yes | Yes |
| Transport | TCP | TCP | QUIC (UDP-based) |
| Head-of-line blocking | Yes | At TCP level | No |
APIs define how clients and servers communicate. Choosing the right API style depends on your use case.
REST is the most widely used API style for web services. It maps CRUD operations to HTTP methods on resources.
# Resource-based URL designGET /api/users # List all usersGET /api/users/123 # Get user 123POST /api/users # Create a new userPUT /api/users/123 # Replace user 123PATCH /api/users/123 # Update user 123DELETE /api/users/123 # Delete user 123
# Nested resourcesGET /api/users/123/posts # List user 123's postsPOST /api/users/123/posts # Create a post for user 123
# Filtering, sorting, paginationGET /api/users?role=admin&sort=-created_at&page=2&limit=20REST Best Practices:
/users not /getUsers)/users not /user)/api/v1/users)GraphQL lets clients request exactly the data they need, solving the over-fetching and under-fetching problems of REST.
# Client specifies exactly what data it needsquery { user(id: "123") { name email posts(last: 5) { title createdAt comments { text author { name } } } }}| Aspect | REST | GraphQL |
|---|---|---|
| Endpoints | Multiple (one per resource) | Single /graphql endpoint |
| Data fetching | Fixed response shape | Client specifies shape |
| Over-fetching | Common problem | Solved |
| Under-fetching | Requires multiple requests | Single request |
| Caching | HTTP caching built-in | Requires custom caching |
| File uploads | Native support | Needs workaround |
| Learning curve | Low | Moderate |
| Best for | Simple CRUD, public APIs | Complex, nested data; mobile apps |
gRPC uses Protocol Buffers for serialization and HTTP/2 for transport. It is ideal for internal service-to-service communication.
// Define the service in a .proto filesyntax = "proto3";
service UserService { rpc GetUser (GetUserRequest) returns (User); rpc ListUsers (ListUsersRequest) returns (stream User); rpc CreateUser (CreateUserRequest) returns (User);}
message GetUserRequest { string user_id = 1;}
message User { string id = 1; string name = 2; string email = 3; int64 created_at = 4;}| Feature | REST | gRPC |
|---|---|---|
| Serialization | JSON (text) | Protocol Buffers (binary) |
| Performance | Moderate | High (10x faster serialization) |
| Streaming | Limited (SSE, WebSocket) | Native bidirectional streaming |
| Type safety | No (relies on docs) | Yes (generated code from .proto) |
| Browser support | Native | Requires gRPC-Web proxy |
| Best for | Public APIs, web clients | Internal microservice communication |
These two metrics are often confused but measure different things.
Analogy: A highway
Latency = How long it takes ONE car to travel from A to BThroughput = How many cars pass a point per hour
A highway can have:- Low latency + High throughput (fast, many lanes) ← ideal- Low latency + Low throughput (fast, few lanes)- High latency + High throughput (slow, many lanes)- High latency + Low throughput (slow, few lanes) ← worst| Goal | Strategies |
|---|---|
| Reduce latency | Caching, CDNs, connection pooling, geographic proximity, async processing |
| Increase throughput | Horizontal scaling, load balancing, batching, connection multiplexing, queue-based processing |
In practice, optimizing for one can impact the other. Batching increases throughput but may increase latency for individual requests. Caching can improve both simultaneously.
In distributed systems, you must often choose between strong consistency and high availability.
Availability is the percentage of time a system is operational and accessible.
Availability = Uptime / (Uptime + Downtime)| Availability | Downtime/Year | Downtime/Month | Downtime/Week |
|---|---|---|---|
| 99% (two nines) | 3.65 days | 7.31 hours | 1.68 hours |
| 99.9% (three nines) | 8.77 hours | 43.83 minutes | 10.08 minutes |
| 99.99% (four nines) | 52.60 minutes | 4.38 minutes | 1.01 minutes |
| 99.999% (five nines) | 5.26 minutes | 26.30 seconds | 6.05 seconds |
| Model | Guarantee | Use Case |
|---|---|---|
| Strong consistency | All reads see the most recent write | Banking, inventory |
| Eventual consistency | Reads will eventually see the latest write | Social media feeds, DNS |
| Causal consistency | Causally related operations are seen in order | Chat messages |
| Read-your-writes | A user always sees their own writes | User profile updates |
The CAP theorem states that a distributed data store can provide at most two of three guarantees simultaneously:
Consistency /\ / \ / \ / CP \ / Systems \ /──────────\ / \ / CA AP \ / Systems Systems\ /──────────────────\ Availability ──────── Partition ToleranceIn any distributed system, network partitions will happen (cables fail, data centers lose connectivity). So partition tolerance is not optional — you must have P. This means you are choosing between:
| Choice | Guarantees | Sacrifice | Examples |
|---|---|---|---|
| CP | Consistency + Partition Tolerance | Availability (may reject requests during partitions) | HBase, MongoDB (tunable), Redis Cluster |
| AP | Availability + Partition Tolerance | Consistency (may serve stale data during partitions) | Cassandra, DynamoDB, CouchDB |
| CA | Consistency + Availability | Partition Tolerance (only works on single node) | Traditional RDBMS (single node PostgreSQL, MySQL) |
The PACELC theorem extends CAP: if there is a Partition, choose between Availability and Consistency; Else (when running normally), choose between Latency and Consistency.
If (Partition) → choose A or CElse → choose L or C
Examples:- DynamoDB: PA/EL (Available during partition, Low latency normally)- MongoDB: PC/EC (Consistent during partition, Consistent normally)- Cassandra: PA/EL (Available during partition, Low latency normally)- PostgreSQL: PC/EC (Consistent always, single-node CA)A forward proxy sits between clients and the internet, acting on behalf of the client.
┌────────┐ ┌─────────────┐ ┌──────────┐│ Client │────►│ Forward │────►│ Server A ││ A │ │ Proxy │ └──────────┘└────────┘ │ │ ┌──────────┐┌────────┐ │ - Caching │────►│ Server B ││ Client │────►│ - Filtering │ └──────────┘│ B │ │ - Anonymity │└────────┘ └─────────────┘Use cases: Corporate content filtering, caching, IP anonymization.
A reverse proxy sits between the internet and servers, acting on behalf of the servers.
┌─────────────┐ ┌──────────┐ │ Reverse │────►│ Server A │┌────────┐ │ Proxy │ └──────────┘│ Client │────►│ │ ┌──────────┐│ │ │ - Load bal. │────►│ Server B │└────────┘ │ - SSL term. │ └──────────┘ │ - Caching │ ┌──────────┐ │ - Compress. │────►│ Server C │ └─────────────┘ └──────────┘Use cases: Load balancing, SSL termination, caching, compression, DDoS protection.
Common reverse proxies: Nginx, HAProxy, AWS ALB, Cloudflare.
With simple modular hashing (hash(key) % N), adding or removing a server causes most keys to be remapped:
With 3 servers: hash("user1") % 3 = 1 → Server 1With 4 servers: hash("user1") % 4 = 2 → Server 2 (remapped!)This causes a cache stampede — most cached data is invalidated when the fleet changes.
Consistent hashing maps both servers and keys onto a circular ring. Each key is assigned to the next server clockwise on the ring.
Server A │ ─────────●───────── / key1 ↗ \ / / \ ● Server D Server B ● \ / \ key2 ↗ / ─────────●───────── Server C
When Server B is removed:- Only keys between A and B are remapped to C- Keys assigned to A, C, D are unaffectedBenefits: When a server is added or removed, only K/N keys are remapped on average (where K = total keys, N = total servers), instead of nearly all keys.
Virtual nodes: Each physical server is mapped to multiple points on the ring for better balance.
Core Protocols
TCP for reliable delivery, UDP for low latency, HTTP/HTTPS for web communication, DNS for name resolution. Know when to use each.
API Design
REST for public APIs, GraphQL for flexible data fetching, gRPC for high-performance internal services. Match the API style to the use case.
Key Trade-offs
Latency vs throughput, availability vs consistency, and the CAP theorem. Every design decision involves trade-offs — be explicit about which you are making.
Infrastructure
Proxies, load balancers, and consistent hashing are the building blocks for scalable, resilient systems. Understand them deeply.