Skip to content

SQL Fundamentals

SQL Categories

SQL statements are organized into four major categories, each serving a distinct purpose:

CategoryFull NamePurposeKey Statements
DDLData Definition LanguageDefine and modify schemaCREATE, ALTER, DROP, TRUNCATE
DMLData Manipulation LanguageQuery and modify dataSELECT, INSERT, UPDATE, DELETE
DCLData Control LanguageManage permissionsGRANT, REVOKE
TCLTransaction Control LanguageManage transactionsBEGIN, COMMIT, ROLLBACK, SAVEPOINT

This page focuses primarily on DML — the statements you will use most frequently as a developer.

Sample Database

All examples in this page use the following tables. Imagine a simple e-commerce system:

CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(150) UNIQUE NOT NULL,
city VARCHAR(50),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
category VARCHAR(50),
price DECIMAL(10, 2) NOT NULL
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers(customer_id),
order_date DATE NOT NULL,
status VARCHAR(20) DEFAULT 'pending'
);
CREATE TABLE order_items (
item_id SERIAL PRIMARY KEY,
order_id INT REFERENCES orders(order_id),
product_id INT REFERENCES products(product_id),
quantity INT NOT NULL,
unit_price DECIMAL(10, 2) NOT NULL
);

Sample data:

customer_idnameemailcity
1Alice Johnsonalice@example.comNew York
2Bob Smithbob@example.comChicago
3Carol Leecarol@example.comNew York
4David Kimdavid@example.comSan Francisco
product_idnamecategoryprice
1LaptopElectronics999.99
2Wireless MouseElectronics29.99
3Python BookBooks45.00
4Standing DeskFurniture349.99
5USB-C CableElectronics12.99

The SELECT Statement

The SELECT statement is the workhorse of SQL. It retrieves data from one or more tables.

Basic Syntax

SELECT column1, column2, ...
FROM table_name
WHERE condition
ORDER BY column1 [ASC|DESC]
LIMIT count
OFFSET skip;

Selecting Columns

-- Select specific columns
SELECT name, email, city
FROM customers;
nameemailcity
Alice Johnsonalice@example.comNew York
Bob Smithbob@example.comChicago
Carol Leecarol@example.comNew York
David Kimdavid@example.comSan Francisco
-- Select all columns (avoid in production — be explicit)
SELECT * FROM customers;
-- Aliasing columns
SELECT name AS customer_name, city AS location
FROM customers;
-- Computed columns
SELECT name, price, price * 0.9 AS discounted_price
FROM products;
namepricediscounted_price
Laptop999.99899.99
Wireless Mouse29.9926.99
Python Book45.0040.50
Standing Desk349.99314.99
USB-C Cable12.9911.69

WHERE Clause — Filtering Rows

-- Simple comparison
SELECT name, price FROM products
WHERE price > 100;
nameprice
Laptop999.99
Standing Desk349.99
-- Multiple conditions
SELECT name, price, category FROM products
WHERE category = 'Electronics' AND price < 50;
namepricecategory
Wireless Mouse29.99Electronics
USB-C Cable12.99Electronics
-- IN operator
SELECT name, city FROM customers
WHERE city IN ('New York', 'Chicago');
-- BETWEEN operator
SELECT name, price FROM products
WHERE price BETWEEN 20 AND 100;
-- LIKE for pattern matching
SELECT name, email FROM customers
WHERE email LIKE '%@example.com';
-- NULL checks
SELECT name, city FROM customers
WHERE city IS NOT NULL;

ORDER BY and LIMIT

-- Sort by price descending, take top 3
SELECT name, price FROM products
ORDER BY price DESC
LIMIT 3;
nameprice
Laptop999.99
Standing Desk349.99
Python Book45.00
-- Pagination: page 2, 2 items per page
SELECT name, price FROM products
ORDER BY price DESC
LIMIT 2 OFFSET 2;
nameprice
Python Book45.00
Wireless Mouse29.99

DISTINCT

SELECT DISTINCT city FROM customers;
city
New York
Chicago
San Francisco

Aggregate Functions

Aggregate functions compute a single result from a set of rows.

FunctionDescriptionExample
COUNT()Number of rowsCOUNT(*), COUNT(column)
SUM()Sum of valuesSUM(price)
AVG()Average of valuesAVG(price)
MIN()Minimum valueMIN(price)
MAX()Maximum valueMAX(price)
SELECT
COUNT(*) AS total_products,
SUM(price) AS total_value,
AVG(price) AS avg_price,
MIN(price) AS cheapest,
MAX(price) AS most_expensive
FROM products;
total_productstotal_valueavg_pricecheapestmost_expensive
51437.96287.5912.99999.99

GROUP BY and HAVING

GROUP BY partitions rows into groups and applies aggregate functions to each group. HAVING filters groups after aggregation (like WHERE but for groups).

SELECT
category,
COUNT(*) AS num_products,
AVG(price) AS avg_price
FROM products
GROUP BY category;
categorynum_productsavg_price
Electronics3347.66
Books145.00
Furniture1349.99
-- Only categories with average price above 100
SELECT
category,
COUNT(*) AS num_products,
AVG(price) AS avg_price
FROM products
GROUP BY category
HAVING AVG(price) > 100;
categorynum_productsavg_price
Electronics3347.66
Furniture1349.99

JOINs

JOINs combine rows from two or more tables based on a related column. They are fundamental to working with relational databases.

Visual Overview

Table A Table B JOIN Types
┌─────┐ ┌─────┐
│ 1 │ │ 1 │ INNER JOIN: Only matching rows
│ 2 │ │ 2 │ LEFT JOIN: All from A + matches from B
│ 3 │ │ 3 │ RIGHT JOIN: All from B + matches from A
│ 4 │ │ 5 │ FULL OUTER JOIN: All from both
└─────┘ └─────┘ CROSS JOIN: Every combination (A x B)
INNER JOIN (A ∩ B) LEFT JOIN RIGHT JOIN FULL OUTER JOIN
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ ┌───┐ ┌───┐ │ │ ┌───────┐ │ │ ┌───────┐ │ │ ┌───────────┐│
│ │ A │█│ B │ │ │ │ A ████│ │ │ │████ B │ │ │ │ A ██████ B ││
│ └───┘ └───┘ │ │ └───────┘ │ │ └───────┘ │ │ └───────────┘│
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
Rows: 1, 2, 3 Rows: 1, 2, 3, 4 Rows: 1, 2, 3, 5 Rows: 1, 2, 3, 4, 5

INNER JOIN

Returns only rows that have matching values in both tables.

SELECT
c.name AS customer,
o.order_id,
o.order_date,
o.status
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;
customerorder_idorder_datestatus
Alice Johnson12024-01-15completed
Alice Johnson22024-02-20completed
Bob Smith32024-03-10pending

LEFT JOIN (LEFT OUTER JOIN)

Returns all rows from the left table, and matching rows from the right table. Non-matching rows get NULL for the right table columns.

SELECT
c.name AS customer,
o.order_id,
o.status
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;
customerorder_idstatus
Alice Johnson1completed
Alice Johnson2completed
Bob Smith3pending
Carol LeeNULLNULL
David KimNULLNULL
-- Find customers who have never placed an order
SELECT c.name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
name
Carol Lee
David Kim

RIGHT JOIN (RIGHT OUTER JOIN)

Returns all rows from the right table, and matching rows from the left table. This is the mirror of LEFT JOIN.

SELECT
c.name AS customer,
o.order_id,
o.status
FROM customers c
RIGHT JOIN orders o ON c.customer_id = o.customer_id;

In practice, most developers prefer LEFT JOIN and rearrange the table order rather than using RIGHT JOIN.

FULL OUTER JOIN

Returns all rows from both tables. Non-matching rows on either side get NULL values.

SELECT
c.name AS customer,
o.order_id
FROM customers c
FULL OUTER JOIN orders o ON c.customer_id = o.customer_id;

CROSS JOIN

Produces the Cartesian product — every row from the first table paired with every row from the second table. Use cautiously as output size = rows_A x rows_B.

-- Generate a report template: every customer x every product
SELECT c.name, p.name AS product
FROM customers c
CROSS JOIN products p;
-- Result: 4 customers x 5 products = 20 rows

SELF JOIN

A table joined with itself. Useful for hierarchical data or comparing rows within the same table.

-- Find customers in the same city
SELECT
a.name AS customer_1,
b.name AS customer_2,
a.city
FROM customers a
INNER JOIN customers b
ON a.city = b.city
AND a.customer_id < b.customer_id;
customer_1customer_2city
Alice JohnsonCarol LeeNew York

Multi-Table JOINs

Real queries often join three or more tables:

-- Full order details: customer, order, items, products
SELECT
c.name AS customer,
o.order_id,
o.order_date,
p.name AS product,
oi.quantity,
oi.unit_price,
(oi.quantity * oi.unit_price) AS line_total
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
ORDER BY o.order_id, p.name;
customerorder_idorder_dateproductquantityunit_priceline_total
Alice Johnson12024-01-15Laptop1999.99999.99
Alice Johnson12024-01-15Wireless Mouse129.9929.99
Alice Johnson22024-02-20Python Book245.0090.00
Bob Smith32024-03-10Standing Desk1349.99349.99
Bob Smith32024-03-10USB-C Cable312.9938.97

ORM Equivalents

Here is how a common multi-table query looks in SQL versus popular ORMs:

SELECT
c.name AS customer,
COUNT(o.order_id) AS total_orders,
COALESCE(SUM(oi.quantity * oi.unit_price), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
LEFT JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY c.customer_id, c.name
ORDER BY total_spent DESC;

Subqueries

A subquery is a SELECT statement nested inside another query. They come in three forms based on what they return.

Scalar Subquery (returns a single value)

-- Products priced above the average
SELECT name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products);
nameprice
Laptop999.99
Standing Desk349.99

Row / Table Subquery (returns a set of values)

-- Customers who have placed at least one order
SELECT name, email
FROM customers
WHERE customer_id IN (
SELECT DISTINCT customer_id
FROM orders
);
nameemail
Alice Johnsonalice@example.com
Bob Smithbob@example.com

Correlated Subquery

A correlated subquery references the outer query, executing once per outer row:

-- For each customer, find their most recent order date
SELECT
c.name,
(SELECT MAX(o.order_date)
FROM orders o
WHERE o.customer_id = c.customer_id
) AS last_order_date
FROM customers c;
namelast_order_date
Alice Johnson2024-02-20
Bob Smith2024-03-10
Carol LeeNULL
David KimNULL

EXISTS

The EXISTS operator tests whether a subquery returns any rows:

-- Customers with at least one completed order
SELECT c.name
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
AND o.status = 'completed'
);
name
Alice Johnson

Common Table Expressions (CTEs)

CTEs create named temporary result sets that exist only for the duration of the query. They make complex queries more readable and maintainable.

WITH customer_totals AS (
SELECT
c.customer_id,
c.name,
COUNT(o.order_id) AS num_orders,
COALESCE(SUM(oi.quantity * oi.unit_price), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
LEFT JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY c.customer_id, c.name
),
spending_ranks AS (
SELECT
name,
num_orders,
total_spent,
CASE
WHEN total_spent > 500 THEN 'High Value'
WHEN total_spent > 0 THEN 'Regular'
ELSE 'Inactive'
END AS customer_tier
FROM customer_totals
)
SELECT * FROM spending_ranks
ORDER BY total_spent DESC;
namenum_orderstotal_spentcustomer_tier
Alice Johnson21119.98High Value
Bob Smith1388.96Regular
Carol Lee00.00Inactive
David Kim00.00Inactive

Recursive CTEs

Recursive CTEs are powerful for hierarchical data like org charts or category trees:

-- Example: employee hierarchy
WITH RECURSIVE org_chart AS (
-- Base case: top-level manager
SELECT employee_id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: find direct reports
SELECT e.employee_id, e.name, e.manager_id, oc.level + 1
FROM employees e
INNER JOIN org_chart oc ON e.manager_id = oc.employee_id
)
SELECT
REPEAT(' ', level - 1) || name AS org_tree,
level
FROM org_chart
ORDER BY level, name;

Window Functions

Window functions perform calculations across a set of rows that are related to the current row, without collapsing the result into a single row (unlike GROUP BY).

Syntax

function_name(args) OVER (
[PARTITION BY column1, column2, ...]
[ORDER BY column3, column4, ...]
[frame_clause]
)

ROW_NUMBER, RANK, and DENSE_RANK

SELECT
name,
category,
price,
ROW_NUMBER() OVER (ORDER BY price DESC) AS row_num,
RANK() OVER (ORDER BY price DESC) AS rank,
DENSE_RANK() OVER (ORDER BY price DESC) AS dense_rank
FROM products;
namecategorypricerow_numrankdense_rank
LaptopElectronics999.99111
Standing DeskFurniture349.99222
Python BookBooks45.00333
Wireless MouseElectronics29.99444
USB-C CableElectronics12.99555

PARTITION BY

PARTITION BY creates separate windows for each group:

-- Rank products within each category by price
SELECT
name,
category,
price,
ROW_NUMBER() OVER (
PARTITION BY category
ORDER BY price DESC
) AS rank_in_category
FROM products;
namecategorypricerank_in_category
Python BookBooks45.001
LaptopElectronics999.991
Wireless MouseElectronics29.992
USB-C CableElectronics12.993
Standing DeskFurniture349.991

LAG and LEAD

Access values from previous or subsequent rows:

-- Compare each product's price to the next cheaper product
SELECT
name,
price,
LAG(price, 1) OVER (ORDER BY price DESC) AS prev_higher_price,
LEAD(price, 1) OVER (ORDER BY price DESC) AS next_lower_price,
price - LEAD(price, 1) OVER (ORDER BY price DESC) AS price_gap
FROM products;
namepriceprev_higher_pricenext_lower_priceprice_gap
Laptop999.99NULL349.99650.00
Standing Desk349.99999.9945.00304.99
Python Book45.00349.9929.9915.01
Wireless Mouse29.9945.0012.9917.00
USB-C Cable12.9929.99NULLNULL

Running Totals with SUM OVER

-- Running total of order item revenue, ordered by date
SELECT
o.order_date,
p.name AS product,
oi.quantity * oi.unit_price AS line_total,
SUM(oi.quantity * oi.unit_price) OVER (
ORDER BY o.order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total
FROM order_items oi
INNER JOIN orders o ON oi.order_id = o.order_id
INNER JOIN products p ON oi.product_id = p.product_id
ORDER BY o.order_date;
order_dateproductline_totalrunning_total
2024-01-15Laptop999.99999.99
2024-01-15Wireless Mouse29.991029.98
2024-02-20Python Book90.001119.98
2024-03-10Standing Desk349.991469.97
2024-03-10USB-C Cable38.971508.94

Window Function ORM Equivalents

SELECT
name,
category,
price,
RANK() OVER (PARTITION BY category ORDER BY price DESC) AS category_rank
FROM products;

SQL Execution Order

Understanding the logical execution order of a SQL query helps you write correct queries and debug issues:

1. FROM / JOIN ← Identify source tables and join them
2. WHERE ← Filter individual rows
3. GROUP BY ← Group rows together
4. HAVING ← Filter groups
5. SELECT ← Choose columns and compute expressions
6. DISTINCT ← Remove duplicate rows
7. ORDER BY ← Sort the result set
8. LIMIT / OFFSET ← Restrict the number of rows returned

Next Steps