Technical Decision-Making
Technical decisions shape the trajectory of software systems for years. The best technical leaders do not just make good decisions — they create transparent, inclusive processes that build organizational knowledge and consensus. This page covers the frameworks and practices that lead to better technical decision-making.
Architecture Decision Records (ADRs)
An ADR is a short document that captures an important architectural decision along with its context and consequences. ADRs create an institutional memory of why things are the way they are.
Why ADRs Matter
- New team members understand the reasoning behind existing architecture
- Teams avoid re-debating settled decisions
- When context changes, teams can revisit decisions with full understanding of the original trade-offs
- Decision-making quality improves because the process forces clear thinking
ADR Template
# ADR-0042: Use PostgreSQL as Primary Database
## StatusAccepted (2025-01-15)
## ContextWe need to choose a primary database for our new order managementsystem. The system requires:- ACID transactions for financial data- Complex queries across related entities- Support for JSON documents for flexible metadata- High availability and proven reliability- Team familiarity
We considered: PostgreSQL, MySQL, MongoDB, CockroachDB.
## DecisionWe will use **PostgreSQL 16** as our primary database.
## Rationale- PostgreSQL supports both relational and JSON data models, eliminating the need for a separate document store- Our team has 3+ years of production PostgreSQL experience- The JSONB type provides flexible schema for order metadata while maintaining query performance- Strong ecosystem: pgvector for future ML features, PostGIS if we need geospatial- Proven at our scale (millions of orders/day at similar companies)
## Alternatives Considered
### MySQL 8- Pros: Familiar, widely used, good performance- Cons: Weaker JSON support, less capable query planner, fewer advanced features (CTEs, window functions were added later)
### MongoDB- Pros: Flexible schema, good developer experience- Cons: Weaker transaction support across collections, team would need training, harder to maintain data integrity for financial records
### CockroachDB- Pros: Distributed by default, PostgreSQL-compatible- Cons: Higher operational complexity, higher cost, overkill for our current scale
## Consequences- We accept the operational burden of managing PostgreSQL (backups, replication, upgrades)- We will use Flyway for schema migrations- We will need read replicas if read traffic exceeds capacity of a single node- Team members unfamiliar with PostgreSQL-specific features will need training on JSONB and advanced SQL
## References- [PostgreSQL 16 Release Notes](https://www.postgresql.org/docs/16/release-16.html)- ADR-0038: Data model requirements for order managementADR Best Practices
| Practice | Description |
|---|---|
| Keep them short | 1-2 pages maximum; nobody reads long documents |
| Number them sequentially | ADR-0001, ADR-0002, etc. |
| Store them in the repo | Version-controlled, close to the code they describe |
| Make them immutable | Never edit an accepted ADR; create a new one that supersedes it |
| Include “alternatives considered” | Shows the decision was thoughtful, not arbitrary |
| Record the status | Proposed, Accepted, Deprecated, Superseded by ADR-XXXX |
ADR Status Lifecycle
Proposed ──▶ Accepted ──▶ Deprecated │ │ │ ▼ ▼ ▼ Rejected Superseded (archived) by ADR-XXXThe RFC Process
For larger, cross-team decisions, an RFC (Request for Comments) process provides a structured way to propose, discuss, and decide.
RFC vs ADR
| Aspect | ADR | RFC |
|---|---|---|
| Scope | Single team or component | Cross-team or organization-wide |
| Length | 1-2 pages | 3-10 pages with detailed design |
| Audience | Team members, future developers | Broader engineering organization |
| Discussion | Informal or in PR review | Formal review period with structured feedback |
| Timeline | Days | 1-3 weeks |
RFC Template
# RFC: Migrate Authentication to OAuth 2.0 / OIDC
**Author:** Jane Smith**Reviewers:** Auth team, Platform team, Security team**Status:** Open for comments (closes 2025-02-01)**Created:** 2025-01-15
## SummaryMigrate our custom authentication system to an OAuth 2.0 / OpenIDConnect-based architecture using Auth0 as our identity provider.
## Motivation- Our custom auth system has 3 known CVEs that require urgent patches- Password reset flow has a 15% failure rate- No support for MFA, SSO, or social login- Auth code is maintained by a single engineer (bus factor = 1)- SOC 2 compliance requires documented auth controls
## Detailed Design
### Architecture[Detailed architecture diagrams and descriptions]
### Migration Plan[Phase-by-phase migration strategy]
### API Changes[Breaking changes, deprecation timeline]
### Security Considerations[Threat model, token handling, session management]
## Alternatives Considered1. Fix the existing system2. Build our own OAuth server3. Use Keycloak (self-hosted)4. Use Auth0 (managed) ← proposed
## Open Questions1. How do we handle the 30-day token migration window?2. Should we support both old and new auth simultaneously?
## Timeline- Phase 1 (Feb): Auth0 setup and internal service migration- Phase 2 (Mar): Customer-facing app migration- Phase 3 (Apr): Deprecate old auth systemRFC Process Flow
Author writes RFC │ ▼Circulate for early feedback (1-2 trusted reviewers) │ ▼Publish RFC (open for comments, 1-2 week period) │ ▼Collect and address feedback │ ▼Decision meeting (if needed) │ ├──▶ Accepted → Begin implementation ├──▶ Rejected → Document reasons └──▶ Needs revision → Update and re-circulateEvaluating Trade-offs
Every technical decision involves trade-offs. The best leaders make trade-offs explicit rather than leaving them implicit.
Common Trade-off Dimensions
Speed of Development │ ┌─────────────┼─────────────┐ │ │ │ Consistency ───────┼─────── Flexibility │ │ │ Simplicity ────────┼─────── Power │ │ │ Performance ───────┼─────── Maintainability │ │ │ Control ───────────┼─────── Convenience │ │ │ └─────────────┼─────────────┘ │ Long-term CostDecision Matrix
When comparing multiple options, use a weighted decision matrix:
Criteria (weight) | Option A | Option B | Option C─────────────────────────┼────────────┼────────────┼────────────Performance (5) | 4 (20) | 5 (25) | 3 (15)Team familiarity (4) | 5 (20) | 2 (8) | 4 (16)Maintainability (4) | 4 (16) | 3 (12) | 5 (20)Cost (3) | 3 (9) | 4 (12) | 5 (15)Ecosystem/community (3) | 5 (15) | 4 (12) | 3 (9)Scalability (3) | 3 (9) | 5 (15) | 4 (12)─────────────────────────┼────────────┼────────────┼────────────Total | 89 | 84 | 87
(Score: 1-5, weighted score in parentheses)Reversible vs Irreversible Decisions
Jeff Bezos categorizes decisions as “one-way doors” and “two-way doors”:
| Type | Characteristics | Approach |
|---|---|---|
| One-way door (Type 1) | Irreversible or very costly to reverse | Careful analysis, broad input, thorough documentation |
| Two-way door (Type 2) | Easily reversible, low switching cost | Decide quickly, experiment, course-correct |
Examples:
- One-way door: Choosing a primary database, public API design, programming language for a core system
- Two-way door: Internal library choice, CI/CD tool, code formatting rules, feature flag decisions
Build vs Buy
One of the most impactful decisions in software engineering is whether to build a solution in-house or adopt an existing one.
Decision Framework
Build When: Buy When: ──────────── ──────────
Core differentiator Commodity capability Unique requirements Standard requirements Need full control Vendor meets 80%+ of needs Strong internal expertise Faster time to market Long-term cost advantage Team should focus elsewhere Regulatory/compliance needs Operational burden too highTotal Cost of Ownership Comparison
Build Costs: Buy Costs:────────── ──────────Initial development $$$ License/subscription $$Testing and QA $$ Integration work $$Documentation $ Customization $Ongoing maintenance $$$/year Training $Security patches $$/year Vendor lock-in risk $$Feature development $$$/year Ongoing fees $$/yearOn-call/operations $$/year Migration cost (exit) $$Knowledge continuity $/year
Total 3-year cost: $$$$$$ Total 3-year cost: $$$$(often 3-5x more than estimated) (more predictable)When Organizations Get It Wrong
| Mistake | Consequence |
|---|---|
| Building when you should buy | Engineering time wasted on non-differentiating work |
| Buying when you should build | Vendor lock-in, inability to customize, hidden costs |
| Not evaluating total cost | Hidden maintenance burden for build; hidden integration cost for buy |
| ”Not invented here” syndrome | Rejecting good external solutions due to ego |
| Ignoring exit costs | Locked into a vendor with no migration plan |
Technology Radar
A technology radar (popularized by ThoughtWorks) helps organizations track and communicate their stance on technologies.
┌──────────┐ │ ADOPT │ Use in production, recommended ┌┴──────────┴┐ │ TRIAL │ Worth exploring in non-critical projects ┌┴────────────┴┐ │ ASSESS │ Explore to understand, not for production ┌┴──────────────┴┐ │ HOLD │ Do not start new projects with this └────────────────┘
Quadrants: 1. Languages & Frameworks 2. Tools 3. Platforms 4. TechniquesExample Technology Radar Entries
| Technology | Ring | Rationale |
|---|---|---|
| TypeScript | Adopt | Standard for all new frontend and Node.js projects |
| Rust | Trial | Evaluating for performance-critical services |
| Deno | Assess | Interesting runtime, monitoring ecosystem maturity |
| jQuery | Hold | Legacy; migrate to modern frameworks |
| PostgreSQL | Adopt | Primary database for OLTP workloads |
| GraphQL | Trial | Using in new mobile API; evaluating DX and performance |
| Microservices | Adopt | Standard architecture for new services (with caveats) |
| Monorepo (Turborepo) | Trial | Piloting with frontend team |
Managing Technical Debt
Technical debt is the accumulated cost of shortcuts, deferred work, and evolving requirements that slow down future development.
Types of Technical Debt
Martin Fowler’s technical debt quadrant:
Deliberate Inadvertent ┌─────────────────────┬─────────────────────┐ │ │ │ Prudent │ "We know this is │ "Now we know how │ │ a shortcut and │ we should have │ │ will pay it back" │ done it" │ │ │ │ ├─────────────────────┼─────────────────────┤ │ │ │ Reckless │ "We don't have │ "What's layered │ │ time for design" │ architecture?" │ │ │ │ └─────────────────────┴─────────────────────┘Measuring Technical Debt
| Metric | What It Tells You |
|---|---|
| Cycle time increase | Time to deliver features is growing |
| Bug rate | Percentage of changes that introduce bugs |
| Code churn | Same files modified repeatedly (indicates unclear design) |
| Dependency age | How outdated are your dependencies |
| Test coverage gaps | Areas with no tests are likely to have hidden debt |
| Developer survey | Ask engineers: “How confident are you making changes in area X?” |
Strategies for Managing Technical Debt
- Make it visible — Track tech debt items in the backlog with estimated cost
- Allocate time — Reserve 15-20 percent of sprint capacity for debt reduction
- Pay it incrementally — Fix debt as you touch related code (the “Boy Scout Rule”)
- Prioritize by impact — Focus on debt that slows down the most teams or the most critical paths
- Prevent accumulation — Code reviews, design reviews, and quality standards
- Communicate business impact — “This tech debt adds 2 weeks to every feature in the payments module”
The Tech Debt Conversation with Product
INEFFECTIVE: Engineer: "We need 3 sprints to refactor the authentication module." PM: "Why? It works fine." Engineer: "Because the code is messy." PM: "We have features to ship."
EFFECTIVE: Engineer: "The authentication module currently takes 2 weeks for any feature change. With a 3-sprint investment, we can reduce that to 3 days. Over the next year, this saves approximately 20 weeks of engineering time across 10 planned auth features." PM: "Let's schedule it for next quarter."Communicating Technical Decisions
To Your Team
- Share the decision and rationale in a team meeting
- Publish the ADR in the repository
- Be open to questions and concerns
- Explain what changes in their day-to-day work
To Other Teams
- Send a summary to affected teams
- Highlight any breaking changes or migration needs
- Offer support during the transition
- Set clear timelines for deprecation
To Leadership
- Focus on business outcomes, not technical details
- Quantify the impact (cost savings, velocity improvement, risk reduction)
- Present options, not just your recommendation
- Be transparent about trade-offs and risks