How to Run an Effective RFC Process in Your Engineering Team
How to Run an Effective RFC Process in Your Engineering Team
Introduction
"Why did we build it this way?"
Six months from now, someone will ask this question. Maybe it's a new hire. Maybe it's you, having forgotten the context. Maybe it's the team lead investigating why the system is struggling.
If you have an RFC process, the answer is documented—along with the alternatives you considered, the tradeoffs you accepted, and the context that shaped the decision. If you don't, you're left with git archaeology and Slack searches that turn up nothing.
An RFC (Request for Comments) process is how engineering teams make important technical decisions collaboratively and transparently. Done well, it improves decision quality, distributes knowledge, and creates an institutional memory that outlasts any individual. Done poorly, it becomes bureaucratic overhead that slows everything down.
This guide covers how to implement an RFC process that actually works—one that improves decisions without killing velocity.
What Is an RFC Process?
RFC: REQUEST FOR COMMENTS
════════════════════════════════════════════════════════════════════
A structured process for proposing, discussing, and deciding on
significant technical changes before implementation begins.
┌─────────────────────────────────────┐
│ │
│ PROBLEM / IDEA │
│ │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ │
│ WRITE RFC │
│ (proposal, alternatives, │
│ tradeoffs, plan) │
│ │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ │
│ REVIEW PERIOD │
│ (comments, questions, │
│ suggestions, debate) │
│ │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ │
│ DECISION │
│ (approve, reject, revise, │
│ defer) │
│ │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ │
│ IMPLEMENTATION │
│ (RFC becomes reference │
│ during and after) │
│ │
└─────────────────────────────────────┘
Why RFCs Matter
┌─────────────────────────────────────────────────────────────────────┐
│ BENEFITS OF AN RFC PROCESS │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ BETTER DECISIONS │
│ ───────────────────────────────────────────────────────────────── │
│ • More perspectives before committing │
│ • Alternatives explicitly considered │
│ • Hidden assumptions surface early │
│ • Cross-team impacts identified │
│ • Reduces "I wish we had thought of that" moments │
│ │
│ KNOWLEDGE DISTRIBUTION │
│ ───────────────────────────────────────────────────────────────── │
│ • Team learns about changes before they happen │
│ • Senior engineers can guide without blocking │
│ • New hires learn architectural context │
│ • Expertise spreads beyond individuals │
│ │
│ INSTITUTIONAL MEMORY │
│ ───────────────────────────────────────────────────────────────── │
│ • Decisions documented with context │
│ • "Why did we do this?" is answerable │
│ • Onboarding is faster │
│ • Reduces repeated debates │
│ │
│ ALIGNMENT │
│ ───────────────────────────────────────────────────────────────── │
│ • Stakeholders have input before commitment │
│ • Reduces surprises during implementation │
│ • Creates shared understanding │
│ • Conflicts surface early, when cheap to resolve │
│ │
│ INCLUSION │
│ ───────────────────────────────────────────────────────────────── │
│ • Async-friendly for distributed teams │
│ • Written format helps non-native speakers │
│ • Introverts can contribute thoughtfully │
│ • Everyone can participate, not just the loudest │
│ │
└─────────────────────────────────────────────────────────────────────┘
When to Write an RFC
Not everything needs an RFC. The art is knowing when to use the process and when to skip it.
The RFC Threshold
WHEN TO WRITE AN RFC:
════════════════════════════════════════════════════════════════════
DEFINITELY RFC PROBABLY NOT RFC
══════════════ ════════════════
• New service or major component • Bug fixes
• Public API changes • Small features
• Breaking changes • Implementation details
• Cross-team dependencies • Reversible changes
• New technology adoption • Local optimizations
• Security-critical changes • Routine maintenance
• Performance-critical design • Changes within team scope
• Significant refactoring • Clear best practices
• Process changes • Urgent fixes
THE DECISION MATRIX:
════════════════════════════════════════════════════════════════════
HIGH IMPACT
│
┌─────────────────┼─────────────────┐
│ │ │
│ RFC │ RFC │
│ REQUIRED │ REQUIRED │
│ │ │
│ High risk, │ High impact, │
│ low impact │ high risk │
│ (rare) │ │
LOW │ │ │ HIGH
REVERSIBILITY──────────────────────────────── REVERSIBILITY
│ │ │
│ CONSIDER │ NO RFC │
│ RFC │ NEEDED │
│ │ │
│ Might want │ Easy to │
│ alignment │ change later │
│ │ │
└─────────────────┼─────────────────┘
│
LOW IMPACT
Practical Thresholds
┌─────────────────────────────────────────────────────────────────────┐
│ PRACTICAL RFC TRIGGERS │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ SCOPE TRIGGERS (any of these): │
│ □ Affects more than one team │
│ □ Changes a public API or contract │
│ □ Introduces a new external dependency │
│ □ Takes more than 2 weeks to implement │
│ □ Creates ongoing maintenance burden │
│ │
│ RISK TRIGGERS (any of these): │
│ □ Could cause data loss or corruption │
│ □ Has security implications │
│ □ Affects system reliability/availability │
│ □ Hard to reverse once deployed │
│ □ Involves significant cost (infra, licenses) │
│ │
│ ORGANIZATIONAL TRIGGERS (any of these): │
│ □ Sets precedent for future decisions │
│ □ Involves technology not currently used │
│ □ Requires buy-in from multiple stakeholders │
│ □ Has been a topic of debate │
│ □ Someone asks "should we RFC this?" │
│ │
│ WHEN IN DOUBT: │
│ Write a lightweight RFC. Takes 30 minutes, saves hours of debate. │
│ │
└─────────────────────────────────────────────────────────────────────┘
The RFC Template
A good template guides authors without being bureaucratic.
Standard RFC Template
# RFC: [Title]
**RFC ID:** RFC-YYYY-NNN
**Author(s):** [Names]
**Status:** Draft | In Review | Approved | Rejected | Superseded
**Created:** YYYY-MM-DD
**Last Updated:** YYYY-MM-DD
**Review Deadline:** YYYY-MM-DD
---
## Summary
[2-3 sentences describing what this RFC proposes. A busy reader should
understand the core proposal from this section alone.]
## Motivation
[Why are we doing this? What problem does it solve? What's the current
state and why is it insufficient? Include data where possible.]
### Goals
- [Primary goal]
- [Secondary goal]
### Non-Goals
- [What this RFC explicitly does NOT address]
- [Scope boundaries]
## Proposal
[Detailed description of the proposed solution. Include:]
### Overview
[High-level description of the approach]
### Detailed Design
[Technical details, architecture, interfaces, etc.]
### Data Model
[If applicable: schemas, data flow, storage]
### API Changes
[If applicable: new endpoints, changed contracts]
### Migration Strategy
[How do we get from here to there?]
## Alternatives Considered
### Alternative 1: [Name]
[Description]
**Pros:**
- [Advantage]
**Cons:**
- [Disadvantage]
**Why not chosen:**
[Explanation]
### Alternative 2: [Name]
[Similar structure]
### Do Nothing
[What happens if we don't do this?]
## Tradeoffs and Drawbacks
[What are the downsides of this proposal? Be honest. Every decision
has costs.]
- [Tradeoff 1]
- [Tradeoff 2]
## Risks
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| [Risk 1] | High/Med/Low | High/Med/Low | [How we'll handle it] |
| [Risk 2] | ... | ... | ... |
## Security Considerations
[Security implications. If none, state "No significant security
implications identified."]
## Privacy Considerations
[Privacy implications, especially for user data.]
## Operational Considerations
[How does this affect operations? Monitoring, alerting, runbooks,
on-call burden.]
## Dependencies
[What does this depend on? What depends on this?]
- [Dependency 1]
- [Dependency 2]
## Timeline
[Rough implementation timeline. Not a commitment, but a sense of scale.]
| Phase | Description | Duration |
|-------|-------------|----------|
| Phase 1 | ... | X weeks |
| Phase 2 | ... | Y weeks |
## Open Questions
[Unresolved questions that need input during review.]
1. [Question 1]
2. [Question 2]
## References
- [Link to related RFCs]
- [Link to external resources]
- [Link to relevant documentation]
---
## Appendix
[Optional: detailed diagrams, extended examples, data that supports
the proposal but would clutter the main text.]
Lightweight RFC Template
For smaller decisions that still benefit from documentation:
# RFC: [Title]
**Author:** [Name]
**Status:** Draft | Approved | Rejected
**Date:** YYYY-MM-DD
## Context
[What's the situation? Why are we making a decision?]
## Decision
[What are we going to do?]
## Alternatives Considered
[What else did we consider and why didn't we choose it?]
## Consequences
[What are the implications of this decision?]
The RFC Lifecycle
State Machine
RFC LIFECYCLE:
════════════════════════════════════════════════════════════════════
┌─────────────┐
│ │
│ DRAFT │
│ │
└──────┬──────┘
│
Author ready for review
│
▼
┌─────────────┐
│ │
┌─────────│ IN REVIEW │─────────┐
│ │ │ │
│ └──────┬──────┘ │
│ │ │
Major issues Discussion No significant
found complete issues
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ │ │ │ │ │
│ REVISION │ │ DEFERRED │ │ APPROVED │
│ REQUESTED │ │ │ │ │
│ │ │ │ │ │
└──────┬──────┘ └─────────────┘ └──────┬──────┘
│ │
│ Implementation
│ begins
│ │
│ ▼
│ ┌─────────────┐
│ │ │
└─────────────────────────│ IMPLEMENTED │
(after revision) │ │
└──────┬──────┘
│
Time passes,
new RFC supersedes
│
▼
┌─────────────┐
│ │
│ SUPERSEDED │
│ │
└─────────────┘
Also possible:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ WITHDRAWN │ │ REJECTED │ │ ABANDONED │
│ (by author)│ │ (by review)│ │ (no action)│
└─────────────┘ └─────────────┘ └─────────────┘
Timeline Guidelines
TYPICAL RFC TIMELINE:
════════════════════════════════════════════════════════════════════
Week 1: DRAFTING
────────────────────────────────────────────────────────────────────
Day 1-3: Author writes initial draft
Day 3-4: Author gets early feedback from 1-2 trusted reviewers
Day 4-5: Author revises based on early feedback
Day 5: RFC submitted for formal review
Week 2: REVIEW
────────────────────────────────────────────────────────────────────
Day 6: RFC announced to relevant stakeholders
Day 6-10: Review period (comments, questions, discussion)
Day 10: Author addresses comments, revises as needed
Day 11-12: Final review of revisions
Week 3: DECISION
────────────────────────────────────────────────────────────────────
Day 13: Decision meeting (if needed) or async approval
Day 14: RFC approved or specific changes requested
Day 15: Final version published
VARIATIONS BY RFC SIZE:
════════════════════════════════════════════════════════════════════
Small RFC (minor change):
• Draft: 1-2 days
• Review: 3-5 days
• Total: ~1 week
Standard RFC (typical feature/change):
• Draft: 3-5 days
• Review: 5-7 days
• Total: ~2 weeks
Large RFC (major system change):
• Draft: 1-2 weeks
• Review: 2-3 weeks
• Total: ~1 month
Critical RFC (breaking change, new architecture):
• Draft: 2-4 weeks
• Review: 2-4 weeks
• Total: 1-2 months
Writing Effective RFCs
The Writing Process
BEFORE YOU WRITE:
════════════════════════════════════════════════════════════════════
1. VALIDATE THE PROBLEM
─────────────────────
□ Is this actually a problem worth solving?
□ Do others agree it's a problem?
□ Is now the right time?
□ Is an RFC the right vehicle?
2. DO YOUR HOMEWORK
─────────────────
□ Research existing solutions
□ Understand current system state
□ Identify stakeholders
□ Gather relevant data
□ Explore alternatives (don't just validate your first idea)
3. GET EARLY INPUT
────────────────
□ Discuss with 1-2 knowledgeable people
□ Identify obvious gaps in your thinking
□ Learn about constraints you might not know
WHILE YOU WRITE:
════════════════════════════════════════════════════════════════════
1. START WITH THE SUMMARY
───────────────────────
Write the summary first. If you can't summarize it in 2-3
sentences, you don't understand it well enough yet.
2. BE SPECIFIC, NOT VAGUE
───────────────────────
✗ "This will improve performance"
✓ "This reduces p99 latency from 500ms to 100ms for /api/search"
✗ "We'll add some caching"
✓ "We'll add a Redis cache layer with 15-minute TTL for user
profile data, keyed by user_id"
3. SHOW YOUR WORK
───────────────
Include the reasoning, not just the conclusion.
Readers should understand WHY, not just WHAT.
4. BE HONEST ABOUT TRADEOFFS
──────────────────────────
Every decision has costs. Acknowledge them.
If you can't think of any, you haven't thought hard enough.
5. ADDRESS THE OBVIOUS OBJECTIONS
───────────────────────────────
If you know someone will ask "why not X?", address it
proactively in Alternatives Considered.
Common Writing Mistakes
┌─────────────────────────────────────────────────────────────────────┐
│ RFC WRITING ANTI-PATTERNS │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ THE RUBBER STAMP │
│ ───────────────────────────────────────────────────────────────── │
│ Problem: RFC written after decision already made, just for │
│ documentation. No real input wanted. │
│ Sign: "We're already building this, here's the RFC" │
│ Fix: Write RFC before commitment. If urgent, acknowledge it. │
│ │
│ THE NOVEL │
│ ───────────────────────────────────────────────────────────────── │
│ Problem: 20-page RFC that nobody reads. │
│ Sign: Eyes glaze over, no comments, rubber-stamp approval. │
│ Fix: Ruthlessly edit. Put details in appendix. Summary matters. │
│ │
│ THE SINGLE OPTION │
│ ───────────────────────────────────────────────────────────────── │
│ Problem: Alternatives section is "do nothing or do my idea." │
│ Sign: No genuine alternatives explored. │
│ Fix: Seriously consider at least 2-3 real alternatives. │
│ │
│ THE SALES PITCH │
│ ───────────────────────────────────────────────────────────────── │
│ Problem: All benefits, no drawbacks. Cherry-picked data. │
│ Sign: Sounds too good to be true. │
│ Fix: Add honest Tradeoffs section. Include counter-evidence. │
│ │
│ THE VAGUE HAND-WAVE │
│ ───────────────────────────────────────────────────────────────── │
│ Problem: "We'll figure out the details later." │
│ Sign: Key design questions unanswered. │
│ Fix: If you can't be specific, you're not ready to RFC. │
│ │
│ THE KITCHEN SINK │
│ ───────────────────────────────────────────────────────────────── │
│ Problem: RFC tries to solve multiple unrelated problems. │
│ Sign: Scope creep, long debates, can't get consensus. │
│ Fix: Split into multiple focused RFCs. │
│ │
│ THE ORPHAN │
│ ───────────────────────────────────────────────────────────────── │
│ Problem: RFC submitted and author disappears. │
│ Sign: Comments go unanswered, review stalls. │
│ Fix: Author must actively shepherd through review. │
│ │
└─────────────────────────────────────────────────────────────────────┘
The Review Process
Who Should Review
REVIEWER SELECTION:
════════════════════════════════════════════════════════════════════
REQUIRED REVIEWERS (must approve):
────────────────────────────────────
• Domain expert(s) for affected area
• Tech lead of owning team
• Security reviewer (if security-relevant)
• On-call representative (if operational impact)
OPTIONAL REVIEWERS (informed, can comment):
────────────────────────────────────────────
• All engineers (open for comment)
• Dependent teams
• Platform/infrastructure team
• Product/design (if user-facing)
ESCALATION REVIEWERS (for significant decisions):
─────────────────────────────────────────────────
• Architecture team
• Engineering leadership
• Staff+ engineers
TIPS FOR GETTING GOOD REVIEWS:
════════════════════════════════════════════════════════════════════
1. TAG SPECIFIC PEOPLE
Generic "please review" gets ignored. Name names.
2. EXPLAIN WHY THEM
"Alice, I'd value your input on the caching strategy
given your experience with Redis at scale."
3. SET EXPECTATIONS
"I need approval from Bob and Carol. Would appreciate
feedback from others by Friday."
4. MAKE IT EASY
Provide context. Link to relevant code/docs.
Don't make reviewers do archaeology.
How to Review
RFC REVIEW CHECKLIST:
════════════════════════════════════════════════════════════════════
UNDERSTANDING (First pass)
□ Do I understand what is being proposed?
□ Is the problem well-defined?
□ Does the summary accurately reflect the proposal?
COMPLETENESS (Second pass)
□ Are all sections adequately addressed?
□ Are there obvious gaps in the design?
□ Are dependencies identified?
□ Is the migration strategy realistic?
ALTERNATIVES (Critical thinking)
□ Were alternatives genuinely considered?
□ Is there a better approach not mentioned?
□ Does "do nothing" have a fair assessment?
FEASIBILITY (Reality check)
□ Is the timeline realistic?
□ Are the technical claims accurate?
□ Are there hidden complexities?
□ Does this match the team's capacity?
RISK (What could go wrong)
□ Are risks adequately identified?
□ Are mitigations realistic?
□ What's the worst-case scenario?
□ Is this reversible if it doesn't work?
OPERATIONS (Living with it)
□ How does this affect on-call?
□ Is it observable/debuggable?
□ What's the maintenance burden?
BROADER IMPACT (Ecosystem)
□ How does this affect other teams?
□ Does this align with architectural direction?
□ Are there precedent implications?
Giving Good Feedback
┌─────────────────────────────────────────────────────────────────────┐
│ HOW TO GIVE RFC FEEDBACK │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ BE SPECIFIC │
│ ───────────────────────────────────────────────────────────────── │
│ ✗ "This seems risky" │
│ ✓ "The migration strategy doesn't account for the case where │
│ jobs are in-flight during the cutover. What happens to them?" │
│ │
│ BE CONSTRUCTIVE │
│ ───────────────────────────────────────────────────────────────── │
│ ✗ "This won't work" │
│ ✓ "I think this approach will struggle with X. Have you │
│ considered Y as an alternative?" │
│ │
│ CATEGORIZE YOUR FEEDBACK │
│ ───────────────────────────────────────────────────────────────── │
│ Use prefixes to indicate severity: │
│ │
│ [BLOCKING] Must be addressed before approval │
│ [MAJOR] Significant concern, should discuss │
│ [MINOR] Suggestion, take it or leave it │
│ [NIT] Trivial (typo, formatting) │
│ [QUESTION] Need clarification to evaluate │
│ │
│ SEPARATE WHAT FROM HOW │
│ ───────────────────────────────────────────────────────────────── │
│ Challenge the proposal if warranted, but don't dictate │
│ implementation details unless necessary. │
│ │
│ ACKNOWLEDGE GOOD WORK │
│ ───────────────────────────────────────────────────────────────── │
│ If the RFC is well-written, say so. If a tricky problem is │
│ elegantly solved, recognize it. │
│ │
│ BE TIMELY │
│ ───────────────────────────────────────────────────────────────── │
│ Late feedback is worse than no feedback. If you can't review │
│ by deadline, say so early. │
│ │
└─────────────────────────────────────────────────────────────────────┘
Receiving Feedback
AS THE AUTHOR, WHEN RECEIVING FEEDBACK:
════════════════════════════════════════════════════════════════════
1. ASSUME GOOD INTENT
───────────────────
Reviewers are trying to help, even when critical.
They took time to engage with your work.
2. DON'T BE DEFENSIVE
───────────────────
✗ "You don't understand what I'm saying"
✓ "Let me clarify that section—I can see how it's confusing"
3. ENGAGE WITH THE SUBSTANCE
──────────────────────────
Address the concern, not the tone.
If feedback is unclear, ask clarifying questions.
4. KNOW WHEN TO PUSH BACK
───────────────────────
You don't have to accept every suggestion.
"I considered that, but chose this approach because..."
is perfectly valid.
5. TRACK AND CLOSE COMMENTS
─────────────────────────
Respond to every comment, even if just "Acknowledged, updated."
Don't leave reviewers wondering if you saw their feedback.
6. SUMMARIZE CHANGES
──────────────────
When you revise, note what changed:
"Updated based on feedback: added migration rollback plan,
clarified caching TTL strategy, addressed API versioning."
Decision Making
Who Decides?
DECISION AUTHORITY MODELS:
════════════════════════════════════════════════════════════════════
MODEL 1: CONSENSUS
──────────────────
All required reviewers must approve.
Pros:
• Everyone aligned
• No one feels overruled
Cons:
• Can take forever
• One blocker halts everything
• Encourages watered-down compromises
Best for: Small teams, high-trust environments
MODEL 2: DESIGNATED DECIDER
───────────────────────────
One person has final authority (tech lead, architect, etc.)
Others advise, decider decides.
Pros:
• Clear accountability
• Faster resolution
• Can break ties
Cons:
• Single point of failure
• Could ignore good feedback
• Decider bottleneck
Best for: Hierarchical orgs, urgent decisions
MODEL 3: LAZY CONSENSUS
───────────────────────
Proposal passes unless someone objects within review period.
Silence = consent.
Pros:
• Low overhead for uncontroversial changes
• Doesn't require active approval from everyone
Cons:
• Important RFCs can slip through
• Requires people to actually read
Best for: Open source, high-volume RFC environments
MODEL 4: TIERED AUTHORITY
─────────────────────────
Different decision levels for different RFC impact.
Small RFC: Team lead approves
Standard RFC: Tech lead + one domain expert
Large RFC: Architecture review board
Critical RFC: Engineering leadership
Best for: Larger organizations, varying RFC significance
RECOMMENDED FOR MOST TEAMS: TIERED AUTHORITY
With designated decider as fallback for disagreements.
Breaking Deadlocks
WHEN REVIEWERS DISAGREE:
════════════════════════════════════════════════════════════════════
Step 1: CLARIFY THE DISAGREEMENT
────────────────────────────────
What specifically do parties disagree on?
• Technical facts? (Can be resolved with evidence)
• Predictions? (Can be resolved with experiments/spikes)
• Values/priorities? (Need escalation)
• Misunderstanding? (Can be resolved with discussion)
Step 2: FACILITATE DISCUSSION
─────────────────────────────
Bring parties together (sync or async).
Focus on:
• What are the concerns on each side?
• What evidence would change your mind?
• Is there a compromise that addresses both concerns?
Step 3: TIME-BOX THE DEBATE
───────────────────────────
"We need to decide by [date]. If we can't reach agreement,
we'll escalate to [decider]."
Prevents endless debate.
Step 4: ESCALATE IF NEEDED
──────────────────────────
If no resolution:
1. Document both positions clearly
2. Escalate to designated decider
3. Decider makes call, documents reasoning
4. Everyone commits to decision
COMMON DEADLOCK BREAKERS:
════════════════════════════════════════════════════════════════════
"Let's prototype both approaches and compare"
→ Evidence over opinion
"What would make this reversible if we're wrong?"
→ Reduces stakes of decision
"Can we start with X and build toward Y later?"
→ Incremental path forward
"What's the cost of being wrong with each option?"
→ Asymmetric risk assessment
"If we can't decide, what does the tie-breaker think?"
→ Clear escalation
After Approval
POST-APPROVAL CHECKLIST:
════════════════════════════════════════════════════════════════════
□ Update RFC status to "Approved"
□ Add approval date and approvers
□ Announce decision to relevant stakeholders
□ Create implementation tickets/epics
□ Link implementation work to RFC
□ Schedule any necessary kick-off meetings
□ Update any dependent documentation
□ Close the review discussion
DURING IMPLEMENTATION:
════════════════════════════════════════════════════════════════════
□ Reference RFC in relevant PRs
□ Update RFC if significant deviations occur
□ Track open questions from RFC as they're resolved
□ Note any decisions that differ from RFC
AFTER IMPLEMENTATION:
════════════════════════════════════════════════════════════════════
□ Update RFC status to "Implemented"
□ Add link to actual implementation
□ Note any significant deviations from plan
□ Optional: Write retrospective on prediction accuracy
RFC Infrastructure
Where to Store RFCs
RFC STORAGE OPTIONS:
════════════════════════════════════════════════════════════════════
OPTION 1: REPOSITORY (Recommended for most teams)
─────────────────────────────────────────────────
rfcs/
├── README.md # Index and process guide
├── 0000-template.md # RFC template
├── 0001-api-versioning.md
├── 0002-event-bus-migration.md
├── 0003-auth-redesign.md
└── archive/
└── 0000-rejected-example.md
Pros:
✓ Version controlled
✓ Review via PR process
✓ Lives with code
✓ Markdown + diagrams
✓ Easy to link from code/commits
Cons:
✗ Less discoverable than wiki
✗ Search can be limited
OPTION 2: WIKI/NOTION/CONFLUENCE
────────────────────────────────
Pros:
✓ Easy to browse
✓ Rich formatting
✓ Better search
✓ Non-engineers can access
Cons:
✗ No version control (usually)
✗ Can get messy over time
✗ Separated from code
OPTION 3: DEDICATED TOOL (Almanac, Slab, etc.)
──────────────────────────────────────────────
Pros:
✓ Purpose-built features
✓ Templates, workflows
✓ Better collaboration
Cons:
✗ Another tool to maintain
✗ Vendor lock-in
✗ Cost
RECOMMENDATION:
Start with a repository. Simple, free, version-controlled.
Move to dedicated tool if volume justifies it.
RFC Numbering and Naming
NUMBERING SCHEMES:
════════════════════════════════════════════════════════════════════
SEQUENTIAL (Simple):
RFC-001, RFC-002, RFC-003...
Pros: Simple, unique
Cons: No metadata in number
DATE-BASED:
RFC-2024-001, RFC-2024-002...
Pros: Shows when written
Cons: Numbers restart each year
CATEGORY-BASED:
RFC-ARCH-001, RFC-API-001, RFC-INFRA-001...
Pros: Grouped by area
Cons: Category debates, harder to find
RECOMMENDATION:
Sequential with descriptive filename:
0042-implement-rate-limiting.md
The number ensures uniqueness.
The name ensures discoverability.
RFC Index
Maintain an index for discoverability:
# RFC Index
## Active RFCs
| RFC | Title | Status | Author | Updated |
|-----|-------|--------|--------|---------|
| [0042](./0042-rate-limiting.md) | Implement Rate Limiting | Approved | @alice | 2024-01-15 |
| [0043](./0043-event-sourcing.md) | Event Sourcing for Orders | In Review | @bob | 2024-01-18 |
## By Area
### API
- [0012](./0012-api-versioning.md) - API Versioning Strategy
- [0042](./0042-rate-limiting.md) - Rate Limiting
### Infrastructure
- [0008](./0008-k8s-migration.md) - Kubernetes Migration
- [0035](./0035-observability.md) - Observability Stack
## Recently Decided
| RFC | Decision | Date |
|-----|----------|------|
| [0041](./0041-caching-strategy.md) | Approved | 2024-01-10 |
| [0040](./0040-graphql.md) | Rejected | 2024-01-05 |
Scaling the Process
For Growing Teams
RFC PROCESS AT DIFFERENT SCALES:
════════════════════════════════════════════════════════════════════
SMALL TEAM (5-15 engineers)
───────────────────────────
• Informal process, low overhead
• Everyone reviews everything (or can)
• Tech lead is default decider
• Review in team Slack channel
• Weekly: discuss open RFCs in team meeting
MEDIUM TEAM (15-50 engineers)
─────────────────────────────
• More formal process
• Designated reviewers per RFC
• Area/domain leads for decisions
• Dedicated RFC channel or tool
• Monthly: RFC review meeting
• Consider lightweight RFCs for smaller decisions
LARGE ORGANIZATION (50+ engineers)
──────────────────────────────────
• Tiered RFC process (team/org/company level)
• Architecture review board for cross-cutting
• RFC champions/shepherds
• Tooling for tracking and discovery
• Quarterly: process retrospective
• Consider separate tracks (technical, process, etc.)
SIGNS YOU NEED TO EVOLVE:
════════════════════════════════════════════════════════════════════
Scale UP if:
• Important decisions happening without visibility
• Same debates recurring
• Cross-team changes causing surprises
• New hires struggling to understand decisions
Scale DOWN if:
• RFCs sitting unreviewed for weeks
• Process feels like bureaucracy
• People avoiding RFC process
• Trivial things getting RFC'd
Avoiding Bureaucracy
┌─────────────────────────────────────────────────────────────────────┐
│ KEEPING RFC PROCESS LIGHTWEIGHT │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ SET CLEAR THRESHOLDS │
│ ───────────────────────────────────────────────────────────────── │
│ Not everything needs an RFC. Be explicit about what does. │
│ When in doubt, err on the side of "don't RFC this." │
│ │
│ HAVE A LIGHTWEIGHT OPTION │
│ ───────────────────────────────────────────────────────────────── │
│ One-pager RFCs for smaller decisions. │
│ Full RFC template for bigger ones. │
│ │
│ TIME-BOX REVIEWS │
│ ───────────────────────────────────────────────────────────────── │
│ "Review closes in 5 days. Silence = consent." │
│ Prevents RFCs lingering forever. │
│ │
│ AUTOMATE ADMINISTRIVIA │
│ ───────────────────────────────────────────────────────────────── │
│ • Auto-assign reviewers based on area │
│ • Auto-update index │
│ • Reminders for stale RFCs │
│ • Templates that fill in boilerplate │
│ │
│ RETROSPECT REGULARLY │
│ ───────────────────────────────────────────────────────────────── │
│ Quarterly: Is the process helping or hindering? │
│ Adjust based on feedback. │
│ │
│ MAKE IT VALUABLE │
│ ───────────────────────────────────────────────────────────────── │
│ If people don't see value, they'll route around the process. │
│ Ensure RFCs actually influence decisions. │
│ Reference them during implementation. │
│ Celebrate good RFCs. │
│ │
└─────────────────────────────────────────────────────────────────────┘
Common Pitfalls and Solutions
┌─────────────────────────────────────────────────────────────────────┐
│ RFC PROCESS PITFALLS │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ PITFALL: RFCs Never Get Reviewed │
│ ───────────────────────────────────────────────────────────────── │
│ Symptoms: RFCs sit for weeks, authors frustrated │
│ Causes: No clear reviewers, no deadline, no accountability │
│ Fixes: │
│ • Assign specific reviewers (not "the team") │
│ • Set review deadline upfront │
│ • Make review part of sprint work │
│ • Track review metrics (time to first review) │
│ │
│ PITFALL: Bike-shedding │
│ ───────────────────────────────────────────────────────────────── │
│ Symptoms: Hours debating minor details, big issues ignored │
│ Causes: Minor details easier to have opinions on │
│ Fixes: │
│ • Categorize comments (blocking vs nit) │
│ • Author can dismiss nits │
│ • Facilitator redirects to substance │
│ • Time-box debates │
│ │
│ PITFALL: RFC-in-Name-Only │
│ ───────────────────────────────────────────────────────────────── │
│ Symptoms: Decisions already made, RFC is rubber stamp │
│ Causes: Urgency, culture of "ask forgiveness" │
│ Fixes: │
│ • Call it out: "Are we actually open to change here?" │
│ • For urgent cases, acknowledge in RFC │
│ • If repeated, address cultural issue │
│ │
│ PITFALL: Analysis Paralysis │
│ ───────────────────────────────────────────────────────────────── │
│ Symptoms: Endless revisions, fear of committing │
│ Causes: Perfectionism, unclear decision criteria │
│ Fixes: │
│ • "Good enough" is fine—RFCs can be revised │
│ • Set decision deadline │
│ • Ask "what would change our decision?" │
│ • Prototype instead of debating │
│ │
│ PITFALL: RFCs Become Outdated │
│ ───────────────────────────────────────────────────────────────── │
│ Symptoms: RFC says one thing, code does another │
│ Causes: Implementation diverged, no one updated │
│ Fixes: │
│ • Note significant deviations in RFC │
│ • Link RFC from code ("See RFC-042") │
│ • Mark outdated RFCs as superseded │
│ • Accept that RFCs are point-in-time, not living docs │
│ │
│ PITFALL: Senior Dominance │
│ ───────────────────────────────────────────────────────────────── │
│ Symptoms: Junior voices ignored, seniors override without debate │
│ Causes: Power dynamics, implicit authority │
│ Fixes: │
│ • Anonymous feedback option │
│ • Explicitly invite junior input │
│ • Seniors speak last in discussions │
│ • Evaluate ideas, not sources │
│ │
│ PITFALL: Process for Process's Sake │
│ ───────────────────────────────────────────────────────────────── │
│ Symptoms: RFCs for trivial things, overhead exceeds value │
│ Causes: Over-broad RFC requirements, fear of skipping │
│ Fixes: │
│ • Clear threshold: "You DON'T need RFC for..." │
│ • Lightweight option for smaller decisions │
│ • Retrospect and adjust │
│ │
└─────────────────────────────────────────────────────────────────────┘
Cultural Aspects
Making RFCs Part of Culture
HOW TO BUILD RFC CULTURE:
════════════════════════════════════════════════════════════════════
1. LEADERSHIP PARTICIPATION
─────────────────────────
Tech leads and senior engineers should:
• Write RFCs themselves (not just review)
• Reference RFCs in discussions
• Praise good RFCs publicly
• Follow the process even when they could skip it
2. CELEBRATE GOOD RFCs
────────────────────
• Highlight well-written RFCs in team meetings
• Share particularly good examples
• Thank authors for thorough analysis
• Recognize good reviews too
3. MAKE IT SAFE TO PROPOSE
────────────────────────
• Rejected RFCs are not failures
• "Good idea, wrong time" is valid outcome
• Encourage junior engineers to RFC
• Separate idea from author in critique
4. REFERENCE RFCs REGULARLY
─────────────────────────
• Link from code comments
• Cite in PR descriptions
• Reference in discussions
• Use for onboarding
5. CLOSE THE LOOP
───────────────
• Did the RFC predictions hold?
• What did we learn?
• Should the process change?
Encouraging Participation
┌─────────────────────────────────────────────────────────────────────┐
│ GETTING PEOPLE TO ENGAGE WITH RFCs │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ FOR AUTHORS: │
│ ───────────────────────────────────────────────────────────────── │
│ • Lower the bar for first RFC │
│ • Pair with experienced author │
│ • Provide good examples │
│ • Give constructive feedback, not just critique │
│ • Recognize effort publicly │
│ │
│ FOR REVIEWERS: │
│ ───────────────────────────────────────────────────────────────── │
│ • Make review expectations clear (time commitment) │
│ • Include review in sprint work │
│ • Rotate review responsibility │
│ • Don't let same people always review │
│ • Acknowledge good reviews │
│ │
│ FOR EVERYONE: │
│ ───────────────────────────────────────────────────────────────── │
│ • Share context: why RFCs matter │
│ • Show value: "RFC-023 prevented X problem" │
│ • Make discoverable: easy to find and read │
│ • Keep process lightweight │
│ • Actually use RFCs in decisions │
│ │
└─────────────────────────────────────────────────────────────────────┘
Example RFC
Here's a complete example of a well-written RFC:
# RFC: Implement Rate Limiting for Public API
**RFC ID:** RFC-2024-042
**Author:** Alice Chen
**Status:** Approved
**Created:** 2024-01-10
**Last Updated:** 2024-01-18
**Review Deadline:** 2024-01-17
**Approvers:** @bob (API Lead), @carol (Security)
---
## Summary
Implement token bucket rate limiting for our public API to prevent
abuse, ensure fair usage, and protect system stability. Limits will
be 1000 requests/minute for standard users, 5000/minute for premium.
## Motivation
We're seeing increasing API abuse:
- 3 incidents in past month where single clients consumed >50% capacity
- Customer complaints about degraded performance during abuse
- No mechanism to protect against runaway scripts or attacks
### Goals
- Protect system from abuse and runaway clients
- Ensure fair resource allocation across customers
- Provide clear feedback to clients hitting limits
- Enable tiered limits for different customer plans
### Non-Goals
- Per-endpoint rate limiting (future RFC)
- Request prioritization/queuing
- Cost-based rate limiting
## Proposal
### Overview
Implement rate limiting at the API gateway layer using token bucket
algorithm. Each API key gets a bucket that refills at a fixed rate.
┌─────────────────────────────────────────────────────┐ │ │ │ Client Request │ │ │ │ │ ▼ │ │ ┌──────────┐ ┌─────────────┐ │ │ │ API │────▶│ Rate Limit │ │ │ │ Gateway │ │ Check │ │ │ └──────────┘ └──────┬──────┘ │ │ │ │ │ ┌─────────────┴─────────────┐ │ │ │ │ │ │ ▼ ▼ │ │ Under Limit Over Limit │ │ │ │ │ │ ▼ ▼ │ │ Process Request 429 Response │ │ + Retry-After │ │ │ └─────────────────────────────────────────────────────┘
### Detailed Design
**Algorithm: Token Bucket**
Each client has a bucket with capacity C and refill rate R.
- Bucket starts full (C tokens)
- Each request consumes 1 token
- Bucket refills at rate R per second
- Request rejected if bucket empty
**Configuration by Tier:**
| Tier | Bucket Size | Refill Rate | Effective Limit |
|------|-------------|-------------|-----------------|
| Free | 100 | 10/sec | 600/min sustained |
| Standard | 200 | 17/sec | 1000/min sustained |
| Premium | 500 | 84/sec | 5000/min sustained |
**Storage: Redis**
Rate limit state stored in Redis for:
- Distributed enforcement across API instances
- Sub-millisecond lookup
- Atomic increment operations
- Automatic expiry of inactive buckets
Key structure: `ratelimit:{api_key}`
Value: `{tokens}:{last_updated_timestamp}`
**Response Headers:**
All responses include:
X-RateLimit-Limit: 1000 X-RateLimit-Remaining: 847 X-RateLimit-Reset: 1705612800
Rate-limited responses (429):
Retry-After: 30
### Migration Strategy
1. Week 1: Deploy in monitoring mode (log, don't enforce)
2. Week 2: Notify customers of upcoming limits
3. Week 3: Enforce with generous limits (2x final)
4. Week 4: Move to final limits
## Alternatives Considered
### Alternative 1: Sliding Window
**Description:** Count requests in rolling time window.
**Pros:**
- Smoother limit enforcement
- No burst allowance
**Cons:**
- Higher memory usage (store timestamps)
- More complex implementation
**Why not chosen:** Token bucket handles bursts better and is
simpler to implement. Can revisit if burst handling becomes issue.
### Alternative 2: Leaky Bucket
**Description:** Process requests at fixed rate, queue excess.
**Pros:**
- Smoothest traffic shaping
**Cons:**
- Requires request queuing
- Higher latency for queued requests
**Why not chosen:** We want fast rejection, not queuing.
### Do Nothing
Customer experience continues to degrade during abuse incidents.
Risk of system instability grows. Not acceptable.
## Tradeoffs and Drawbacks
- **Legitimate users may hit limits:** Mitigated by generous limits
and clear upgrade path.
- **Redis dependency:** Rate limiting fails if Redis down. Will
fail-open (allow requests) rather than fail-closed.
- **Operational complexity:** New system to monitor and maintain.
## Risks
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Redis failure | Low | Medium | Fail-open, alert on-call |
| Limits too aggressive | Medium | High | Start generous, monitor |
| Customer complaints | Medium | Low | Clear docs, support ready |
## Security Considerations
- API keys must be validated before rate limit check (prevents
enumeration via rate limit responses)
- Rate limit state in Redis not sensitive, but Redis should be
on private network
## Operational Considerations
- New Redis cluster required (or use existing, eval capacity)
- Dashboard for rate limit metrics
- Alert on: Redis latency, high rejection rate, limit changes
## Dependencies
- Redis 6.x+ (for Lua scripting)
- API gateway update (nginx module or custom code)
## Timeline
| Phase | Description | Duration |
|-------|-------------|----------|
| Implementation | Core rate limiting | 1 week |
| Testing | Load testing, edge cases | 1 week |
| Rollout | Monitoring → Enforcement | 2 weeks |
## Open Questions
1. ~~Should we allow limit increases via self-service?~~
Resolved: Yes, in billing portal. Implementation separate.
2. ~~Per-IP limiting for unauthenticated endpoints?~~
Resolved: Out of scope, separate RFC if needed.
## References
- [Token bucket algorithm](https://en.wikipedia.org/wiki/Token_bucket)
- [Stripe's rate limiting blog post](https://stripe.com/blog/rate-limiters)
- Previous incident reports: INC-234, INC-256, INC-271
---
## Revision History
- 2024-01-18: Approved by @bob, @carol
- 2024-01-16: Updated based on review feedback (added fail-open)
- 2024-01-10: Initial draft
Quick Reference
┌─────────────────────────────────────────────────────────────────────┐
│ RFC PROCESS QUICK REFERENCE │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ WHEN TO RFC │
│ ───────────────────────────────────────────────────────────────── │
│ ✓ Affects multiple teams │
│ ✓ Changes public APIs │
│ ✓ Introduces new dependencies │
│ ✓ Has security implications │
│ ✓ Hard to reverse │
│ ✗ Bug fixes, small features, implementation details │
│ │
│ RFC STRUCTURE │
│ ───────────────────────────────────────────────────────────────── │
│ 1. Summary (2-3 sentences) │
│ 2. Motivation (why are we doing this?) │
│ 3. Proposal (what are we doing?) │
│ 4. Alternatives (what else did we consider?) │
│ 5. Tradeoffs (what are the downsides?) │
│ │
│ AUTHOR RESPONSIBILITIES │
│ ───────────────────────────────────────────────────────────────── │
│ □ Get early feedback before formal review │
│ □ Be specific and concrete │
│ □ Genuinely explore alternatives │
│ □ Respond to all comments │
│ □ Shepherd through to decision │
│ │
│ REVIEWER RESPONSIBILITIES │
│ ───────────────────────────────────────────────────────────────── │
│ □ Review within deadline │
│ □ Be specific and constructive │
│ □ Categorize feedback (blocking/major/minor/nit) │
│ □ Focus on substance, not style │
│ │
│ LIFECYCLE │
│ ───────────────────────────────────────────────────────────────── │
│ Draft → In Review → Approved/Rejected → Implemented │
│ │
│ DECISION ESCALATION │
│ ───────────────────────────────────────────────────────────────── │
│ 1. Clarify disagreement │
│ 2. Facilitate discussion │
│ 3. Time-box debate │
│ 4. Escalate to designated decider │
│ │
│ SUCCESS METRICS │
│ ───────────────────────────────────────────────────────────────── │
│ • Time from draft to decision < 2 weeks (typical) │
│ • Most RFCs get substantive feedback │
│ • Decisions are referenced during implementation │
│ • Team finds process valuable, not burdensome │
│ │
└─────────────────────────────────────────────────────────────────────┘
Conclusion
An RFC process is a tool. Like any tool, it can be used well or poorly.
Used well, it:
- Improves decision quality through diverse input
- Creates shared understanding before implementation
- Builds institutional knowledge that outlasts individuals
- Gives everyone a voice in technical direction
Used poorly, it:
- Slows everything down without adding value
- Becomes a rubber stamp for decisions already made
- Creates bureaucracy people route around
- Frustrates authors and reviewers alike
The difference is in the details: clear thresholds, lightweight options, engaged reviewers, responsive authors, and a culture that values both velocity and thoughtfulness.
Start simple. One template. One place to store RFCs. One clear threshold for when to use them. Iterate from there based on what works for your team.
The goal isn't a perfect process. It's better decisions, documented for the future, made by people who feel heard. Everything else is details.
What did you think?