AI-Powered Code Review and PR Analysis Architecture

May 7, 202667 min read0 views

ai code review

pull request analysis

AI-Powered Code Review and PR Analysis Architecture

Real-World Problem Context

A frontend team merges 20+ pull requests per day. Code reviews are a bottleneck: senior engineers spend hours reviewing PRs, catching bugs the author missed, suggesting better patterns, and enforcing coding standards. Many issues that could be flagged automatically — unused imports, missing error handling, accessibility violations, performance anti-patterns, security vulnerabilities — slip through human review fatigue. AI-powered code review tools (CodeRabbit, GitHub Copilot Code Review, Sourcery, Codium PR-Agent) analyze diffs, understand the intent of changes, and provide actionable review comments automatically. This post covers the architecture: how AI tools ingest PR diffs, build context from the full codebase, generate review comments with specific line references, detect categories of issues (bugs, security, performance, style), and integrate into GitHub/GitLab workflows as automated reviewers.

Problem Statements

Diff Understanding: How does an AI reviewer parse a PR diff, understand what changed and why (not just the syntax but the intent), and generate contextually relevant review comments anchored to specific lines?
Full-Codebase Context: A diff alone doesn't tell the full story — the AI needs to understand the existing codebase (types, APIs, conventions) — how do AI review tools build this context without exceeding token limits?
Actionable Feedback: How do you ensure AI review comments are specific, actionable, and low-noise (not flagging stylistic preferences or obvious patterns) so developers trust and act on them?

Deep Dive: Internal Mechanisms

1. PR Review Pipeline Architecture

/*
 * End-to-end AI code review flow:
 *
 *   PR opened/updated on GitHub
 *       │
 *       ▼
 *   Webhook → AI Review Service
 *       │
 *       ▼
 *   1. Fetch PR diff (changed files, hunks)
 *   2. Fetch full file content (before + after)
 *   3. Fetch related files (imports, tests, types)
 *   4. Build context & chunk by file
 *       │
 *       ▼
 *   5. Per-file analysis:
 *      ┌────────────────────────────┐
 *      │ Bug detection              │
 *      │ Security vulnerability scan│
 *      │ Performance anti-patterns  │
 *      │ Style/convention checks    │
 *      │ Missing error handling     │
 *      │ Accessibility issues       │
 *      │ Type safety concerns       │
 *      └────────────────────────────┘
 *       │
 *       ▼
 *   6. Post-process:
 *      - De-duplicate findings
 *      - Filter low-confidence comments
 *      - Rank by severity
 *      - Map to exact diff lines
 *       │
 *       ▼
 *   7. Post review comments via GitHub API
 *      - Inline comments on specific lines
 *      - Summary comment with overview
 *      - Approval/request-changes status
 */

2. Diff Parsing and Context Assembly

/*
 * GitHub's unified diff format:
 *
 *   diff --git a/src/components/UserList.tsx b/src/components/UserList.tsx
 *   index abc123..def456 100644
 *   --- a/src/components/UserList.tsx
 *   +++ b/src/components/UserList.tsx
 *   @@ -15,6 +15,12 @@ function UserList({ users }) {
 *      const [search, setSearch] = useState('');
 *      const [page, setPage] = useState(1);
 *    
 *   +  const filteredUsers = users.filter(user =>
 *   +    user.name.toLowerCase().includes(search)
 *   +  );
 *   +
 *   +  const paginatedUsers = filteredUsers.slice((page-1)*10, page*10);
 *   +
 *      return (
 *        <div>
 *
 * The AI needs more than just the diff:
 */

async function buildReviewContext(prNumber, repo) {
    // 1. Get the diff:
    const diff = await github.pulls.get({
        owner: repo.owner,
        repo: repo.name,
        pull_number: prNumber,
        mediaType: { format: 'diff' },
    });
    
    // 2. Parse into structured hunks:
    const parsedFiles = parseDiff(diff.data);
    
    // 3. For each changed file, get the FULL file content (not just diff):
    const fileContexts = await Promise.all(
        parsedFiles.map(async (file) => {
            const fullContent = await github.repos.getContent({
                owner: repo.owner,
                repo: repo.name,
                path: file.path,
                ref: prHead, // The PR's branch
            });
            
            // Get the BEFORE version too (for understanding what changed):
            const beforeContent = await github.repos.getContent({
                owner: repo.owner,
                repo: repo.name,
                path: file.oldPath || file.path,
                ref: prBase, // The base branch
            }).catch(() => null); // File might be new
            
            return {
                path: file.path,
                diff: file.hunks,
                fullContent: decodeBase64(fullContent.data.content),
                beforeContent: beforeContent
                    ? decodeBase64(beforeContent.data.content)
                    : null,
                additions: file.additions,
                deletions: file.deletions,
            };
        })
    );
    
    // 4. Fetch related files (imports, types):
    const relatedFiles = await fetchRelatedFiles(fileContexts, repo);
    
    // 5. Get PR description and commit messages for intent:
    const prDetails = await github.pulls.get({
        owner: repo.owner,
        repo: repo.name,
        pull_number: prNumber,
    });
    
    return {
        files: fileContexts,
        relatedFiles,
        prDescription: prDetails.data.body,
        prTitle: prDetails.data.title,
        commitMessages: await getCommitMessages(prNumber, repo),
    };
}

3. AI Review Prompt Engineering

/*
 * The prompt structure determines review quality.
 * Key elements:
 *   1. Role definition (expert reviewer persona)
 *   2. Review instructions (what to look for)
 *   3. Context (PR description, full file, diff)
 *   4. Output format (structured JSON for parsing)
 */

function buildReviewPrompt(fileContext, projectConventions) {
    return `You are an expert frontend code reviewer. Review the following code changes.

## Review Guidelines
- Focus on bugs, security issues, performance problems, and logic errors.
- Do NOT comment on style preferences (formatting, naming conventions) unless they violate the project's explicit rules.
- Every comment must be actionable: explain WHAT is wrong, WHY it matters, and HOW to fix it.
- Only comment on lines that were ADDED or MODIFIED in the diff (+ lines).
- Rate each finding: critical (blocks merge), warning (should fix), suggestion (nice to have).

## Project Conventions
${projectConventions}

## PR Context
Title: ${fileContext.prTitle}
Description: ${fileContext.prDescription}

## File: ${fileContext.path}

### Full file content (for context):
\`\`\`typescript
${fileContext.fullContent}
\`\`\`

### Diff (changes to review):
\`\`\`diff
${fileContext.diff}
\`\`\`

### Related type definitions:
${fileContext.relatedTypes}

## Output Format
Respond with a JSON array of review comments:
[
  {
    "line": <line number in the NEW file>,
    "severity": "critical" | "warning" | "suggestion",
    "category": "bug" | "security" | "performance" | "accessibility" | "error-handling" | "logic",
    "comment": "<specific, actionable review comment>",
    "suggestedFix": "<optional code suggestion>"
  }
]

If no issues found, respond with an empty array: []`;
}

/*
 * Prompt optimization techniques:
 *
 * 1. Few-shot examples: show good and bad review comments
 *    Good: "user.name.toLowerCase() will throw if user.name is undefined.
 *           Add optional chaining: user.name?.toLowerCase()"
 *    Bad:  "Consider using a different variable name"
 *
 * 2. Negative instructions: explicitly say what NOT to flag
 *    "Do not flag: formatting, import ordering, missing JSDoc,
 *     semicolons, single vs double quotes"
 *
 * 3. Severity calibration with examples:
 *    Critical: null dereference, SQL injection, infinite loop
 *    Warning: missing error handling, performance issue
 *    Suggestion: better naming, simpler approach available
 */

4. Issue Detection Categories

/*
 * Common issues AI reviewers catch in frontend PRs:
 *
 * ┌──────────────────┬────────────────────────────────────────┐
 * │ Category         │ Examples                               │
 * ├──────────────────┼────────────────────────────────────────┤
 * │ Bugs             │ Null dereference, off-by-one,          │
 * │                  │ stale closure, missing deps in         │
 * │                  │ useEffect, race conditions             │
 * │                  │                                        │
 * │ Security         │ XSS (dangerouslySetInnerHTML),         │
 * │                  │ CSRF missing, eval(), hardcoded        │
 * │                  │ secrets, unsafe URL construction       │
 * │                  │                                        │
 * │ Performance      │ Re-renders (missing useMemo/useCallback│
 * │                  │ where needed), large bundle imports,   │
 * │                  │ layout thrashing, missing keys in lists│
 * │                  │                                        │
 * │ Error Handling   │ Unhandled promise rejections, missing  │
 * │                  │ try/catch, empty catch blocks,         │
 * │                  │ missing error boundaries               │
 * │                  │                                        │
 * │ Accessibility    │ Missing alt text, no keyboard handler, │
 * │                  │ missing ARIA labels, color contrast    │
 * │                  │                                        │
 * │ Logic            │ Dead code, unreachable branches,       │
 * │                  │ tautological conditions, wrong         │
 * │                  │ comparison operator                    │
 * └──────────────────┴────────────────────────────────────────┘
 */

// Structured detection for React-specific issues:
const reactReviewRules = {
    'stale-closure': {
        description: 'useEffect/useCallback with stale closure over state',
        detect: (diff) => {
            // Look for hooks that reference state but missing from deps array
        },
        severity: 'critical',
    },
    
    'missing-key-prop': {
        description: 'Array .map() in JSX without key prop',
        detect: (diff) => {
            // Look for .map( that returns JSX without key
        },
        severity: 'warning',
    },
    
    'state-in-render': {
        description: 'setState called during render (infinite loop)',
        detect: (diff) => {
            // Look for setState outside useEffect/event handlers
        },
        severity: 'critical',
    },
    
    'missing-error-boundary': {
        description: 'Async data fetching without error handling',
        detect: (diff) => {
            // Look for fetch/axios without try/catch or .catch()
        },
        severity: 'warning',
    },
};

5. Comment Deduplication and Filtering

/*
 * Raw AI output often has:
 * - Duplicate findings (same issue mentioned differently)
 * - False positives (flagging correct code)
 * - Low-value comments (nitpicks, style preferences)
 * - Comments on unchanged lines (context, not changes)
 *
 * Post-processing is critical for trust.
 */

function postProcessReviewComments(comments, diffContext) {
    let filtered = comments;
    
    // 1. Remove comments on unchanged lines:
    filtered = filtered.filter(comment => {
        return isLineInDiff(comment.line, diffContext.addedLines);
    });
    
    // 2. Deduplicate by similarity:
    filtered = deduplicateComments(filtered);
    
    // 3. Remove likely false positives:
    filtered = filtered.filter(comment => {
        // If comment says "variable might be undefined" but
        // TypeScript strict mode handles it → false positive
        if (comment.category === 'bug' && 
            diffContext.hasStrictTypeScript &&
            comment.comment.includes('might be undefined')) {
            return false;
        }
        return true;
    });
    
    // 4. Cap total comments per file (noise reduction):
    const MAX_COMMENTS_PER_FILE = 8;
    if (filtered.length > MAX_COMMENTS_PER_FILE) {
        // Keep highest severity:
        filtered.sort((a, b) => {
            const severityOrder = { critical: 0, warning: 1, suggestion: 2 };
            return severityOrder[a.severity] - severityOrder[b.severity];
        });
        filtered = filtered.slice(0, MAX_COMMENTS_PER_FILE);
    }
    
    // 5. Cap total comments per PR:
    const MAX_COMMENTS_PER_PR = 25;
    
    return filtered;
}

function deduplicateComments(comments) {
    const unique = [];
    
    for (const comment of comments) {
        const isDuplicate = unique.some(existing => {
            // Same line, similar message:
            if (Math.abs(existing.line - comment.line) <= 2) {
                const similarity = computeTextSimilarity(
                    existing.comment, comment.comment
                );
                return similarity > 0.7;
            }
            return false;
        });
        
        if (!isDuplicate) {
            unique.push(comment);
        }
    }
    
    return unique;
}

6. GitHub API Integration

/*
 * Posting review comments via GitHub API:
 *
 * Two types of comments:
 * 1. PR review comment: attached to specific diff line
 * 2. PR comment: general comment on the PR
 *
 * Best approach: create a single review with multiple comments
 * (not individual comments — batching reduces notification noise).
 */

async function postReview(prNumber, repo, reviewComments, summary) {
    // Map AI findings to GitHub review comment format:
    const comments = reviewComments.map(finding => ({
        path: finding.filePath,
        line: finding.line,
        side: 'RIGHT', // Comment on the new version
        body: formatReviewComment(finding),
    }));
    
    // Determine review action:
    const hasCritical = reviewComments.some(c => c.severity === 'critical');
    const event = hasCritical ? 'REQUEST_CHANGES' : 'COMMENT';
    
    // Create the review (all comments posted atomically):
    await github.pulls.createReview({
        owner: repo.owner,
        repo: repo.name,
        pull_number: prNumber,
        event,
        body: formatSummary(summary, reviewComments),
        comments,
    });
}

function formatReviewComment(finding) {
    const severityEmoji = {
        critical: '🔴',
        warning: '🟡',
        suggestion: '🔵',
    };
    
    let body = `${severityEmoji[finding.severity]} **${finding.severity.toUpperCase()}** — ${finding.category}\n\n`;
    body += finding.comment + '\n';
    
    if (finding.suggestedFix) {
        body += `\n**Suggested fix:**\n\`\`\`suggestion\n${finding.suggestedFix}\n\`\`\`\n`;
        // GitHub's "suggestion" code fence creates a one-click apply button
    }
    
    return body;
}

function formatSummary(summary, comments) {
    const counts = {
        critical: comments.filter(c => c.severity === 'critical').length,
        warning: comments.filter(c => c.severity === 'warning').length,
        suggestion: comments.filter(c => c.severity === 'suggestion').length,
    };
    
    return `## AI Code Review Summary

${summary}

### Findings
| Severity | Count |
|----------|-------|
| 🔴 Critical | ${counts.critical} |
| 🟡 Warning | ${counts.warning} |
| 🔵 Suggestion | ${counts.suggestion} |

---
*Automated review by AI Code Reviewer. Please verify all findings.*`;
}

7. Incremental Review (Updated PRs)

/*
 * When a PR is updated (new commits pushed), the AI should:
 * - Only review NEW changes (not re-review already-reviewed code)
 * - Check if previous findings are addressed
 * - Update/resolve previous comments
 *
 * This prevents notification spam and duplicate comments.
 */

async function handlePRUpdate(prNumber, repo, previousReview) {
    // Get the diff between the previous review and current state:
    const lastReviewedCommit = previousReview.commitSha;
    const currentHead = await getCurrentHead(prNumber, repo);
    
    // Incremental diff (only new changes since last review):
    const incrementalDiff = await github.repos.compareCommits({
        owner: repo.owner,
        repo: repo.name,
        base: lastReviewedCommit,
        head: currentHead,
    });
    
    // Review only the new changes:
    const newFindings = await reviewDiff(incrementalDiff);
    
    // Check if previous findings are fixed:
    const resolvedFindings = [];
    for (const previous of previousReview.findings) {
        const fileStillChanged = incrementalDiff.files.some(
            f => f.filename === previous.filePath
        );
        
        if (fileStillChanged) {
            // Re-check if the specific issue is still present:
            const stillPresent = await checkIfIssuePresent(
                previous, currentHead, repo
            );
            
            if (!stillPresent) {
                resolvedFindings.push(previous);
            }
        }
    }
    
    // Resolve previous comments that are fixed:
    for (const resolved of resolvedFindings) {
        await github.pulls.updateReviewComment({
            owner: repo.owner,
            repo: repo.name,
            comment_id: resolved.commentId,
            body: resolved.body + '\n\n✅ *This issue appears to be resolved.*',
        });
    }
    
    // Post new findings only:
    if (newFindings.length > 0) {
        await postReview(prNumber, repo, newFindings, 'Incremental review of new changes');
    }
}

8. Learning from Feedback

/*
 * AI review improves when developers give feedback:
 *
 * 👍 = correct finding (reinforce)
 * 👎 = false positive (suppress similar)
 * "Resolved" = finding was addressed (positive signal)
 * Ignored = possibly low value
 *
 * Feedback loop:
 *
 *   AI posts comment → Developer reacts
 *       │                    │
 *       │                    ▼
 *       │              ┌──────────┐
 *       │              │ 👍 Useful │ → Reinforce this pattern
 *       │              │ 👎 Wrong  │ → Add to suppression list
 *       │              │ Resolved  │ → Confirm detection works
 *       │              │ Ignored   │ → Lower confidence score
 *       │              └──────────┘
 *       │                    │
 *       ▼                    ▼
 *   Feedback database → Fine-tune detection rules
 */

class ReviewFeedbackTracker {
    constructor(db) {
        this.db = db;
    }
    
    async recordFeedback(commentId, reaction, context) {
        await this.db.collection('review_feedback').insertOne({
            commentId,
            reaction, // 'positive', 'negative', 'resolved', 'ignored'
            category: context.category,
            ruleId: context.ruleId,
            fileType: context.fileType,
            timestamp: new Date(),
        });
    }
    
    async getSuppressionRules(repo) {
        // Find patterns with high false-positive rate:
        const stats = await this.db.collection('review_feedback').aggregate([
            { $match: { repo } },
            { $group: {
                _id: '$ruleId',
                positive: { $sum: { $cond: [{ $eq: ['$reaction', 'positive'] }, 1, 0] } },
                negative: { $sum: { $cond: [{ $eq: ['$reaction', 'negative'] }, 1, 0] } },
                total: { $sum: 1 },
            }},
            { $addFields: {
                falsePositiveRate: { $divide: ['$negative', '$total'] },
            }},
            { $match: { falsePositiveRate: { $gt: 0.5 }, total: { $gte: 5 } } },
        ]).toArray();
        
        return stats.map(s => s._id);
    }
}

9. Review Summary and PR Description Enhancement

/*
 * Beyond line-level comments, AI generates:
 * 1. PR summary (what changed and why)
 * 2. Risk assessment (low/medium/high)
 * 3. Test coverage analysis (are new paths tested?)
 * 4. Suggested reviewers (based on file ownership)
 */

async function generatePRSummary(prContext) {
    const prompt = `Analyze this PR and generate a structured summary.

## PR Info
Title: ${prContext.prTitle}
Description: ${prContext.prDescription}
Files changed: ${prContext.files.length}
Lines added: ${prContext.totalAdditions}
Lines removed: ${prContext.totalDeletions}

## Changed Files
${prContext.files.map(f => `- ${f.path} (+${f.additions}/-${f.deletions})`).join('\n')}

## Key Diffs
${prContext.files.slice(0, 5).map(f => f.diff).join('\n---\n')}

Generate:
1. A concise summary of what this PR does (2-3 sentences)
2. Risk level: low (style/docs), medium (logic changes), high (auth/data/infra)
3. Areas that need careful human review
4. Whether new code paths appear to have test coverage`;
    
    const response = await callLLM(prompt);
    return response;
}

10. Configuration and Customization

/*
 * Teams need to customize AI review behavior:
 *
 * .ai-review.yml in repo root:
 */

const defaultConfig = {
    // What to review:
    review: {
        enabled: true,
        autoReview: true,              // Review on every PR
        triggerOn: ['opened', 'synchronize'],
        ignorePaths: [
            '**/*.test.ts',            // Don't review test files
            'docs/**',                  // Don't review docs
            '**/*.generated.*',         // Don't review generated code
            'package-lock.json',
        ],
        maxFilesPerReview: 20,         // Skip if too many files
        maxDiffSize: 5000,             // Skip if diff too large (lines)
    },
    
    // What to check:
    rules: {
        security: { enabled: true, severity: 'critical' },
        bugs: { enabled: true, severity: 'warning' },
        performance: { enabled: true, severity: 'suggestion' },
        accessibility: { enabled: true, severity: 'warning' },
        errorHandling: { enabled: true, severity: 'warning' },
        style: { enabled: false },     // Defer to ESLint/Prettier
    },
    
    // Comment behavior:
    comments: {
        maxPerFile: 8,
        maxPerPR: 25,
        minSeverity: 'suggestion',     // Don't post below this
        includeCodeSuggestions: true,   // GitHub suggestion blocks
        groupByFile: true,
    },
    
    // Custom instructions:
    instructions: `
        This is a Next.js 14 project using App Router.
        We use Zod for validation - flag raw type assertions.
        We use React Query for data fetching - flag useEffect for fetching.
        All API routes must validate request body with Zod.
    `,
    
    // Labels:
    labels: {
        addOnCritical: ['ai-review-critical'],
        addOnClean: ['ai-review-passed'],
    },
};

/*
 * Cost consideration:
 *
 * Per-PR cost with GPT-4 class model:
 *   - Small PR (5 files, 100 lines): ~$0.05-0.10
 *   - Medium PR (15 files, 500 lines): ~$0.20-0.50
 *   - Large PR (50 files, 2000 lines): ~$1.00-3.00
 *
 * At 100 PRs/week: ~$20-100/week
 * Compare to: senior engineer time saved (~hours/week)
 *
 * Cost optimization:
 *   - Use cheaper models for initial triage
 *   - Only use GPT-4 for flagged files
 *   - Cache repeated patterns
 *   - Skip trivial PRs (docs, config)
 */

Trade-offs & Considerations

Aspect	AI Review	Human Review	AI + Human
Speed	Seconds	Hours/days	Minutes
Consistency	Always same rules	Varies by reviewer	Standardized
Context	Limited to code	Understands business	Best coverage
False positives	10-30%	2-5%	5-10%
Novel bugs	May miss	Catches creative	Best coverage
Trust	Must be earned	Inherent	Earned over time
Cost	~$0.10-3/PR	~$50-200/hour	Optimized

Best Practices

Include full file content alongside the diff — not just the changed lines — AI needs surrounding context to understand what function a change is inside, what types are available, what the code does; send the full file (before and after) with the diff highlighted, plus imported type definitions.
Post a single review with batched comments rather than individual comments — creating one GitHub review with all comments reduces notification noise (one email instead of 25); developers see the full picture at once; use GitHub's review API createReview with a comments array.
Cap comments per file (5-8) and per PR (20-25) to prevent review fatigue — too many comments make developers ignore all of them; prioritize by severity (critical > warning > suggestion); if there are many issues, summarize the pattern instead of commenting on every instance.
Support incremental review on PR updates so developers don't see duplicate comments — when new commits are pushed, only review the new changes; mark previous findings that are addressed as resolved; this respects the developer's effort and reduces frustration.
Let teams customize rules via a config file and learn from thumbs-up/down feedback — every project has different conventions; let teams disable rule categories, add custom instructions ("we use React Query, not useEffect for fetching"), and track which findings get positive vs. negative reactions to tune the model over time.

Conclusion

AI-powered code review ingests PR diffs via webhooks, builds rich context (full file content, imported types, PR description, commit messages), and generates review comments anchored to specific diff lines. The prompt instructs the model to focus on bugs, security, performance, and error handling — not style preferences — and output structured findings with severity, category, and suggested fixes. Post-processing filters false positives, deduplicates similar findings, caps the total comment count to prevent review fatigue, and maps findings to exact diff line positions for GitHub's review comment API. Incremental review on PR updates only analyzes new changes and resolves previous findings that have been fixed. The key to developer trust: low false-positive rate (through aggressive filtering), actionable comments (explain what, why, and how to fix), and customizability (team-specific rules via config files). AI review doesn't replace human reviewers — it handles the systematic checks (security anti-patterns, missing error handling, accessibility violations) so human reviewers can focus on architecture, logic, and business intent. Feedback loops (thumbs-up/down on comments) let the system learn which findings are valuable for each team over time.

What did you think?