AI Design-to-Code and Screenshot-to-Code Pipeline Architecture

May 15, 202690 min read0 views

design to code

screenshot to code

ai ui generation

frontend architecture

AI Design-to-Code and Screenshot-to-Code Pipeline Architecture

Real-World Problem Context

A designer hands you a Figma file with 15 screens. Traditionally, you'd spend days manually translating visual designs into React components — pixel-matching spacing, extracting colors, recreating layouts with CSS, and building component hierarchies. AI design-to-code tools (v0 by Vercel, Locofy, Anima, Builder.io, screenshot-to-code by Abi Raja) take a visual input — a Figma design, a screenshot, or even a hand-drawn wireframe — and generate production-quality frontend code. This post covers the internal pipeline: how vision models understand UI layouts, how they decompose visuals into component trees, how they map visual elements to semantic HTML and CSS/Tailwind, how they handle responsive design, and the gap between generated code and production-ready code.

Problem Statements

Visual Understanding: How does a vision model look at a screenshot and understand the layout structure — headers, sidebars, cards, buttons, form fields — and their spatial relationships (hierarchical nesting, alignment, spacing)?
Code Quality: How do you generate code that uses semantic HTML, proper component decomposition, and modern CSS (Flexbox/Grid/Tailwind) instead of absolute-positioned divs with hardcoded pixel values?
Interactive Behavior: A screenshot is static — how does the AI infer interactive behavior (hover states, click handlers, form validation, navigation) that isn't visible in a single image?

Deep Dive: Internal Mechanisms

1. End-to-End Design-to-Code Pipeline

/*
 * Design-to-code pipeline:
 *
 *   Visual Input
 *   (Figma/Screenshot/Wireframe)
 *       │
 *       ▼
 *   ┌─────────────────────────────┐
 *   │ 1. Vision Model Processing  │
 *   │    - OCR (text extraction)  │
 *   │    - Element detection      │
 *   │    - Layout analysis        │
 *   │    - Color/font extraction  │
 *   └──────────────┬──────────────┘
 *                  │
 *                  ▼
 *   ┌─────────────────────────────┐
 *   │ 2. UI Structure Tree        │
 *   │    - Component hierarchy    │
 *   │    - Element types (button, │
 *   │      input, card, nav)      │
 *   │    - Bounding boxes         │
 *   │    - Text content           │
 *   └──────────────┬──────────────┘
 *                  │
 *                  ▼
 *   ┌─────────────────────────────┐
 *   │ 3. Code Generation          │
 *   │    - React/HTML components  │
 *   │    - Tailwind/CSS styling   │
 *   │    - Responsive layout      │
 *   │    - Component composition  │
 *   └──────────────┬──────────────┘
 *                  │
 *                  ▼
 *   ┌─────────────────────────────┐
 *   │ 4. Post-Processing          │
 *   │    - Visual diff validation │
 *   │    - Code formatting        │
 *   │    - Accessibility attrs    │
 *   │    - Responsive breakpoints │
 *   └─────────────────────────────┘
 */

2. Vision Model UI Understanding

/*
 * How vision models (GPT-4V, Claude Vision, Gemini) process UI images:
 *
 * The image is encoded as tokens (patch embeddings):
 *
 *   ┌─────────────────────────────────┐
 *   │     Screenshot (1280x800)       │
 *   │ ┌─────┬─────┬─────┬─────┬────┐ │
 *   │ │patch│patch│patch│patch│... │ │
 *   │ │ 1   │ 2   │ 3   │ 4   │    │ │  Each 14x14 or 32x32 pixel
 *   │ ├─────┼─────┼─────┼─────┼────┤ │  patch → one visual token
 *   │ │patch│patch│patch│patch│... │ │
 *   │ │ 5   │ 6   │ 7   │ 8   │    │ │  1280x800 → ~1500-3000 tokens
 *   │ ├─────┼─────┼─────┼─────┼────┤ │
 *   │ │...  │...  │...  │...  │... │ │
 *   │ └─────┴─────┴─────┴─────┴────┘ │
 *   └─────────────────────────────────┘
 *
 * The model understands:
 * 1. Spatial layout (grid, flex, absolute)
 * 2. UI patterns (header/footer, sidebar, cards)
 * 3. Element types (buttons, inputs, text, images)
 * 4. Text content via built-in OCR
 * 5. Colors, fonts, spacing (approximate)
 * 6. Design system patterns (Material, Shadcn, etc.)
 */

// Prompt structure for screenshot-to-code:
function buildDesignToCodePrompt(imageBase64, options) {
    return {
        messages: [
            {
                role: 'user',
                content: [
                    {
                        type: 'image_url',
                        image_url: {
                            url: `data:image/png;base64,${imageBase64}`,
                            detail: 'high', // High resolution mode
                        },
                    },
                    {
                        type: 'text',
                        text: `Convert this UI design into a React component.

Requirements:
- Framework: ${options.framework} (React/Next.js/Vue)
- Styling: ${options.styling} (Tailwind CSS/CSS Modules/styled-components)
- Component library: ${options.componentLib || 'none (custom)'}
- Responsive: Yes, mobile-first
- Accessibility: Include ARIA labels, semantic HTML

Output a single React component file with all styles inline (Tailwind classes).
Use semantic HTML elements (nav, main, section, article, button — not just divs).
Extract repeated patterns into sub-components.
Use realistic placeholder data.`,
                    },
                ],
            },
        ],
    };
}

3. Figma API Integration (Structured Input)

/*
 * Unlike screenshots, Figma provides STRUCTURED design data:
 *
 * Figma file → API → JSON tree of nodes
 *
 * Each node has:
 * - type: FRAME, TEXT, RECTANGLE, COMPONENT, INSTANCE
 * - absoluteBoundingBox: { x, y, width, height }
 * - fills: [{ type: 'SOLID', color: { r, g, b, a } }]
 * - strokes, effects (shadows, blur)
 * - children: nested elements
 * - style: fontFamily, fontSize, fontWeight, lineHeight
 * - layoutMode: 'HORIZONTAL' | 'VERTICAL' (auto-layout = flex)
 * - constraints: how it resizes
 *
 * This is MUCH richer than a screenshot.
 */

async function extractFigmaDesign(fileKey, nodeId) {
    const response = await fetch(
        `https://api.figma.com/v1/files/${fileKey}/nodes?ids=${nodeId}`,
        { headers: { 'X-Figma-Token': process.env.FIGMA_TOKEN } }
    );
    const data = await response.json();
    return data.nodes[nodeId].document;
}

function figmaNodeToUITree(node, parent = null) {
    const element = {
        type: mapFigmaType(node),
        name: node.name,
        bounds: node.absoluteBoundingBox,
        styles: extractStyles(node),
        text: node.type === 'TEXT' ? node.characters : null,
        children: [],
        layout: extractLayoutInfo(node),
    };
    
    if (node.children) {
        element.children = node.children.map(child =>
            figmaNodeToUITree(child, element)
        );
    }
    
    return element;
}

function mapFigmaType(node) {
    // Map Figma types to semantic HTML:
    const nameHints = node.name.toLowerCase();
    
    if (nameHints.includes('button') || nameHints.includes('btn')) return 'button';
    if (nameHints.includes('input') || nameHints.includes('field')) return 'input';
    if (nameHints.includes('nav')) return 'nav';
    if (nameHints.includes('header')) return 'header';
    if (nameHints.includes('footer')) return 'footer';
    if (nameHints.includes('card')) return 'article';
    if (nameHints.includes('image') || nameHints.includes('avatar')) return 'img';
    if (node.type === 'TEXT') return 'text';
    
    // Layout inference from auto-layout:
    if (node.layoutMode === 'HORIZONTAL') return 'flex-row';
    if (node.layoutMode === 'VERTICAL') return 'flex-col';
    
    return 'div';
}

function extractStyles(node) {
    const styles = {};
    
    // Colors:
    if (node.fills?.length > 0) {
        const fill = node.fills[0];
        if (fill.type === 'SOLID') {
            styles.backgroundColor = rgbToHex(fill.color);
            styles.opacity = fill.opacity;
        }
    }
    
    // Typography:
    if (node.style) {
        styles.fontFamily = node.style.fontFamily;
        styles.fontSize = node.style.fontSize;
        styles.fontWeight = node.style.fontWeight;
        styles.lineHeight = node.style.lineHeightPx;
        styles.letterSpacing = node.style.letterSpacing;
        styles.textAlign = node.style.textAlignHorizontal?.toLowerCase();
    }
    
    // Spacing (from auto-layout):
    if (node.paddingLeft != null) {
        styles.padding = {
            top: node.paddingTop,
            right: node.paddingRight,
            bottom: node.paddingBottom,
            left: node.paddingLeft,
        };
    }
    if (node.itemSpacing != null) {
        styles.gap = node.itemSpacing;
    }
    
    // Border radius:
    if (node.cornerRadius) {
        styles.borderRadius = node.cornerRadius;
    }
    
    // Shadows:
    if (node.effects?.length > 0) {
        styles.shadows = node.effects
            .filter(e => e.type === 'DROP_SHADOW' && e.visible)
            .map(e => ({
                x: e.offset.x,
                y: e.offset.y,
                blur: e.radius,
                color: rgbToHex(e.color),
            }));
    }
    
    return styles;
}

4. Layout Algorithm: Visual to CSS

/*
 * Converting spatial positions to CSS layout is non-trivial:
 *
 * Visual arrangement:
 * ┌──────────────────────────────────┐
 * │ ┌────────────────────────┐       │
 * │ │ Logo  │ Nav│ Nav│ CTA  │       │
 * │ └────────────────────────┘       │
 * │ ┌──────────┐ ┌──────────────┐   │
 * │ │ Sidebar  │ │  Main        │   │
 * │ │          │ │  ┌────┐┌────┐│   │
 * │ │ Link 1   │ │  │Card││Card││   │
 * │ │ Link 2   │ │  └────┘└────┘│   │
 * │ │ Link 3   │ │  ┌────┐┌────┐│   │
 * │ │          │ │  │Card││Card││   │
 * │ └──────────┘ │  └────┘└────┘│   │
 * │              └──────────────┘   │
 * └──────────────────────────────────┘
 *
 * Must infer:
 * - Header is flex-row with space-between
 * - Body is two-column grid/flex
 * - Cards are grid with auto-fill columns
 */

function inferCSSLayout(element) {
    if (!element.children || element.children.length === 0) {
        return element;
    }
    
    const children = element.children;
    
    // Check if children are arranged horizontally:
    const isHorizontal = children.every((child, i) => {
        if (i === 0) return true;
        const prev = children[i - 1];
        // Child starts after previous child ends (with tolerance):
        return child.bounds.x >= prev.bounds.x + prev.bounds.width - 5;
    });
    
    // Check if children are arranged vertically:
    const isVertical = children.every((child, i) => {
        if (i === 0) return true;
        const prev = children[i - 1];
        return child.bounds.y >= prev.bounds.y + prev.bounds.height - 5;
    });
    
    // Check for grid pattern (rows of similar items):
    const gridPattern = detectGridPattern(children);
    
    if (gridPattern) {
        element.cssLayout = {
            display: 'grid',
            gridTemplateColumns: `repeat(${gridPattern.columns}, 1fr)`,
            gap: `${gridPattern.gap}px`,
        };
    } else if (isHorizontal) {
        element.cssLayout = {
            display: 'flex',
            flexDirection: 'row',
            gap: inferGap(children, 'horizontal'),
            justifyContent: inferJustifyContent(element, children),
            alignItems: inferAlignItems(element, children),
        };
    } else if (isVertical) {
        element.cssLayout = {
            display: 'flex',
            flexDirection: 'column',
            gap: inferGap(children, 'vertical'),
        };
    }
    
    // Recurse into children:
    element.children = children.map(child => inferCSSLayout(child));
    
    return element;
}

function detectGridPattern(children) {
    if (children.length < 4) return null;
    
    // Group children into rows by Y-position (within tolerance):
    const rows = [];
    let currentRow = [children[0]];
    
    for (let i = 1; i < children.length; i++) {
        const yDiff = Math.abs(children[i].bounds.y - currentRow[0].bounds.y);
        if (yDiff < 10) {
            currentRow.push(children[i]);
        } else {
            rows.push(currentRow);
            currentRow = [children[i]];
        }
    }
    rows.push(currentRow);
    
    // Check if all rows have same column count:
    const columnCounts = rows.map(r => r.length);
    const isUniformGrid = columnCounts.every(c => c === columnCounts[0]);
    
    if (isUniformGrid && rows.length >= 2) {
        return {
            rows: rows.length,
            columns: columnCounts[0],
            gap: rows[0].length > 1
                ? rows[0][1].bounds.x - (rows[0][0].bounds.x + rows[0][0].bounds.width)
                : 0,
        };
    }
    
    return null;
}

5. Design Token Extraction

/*
 * Extract reusable design tokens from the design:
 *
 * Colors used → map to CSS variables or Tailwind config
 * Font sizes → map to type scale
 * Spacing values → map to spacing scale
 * Border radii → map to radius tokens
 */

function extractDesignTokens(uiTree) {
    const tokens = {
        colors: new Map(),
        fontSizes: new Set(),
        spacing: new Set(),
        radii: new Set(),
    };
    
    function traverse(node) {
        // Collect colors:
        if (node.styles.backgroundColor) {
            const color = node.styles.backgroundColor;
            const usage = tokens.colors.get(color) || { count: 0, uses: [] };
            usage.count++;
            usage.uses.push(node.name);
            tokens.colors.set(color, usage);
        }
        
        // Collect font sizes:
        if (node.styles.fontSize) {
            tokens.fontSizes.add(node.styles.fontSize);
        }
        
        // Collect spacing:
        if (node.styles.padding) {
            Object.values(node.styles.padding).forEach(v => tokens.spacing.add(v));
        }
        if (node.styles.gap) {
            tokens.spacing.add(node.styles.gap);
        }
        
        if (node.children) {
            node.children.forEach(traverse);
        }
    }
    
    traverse(uiTree);
    
    // Map to Tailwind classes:
    return {
        tailwindColors: mapColorsToTailwind(tokens.colors),
        tailwindSpacing: mapSpacingToTailwind(tokens.spacing),
        fontSizeScale: mapToTypeScale(tokens.fontSizes),
        radiusScale: mapToRadiusScale(tokens.radii),
    };
}

function mapColorsToTailwind(colorMap) {
    const tailwindColors = {
        slate: { 50: '#f8fafc', 100: '#f1f5f9', /* ... */ },
        blue: { 50: '#eff6ff', 500: '#3b82f6', /* ... */ },
        // ... all Tailwind colors
    };
    
    const mappings = {};
    
    for (const [hexColor, usage] of colorMap) {
        // Find closest Tailwind color:
        let closestMatch = null;
        let closestDistance = Infinity;
        
        for (const [name, shades] of Object.entries(tailwindColors)) {
            for (const [shade, tailwindHex] of Object.entries(shades)) {
                const distance = colorDistance(hexColor, tailwindHex);
                if (distance < closestDistance) {
                    closestDistance = distance;
                    closestMatch = `${name}-${shade}`;
                }
            }
        }
        
        mappings[hexColor] = {
            tailwindClass: closestMatch,
            exact: closestDistance === 0,
            uses: usage.uses,
        };
    }
    
    return mappings;
}

6. Component Decomposition

/*
 * The AI must decide how to split the design into components:
 *
 * Heuristics for component boundaries:
 *
 * 1. Repeated elements → single component with props
 *    (e.g., 4 similar cards → <Card> component)
 *
 * 2. Semantic sections → separate components
 *    (Header, Sidebar, MainContent, Footer)
 *
 * 3. Interactive groups → component with state
 *    (Form with inputs, SearchBar with autocomplete)
 *
 * 4. Size threshold → split large components
 *    (>200 lines → break into sub-components)
 */

function decomposeIntoComponents(uiTree) {
    const components = [];
    
    // 1. Find repeated patterns:
    const repeated = findRepeatedPatterns(uiTree);
    
    for (const pattern of repeated) {
        components.push({
            name: generateComponentName(pattern),
            type: 'reusable',
            props: extractVariations(pattern.instances),
            template: pattern.structure,
            instances: pattern.instances.length,
        });
    }
    
    // 2. Identify semantic sections:
    const sections = identifySections(uiTree);
    
    for (const section of sections) {
        components.push({
            name: section.name, // 'Header', 'Sidebar', etc.
            type: 'section',
            children: section.children,
        });
    }
    
    // 3. Generate the page component that composes them:
    const page = {
        name: 'Page',
        type: 'page',
        imports: components.map(c => c.name),
        layout: uiTree.cssLayout,
    };
    
    return { page, components };
}

function findRepeatedPatterns(uiTree) {
    const allNodes = flattenTree(uiTree);
    const patterns = [];
    
    // Group nodes by structural similarity:
    for (let i = 0; i < allNodes.length; i++) {
        const similar = allNodes.filter((node, j) =>
            j !== i && structuralSimilarity(allNodes[i], node) > 0.85
        );
        
        if (similar.length >= 2) {
            // Found a repeated pattern:
            const instances = [allNodes[i], ...similar];
            
            // Check if already captured:
            const alreadyCaptured = patterns.some(p =>
                p.instances.some(inst => instances.includes(inst))
            );
            
            if (!alreadyCaptured) {
                patterns.push({
                    structure: generalizeStructure(instances),
                    instances,
                });
            }
        }
    }
    
    return patterns;
}

function extractVariations(instances) {
    // Find what differs between instances → those become props:
    const props = [];
    
    // Compare text content:
    const texts = instances.map(i => extractAllText(i));
    if (new Set(texts.map(JSON.stringify)).size > 1) {
        // Text varies → prop
        props.push({ name: 'title', type: 'string' });
    }
    
    // Compare images:
    const images = instances.map(i => extractImages(i));
    if (new Set(images.map(JSON.stringify)).size > 1) {
        props.push({ name: 'imageSrc', type: 'string' });
    }
    
    // Compare colors (e.g., status badges):
    const colors = instances.map(i => i.styles?.backgroundColor);
    if (new Set(colors).size > 1) {
        props.push({ name: 'variant', type: "'primary' | 'secondary' | 'success'" });
    }
    
    return props;
}

7. Responsive Design Inference

/*
 * A screenshot shows one viewport. The AI must infer responsive behavior.
 *
 * Strategies:
 * 1. If Figma file includes mobile + desktop frames → explicit
 * 2. If only desktop screenshot → AI infers breakpoints
 *
 * Common responsive patterns the AI applies:
 *
 * Desktop (≥1024px)          Tablet (768px)          Mobile (<768px)
 * ┌────┬────────────┐        ┌──────────────┐        ┌──────────┐
 * │Side│   Main     │        │   Main       │        │  Main    │
 * │bar │  ┌──┐┌──┐  │        │ ┌──┐┌──┐     │        │ ┌──────┐ │
 * │    │  │  ││  │  │   →    │ │  ││  │     │   →    │ │      │ │
 * │    │  └──┘└──┘  │        │ └──┘└──┘     │        │ └──────┘ │
 * │    │  ┌──┐┌──┐  │        │ ┌──┐┌──┐     │        │ ┌──────┐ │
 * │    │  │  ││  │  │        │ │  ││  │     │        │ │      │ │
 * └────┴────────────┘        └──────────────┘        │ └──────┘ │
 *                                                    └──────────┘
 * Sidebar hidden            Sidebar collapsed         Cards stack
 * → hamburger menu          to icons                  to single column
 */

function generateResponsiveCode(layout, components) {
    const responsiveRules = [];
    
    // Sidebar → hidden on mobile:
    if (layout.hasSidebar) {
        responsiveRules.push({
            component: 'Sidebar',
            desktop: 'block w-64',
            mobile: 'hidden',
            tablet: 'w-16', // Icon-only mode
            mobileAlternative: 'hamburger-menu',
        });
    }
    
    // Grid → fewer columns on smaller screens:
    for (const comp of components) {
        if (comp.layout?.display === 'grid') {
            const cols = comp.layout.gridTemplateColumns;
            responsiveRules.push({
                component: comp.name,
                desktop: `grid-cols-${cols}`,
                tablet: `grid-cols-${Math.ceil(cols / 2)}`,
                mobile: 'grid-cols-1',
            });
        }
    }
    
    // Navigation → mobile hamburger:
    if (layout.hasNavbar) {
        responsiveRules.push({
            component: 'Navigation',
            desktop: 'flex gap-6 items-center',
            mobile: 'hidden md:flex', // Hidden, shown via button
            mobileAlternative: `
                // Mobile hamburger menu:
                const [isOpen, setIsOpen] = useState(false);
                <>
                    <button className="md:hidden" onClick={() => setIsOpen(!isOpen)}>
                        <MenuIcon />
                    </button>
                    {isOpen && <MobileNav onClose={() => setIsOpen(false)} />}
                </>
            `,
        });
    }
    
    return responsiveRules;
}

8. Iterative Refinement via Visual Diff

/*
 * After generating code, render it and compare to the original:
 *
 *   Original design          Generated code render
 *   ┌─────────────┐          ┌─────────────┐
 *   │             │          │             │
 *   │  (design)   │   diff   │  (render)   │
 *   │             │   ───►   │             │
 *   │             │          │             │
 *   └─────────────┘          └─────────────┘
 *                                 │
 *                                 ▼
 *                       Visual diff score: 85%
 *                       Problem areas highlighted
 *                                 │
 *                                 ▼
 *                       Feed diff back to LLM
 *                       "Fix the spacing in the header
 *                        and the card border radius"
 */

async function iterativeRefinement(originalImage, generatedCode, maxIterations = 3) {
    let currentCode = generatedCode;
    
    for (let i = 0; i < maxIterations; i++) {
        // Render the generated code to an image:
        const renderedImage = await renderToScreenshot(currentCode);
        
        // Compute visual similarity:
        const similarity = await computeVisualSimilarity(
            originalImage, renderedImage
        );
        
        if (similarity > 0.95) {
            return { code: currentCode, similarity, iterations: i };
        }
        
        // Find specific differences:
        const diffRegions = await highlightDifferences(
            originalImage, renderedImage
        );
        
        // Ask the LLM to fix specific issues:
        const fixPrompt = buildFixPrompt(currentCode, diffRegions);
        currentCode = await callLLM(fixPrompt);
    }
    
    return { code: currentCode, similarity, iterations: maxIterations };
}

function buildFixPrompt(code, diffRegions) {
    return `The generated UI doesn't match the design. Fix these specific issues:

${diffRegions.map(region => `
Region: ${region.area} (${region.description})
Issue: ${region.issue}
Expected: ${region.expected}
Current: ${region.actual}
`).join('\n')}

Current code:
\`\`\`tsx
${code}
\`\`\`

Fix the code to match the design more closely. Only change what's needed.`;
}

9. Handling Interactive Behavior

/*
 * Screenshots are static. How the AI infers interactivity:
 *
 * 1. Element type → default behavior:
 *    Button → onClick handler (placeholder)
 *    Form → onSubmit with validation
 *    Link → navigation (href)
 *    Input → onChange with state
 *    Toggle → boolean state
 *    Dropdown → open/close state
 *    Tab → active tab state
 *
 * 2. Visual cues → behavior:
 *    Hover shadow on card → clickable, cursor-pointer
 *    Search icon in input → search functionality
 *    X button → close/dismiss
 *    Arrow icon → expandable/collapsible
 *    Heart icon → like/favorite toggle
 *    Pagination dots → carousel/slider
 *
 * 3. Context from multiple screens:
 *    If user provides before/after screenshots,
 *    AI infers state transitions.
 */

function inferInteractiveBehavior(element, context) {
    const behaviors = [];
    
    // Button behavior:
    if (element.type === 'button') {
        const text = element.text?.toLowerCase() || '';
        
        if (text.includes('submit') || text.includes('save')) {
            behaviors.push({
                event: 'onClick',
                action: 'submitForm',
                code: `
                    const handleSubmit = async () => {
                        setLoading(true);
                        try {
                            await onSubmit(formData);
                        } catch (error) {
                            setError(error.message);
                        } finally {
                            setLoading(false);
                        }
                    };
                `,
            });
        } else if (text.includes('delete') || text.includes('remove')) {
            behaviors.push({
                event: 'onClick',
                action: 'confirmDelete',
                code: `
                    const handleDelete = () => {
                        if (window.confirm('Are you sure?')) {
                            onDelete(id);
                        }
                    };
                `,
            });
        }
    }
    
    // Form behavior:
    if (element.type === 'form' || hasFormChildren(element)) {
        behaviors.push({
            type: 'form-state',
            code: `
                const [formData, setFormData] = useState({
                    ${element.children
                        .filter(c => c.type === 'input')
                        .map(c => `${c.name}: ''`)
                        .join(',\n                    ')}
                });
                
                const handleChange = (e) => {
                    setFormData(prev => ({
                        ...prev,
                        [e.target.name]: e.target.value,
                    }));
                };
            `,
        });
    }
    
    // Tab/navigation behavior:
    if (element.children?.some(c => c.name?.includes('tab'))) {
        behaviors.push({
            type: 'tab-state',
            code: `
                const [activeTab, setActiveTab] = useState(0);
            `,
        });
    }
    
    return behaviors;
}

10. Multi-Screen and Design System Awareness

/*
 * Advanced design-to-code handles full design systems:
 *
 * 1. Process all screens in the Figma file
 * 2. Extract shared components across screens
 * 3. Generate a component library + pages
 *
 *   Figma File
 *   ├── Components (shared)
 *   │   ├── Button (Primary, Secondary, Ghost)
 *   │   ├── Input (Text, Email, Password)
 *   │   ├── Card (Product, User, Stats)
 *   │   └── Modal (Confirm, Form, Alert)
 *   │
 *   ├── Pages
 *   │   ├── Dashboard
 *   │   ├── Settings
 *   │   ├── User Profile
 *   │   └── Product List
 *   │
 *   └──▶ Generated Code
 *       ├── components/
 *       │   ├── ui/
 *       │   │   ├── Button.tsx
 *       │   │   ├── Input.tsx
 *       │   │   └── Card.tsx
 *       │   └── layout/
 *       │       ├── Header.tsx
 *       │       └── Sidebar.tsx
 *       ├── pages/
 *       │   ├── Dashboard.tsx
 *       │   ├── Settings.tsx
 *       │   └── UserProfile.tsx
 *       └── styles/
 *           └── tokens.css (design tokens)
 */

function generateDesignSystem(figmaComponents) {
    // Extract variants of each component:
    const componentVariants = {};
    
    for (const component of figmaComponents) {
        const baseName = component.name.split('/')[0]; // "Button/Primary" → "Button"
        const variant = component.name.split('/')[1] || 'default';
        
        if (!componentVariants[baseName]) {
            componentVariants[baseName] = [];
        }
        
        componentVariants[baseName].push({
            variant,
            styles: component.styles,
            children: component.children,
        });
    }
    
    // Generate component with variants:
    const generatedComponents = {};
    
    for (const [name, variants] of Object.entries(componentVariants)) {
        generatedComponents[name] = {
            props: ['variant', ...extractCommonProps(variants)],
            variants: variants.map(v => ({
                name: v.variant,
                className: stylesToTailwind(v.styles),
            })),
            template: `
interface ${name}Props {
    variant?: ${variants.map(v => `'${v.variant}'`).join(' | ')};
    children?: React.ReactNode;
    // ... extracted props
}

export function ${name}({ variant = '${variants[0].variant}', ...props }: ${name}Props) {
    const variantStyles = {
        ${variants.map(v => `'${v.variant}': '${stylesToTailwind(v.styles)}'`).join(',\n        ')}
    };
    
    return (
        <${inferHTMLElement(name)}
            className={\`\${baseStyles} \${variantStyles[variant]}\`}
            {...props}
        />
    );
}`,
        };
    }
    
    return generatedComponents;
}

Trade-offs & Considerations

Aspect	Screenshot-to-Code	Figma API	AI + Manual
Input quality	Lossy (pixel-level)	Structured (exact values)	Both available
Layout accuracy	~80-90%	~95-99%	Production-ready
Responsive	Must infer	Can read constraints	Explicitly designed
Design tokens	Approximated	Exact from Figma	Curated
Component reuse	Heuristic detection	Figma components known	Architect decides
Interactive behavior	Inferred from visuals	Inferred + prototypes	Specified
Speed	Seconds	Minutes (API calls)	Hours (with review)

Best Practices

Use Figma API when available — structured data produces far better code than screenshots — screenshots lose precision (exact colors, spacing, font sizes); Figma's API provides exact design tokens, auto-layout information (which maps directly to flexbox), component instances, and variant properties; always prefer the API over screenshots for production work.
Extract design tokens first, then generate components — before generating any component code, extract all colors, font sizes, spacing values, and border radii from the design; map them to Tailwind config or CSS custom properties; this ensures all generated components reference tokens instead of hardcoded values, making future design changes a config update rather than a code rewrite.
Use iterative visual diffing to refine generated code — render the generated code to a screenshot, compute a visual diff against the original design, and feed specific discrepancies back to the AI for targeted fixes; 2-3 iterations typically bring accuracy from 80% to 95%+; automate this loop rather than manually comparing.
Always generate semantic HTML and ARIA attributes, not just visual stubs — the AI should produce <nav>, <button>, <form>, not <div onClick> everywhere; include alt text on images, role attributes where needed, and keyboard event handlers alongside mouse events; visual output should also be accessible from the start.
Treat generated code as a starting point: review component boundaries, add interaction logic, test responsiveness — AI-generated code handles layout and styling well but lacks business logic, real API integration, form validation, and edge case handling; review the component decomposition, add proper state management, connect real data sources, and test across viewport sizes before shipping.

Conclusion

AI design-to-code pipelines take visual input (screenshots via vision models, or Figma designs via structured API) and transform them into component trees with layout, styling, and structure. The pipeline involves: visual understanding (detecting elements, inferring layouts), structured analysis (extracting colors, typography, spacing as design tokens), layout inference (converting spatial positions to CSS Flexbox/Grid), component decomposition (identifying repeated patterns as reusable components, semantic sections as layout components), responsive inference (applying mobile-first breakpoints to sidebar, grid, navigation patterns), and behavior inference (mapping element types and visual cues to interactive state). Figma's structured API produces significantly better results than screenshots because it provides exact values for every design decision. The iterative visual diff loop — render code, compare to design, fix specific discrepancies — typically converges to 95%+ visual accuracy in 2-3 iterations. Generated code handles layout and styling well but always requires human review for component boundaries, state management, API integration, form validation, accessibility, and responsive edge cases.

What did you think?