AI Form Intelligence: NLP-Powered Data Extraction and Smart Forms
AI Form Intelligence: NLP-Powered Data Extraction and Smart Forms
Real-World Problem Context
An insurance company's quote request form has 35 fields across 4 steps. The average completion rate is 23% — users abandon at Step 2 when asked for vehicle details (VIN, make, model, year, mileage). Customer support receives 200+ daily calls from users who start online but can't complete the form. The competitor's form has a single text box: "Tell us about your vehicle" and uses NLP to extract structured data from natural language input. A healthcare portal asks patients to describe symptoms in free text, then extracts structured medical codes. An expense reporting tool lets users photograph receipts and auto-populates merchant, amount, date, and category fields. The frontend team integrates AI at four points: (1) natural language input parsing — converting free-text descriptions into structured form data, (2) intelligent autocomplete that predicts field values from partial input and context, (3) document/image entity extraction using OCR + NLP to populate forms from uploaded files, and (4) real-time validation that understands semantic correctness beyond format checks (e.g., "the mileage seems unusually low for a 2015 vehicle"). This post covers how each mechanism works.
Problem Statements
-
Natural Language to Structured Data: How do you parse a free-text input like "2019 blue Honda Civic with 45k miles" into structured fields (year: 2019, color: blue, make: Honda, model: Civic, mileage: 45000)? How do you handle ambiguity, abbreviations, and missing data?
-
Context-Aware Autocomplete: How does AI predict the next field value based on previously entered data? How do you implement autocomplete that understands relationships between fields (entering a ZIP code pre-fills city and state, entering a VIN pre-fills make/model/year)?
-
Document Entity Extraction: How does a frontend application extract structured data from photos of documents (receipts, IDs, insurance cards) using OCR + NLP? How do you handle poor image quality, varied layouts, and confidence scoring?
Deep Dive: Internal Mechanisms
1. Natural Language Input Parsing
/*
* Converting free text → structured data using NLP:
*
* Input: "I have a 2019 blue Honda Civic EX with about 45k miles,
* bought it in March. Clean title."
*
* Output:
* {
* year: 2019,
* color: "blue",
* make: "Honda",
* model: "Civic",
* trim: "EX",
* mileage: 45000,
* purchaseMonth: "March",
* titleStatus: "clean"
* }
*
* Architecture:
*
* ┌──────────────────────────────────────────────────┐
* │ Free text input │
* │ │ │
* │ ▼ │
* │ ┌─────────────────────┐ │
* │ │ 1. Entity extraction│ LLM with structured │
* │ │ (NER via LLM) │ output format │
* │ └────────┬────────────┘ │
* │ │ │
* │ ▼ │
* │ ┌─────────────────────┐ │
* │ │ 2. Validation │ Check extracted values │
* │ │ & normalization │ against known data │
* │ └────────┬────────────┘ │
* │ │ │
* │ ▼ │
* │ ┌─────────────────────┐ │
* │ │ 3. Confidence │ Per-field confidence │
* │ │ scoring │ for UI treatment │
* │ └────────┬────────────┘ │
* │ │ │
* │ ▼ │
* │ ┌─────────────────────┐ │
* │ │ 4. Form population │ Fill fields, highlight │
* │ │ with review UI │ uncertain values │
* │ └─────────────────────┘ │
* └──────────────────────────────────────────────────┘
*/
async function parseNaturalLanguageInput(text, formSchema) {
// Build the extraction prompt from the form schema:
const fieldDescriptions = formSchema.fields.map(f =>
`- ${f.name} (${f.type}): ${f.description}${f.enum ? ` [options: ${f.enum.join(', ')}]` : ''}${f.pattern ? ` [format: ${f.pattern}]` : ''}`
).join('\n');
const prompt = `Extract structured data from this free-text input.
FORM FIELDS TO EXTRACT:
${fieldDescriptions}
USER INPUT:
"${text}"
RULES:
1. Only extract values explicitly stated or clearly implied
2. For ambiguous values, provide both the extracted value and a note
3. Normalize values: "45k" → 45000, "blue" → "Blue", abbreviations expanded
4. If a field cannot be determined from the input, set it to null
5. Include a confidence score (0-1) for each extracted value
Return JSON:
{
"fields": {
"<fieldName>": {
"value": <extracted value or null>,
"confidence": <0.0-1.0>,
"source": "<exact text span that implies this value>",
"note": "<optional note about ambiguity>"
}
},
"unmappedText": "<any input text that didn't map to a field>"
}`;
const result = JSON.parse(await callLLM(prompt, {
model: 'gpt-4o-mini',
temperature: 0,
maxTokens: 500,
response_format: { type: 'json_object' },
}));
// Post-process: validate and normalize each field:
for (const [fieldName, extraction] of Object.entries(result.fields)) {
const fieldDef = formSchema.fields.find(f => f.name === fieldName);
if (!fieldDef || extraction.value === null) continue;
// Validate against field constraints:
const validation = validateExtractedValue(extraction.value, fieldDef);
extraction.valid = validation.valid;
extraction.validationError = validation.error;
// Adjust confidence based on validation:
if (!validation.valid) {
extraction.confidence *= 0.5;
}
}
return result;
}
function validateExtractedValue(value, fieldDef) {
// Type validation:
if (fieldDef.type === 'number') {
const num = Number(value);
if (isNaN(num)) return { valid: false, error: 'Not a valid number' };
if (fieldDef.min !== undefined && num < fieldDef.min)
return { valid: false, error: `Below minimum (${fieldDef.min})` };
if (fieldDef.max !== undefined && num > fieldDef.max)
return { valid: false, error: `Above maximum (${fieldDef.max})` };
}
// Enum validation:
if (fieldDef.enum) {
const match = fieldDef.enum.find(e =>
e.toLowerCase() === String(value).toLowerCase()
);
if (!match) {
// Fuzzy match:
const closest = findClosestMatch(String(value), fieldDef.enum);
if (closest && closest.distance < 3) {
return { valid: true, correctedValue: closest.value };
}
return { valid: false, error: `Not a valid option` };
}
}
// Pattern validation:
if (fieldDef.pattern) {
const regex = new RegExp(fieldDef.pattern);
if (!regex.test(String(value))) {
return { valid: false, error: `Doesn't match expected format` };
}
}
return { valid: true };
}
2. Smart Form Component with NLP Input
/*
* The UI presents BOTH a free-text input AND traditional fields.
* Users can:
* 1. Type naturally → fields auto-populate
* 2. Review & correct auto-populated fields
* 3. Manually fill any fields the AI missed
*/
function SmartForm({ schema, onSubmit }) {
const [formData, setFormData] = useState({});
const [extractions, setExtractions] = useState({});
const [nlpInput, setNlpInput] = useState('');
const [parsing, setParsing] = useState(false);
const handleNlpInput = useDebouncedCallback(async (text) => {
if (text.length < 10) return; // Too short to extract
setParsing(true);
try {
const result = await parseNaturalLanguageInput(text, schema);
setExtractions(result.fields);
// Auto-fill fields with high confidence:
const newData = { ...formData };
for (const [field, extraction] of Object.entries(result.fields)) {
if (extraction.value !== null && extraction.confidence >= 0.8) {
newData[field] = extraction.validationError
? extraction.correctedValue || extraction.value
: extraction.value;
}
}
setFormData(newData);
} finally {
setParsing(false);
}
}, 500);
return (
<form onSubmit={(e) => { e.preventDefault(); onSubmit(formData); }}>
{/* Natural language input */}
<div className="nlp-input-section">
<label htmlFor="nlp-input">
Describe in your own words:
</label>
<textarea
id="nlp-input"
value={nlpInput}
onChange={(e) => {
setNlpInput(e.target.value);
handleNlpInput(e.target.value);
}}
placeholder="e.g., I have a 2019 blue Honda Civic with 45k miles..."
rows={3}
/>
{parsing && <span className="parsing-indicator">Analyzing...</span>}
</div>
<hr />
{/* Traditional form fields with AI annotations */}
{schema.fields.map(field => {
const extraction = extractions[field.name];
const isAiFilled = extraction?.value !== null &&
extraction?.confidence >= 0.8;
return (
<FormField
key={field.name}
field={field}
value={formData[field.name] || ''}
onChange={(value) => setFormData(prev => ({
...prev, [field.name]: value
}))}
extraction={extraction}
isAiFilled={isAiFilled}
/>
);
})}
<button type="submit">Submit</button>
</form>
);
}
function FormField({ field, value, onChange, extraction, isAiFilled }) {
return (
<div className={`form-field ${isAiFilled ? 'ai-filled' : ''}`}>
<label htmlFor={field.name}>
{field.label}
{isAiFilled && (
<span className="ai-badge" title={`Extracted from: "${extraction.source}"`}>
AI-filled ({Math.round(extraction.confidence * 100)}%)
</span>
)}
</label>
<input
id={field.name}
type={field.inputType || 'text'}
value={value}
onChange={(e) => onChange(e.target.value)}
className={isAiFilled ? 'ai-highlight' : ''}
/>
{/* Show low-confidence extraction as suggestion */}
{extraction?.value !== null && extraction?.confidence < 0.8 &&
extraction?.confidence >= 0.4 && !value && (
<div className="ai-suggestion">
<span>Suggested: {extraction.value}</span>
<button
type="button"
onClick={() => onChange(extraction.value)}
>
Use this
</button>
</div>
)}
{extraction?.validationError && (
<div className="field-warning">
{extraction.validationError}
</div>
)}
</div>
);
}
3. Context-Aware Field Autocomplete
/*
* Traditional autocomplete: match input against a static list.
* AI autocomplete: predict the value based on OTHER field values.
*
* Examples:
* - User enters ZIP "94105" → auto-fill city "San Francisco", state "CA"
* - User selects make "Toyota" → model dropdown shows Toyota models
* - User enters DOB → auto-calculate age, suggest insurance tier
* - User enters job title "Senior SWE" → suggest salary range
*
* For cross-field prediction, the AI uses ALL current form values
* as context.
*/
class ContextualAutocomplete {
constructor(formSchema) {
this.schema = formSchema;
this.lookupCache = new Map();
// Define field relationships:
this.relationships = {
zipCode: { triggers: ['city', 'state', 'county'] },
make: { triggers: ['model', 'year'] },
vin: { triggers: ['make', 'model', 'year', 'trim', 'engineType'] },
dateOfBirth: { triggers: ['age'] },
};
// Deterministic lookups (no AI needed):
this.deterministicLookups = {
zipCode: this.lookupZipCode.bind(this),
vin: this.decodeVIN.bind(this),
};
}
async onFieldChange(fieldName, value, currentFormData) {
const suggestions = {};
// 1. Check deterministic lookups first:
if (this.deterministicLookups[fieldName]) {
const result = await this.deterministicLookups[fieldName](value);
if (result) {
Object.assign(suggestions, result);
}
}
// 2. Check if this field triggers other fields:
const triggered = this.relationships[fieldName]?.triggers || [];
const emptyTriggered = triggered.filter(f => !currentFormData[f]);
if (emptyTriggered.length > 0 && !this.deterministicLookups[fieldName]) {
// Use AI for non-deterministic predictions:
const aiSuggestions = await this.predictFields(
emptyTriggered,
{ ...currentFormData, [fieldName]: value }
);
Object.assign(suggestions, aiSuggestions);
}
return suggestions;
}
async lookupZipCode(zip) {
if (!/^\d{5}$/.test(zip)) return null;
// Use a ZIP code API (deterministic, no AI):
const cached = this.lookupCache.get(`zip:${zip}`);
if (cached) return cached;
const response = await fetch(`/api/lookup/zip/${zip}`);
if (!response.ok) return null;
const data = await response.json();
const result = {
city: { value: data.city, confidence: 1.0 },
state: { value: data.state, confidence: 1.0 },
county: { value: data.county, confidence: 0.95 },
};
this.lookupCache.set(`zip:${zip}`, result);
return result;
}
async decodeVIN(vin) {
if (vin.length !== 17) return null;
// NHTSA VIN decoder (deterministic API):
const response = await fetch(
`https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVin/${vin}?format=json`
);
const data = await response.json();
const getValue = (variableId) =>
data.Results?.find(r => r.VariableId === variableId)?.Value;
return {
make: { value: getValue(26), confidence: 1.0 },
model: { value: getValue(28), confidence: 1.0 },
year: { value: getValue(29), confidence: 1.0 },
trim: { value: getValue(38), confidence: 0.9 },
engineType: { value: getValue(71), confidence: 0.9 },
};
}
async predictFields(fieldsToPredict, currentFormData) {
// AI prediction for non-deterministic relationships:
const context = Object.entries(currentFormData)
.filter(([, v]) => v !== null && v !== undefined && v !== '')
.map(([k, v]) => `${k}: ${v}`)
.join('\n');
const prompt = `Given these form values, predict the most likely values for the empty fields.
CURRENT VALUES:
${context}
PREDICT THESE FIELDS:
${fieldsToPredict.map(f => {
const fieldDef = this.schema.fields.find(fd => fd.name === f);
return `- ${f}: ${fieldDef?.description || ''}${fieldDef?.enum ? ` [options: ${fieldDef.enum.join(', ')}]` : ''}`;
}).join('\n')}
Return JSON: { "<field>": { "value": ..., "confidence": 0-1, "reasoning": "..." } }
Only predict if you're reasonably confident. Set confidence to 0 if uncertain.`;
const result = JSON.parse(await callLLM(prompt, {
model: 'gpt-4o-mini',
temperature: 0,
maxTokens: 300,
}));
// Filter out low-confidence predictions:
for (const [field, prediction] of Object.entries(result)) {
if (prediction.confidence < 0.5) {
delete result[field];
}
}
return result;
}
}
4. Document OCR and Entity Extraction
/*
* Extracting data from photographed documents:
*
* ┌──────────────────────────────────────────────────┐
* │ User uploads photo of receipt │
* │ │ │
* │ ▼ │
* │ 1. Image preprocessing │
* │ ├─ Deskew (correct rotation) │
* │ ├─ Contrast enhancement │
* │ └─ Crop to document bounds │
* │ │ │
* │ ▼ │
* │ 2. OCR (Tesseract.js or Cloud Vision API) │
* │ ├─ Extract text with bounding boxes │
* │ └─ Per-word confidence scores │
* │ │ │
* │ ▼ │
* │ 3. Entity extraction (LLM) │
* │ ├─ Map OCR text to form fields │
* │ ├─ Handle OCR errors ("$l2.50" → "$12.50") │
* │ └─ Per-field confidence │
* │ │ │
* │ ▼ │
* │ 4. Form population with visual verification │
* │ └─ Show extracted values overlaid on image │
* └──────────────────────────────────────────────────┘
*/
class DocumentExtractor {
constructor() {
this.ocrWorker = null;
}
async init() {
const { createWorker } = await import('tesseract.js');
this.ocrWorker = await createWorker('eng');
}
async extractFromImage(imageFile, documentType) {
// 1. Preprocess image:
const processedImage = await this.preprocessImage(imageFile);
// 2. Run OCR:
const ocrResult = await this.runOCR(processedImage);
// 3. Extract entities based on document type:
const entities = await this.extractEntities(
ocrResult, documentType
);
return {
rawText: ocrResult.text,
entities,
ocrConfidence: ocrResult.confidence,
boundingBoxes: ocrResult.words,
};
}
async preprocessImage(imageFile) {
// Use Canvas for client-side image preprocessing:
const img = await loadImage(imageFile);
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
canvas.width = img.width;
canvas.height = img.height;
ctx.drawImage(img, 0, 0);
// Apply contrast enhancement:
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
const data = imageData.data;
// Simple contrast stretch:
let min = 255, max = 0;
for (let i = 0; i < data.length; i += 4) {
const gray = (data[i] + data[i+1] + data[i+2]) / 3;
if (gray < min) min = gray;
if (gray > max) max = gray;
}
const range = max - min || 1;
for (let i = 0; i < data.length; i += 4) {
data[i] = ((data[i] - min) / range) * 255;
data[i+1] = ((data[i+1] - min) / range) * 255;
data[i+2] = ((data[i+2] - min) / range) * 255;
}
ctx.putImageData(imageData, 0, 0);
return new Promise(resolve =>
canvas.toBlob(resolve, 'image/png')
);
}
async runOCR(imageBlob) {
const { data } = await this.ocrWorker.recognize(imageBlob);
return {
text: data.text,
confidence: data.confidence,
words: data.words.map(w => ({
text: w.text,
confidence: w.confidence,
bbox: w.bbox, // { x0, y0, x1, y1 }
})),
lines: data.lines.map(l => ({
text: l.text,
confidence: l.confidence,
bbox: l.bbox,
})),
};
}
async extractEntities(ocrResult, documentType) {
const extractionPrompts = {
receipt: `Extract from this receipt OCR text:
- merchant: store/restaurant name
- date: transaction date (normalize to YYYY-MM-DD)
- total: total amount (number only, e.g., 42.50)
- subtotal: subtotal before tax (if visible)
- tax: tax amount (if visible)
- paymentMethod: payment method (if visible)
- items: array of { name, quantity, price } (if visible)
OCR may have errors. Common OCR mistakes:
- "l" ↔ "1", "O" ↔ "0", "S" ↔ "5", "$" ↔ "S"
- Spaces in numbers: "1 2.50" → "12.50"
- Missing decimal: "1250" for a restaurant bill likely means "$12.50"`,
id_card: `Extract from this ID card OCR text:
- fullName: full name
- dateOfBirth: date of birth (YYYY-MM-DD)
- address: full address
- idNumber: ID/license number
- expirationDate: expiration date (YYYY-MM-DD)
- state: issuing state`,
insurance_card: `Extract from this insurance card OCR text:
- insurerName: insurance company name
- policyNumber: policy number
- groupNumber: group number
- memberId: member ID
- memberName: member name
- effectiveDate: effective date
- copay: copay amounts`,
};
const prompt = `${extractionPrompts[documentType] || 'Extract all structured data.'}
OCR TEXT (may contain errors):
"${ocrResult.text}"
OCR CONFIDENCE: ${ocrResult.confidence.toFixed(1)}%
Return JSON with each field having:
{ "value": <extracted>, "confidence": <0-1>, "ocrSource": "<exact OCR text>" }
Set value to null if not found.`;
return JSON.parse(await callLLM(prompt, {
model: 'gpt-4o-mini',
temperature: 0,
maxTokens: 600,
}));
}
}
5. Semantic Form Validation
/*
* Traditional validation: "Is this a valid email format?"
* Semantic validation: "Does this data make sense in context?"
*
* Examples:
* - Mileage: 5,000 on a 2015 vehicle → "Unusually low, please verify"
* - Salary: $500,000 for "Junior Developer" → "Seems high for this role"
* - DOB: makes person 150 years old → "Please check the year"
* - Address: ZIP doesn't match city/state → "ZIP code mismatch"
*/
class SemanticValidator {
constructor(formSchema) {
this.schema = formSchema;
// Deterministic cross-field rules:
this.rules = [
{
name: 'vehicle-mileage-year',
fields: ['mileage', 'year'],
validate: (data) => {
if (!data.mileage || !data.year) return null;
const vehicleAge = new Date().getFullYear() - data.year;
const avgMilesPerYear = data.mileage / Math.max(vehicleAge, 1);
if (avgMilesPerYear < 1000) {
return {
severity: 'warning',
field: 'mileage',
message: `${data.mileage.toLocaleString()} miles seems low for a ${data.year} vehicle (${Math.round(avgMilesPerYear).toLocaleString()} miles/year). Please verify.`,
};
}
if (avgMilesPerYear > 30000) {
return {
severity: 'warning',
field: 'mileage',
message: `${data.mileage.toLocaleString()} miles seems high for a ${data.year} vehicle (${Math.round(avgMilesPerYear).toLocaleString()} miles/year). Please verify.`,
};
}
return null;
},
},
{
name: 'age-from-dob',
fields: ['dateOfBirth'],
validate: (data) => {
if (!data.dateOfBirth) return null;
const age = calculateAge(data.dateOfBirth);
if (age < 0 || age > 120) {
return {
severity: 'error',
field: 'dateOfBirth',
message: `Calculated age is ${age}, which seems incorrect.`,
};
}
return null;
},
},
];
}
validate(formData) {
const results = [];
// Run deterministic rules:
for (const rule of this.rules) {
const result = rule.validate(formData);
if (result) {
results.push({ ...result, rule: rule.name });
}
}
return results;
}
// AI-powered validation for complex relationships:
async validateWithAI(formData, formContext) {
const filledFields = Object.entries(formData)
.filter(([, v]) => v !== null && v !== undefined && v !== '')
.map(([k, v]) => `${k}: ${v}`)
.join('\n');
const prompt = `Review these form values for semantic consistency.
This is a ${formContext.formType} form.
VALUES:
${filledFields}
Check for:
1. Values that seem inconsistent with each other
2. Values that seem unrealistic (but don't flag valid edge cases)
3. Potential data entry errors (transposed digits, wrong units)
Return JSON array of issues:
[{ "field": "...", "severity": "warning|error", "message": "...", "suggestion": "..." }]
Return empty array [] if everything looks consistent.
Only flag issues you're confident about.`;
const issues = JSON.parse(await callLLM(prompt, {
model: 'gpt-4o-mini',
temperature: 0,
maxTokens: 300,
}));
return issues;
}
}
6. Address Parsing and Standardization
/*
* Address input is notoriously messy:
* - "123 Main St Apt 4B, SF CA 94105"
* - "123 Main Street, Apartment 4B, San Francisco, California 94105"
* - "123 Main, #4B, San Francisco 94105"
* All refer to the same address.
*
* AI parses free-form addresses into structured components.
*/
async function parseAddress(freeformAddress) {
const prompt = `Parse this US address into structured components.
INPUT: "${freeformAddress}"
Return JSON:
{
"streetNumber": "...",
"streetName": "...",
"streetType": "...", // St, Ave, Blvd, etc. (abbreviated)
"unit": "...", // Apt/Suite/Unit number or null
"city": "...",
"state": "...", // 2-letter abbreviation
"zipCode": "...", // 5-digit
"zipPlus4": "...", // 4-digit extension or null
"confidence": 0-1,
"standardized": "..." // USPS-standardized format
}
Standardization rules:
- Capitalize all letters
- Abbreviate street types: STREET→ST, AVENUE→AVE, BOULEVARD→BLVD
- State as 2-letter code
- Unit with # prefix if not specified`;
const parsed = JSON.parse(await callLLM(prompt, {
model: 'gpt-4o-mini',
temperature: 0,
maxTokens: 200,
}));
// Validate with USPS or geocoding API:
const validated = await validateAddress(parsed);
return {
...parsed,
validated: validated.valid,
suggestions: validated.suggestions,
};
}
// React address input component:
function SmartAddressInput({ value, onChange, onParsed }) {
const [inputMode, setInputMode] = useState('freeform'); // or 'structured'
const [parsing, setParsing] = useState(false);
const [parsedResult, setParsedResult] = useState(null);
const handleFreeformChange = useDebouncedCallback(async (text) => {
if (text.length < 10) return;
setParsing(true);
try {
const result = await parseAddress(text);
setParsedResult(result);
if (result.confidence >= 0.8) {
onChange(result);
onParsed?.(result);
}
} finally {
setParsing(false);
}
}, 500);
if (inputMode === 'freeform') {
return (
<div className="address-input">
<textarea
placeholder="Enter your full address..."
onChange={(e) => handleFreeformChange(e.target.value)}
rows={2}
/>
{parsing && <span>Parsing address...</span>}
{parsedResult && (
<div className="parsed-preview">
<p>Parsed as: <strong>{parsedResult.standardized}</strong></p>
{parsedResult.confidence < 0.9 && (
<button onClick={() => setInputMode('structured')}>
Correct manually
</button>
)}
</div>
)}
<button
type="button"
onClick={() => setInputMode('structured')}
className="link-button"
>
Enter address manually
</button>
</div>
);
}
return (
<div className="address-structured">
<input placeholder="Street address"
value={value?.streetNumber ? `${value.streetNumber} ${value.streetName} ${value.streetType}` : ''} />
<input placeholder="Apt, suite, etc." value={value?.unit || ''} />
<input placeholder="City" value={value?.city || ''} />
<input placeholder="State" value={value?.state || ''} />
<input placeholder="ZIP code" value={value?.zipCode || ''} />
<button type="button" onClick={() => setInputMode('freeform')} className="link-button">
Type address freely
</button>
</div>
);
}
7. Multi-Turn Conversational Form Filling
/*
* Instead of a traditional form, present a CHAT interface
* that asks questions conversationally and extracts data
* from natural responses.
*
* Bot: "What vehicle do you want to insure?"
* User: "My Honda Civic, it's a 2019"
* Bot: "Great, a 2019 Honda Civic. What color is it and roughly how many miles?"
* User: "Blue, around 45 thousand"
* Bot: "Got it. I have: 2019 Blue Honda Civic, ~45,000 miles. Is that right?"
*
* The system maintains:
* 1. Form state (what's filled, what's missing)
* 2. Conversation context
* 3. Next question selection (what to ask next)
*/
class ConversationalFormAgent {
constructor(formSchema) {
this.schema = formSchema;
this.formState = {};
this.conversationHistory = [];
this.requiredFields = formSchema.fields
.filter(f => f.required)
.map(f => f.name);
}
async processUserMessage(userMessage) {
// 1. Add to conversation history:
this.conversationHistory.push({
role: 'user',
content: userMessage,
});
// 2. Extract any new data from the message:
const extraction = await this.extractFromMessage(userMessage);
// 3. Update form state:
for (const [field, data] of Object.entries(extraction.fields || {})) {
if (data.value !== null && data.confidence >= 0.7) {
this.formState[field] = data.value;
}
}
// 4. Determine what to ask next:
const missingRequired = this.requiredFields.filter(
f => !this.formState[f]
);
// 5. Generate next response:
const response = await this.generateResponse(
extraction, missingRequired
);
this.conversationHistory.push({
role: 'assistant',
content: response,
});
return {
message: response,
formState: { ...this.formState },
missingFields: missingRequired,
isComplete: missingRequired.length === 0,
};
}
async extractFromMessage(message) {
const missingFields = this.schema.fields
.filter(f => !this.formState[f.name])
.map(f => `${f.name}: ${f.description}`)
.join('\n');
const prompt = `Extract form data from the user's message.
CONVERSATION SO FAR:
${this.conversationHistory.slice(-6).map(m =>
`${m.role}: ${m.content}`
).join('\n')}
CURRENT FORM STATE:
${Object.entries(this.formState).map(([k, v]) => `${k}: ${v}`).join('\n') || '(empty)'}
FIELDS STILL NEEDED:
${missingFields}
USER'S LATEST MESSAGE: "${message}"
Extract any field values mentioned. Return JSON:
{ "fields": { "<name>": { "value": ..., "confidence": 0-1 } } }`;
return JSON.parse(await callLLM(prompt, {
model: 'gpt-4o-mini',
temperature: 0,
maxTokens: 200,
}));
}
async generateResponse(extraction, missingFields) {
const prompt = `You're helping a user fill out a ${this.schema.name} form via chat.
FORM STATE:
${Object.entries(this.formState).map(([k, v]) => `✓ ${k}: ${v}`).join('\n') || '(nothing yet)'}
JUST EXTRACTED:
${Object.entries(extraction.fields || {})
.filter(([, d]) => d.value !== null)
.map(([k, d]) => `${k}: ${d.value}`)
.join('\n') || '(nothing new)'}
STILL MISSING (required):
${missingFields.map(f => {
const field = this.schema.fields.find(fd => fd.name === f);
return `- ${f}: ${field?.description}`;
}).join('\n') || '(all required fields filled!)'}
Generate a BRIEF, conversational response that:
1. Acknowledges what they just told you (if anything new was extracted)
2. Asks for the NEXT most important missing field naturally
3. If all fields are filled, confirm the complete information
Keep it under 2 sentences. Be friendly but efficient.`;
return await callLLM(prompt, {
model: 'gpt-4o-mini',
temperature: 0.3,
maxTokens: 100,
});
}
}
8. Receipt and Invoice Processing
/*
* Receipt processing pipeline for expense reporting:
*
* ┌──────────────────────────────────────────────────┐
* │ Photo of receipt │
* │ │ │
* │ ▼ │
* │ 1. Quality check: │
* │ - Is image sharp enough for OCR? │
* │ - Is it actually a receipt/invoice? │
* │ │ │
* │ ▼ │
* │ 2. OCR with layout awareness: │
* │ - Preserve column alignment │
* │ - Identify header/items/totals sections │
* │ │ │
* │ ▼ │
* │ 3. LLM entity extraction: │
* │ - Merchant, date, total, tax, line items │
* │ - Handle multi-currency, tips, discounts │
* │ │ │
* │ ▼ │
* │ 4. Expense categorization: │
* │ - Map merchant to category (Uber → Transport) │
* │ - Suggest GL code from company chart │
* │ │ │
* │ ▼ │
* │ 5. Form auto-fill with verification overlay │
* └──────────────────────────────────────────────────┘
*/
class ReceiptProcessor {
constructor(categoryMappings) {
this.categories = categoryMappings;
this.documentExtractor = new DocumentExtractor();
}
async processReceipt(imageFile) {
// 1. Quality check:
const qualityCheck = await this.checkImageQuality(imageFile);
if (!qualityCheck.acceptable) {
return {
success: false,
error: qualityCheck.reason,
suggestion: qualityCheck.suggestion,
};
}
// 2. Extract raw data:
const extraction = await this.documentExtractor.extractFromImage(
imageFile, 'receipt'
);
// 3. Categorize the expense:
const category = await this.categorizeExpense(extraction.entities);
// 4. Build expense entry:
return {
success: true,
expense: {
merchant: extraction.entities.merchant?.value,
date: extraction.entities.date?.value,
total: parseFloat(extraction.entities.total?.value) || null,
subtotal: parseFloat(extraction.entities.subtotal?.value) || null,
tax: parseFloat(extraction.entities.tax?.value) || null,
category: category.category,
glCode: category.glCode,
lineItems: extraction.entities.items?.value || [],
currency: extraction.entities.currency?.value || 'USD',
},
confidence: {
overall: extraction.ocrConfidence / 100,
perField: Object.fromEntries(
Object.entries(extraction.entities).map(([k, v]) =>
[k, v?.confidence || 0]
)
),
},
rawOCR: extraction.rawText,
};
}
async checkImageQuality(imageFile) {
const img = await loadImage(imageFile);
// Check resolution:
if (img.width < 300 || img.height < 300) {
return {
acceptable: false,
reason: 'Image resolution too low',
suggestion: 'Please take a clearer photo at higher resolution',
};
}
// Check if it's a photo of a receipt (not a random image):
// Use aspect ratio and content heuristics:
const aspectRatio = img.height / img.width;
if (aspectRatio < 0.5 || aspectRatio > 5) {
return {
acceptable: true, // Don't block, but note unusual aspect ratio
warning: 'Unusual aspect ratio — may not be a standard receipt',
};
}
return { acceptable: true };
}
async categorizeExpense(entities) {
const merchant = entities.merchant?.value?.toLowerCase() || '';
// Check known merchant mappings first:
for (const [pattern, mapping] of Object.entries(this.categories)) {
if (merchant.includes(pattern.toLowerCase())) {
return mapping;
}
}
// AI categorization for unknown merchants:
const prompt = `Categorize this expense.
Merchant: ${entities.merchant?.value || 'unknown'}
Amount: ${entities.total?.value || 'unknown'}
Items: ${entities.items?.value?.map(i => i.name).join(', ') || 'unknown'}
Categories: Travel, Meals & Entertainment, Office Supplies, Software/SaaS,
Transportation, Accommodation, Communication, Professional Services, Other
Return JSON: { "category": "...", "glCode": "...", "confidence": 0-1 }`;
return JSON.parse(await callLLM(prompt, {
model: 'gpt-4o-mini',
temperature: 0,
maxTokens: 50,
}));
}
}
9. Progressive Form Enhancement
/*
* Not all users benefit equally from AI features.
* Progressive enhancement ensures:
* 1. Form works without AI (traditional input)
* 2. AI features enhance progressively as available
* 3. Users control their experience
*
* ┌──────────────────────────────────────────────────┐
* │ Level 0: Plain form fields │
* │ Level 1: + Client-side validation │
* │ Level 2: + Autocomplete from static data │
* │ Level 3: + AI field prediction │
* │ Level 4: + NLP free-text input │
* │ Level 5: + Document OCR extraction │
* │ Level 6: + Conversational interface │
* └──────────────────────────────────────────────────┘
*/
function ProgressiveSmartForm({ schema, onSubmit }) {
const [aiLevel, setAiLevel] = useState(0);
const [capabilities, setCapabilities] = useState({
nlp: false,
ocr: false,
autocomplete: false,
});
// Detect available capabilities:
useEffect(() => {
async function detectCapabilities() {
// Check if AI API is available:
const nlpAvailable = await checkAPIHealth('/api/ai/extract');
// Check if camera/file upload is available:
const ocrAvailable = 'mediaDevices' in navigator ||
'FileReader' in window;
// Check if autocomplete data is loaded:
const autocompleteAvailable = await checkAPIHealth('/api/lookup/health');
const caps = {
nlp: nlpAvailable,
ocr: ocrAvailable,
autocomplete: autocompleteAvailable,
};
setCapabilities(caps);
// Set initial AI level based on capabilities:
if (nlpAvailable) setAiLevel(4);
else if (autocompleteAvailable) setAiLevel(2);
else setAiLevel(0);
}
detectCapabilities();
}, []);
return (
<div className="smart-form">
{/* AI level selector */}
<div className="form-mode-selector">
<label>Input mode:</label>
<select
value={aiLevel}
onChange={(e) => setAiLevel(Number(e.target.value))}
>
<option value={0}>Standard form</option>
{capabilities.autocomplete && (
<option value={2}>Smart autocomplete</option>
)}
{capabilities.nlp && (
<option value={4}>Describe in your words</option>
)}
{capabilities.ocr && (
<option value={5}>Upload a document</option>
)}
</select>
</div>
{/* Render appropriate form variant */}
{aiLevel >= 5 && <DocumentUploadSection schema={schema} />}
{aiLevel >= 4 && <NLPInputSection schema={schema} />}
<TraditionalFormFields
schema={schema}
aiLevel={aiLevel}
onSubmit={onSubmit}
/>
</div>
);
}
10. Privacy-Preserving Local Entity Extraction
/*
* Sending form data to external AI APIs raises privacy concerns,
* especially for PII (addresses, SSN, health data).
*
* Options for local processing:
* 1. On-device NER models (Transformers.js)
* 2. WebAssembly-based inference
* 3. Edge functions (data stays in region)
*/
// Local NER using Transformers.js:
class LocalEntityExtractor {
constructor() {
this.pipeline = null;
}
async init() {
const { pipeline } = await import('@xenova/transformers');
// Load a NER model that runs entirely in the browser:
this.pipeline = await pipeline(
'token-classification',
'Xenova/bert-base-NER',
{
quantized: true, // Use quantized model (~50MB)
progress_callback: (p) => {
console.log(`Loading model: ${Math.round(p.progress)}%`);
},
}
);
}
async extractEntities(text) {
if (!this.pipeline) await this.init();
const results = await this.pipeline(text, {
aggregation_strategy: 'simple',
});
// results = [
// { entity_group: 'PER', word: 'John Smith', score: 0.98, ... },
// { entity_group: 'LOC', word: 'San Francisco', score: 0.95, ... },
// { entity_group: 'ORG', word: 'Honda', score: 0.92, ... },
// ]
// Map NER entities to form fields:
const mapped = {};
for (const entity of results) {
switch (entity.entity_group) {
case 'PER':
mapped.name = {
value: entity.word,
confidence: entity.score
};
break;
case 'LOC':
mapped.location = {
value: entity.word,
confidence: entity.score
};
break;
case 'ORG':
mapped.organization = {
value: entity.word,
confidence: entity.score
};
break;
case 'MISC':
mapped.other = mapped.other || [];
mapped.other.push({
value: entity.word,
confidence: entity.score
});
break;
}
}
return {
entities: mapped,
raw: results,
processedLocally: true,
};
}
}
// Hybrid approach: local for PII, cloud for non-sensitive:
class HybridExtractor {
constructor() {
this.localExtractor = new LocalEntityExtractor();
this.piiFields = new Set([
'ssn', 'dateOfBirth', 'driverLicense', 'bankAccount',
'creditCard', 'healthCondition', 'income',
]);
}
async extract(text, formSchema) {
// Classify fields by sensitivity:
const sensitiveFields = formSchema.fields.filter(f =>
this.piiFields.has(f.name) || f.sensitive === true
);
const nonSensitiveFields = formSchema.fields.filter(f =>
!this.piiFields.has(f.name) && f.sensitive !== true
);
if (sensitiveFields.length > 0) {
// Extract sensitive data locally:
const localResult = await this.localExtractor.extractEntities(text);
// Extract non-sensitive data via API:
let apiResult = {};
if (nonSensitiveFields.length > 0) {
// Redact PII before sending to API:
const redacted = this.redactPII(text, localResult.raw);
apiResult = await this.apiExtract(redacted, nonSensitiveFields);
}
return { ...apiResult, ...localResult.entities };
}
// No sensitive fields — use cloud API:
return await this.apiExtract(text, formSchema.fields);
}
redactPII(text, entities) {
let redacted = text;
// Replace detected PII entities with placeholders:
for (const entity of entities.sort((a, b) => b.start - a.start)) {
if (['PER', 'LOC'].includes(entity.entity_group)) {
redacted = redacted.slice(0, entity.start) +
`[${entity.entity_group}]` +
redacted.slice(entity.end);
}
}
return redacted;
}
}
Trade-offs & Considerations
| Aspect | Traditional Form | AI NLP Input | Document OCR | Conversational |
|---|---|---|---|---|
| Completion rate | 23-40% | 50-70% | 60-80% | 55-75% |
| Data accuracy | User-dependent | 85-95% + review | 80-90% + review | 85-95% + review |
| Time to complete | 5-15 min | 2-5 min | 1-3 min | 3-8 min |
| Accessibility | High (standard) | Medium (needs text) | Low (needs camera) | High (chat-like) |
| Privacy risk | Low (no API) | Medium (cloud AI) | Medium (cloud OCR) | Medium (cloud AI) |
| Works offline | Yes | No (needs API) | Partial (local OCR) | No (needs API) |
| Development cost | Low | Medium | High | High |
| Edge cases | Manual handling | Needs fallback | Needs fallback | Needs fallback |
Best Practices
-
Always provide a traditional form fallback alongside AI input — AI is an enhancement, not a replacement — some users prefer structured fields, some have accessibility needs, and AI APIs can be unavailable; render both the NLP/conversational input AND the traditional fields simultaneously; auto-populate the fields from AI extraction but let users edit every value; never block form submission on AI availability.
-
Show confidence scores visually and highlight AI-filled fields for user verification — distinguish AI-populated fields (highlighted border, "AI-filled 92%" badge) from user-entered fields; show low-confidence extractions as suggestions ("Suggested: Honda Civic — Use this?") rather than auto-filling; always show the source text span that led to each extraction so users can verify.
-
Use deterministic lookups before AI for fields with known mappings — ZIP code → city/state, VIN → vehicle details, phone area code → region — these are API lookups, not AI tasks; deterministic lookups are faster, cheaper, and 100% accurate; reserve AI for fields that genuinely require natural language understanding (free-text descriptions, document extraction, ambiguous inputs).
-
Process sensitive data locally using on-device models for PII fields — use Transformers.js or TensorFlow.js for Named Entity Recognition on fields containing SSN, date of birth, health data, or financial information; only send non-sensitive data to cloud AI APIs; redact detected PII entities before cloud API calls; this satisfies GDPR/HIPAA requirements without sacrificing AI functionality for non-sensitive fields.
-
Validate extracted data semantically, not just syntactically — beyond "is this a valid number?", check cross-field consistency: mileage vs vehicle year, salary vs job title, ZIP code vs city/state; use deterministic rules for known relationships and AI for open-ended consistency checks; present semantic warnings as "Please verify" rather than blocking errors — the user may have correct but unusual data.
Conclusion
AI-powered form intelligence transforms data collection from a tedious multi-field experience into a natural interaction. NLP parsing converts free-text input ("2019 blue Honda Civic with 45k miles") into structured field values using LLM extraction with per-field confidence scores and source text attribution. Context-aware autocomplete uses field relationships — ZIP code triggers city/state lookup (deterministic), VIN triggers vehicle decode (API), and cross-field AI prediction fills remaining gaps. Document OCR combines image preprocessing (deskew, contrast enhancement), Tesseract.js text extraction with bounding boxes and word-level confidence, and LLM entity extraction that handles OCR errors ("$l2.50" → "$12.50") and varied document layouts. Semantic validation goes beyond format checks to verify cross-field consistency (mileage appropriate for vehicle age, salary reasonable for job title). The conversational form interface maintains form state across chat turns, extracts data from each user message, and asks for missing required fields naturally. Progressive enhancement ensures every form works without AI (Level 0: standard fields) while layering smart features when available (autocomplete → NLP input → document upload → conversational). Privacy-preserving extraction uses on-device NER models (Transformers.js) for PII fields and redacts sensitive entities before sending non-PII to cloud APIs. The key principle is that AI is an enhancement layer that populates the same underlying form fields — users can always review, correct, and manually fill any value.
What did you think?