Frontend Observability Architecture
Frontend Observability Architecture
Introduction
Backend observability is table stakes. Frontend observability is where most engineering organizations still stumble. You can have perfect server-side metrics and still be blind to the user experience: JavaScript errors silently swallowed, performance regressions invisible until NPS tanks, rage clicks accumulating without anyone noticing.
Frontend observability is fundamentally different from backend observability. You don't control the execution environment. Browsers vary wildly in capabilities and behaviors. Network conditions span the spectrum from 5G to 2G. User interactions are unpredictable and non-deterministic. Client-side code runs on millions of devices you'll never see.
This deep dive examines how to build comprehensive frontend observability: from the browser APIs that capture performance data, to the telemetry pipelines that deliver it, to the dashboards and alerts that make it actionable. We'll cover what to measure, how to measure it without killing performance, and how to build systems that scale to millions of users.
Scale Context
Production frontend observability at scale:
| Metric | Value |
|---|---|
| Daily Active Users | 25M |
| Page Views per Day | 200M |
| RUM Events per Day | 5B+ |
| Error Events per Day | 10M |
| Telemetry Payload Size (avg) | 2-5KB |
| Telemetry Data Volume | 10-25TB/day |
| Beacon Success Rate | 97-99% |
| Client-side Storage Budget | <50KB |
| Performance Overhead Target | <3% |
| Alert P50 Latency (event → alert) | <60 seconds |
| Dashboard Query P95 | <5 seconds |
At this scale, every byte of telemetry matters. Every unnecessary event is millions of wasted requests.
Observability Pillars for Frontend
The Three Pillars Adapted for Frontend
┌─────────────────────────────────────────────────────────────────────────────┐
│ FRONTEND OBSERVABILITY PILLARS │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ METRICS │ │
│ │ │ │
│ │ Aggregated numerical measurements over time │ │
│ │ │ │
│ │ Frontend-specific: │ │
│ │ • Core Web Vitals (LCP, FID/INP, CLS) │ │
│ │ • Custom timing metrics │ │
│ │ • JavaScript error rates │ │
│ │ • API call success/failure rates │ │
│ │ • User interaction metrics │ │
│ │ │ │
│ │ Cardinality challenge: Dimensions like browser version, device, │ │
│ │ network type, geographic location explode cardinality │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ LOGS │ │
│ │ │ │
│ │ Discrete events with context │ │
│ │ │ │
│ │ Frontend-specific: │ │
│ │ • JavaScript errors with stack traces │ │
│ │ • Console warnings/errors │ │
│ │ • Network request failures │ │
│ │ • User actions (breadcrumbs) │ │
│ │ • Custom debug events │ │
│ │ │ │
│ │ Challenge: Stack traces from minified code are useless without │ │
│ │ source maps. Privacy concerns with user data in logs. │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ TRACES │ │
│ │ │ │
│ │ Request flow across systems │ │
│ │ │ │
│ │ Frontend-specific: │ │
│ │ • User journey traces (click → page load → API → render) │ │
│ │ • Long tasks breakdown │ │
│ │ • Resource loading waterfall │ │
│ │ • Cross-origin request correlation │ │
│ │ │ │
│ │ Challenge: Connecting frontend traces to backend requires │ │
│ │ trace context propagation through headers. │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ SESSION REPLAY (Bonus Pillar) │ │
│ │ │ │
│ │ Visual reconstruction of user sessions │ │
│ │ │ │
│ │ • DOM snapshots and mutations │ │
│ │ • User interactions (clicks, scrolls, inputs) │ │
│ │ • Network activity timeline │ │
│ │ • Console output │ │
│ │ │ │
│ │ Challenge: Privacy (PII masking), data volume, performance impact │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Browser APIs for Observability
Performance APIs
// Core browser APIs for performance observability
// 1. Performance Timeline API
// Unified access to performance entries
const entries = performance.getEntriesByType('navigation');
const navigationTiming = entries[0] as PerformanceNavigationTiming;
console.log({
// DNS lookup
dnsLookup: navigationTiming.domainLookupEnd - navigationTiming.domainLookupStart,
// TCP connection
tcpConnect: navigationTiming.connectEnd - navigationTiming.connectStart,
// TLS negotiation (if HTTPS)
tlsNegotiation: navigationTiming.secureConnectionStart > 0
? navigationTiming.connectEnd - navigationTiming.secureConnectionStart
: 0,
// Time to First Byte
ttfb: navigationTiming.responseStart - navigationTiming.requestStart,
// Content download
contentDownload: navigationTiming.responseEnd - navigationTiming.responseStart,
// DOM parsing
domParsing: navigationTiming.domContentLoadedEventEnd - navigationTiming.responseEnd,
// Total page load
totalLoad: navigationTiming.loadEventEnd - navigationTiming.startTime
});
// 2. PerformanceObserver API
// Observe specific performance events as they occur
// Observe Largest Contentful Paint
const lcpObserver = new PerformanceObserver((list) => {
const entries = list.getEntries();
const lastEntry = entries[entries.length - 1];
console.log('LCP:', lastEntry.startTime);
});
lcpObserver.observe({ type: 'largest-contentful-paint', buffered: true });
// Observe First Input Delay
const fidObserver = new PerformanceObserver((list) => {
for (const entry of list.getEntries()) {
const fidEntry = entry as PerformanceEventTiming;
console.log('FID:', fidEntry.processingStart - fidEntry.startTime);
}
});
fidObserver.observe({ type: 'first-input', buffered: true });
// Observe Cumulative Layout Shift
let clsValue = 0;
const clsObserver = new PerformanceObserver((list) => {
for (const entry of list.getEntries() as LayoutShift[]) {
if (!entry.hadRecentInput) {
clsValue += entry.value;
}
}
});
clsObserver.observe({ type: 'layout-shift', buffered: true });
// Observe Long Tasks (blocking main thread)
const longTaskObserver = new PerformanceObserver((list) => {
for (const entry of list.getEntries()) {
console.log('Long task:', {
duration: entry.duration,
startTime: entry.startTime,
name: entry.name,
attribution: (entry as any).attribution
});
}
});
longTaskObserver.observe({ type: 'longtask', buffered: true });
// 3. Resource Timing API
// Detailed timing for every resource
const resources = performance.getEntriesByType('resource') as PerformanceResourceTiming[];
for (const resource of resources) {
console.log({
name: resource.name,
type: resource.initiatorType,
duration: resource.duration,
transferSize: resource.transferSize,
decodedBodySize: resource.decodedBodySize,
cached: resource.transferSize === 0 && resource.decodedBodySize > 0
});
}
// 4. User Timing API
// Custom performance marks and measures
performance.mark('feature-start');
// ... do work ...
performance.mark('feature-end');
performance.measure('feature-duration', 'feature-start', 'feature-end');
const measures = performance.getEntriesByName('feature-duration');
console.log('Feature took:', measures[0].duration, 'ms');
// 5. Element Timing API
// Track specific element render times
// In HTML: <img elementtiming="hero-image" src="..." />
const elementObserver = new PerformanceObserver((list) => {
for (const entry of list.getEntries() as PerformanceElementTiming[]) {
console.log('Element rendered:', entry.identifier, entry.renderTime);
}
});
elementObserver.observe({ type: 'element', buffered: true });
Error Capture APIs
// Comprehensive error capture
interface CapturedError {
type: 'unhandled' | 'promise' | 'network' | 'console';
message: string;
stack?: string;
filename?: string;
lineno?: number;
colno?: number;
timestamp: number;
breadcrumbs: Breadcrumb[];
context: ErrorContext;
}
interface ErrorContext {
url: string;
userAgent: string;
viewport: { width: number; height: number };
memory?: MemoryInfo;
connection?: ConnectionInfo;
}
class ErrorCapture {
private breadcrumbs: Breadcrumb[] = [];
private maxBreadcrumbs = 50;
initialize(): void {
this.setupGlobalErrorHandler();
this.setupPromiseRejectionHandler();
this.setupNetworkErrorCapture();
this.setupConsoleCapture();
this.setupBreadcrumbCapture();
}
private setupGlobalErrorHandler(): void {
window.onerror = (
message: string | Event,
filename?: string,
lineno?: number,
colno?: number,
error?: Error
): boolean => {
this.captureError({
type: 'unhandled',
message: typeof message === 'string' ? message : message.type,
stack: error?.stack,
filename,
lineno,
colno,
timestamp: Date.now(),
breadcrumbs: [...this.breadcrumbs],
context: this.getContext()
});
return false; // Don't prevent default handling
};
}
private setupPromiseRejectionHandler(): void {
window.onunhandledrejection = (event: PromiseRejectionEvent): void => {
const error = event.reason;
this.captureError({
type: 'promise',
message: error?.message || String(error),
stack: error?.stack,
timestamp: Date.now(),
breadcrumbs: [...this.breadcrumbs],
context: this.getContext()
});
};
}
private setupNetworkErrorCapture(): void {
// Intercept fetch
const originalFetch = window.fetch;
window.fetch = async (input: RequestInfo | URL, init?: RequestInit) => {
const url = typeof input === 'string' ? input : input instanceof URL ? input.href : input.url;
const startTime = performance.now();
try {
const response = await originalFetch(input, init);
this.addBreadcrumb({
category: 'fetch',
message: `${init?.method || 'GET'} ${url}`,
data: { status: response.status },
level: response.ok ? 'info' : 'warning',
timestamp: Date.now()
});
if (!response.ok) {
this.captureError({
type: 'network',
message: `HTTP ${response.status}: ${url}`,
timestamp: Date.now(),
breadcrumbs: [...this.breadcrumbs],
context: {
...this.getContext(),
request: { url, method: init?.method || 'GET' },
response: { status: response.status }
}
});
}
return response;
} catch (error) {
this.captureError({
type: 'network',
message: `Network error: ${url}`,
stack: error instanceof Error ? error.stack : undefined,
timestamp: Date.now(),
breadcrumbs: [...this.breadcrumbs],
context: this.getContext()
});
throw error;
}
};
}
private setupBreadcrumbCapture(): void {
// Capture clicks
document.addEventListener('click', (event) => {
const target = event.target as HTMLElement;
const selector = this.getElementSelector(target);
this.addBreadcrumb({
category: 'ui.click',
message: selector,
data: { tag: target.tagName, text: target.innerText?.slice(0, 50) },
level: 'info',
timestamp: Date.now()
});
}, true);
// Capture navigation
const originalPushState = history.pushState;
history.pushState = (...args) => {
this.addBreadcrumb({
category: 'navigation',
message: String(args[2]),
data: { from: window.location.href },
level: 'info',
timestamp: Date.now()
});
return originalPushState.apply(history, args);
};
}
private addBreadcrumb(breadcrumb: Breadcrumb): void {
this.breadcrumbs.push(breadcrumb);
if (this.breadcrumbs.length > this.maxBreadcrumbs) {
this.breadcrumbs.shift();
}
}
private getContext(): ErrorContext {
return {
url: window.location.href,
userAgent: navigator.userAgent,
viewport: {
width: window.innerWidth,
height: window.innerHeight
},
memory: (performance as any).memory,
connection: (navigator as any).connection
};
}
private getElementSelector(element: HTMLElement): string {
const parts: string[] = [];
let current: HTMLElement | null = element;
while (current && parts.length < 5) {
let selector = current.tagName.toLowerCase();
if (current.id) {
selector += `#${current.id}`;
} else if (current.className) {
selector += `.${current.className.split(' ').join('.')}`;
}
parts.unshift(selector);
current = current.parentElement;
}
return parts.join(' > ');
}
private captureError(error: CapturedError): void {
// Send to telemetry endpoint
this.sendToBackend(error);
}
}
Telemetry Pipeline Architecture
Data Collection Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ FRONTEND TELEMETRY PIPELINE │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ BROWSER LAYER │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Performance │ │ Error │ │ User │ │ │
│ │ │ Metrics │ │ Capture │ │ Interactions│ │ │
│ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │
│ │ │ │ │ │ │
│ │ └────────────────┼────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ TELEMETRY SDK │ │ │
│ │ │ │ │ │
│ │ │ • Event batching (100 events or 5 seconds) │ │ │
│ │ │ • Sampling (1-100% based on event type) │ │ │
│ │ │ • Compression (gzip payload) │ │ │
│ │ │ • Local buffering (IndexedDB for offline) │ │ │
│ │ │ • Retry logic (exponential backoff) │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ BEACON TRANSPORT │ │ │
│ │ │ │ │ │
│ │ │ Priority: │ │ │
│ │ │ 1. navigator.sendBeacon() - survives page unload │ │ │
│ │ │ 2. fetch() with keepalive - more control │ │ │
│ │ │ 3. XMLHttpRequest - fallback │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ HTTPS POST │
│ │ (gzipped JSON) │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ INGESTION LAYER │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ EDGE COLLECTOR │ │ │
│ │ │ │ │ │
│ │ │ • Deployed at CDN edge (low latency ingest) │ │ │
│ │ │ • Request validation │ │ │
│ │ │ • Rate limiting per client │ │ │
│ │ │ • Source map lookup (if stack trace) │ │ │
│ │ │ • GeoIP enrichment │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ MESSAGE QUEUE │ │ │
│ │ │ (Kafka / Kinesis / Pub/Sub) │ │ │
│ │ │ │ │ │
│ │ │ Topics: │ │ │
│ │ │ • rum.metrics - Performance data │ │ │
│ │ │ • rum.errors - Error events │ │ │
│ │ │ • rum.interactions - User behavior │ │ │
│ │ │ • rum.replay - Session replay data │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ PROCESSING LAYER │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ │
│ │ │ Stream │ │ Aggregation │ │ Anomaly │ │ │
│ │ │ Processing │ │ Engine │ │ Detection │ │ │
│ │ │ (Flink/Spark) │ │ (Time-series) │ │ (ML Pipeline) │ │ │
│ │ └──────────────────┘ └──────────────────┘ └──────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ STORAGE LAYER │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ │
│ │ │ Time-Series DB │ │ Raw Event │ │ Error │ │ │
│ │ │ (InfluxDB/ │ │ Storage │ │ Aggregation │ │ │
│ │ │ TimescaleDB) │ │ (S3/BigQuery) │ │ (Postgres/ES) │ │ │
│ │ └──────────────────┘ └──────────────────┘ └──────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
SDK Implementation
// Production-grade telemetry SDK
interface TelemetryConfig {
endpoint: string;
apiKey: string;
sampleRate: number; // 0-1
batchSize: number; // Events per batch
flushInterval: number; // ms
maxQueueSize: number; // Max events before dropping
enableCompression: boolean;
enableOfflineSupport: boolean;
}
class TelemetrySDK {
private queue: TelemetryEvent[] = [];
private flushTimer: number | null = null;
private db: IDBDatabase | null = null;
constructor(private config: TelemetryConfig) {
this.initialize();
}
private async initialize(): Promise<void> {
if (this.config.enableOfflineSupport) {
await this.initializeOfflineStorage();
}
this.setupFlushTimer();
this.setupBeforeUnload();
await this.sendPendingFromStorage();
}
track(event: TelemetryEvent): void {
// Apply sampling
if (Math.random() > this.config.sampleRate) {
return;
}
// Enrich event
const enrichedEvent = this.enrichEvent(event);
// Add to queue
if (this.queue.length >= this.config.maxQueueSize) {
// Drop oldest events
this.queue.shift();
}
this.queue.push(enrichedEvent);
// Flush if batch size reached
if (this.queue.length >= this.config.batchSize) {
this.flush();
}
}
private enrichEvent(event: TelemetryEvent): TelemetryEvent {
return {
...event,
timestamp: Date.now(),
sessionId: this.getSessionId(),
pageUrl: window.location.href,
viewport: {
width: window.innerWidth,
height: window.innerHeight
},
connection: this.getConnectionInfo(),
deviceMemory: (navigator as any).deviceMemory,
hardwareConcurrency: navigator.hardwareConcurrency
};
}
private async flush(): Promise<void> {
if (this.queue.length === 0) return;
const batch = this.queue.splice(0, this.config.batchSize);
const payload = JSON.stringify({ events: batch });
try {
const compressed = this.config.enableCompression
? await this.compress(payload)
: payload;
const success = await this.sendBatch(compressed);
if (!success) {
// Store for retry
await this.storeForRetry(batch);
}
} catch (error) {
// Store for retry
await this.storeForRetry(batch);
}
}
private async sendBatch(payload: string | ArrayBuffer): Promise<boolean> {
// Try sendBeacon first (best for page unload)
if (typeof payload === 'string' && payload.length < 65536) {
const blob = new Blob([payload], { type: 'application/json' });
const success = navigator.sendBeacon(this.config.endpoint, blob);
if (success) return true;
}
// Fallback to fetch with keepalive
try {
const response = await fetch(this.config.endpoint, {
method: 'POST',
headers: {
'Content-Type': this.config.enableCompression
? 'application/octet-stream'
: 'application/json',
'Content-Encoding': this.config.enableCompression ? 'gzip' : 'identity',
'X-API-Key': this.config.apiKey
},
body: payload,
keepalive: true
});
return response.ok;
} catch {
return false;
}
}
private async compress(data: string): Promise<ArrayBuffer> {
const encoder = new TextEncoder();
const stream = new CompressionStream('gzip');
const writer = stream.writable.getWriter();
writer.write(encoder.encode(data));
writer.close();
const reader = stream.readable.getReader();
const chunks: Uint8Array[] = [];
while (true) {
const { done, value } = await reader.read();
if (done) break;
chunks.push(value);
}
const totalLength = chunks.reduce((acc, chunk) => acc + chunk.length, 0);
const result = new Uint8Array(totalLength);
let offset = 0;
for (const chunk of chunks) {
result.set(chunk, offset);
offset += chunk.length;
}
return result.buffer;
}
private setupFlushTimer(): void {
this.flushTimer = window.setInterval(() => {
this.flush();
}, this.config.flushInterval);
}
private setupBeforeUnload(): void {
window.addEventListener('visibilitychange', () => {
if (document.visibilityState === 'hidden') {
this.flush();
}
});
window.addEventListener('pagehide', () => {
this.flush();
});
}
// IndexedDB for offline support
private async initializeOfflineStorage(): Promise<void> {
return new Promise((resolve, reject) => {
const request = indexedDB.open('telemetry', 1);
request.onerror = () => reject(request.error);
request.onupgradeneeded = () => {
const db = request.result;
if (!db.objectStoreNames.contains('pending')) {
db.createObjectStore('pending', { autoIncrement: true });
}
};
request.onsuccess = () => {
this.db = request.result;
resolve();
};
});
}
private async storeForRetry(events: TelemetryEvent[]): Promise<void> {
if (!this.db) return;
const transaction = this.db.transaction('pending', 'readwrite');
const store = transaction.objectStore('pending');
for (const event of events) {
store.add(event);
}
}
private async sendPendingFromStorage(): Promise<void> {
if (!this.db) return;
const transaction = this.db.transaction('pending', 'readwrite');
const store = transaction.objectStore('pending');
const request = store.getAll();
request.onsuccess = async () => {
const events = request.result;
if (events.length > 0) {
const payload = JSON.stringify({ events });
const success = await this.sendBatch(payload);
if (success) {
store.clear();
}
}
};
}
}
Core Web Vitals Monitoring
LCP (Largest Contentful Paint) Tracking
// Comprehensive LCP tracking with attribution
interface LCPData {
value: number;
element: string;
elementType: string;
url?: string;
loadTime?: number;
renderDelay?: number;
attribution: LCPAttribution;
}
interface LCPAttribution {
ttfb: number;
resourceLoadDelay: number;
resourceLoadTime: number;
elementRenderDelay: number;
}
class LCPTracker {
private lcpEntries: PerformanceEntry[] = [];
private observer: PerformanceObserver | null = null;
start(): void {
this.observer = new PerformanceObserver((list) => {
const entries = list.getEntries();
this.lcpEntries.push(...entries);
});
this.observer.observe({ type: 'largest-contentful-paint', buffered: true });
// Stop observing after load + a buffer
window.addEventListener('load', () => {
setTimeout(() => this.finalize(), 5000);
});
}
private finalize(): void {
this.observer?.disconnect();
if (this.lcpEntries.length === 0) return;
const finalEntry = this.lcpEntries[this.lcpEntries.length - 1] as any;
const lcpData = this.buildLCPData(finalEntry);
// Send to telemetry
this.report(lcpData);
}
private buildLCPData(entry: any): LCPData {
const element = entry.element;
const elementType = element?.tagName || 'unknown';
const url = entry.url;
// Calculate attribution
const navEntry = performance.getEntriesByType('navigation')[0] as PerformanceNavigationTiming;
const ttfb = navEntry.responseStart - navEntry.requestStart;
// Find resource timing for LCP element (if applicable)
let resourceLoadDelay = 0;
let resourceLoadTime = 0;
if (url) {
const resourceEntry = performance
.getEntriesByType('resource')
.find(r => r.name === url) as PerformanceResourceTiming | undefined;
if (resourceEntry) {
resourceLoadDelay = resourceEntry.startTime - navEntry.responseEnd;
resourceLoadTime = resourceEntry.responseEnd - resourceEntry.startTime;
}
}
const elementRenderDelay = entry.startTime - (ttfb + resourceLoadDelay + resourceLoadTime);
return {
value: entry.startTime,
element: this.getElementSelector(element),
elementType,
url,
attribution: {
ttfb,
resourceLoadDelay,
resourceLoadTime,
elementRenderDelay
}
};
}
private report(data: LCPData): void {
// Categorize for alerting
const rating = this.getLCPRating(data.value);
telemetry.track({
type: 'web-vital',
name: 'LCP',
value: data.value,
rating,
...data
});
}
private getLCPRating(value: number): 'good' | 'needs-improvement' | 'poor' {
if (value <= 2500) return 'good';
if (value <= 4000) return 'needs-improvement';
return 'poor';
}
}
INP (Interaction to Next Paint) Tracking
// INP tracking with detailed attribution
interface INPData {
value: number;
eventType: string;
target: string;
attribution: INPAttribution;
}
interface INPAttribution {
inputDelay: number; // Time from interaction to handler start
processingTime: number; // Time in handler
presentationDelay: number; // Time from handler to paint
}
class INPTracker {
private interactions: Map<number, INPData> = new Map();
private observer: PerformanceObserver | null = null;
start(): void {
this.observer = new PerformanceObserver((list) => {
for (const entry of list.getEntries() as PerformanceEventTiming[]) {
// Only track discrete interactions
if (!['pointerdown', 'pointerup', 'keydown', 'keyup', 'click'].includes(entry.name)) {
continue;
}
const duration = entry.duration;
const interactionId = entry.interactionId;
if (!interactionId) continue;
// INP = max duration per interaction
const existing = this.interactions.get(interactionId);
if (!existing || duration > existing.value) {
this.interactions.set(interactionId, {
value: duration,
eventType: entry.name,
target: this.getTargetSelector(entry),
attribution: {
inputDelay: entry.processingStart - entry.startTime,
processingTime: entry.processingEnd - entry.processingStart,
presentationDelay: duration - (entry.processingEnd - entry.startTime)
}
});
}
}
});
this.observer.observe({ type: 'event', buffered: true, durationThreshold: 16 });
// Report periodically and on page hide
document.addEventListener('visibilitychange', () => {
if (document.visibilityState === 'hidden') {
this.reportINP();
}
});
}
private reportINP(): void {
if (this.interactions.size === 0) return;
// INP is the worst interaction (with some nuance for 98th percentile at scale)
const sorted = [...this.interactions.values()].sort((a, b) => b.value - a.value);
const inp = sorted[0];
const rating = this.getINPRating(inp.value);
telemetry.track({
type: 'web-vital',
name: 'INP',
value: inp.value,
rating,
...inp
});
}
private getINPRating(value: number): 'good' | 'needs-improvement' | 'poor' {
if (value <= 200) return 'good';
if (value <= 500) return 'needs-improvement';
return 'poor';
}
}
CLS (Cumulative Layout Shift) Tracking
// CLS tracking with shift source identification
interface CLSData {
value: number;
shifts: LayoutShiftInfo[];
largestShiftSource?: string;
}
interface LayoutShiftInfo {
value: number;
timestamp: number;
sources: string[];
hadRecentInput: boolean;
}
class CLSTracker {
private sessionValue = 0;
private sessionEntries: LayoutShiftInfo[] = [];
private currentSessionStart = 0;
private lastEntryTime = 0;
start(): void {
const observer = new PerformanceObserver((list) => {
for (const entry of list.getEntries() as LayoutShift[]) {
// Ignore shifts from user input
if (entry.hadRecentInput) continue;
const currentTime = entry.startTime;
// Session window: gap of 1s+ or total 5s+ ends session
if (
currentTime - this.lastEntryTime > 1000 ||
currentTime - this.currentSessionStart > 5000
) {
// New session
this.currentSessionStart = currentTime;
this.sessionValue = 0;
this.sessionEntries = [];
}
this.sessionValue += entry.value;
this.lastEntryTime = currentTime;
this.sessionEntries.push({
value: entry.value,
timestamp: currentTime,
sources: this.getShiftSources(entry),
hadRecentInput: entry.hadRecentInput
});
}
});
observer.observe({ type: 'layout-shift', buffered: true });
// Report on page hide
document.addEventListener('visibilitychange', () => {
if (document.visibilityState === 'hidden') {
this.reportCLS();
}
});
}
private getShiftSources(entry: LayoutShift): string[] {
if (!entry.sources || entry.sources.length === 0) {
return ['unknown'];
}
return entry.sources.map(source => {
const node = source.node;
if (!node) return 'removed-node';
if (node instanceof HTMLElement) {
return this.getElementSelector(node);
}
return node.nodeName;
});
}
private reportCLS(): void {
const rating = this.getCLSRating(this.sessionValue);
// Find largest shift source
const largestShift = this.sessionEntries.reduce(
(max, shift) => (shift.value > max.value ? shift : max),
this.sessionEntries[0]
);
telemetry.track({
type: 'web-vital',
name: 'CLS',
value: this.sessionValue,
rating,
shifts: this.sessionEntries,
largestShiftSource: largestShift?.sources[0]
});
}
private getCLSRating(value: number): 'good' | 'needs-improvement' | 'poor' {
if (value <= 0.1) return 'good';
if (value <= 0.25) return 'needs-improvement';
return 'poor';
}
}
Source Map Integration
Stack Trace Symbolication
┌─────────────────────────────────────────────────────────────────────────────┐
│ SOURCE MAP SYMBOLICATION FLOW │
│ │
│ Build Time: │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ [Source Code] ──build──▶ [Minified JS] + [Source Map] │ │
│ │ │ │
│ │ app.ts (500KB) ──▶ app.a1b2c3.js (50KB) + app.a1b2c3.js.map │ │
│ │ │ │
│ │ Source map upload: │ │
│ │ POST /api/sourcemaps │ │
│ │ { │ │
│ │ "version": "a1b2c3", │ │
│ │ "files": [ │ │
│ │ { "name": "app.a1b2c3.js.map", "content": "..." } │ │
│ │ ] │ │
│ │ } │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ Runtime (Error Occurs): │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Minified Stack: │ │
│ │ Error: Cannot read property 'x' of undefined │ │
│ │ at e.handleClick (app.a1b2c3.js:1:54321) │ │
│ │ at o (app.a1b2c3.js:1:12345) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Server-Side Symbolication: │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ 1. Extract version from filename (a1b2c3) │ │
│ │ 2. Fetch source map from storage │ │
│ │ 3. Parse stack trace │ │
│ │ 4. For each frame: │ │
│ │ - Look up (line, column) in source map │ │
│ │ - Get original (file, line, column, name) │ │
│ │ 5. Reconstruct symbolicated stack │ │
│ │ │ │
│ │ Symbolicated Stack: │ │
│ │ Error: Cannot read property 'x' of undefined │ │
│ │ at UserProfile.handleClick (src/UserProfile.tsx:42:15) │ │
│ │ at onClick (src/Button.tsx:18:5) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Symbolication Service:
// Source map symbolication service
import { SourceMapConsumer, RawSourceMap } from 'source-map';
interface StackFrame {
filename: string;
lineno: number;
colno: number;
function?: string;
}
interface SymbolicatedFrame extends StackFrame {
originalFilename: string;
originalLineno: number;
originalColno: number;
originalFunction: string;
contextLines?: {
pre: string[];
line: string;
post: string[];
};
}
class SymbolicationService {
private sourceMapCache = new Map<string, SourceMapConsumer>();
private storage: SourceMapStorage;
async symbolicate(
stack: StackFrame[],
appVersion: string
): Promise<SymbolicatedFrame[]> {
const symbolicated: SymbolicatedFrame[] = [];
for (const frame of stack) {
try {
const symbolicatedFrame = await this.symbolicateFrame(frame, appVersion);
symbolicated.push(symbolicatedFrame);
} catch (error) {
// Return original frame if symbolication fails
symbolicated.push({
...frame,
originalFilename: frame.filename,
originalLineno: frame.lineno,
originalColno: frame.colno,
originalFunction: frame.function || '<anonymous>'
});
}
}
return symbolicated;
}
private async symbolicateFrame(
frame: StackFrame,
appVersion: string
): Promise<SymbolicatedFrame> {
const consumer = await this.getSourceMapConsumer(frame.filename, appVersion);
const originalPosition = consumer.originalPositionFor({
line: frame.lineno,
column: frame.colno
});
if (!originalPosition.source) {
throw new Error('Source map position not found');
}
// Get source context
const sourceContent = consumer.sourceContentFor(originalPosition.source);
const contextLines = this.extractContextLines(
sourceContent,
originalPosition.line
);
return {
...frame,
originalFilename: originalPosition.source,
originalLineno: originalPosition.line || 0,
originalColno: originalPosition.column || 0,
originalFunction: originalPosition.name || '<anonymous>',
contextLines
};
}
private async getSourceMapConsumer(
filename: string,
appVersion: string
): Promise<SourceMapConsumer> {
const cacheKey = `${appVersion}:${filename}`;
if (this.sourceMapCache.has(cacheKey)) {
return this.sourceMapCache.get(cacheKey)!;
}
// Fetch source map
const sourceMapContent = await this.storage.getSourceMap(filename, appVersion);
const rawSourceMap: RawSourceMap = JSON.parse(sourceMapContent);
const consumer = await new SourceMapConsumer(rawSourceMap);
this.sourceMapCache.set(cacheKey, consumer);
// Evict old entries
if (this.sourceMapCache.size > 100) {
const firstKey = this.sourceMapCache.keys().next().value;
this.sourceMapCache.delete(firstKey);
}
return consumer;
}
private extractContextLines(
source: string | null,
line: number,
contextSize = 3
): { pre: string[]; line: string; post: string[] } | undefined {
if (!source) return undefined;
const lines = source.split('\n');
const targetLine = lines[line - 1];
if (!targetLine) return undefined;
return {
pre: lines.slice(Math.max(0, line - 1 - contextSize), line - 1),
line: targetLine,
post: lines.slice(line, line + contextSize)
};
}
}
User Session Tracking
Session Architecture
// Session management for observability
interface Session {
id: string;
userId?: string;
startTime: number;
lastActivityTime: number;
pageViews: number;
interactions: number;
errors: number;
device: DeviceInfo;
referrer?: string;
utmParams?: UTMParams;
}
interface DeviceInfo {
type: 'mobile' | 'tablet' | 'desktop';
os: string;
browser: string;
viewport: { width: number; height: number };
deviceMemory?: number;
hardwareConcurrency?: number;
connectionType?: string;
}
class SessionManager {
private session: Session | null = null;
private sessionTimeout = 30 * 60 * 1000; // 30 minutes
private activityTimer: number | null = null;
initialize(): void {
this.loadOrCreateSession();
this.setupActivityTracking();
this.setupBeforeUnload();
}
private loadOrCreateSession(): void {
const stored = sessionStorage.getItem('obs_session');
if (stored) {
const session = JSON.parse(stored) as Session;
// Check if session expired
if (Date.now() - session.lastActivityTime < this.sessionTimeout) {
this.session = session;
return;
}
}
// Create new session
this.session = this.createSession();
this.persistSession();
}
private createSession(): Session {
return {
id: this.generateSessionId(),
userId: this.getUserId(),
startTime: Date.now(),
lastActivityTime: Date.now(),
pageViews: 1,
interactions: 0,
errors: 0,
device: this.getDeviceInfo(),
referrer: document.referrer,
utmParams: this.extractUTMParams()
};
}
private getDeviceInfo(): DeviceInfo {
return {
type: this.detectDeviceType(),
os: this.detectOS(),
browser: this.detectBrowser(),
viewport: {
width: window.innerWidth,
height: window.innerHeight
},
deviceMemory: (navigator as any).deviceMemory,
hardwareConcurrency: navigator.hardwareConcurrency,
connectionType: (navigator as any).connection?.effectiveType
};
}
private setupActivityTracking(): void {
const events = ['click', 'scroll', 'keydown'];
const updateActivity = throttle(() => {
if (this.session) {
this.session.lastActivityTime = Date.now();
this.persistSession();
}
}, 5000);
events.forEach(event => {
document.addEventListener(event, updateActivity, { passive: true });
});
}
recordPageView(): void {
if (this.session) {
this.session.pageViews++;
this.session.lastActivityTime = Date.now();
this.persistSession();
}
}
recordInteraction(): void {
if (this.session) {
this.session.interactions++;
this.persistSession();
}
}
recordError(): void {
if (this.session) {
this.session.errors++;
this.persistSession();
}
}
getSessionContext(): Partial<Session> {
if (!this.session) return {};
return {
id: this.session.id,
userId: this.session.userId,
pageViews: this.session.pageViews,
interactions: this.session.interactions,
errors: this.session.errors,
device: this.session.device
};
}
private persistSession(): void {
if (this.session) {
sessionStorage.setItem('obs_session', JSON.stringify(this.session));
}
}
private generateSessionId(): string {
// Combine timestamp with random for uniqueness
return `${Date.now().toString(36)}-${Math.random().toString(36).slice(2, 9)}`;
}
}
Alerting Architecture
Alert Definition and Evaluation
// Frontend alerting system
interface AlertRule {
id: string;
name: string;
metric: string;
condition: AlertCondition;
window: number; // seconds
severity: 'critical' | 'warning' | 'info';
channels: string[];
tags: Record<string, string>;
enabled: boolean;
}
interface AlertCondition {
type: 'threshold' | 'anomaly' | 'absence';
operator?: 'gt' | 'lt' | 'gte' | 'lte';
value?: number;
percentile?: number;
sensitivityLevel?: 'low' | 'medium' | 'high';
}
interface Alert {
id: string;
ruleId: string;
status: 'firing' | 'resolved';
startTime: number;
resolvedTime?: number;
value: number;
threshold: number;
message: string;
context: Record<string, unknown>;
}
// Alert rules for frontend
const frontendAlertRules: AlertRule[] = [
// Core Web Vitals
{
id: 'lcp-degradation',
name: 'LCP P75 Degradation',
metric: 'web_vitals.lcp',
condition: {
type: 'threshold',
operator: 'gt',
value: 2500,
percentile: 75
},
window: 300, // 5 minutes
severity: 'warning',
channels: ['slack-frontend', 'pagerduty-secondary'],
tags: { team: 'frontend', category: 'performance' },
enabled: true
},
{
id: 'lcp-critical',
name: 'LCP P75 Critical',
metric: 'web_vitals.lcp',
condition: {
type: 'threshold',
operator: 'gt',
value: 4000,
percentile: 75
},
window: 300,
severity: 'critical',
channels: ['slack-frontend', 'pagerduty-primary'],
tags: { team: 'frontend', category: 'performance' },
enabled: true
},
// Error rates
{
id: 'js-error-spike',
name: 'JavaScript Error Rate Spike',
metric: 'errors.javascript.rate',
condition: {
type: 'anomaly',
sensitivityLevel: 'medium'
},
window: 600,
severity: 'warning',
channels: ['slack-frontend'],
tags: { team: 'frontend', category: 'errors' },
enabled: true
},
{
id: 'js-error-critical',
name: 'JavaScript Error Rate Critical',
metric: 'errors.javascript.rate',
condition: {
type: 'threshold',
operator: 'gt',
value: 5 // 5% error rate
},
window: 300,
severity: 'critical',
channels: ['slack-frontend', 'pagerduty-primary'],
tags: { team: 'frontend', category: 'errors' },
enabled: true
},
// API failures from frontend perspective
{
id: 'api-failure-rate',
name: 'Frontend API Failure Rate',
metric: 'api.error_rate',
condition: {
type: 'threshold',
operator: 'gt',
value: 1 // 1%
},
window: 300,
severity: 'warning',
channels: ['slack-frontend', 'slack-backend'],
tags: { team: 'frontend', category: 'api' },
enabled: true
},
// Rage clicks (user frustration)
{
id: 'rage-clicks',
name: 'Rage Click Detection',
metric: 'user.rage_clicks.rate',
condition: {
type: 'anomaly',
sensitivityLevel: 'high'
},
window: 900,
severity: 'warning',
channels: ['slack-product', 'slack-frontend'],
tags: { team: 'product', category: 'ux' },
enabled: true
},
// Data absence (telemetry pipeline issue)
{
id: 'telemetry-absence',
name: 'Telemetry Data Absence',
metric: 'telemetry.events.count',
condition: {
type: 'absence'
},
window: 300,
severity: 'critical',
channels: ['slack-infra', 'pagerduty-primary'],
tags: { team: 'infra', category: 'telemetry' },
enabled: true
}
];
// Alert evaluation engine
class AlertEvaluator {
private activeAlerts = new Map<string, Alert>();
async evaluate(rule: AlertRule): Promise<Alert | null> {
const metricValue = await this.queryMetric(
rule.metric,
rule.window,
rule.condition.percentile
);
const isTriggered = this.evaluateCondition(rule.condition, metricValue);
const existingAlert = this.activeAlerts.get(rule.id);
if (isTriggered && !existingAlert) {
// New alert
const alert: Alert = {
id: `${rule.id}-${Date.now()}`,
ruleId: rule.id,
status: 'firing',
startTime: Date.now(),
value: metricValue,
threshold: rule.condition.value || 0,
message: this.buildAlertMessage(rule, metricValue),
context: await this.gatherContext(rule)
};
this.activeAlerts.set(rule.id, alert);
await this.notifyChannels(alert, rule.channels);
return alert;
}
if (!isTriggered && existingAlert) {
// Resolve alert
existingAlert.status = 'resolved';
existingAlert.resolvedTime = Date.now();
await this.notifyChannels(existingAlert, rule.channels);
this.activeAlerts.delete(rule.id);
return existingAlert;
}
return null;
}
private evaluateCondition(condition: AlertCondition, value: number): boolean {
switch (condition.type) {
case 'threshold':
switch (condition.operator) {
case 'gt': return value > condition.value!;
case 'lt': return value < condition.value!;
case 'gte': return value >= condition.value!;
case 'lte': return value <= condition.value!;
default: return false;
}
case 'anomaly':
return this.detectAnomaly(value, condition.sensitivityLevel!);
case 'absence':
return value === 0;
default:
return false;
}
}
private async gatherContext(rule: AlertRule): Promise<Record<string, unknown>> {
// Gather additional context for debugging
return {
affectedBrowsers: await this.queryTopBrowsers(rule.metric, rule.window),
affectedPages: await this.queryTopPages(rule.metric, rule.window),
recentDeployments: await this.getRecentDeployments(),
relatedErrors: await this.getRelatedErrors(rule.metric)
};
}
}
Dashboard Architecture
Key Dashboard Views
┌─────────────────────────────────────────────────────────────────────────────┐
│ FRONTEND OBSERVABILITY DASHBOARDS │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 1. EXECUTIVE OVERVIEW │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐│ │
│ │ │ LCP P75 │ │ INP P75 │ │ CLS P75 │ │ Error Rate ││ │
│ │ │ 2.1s ✓ │ │ 180ms ✓ │ │ 0.08 ✓ │ │ 0.3% ✓ ││ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘│ │
│ │ │ │
│ │ [7-day Web Vitals Trend Chart] │ │
│ │ [Traffic Volume by Region] │ │
│ │ [Active Alerts Summary] │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 2. PERFORMANCE DEEP DIVE │ │
│ │ │ │
│ │ Filters: [Page] [Browser] [Device] [Connection] [Region] │ │
│ │ │ │
│ │ ┌───────────────────────────────────────────────────────────────┐ │ │
│ │ │ LCP Breakdown by Page │ │ │
│ │ │ /home ████████████████ 2.1s │ │ │
│ │ │ /product/* ██████████████████████ 2.8s │ │ │
│ │ │ /checkout ████████████████████████████ 3.4s │ │ │
│ │ └───────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌───────────────────────────────────────────────────────────────┐ │ │
│ │ │ LCP Attribution │ │ │
│ │ │ TTFB: ████ 800ms │ │ │
│ │ │ Resource Load: ████████ 1200ms │ │ │
│ │ │ Render Delay: ██ 300ms │ │ │
│ │ └───────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ [Resource Waterfall] [Long Tasks Timeline] [CLS Sources] │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 3. ERROR MONITORING │ │
│ │ │ │
│ │ ┌───────────────────────────────────────────────────────────────┐ │ │
│ │ │ Error Groups (Deduplicated) │ │ │
│ │ │ │ │ │
│ │ │ [!] TypeError: Cannot read property 'map' of undefined │ │ │
│ │ │ Users: 12,450 | Sessions: 8,230 | First: 2h ago │ │ │
│ │ │ src/ProductList.tsx:42 │ │ │
│ │ │ │ │ │
│ │ │ [!] NetworkError: Failed to fetch │ │ │
│ │ │ Users: 8,120 | Sessions: 5,670 | First: 4h ago │ │ │
│ │ │ src/api/client.ts:128 │ │ │
│ │ │ │ │ │
│ │ └───────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ [Error Rate Over Time] [Affected Browsers] [Stack Trace] │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 4. USER EXPERIENCE │ │
│ │ │ │
│ │ ┌───────────────────────────────────────────────────────────────┐ │ │
│ │ │ Frustration Signals │ │ │
│ │ │ │ │ │
│ │ │ Rage Clicks: ███████ 2.1% of sessions │ │ │
│ │ │ Dead Clicks: █████ 1.4% of sessions │ │ │
│ │ │ Error + Bounce: ████████████ 4.2% of sessions │ │ │
│ │ └───────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ [Session Replay List] [User Journey Funnels] │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Performance Overhead Management
Minimizing Observer Impact
// Performance-conscious observability
class LowOverheadObserver {
private isHighPriorityPage = false;
private samplingRate = 0.1; // 10% default
private batchBuffer: TelemetryEvent[] = [];
private idleCallbackId: number | null = null;
constructor() {
this.detectPagePriority();
this.adjustSamplingRate();
}
private detectPagePriority(): void {
// Critical pages get full observability
const criticalPaths = ['/checkout', '/payment', '/signup'];
this.isHighPriorityPage = criticalPaths.some(
path => window.location.pathname.startsWith(path)
);
}
private adjustSamplingRate(): void {
// Adjust based on page priority and device
if (this.isHighPriorityPage) {
this.samplingRate = 1.0; // 100% for critical pages
} else if (this.isLowEndDevice()) {
this.samplingRate = 0.01; // 1% for low-end devices
} else {
this.samplingRate = 0.1; // 10% default
}
}
private isLowEndDevice(): boolean {
const memory = (navigator as any).deviceMemory;
const cores = navigator.hardwareConcurrency;
const connection = (navigator as any).connection?.effectiveType;
return (
(memory && memory < 4) ||
(cores && cores < 4) ||
['slow-2g', '2g'].includes(connection)
);
}
observe(eventType: string, handler: () => TelemetryEvent | null): void {
// Sample first
if (Math.random() > this.samplingRate) {
return;
}
// Use requestIdleCallback for non-critical work
if ('requestIdleCallback' in window) {
requestIdleCallback(
(deadline) => {
if (deadline.timeRemaining() > 5) {
const event = handler();
if (event) this.queueEvent(event);
}
},
{ timeout: 2000 }
);
} else {
// Fallback: defer to next frame
requestAnimationFrame(() => {
setTimeout(() => {
const event = handler();
if (event) this.queueEvent(event);
}, 0);
});
}
}
private queueEvent(event: TelemetryEvent): void {
this.batchBuffer.push(event);
// Batch send during idle time
if (!this.idleCallbackId && this.batchBuffer.length >= 10) {
this.scheduleFlush();
}
}
private scheduleFlush(): void {
if ('requestIdleCallback' in window) {
this.idleCallbackId = requestIdleCallback(
() => this.flush(),
{ timeout: 10000 }
);
} else {
setTimeout(() => this.flush(), 5000);
}
}
private flush(): void {
if (this.batchBuffer.length === 0) return;
const batch = this.batchBuffer.splice(0, this.batchBuffer.length);
this.sendBatch(batch);
this.idleCallbackId = null;
}
}
Measuring Overhead
// Self-monitoring for observability overhead
class OverheadMonitor {
private longTasksFromObservers = 0;
private totalObserverTime = 0;
private measurements: number[] = [];
measureObserverImpact<T>(name: string, fn: () => T): T {
const start = performance.now();
const result = fn();
const duration = performance.now() - start;
this.measurements.push(duration);
this.totalObserverTime += duration;
// Alert if observer is taking too long
if (duration > 16) { // More than one frame
console.warn(`Observer ${name} took ${duration.toFixed(2)}ms`);
this.longTasksFromObservers++;
}
return result;
}
getOverheadReport(): OverheadReport {
return {
totalTime: this.totalObserverTime,
measurementCount: this.measurements.length,
averageTime: this.measurements.length > 0
? this.totalObserverTime / this.measurements.length
: 0,
longTasks: this.longTasksFromObservers,
p95Time: this.calculatePercentile(this.measurements, 95),
recommendation: this.getRecommendation()
};
}
private getRecommendation(): string {
const avg = this.totalObserverTime / Math.max(this.measurements.length, 1);
if (avg > 10) {
return 'CRITICAL: Observability overhead too high. Reduce sampling or simplify observers.';
} else if (avg > 5) {
return 'WARNING: Observability overhead elevated. Consider optimization.';
} else {
return 'OK: Observability overhead within acceptable limits.';
}
}
}
Summary
Frontend observability is not optional infrastructure—it's essential for understanding how your application behaves in the wild. Backend metrics tell you what your servers see; frontend observability tells you what users experience.
Key Architectural Principles:
-
Capture Core Web Vitals with attribution - Not just the numbers, but why they are what they are. Attribution enables debugging.
-
Sample intelligently - 100% capture for critical pages, reduced sampling for high-volume low-value events. Balance data fidelity with overhead.
-
Buffer and batch - Never send one event at a time. Batch to reduce requests. Buffer for offline resilience.
-
Use sendBeacon for page unload - Critical events must survive navigation. sendBeacon is the only reliable option.
-
Source maps are mandatory - Minified stack traces are useless. Automate source map upload in CI/CD.
-
Session context enriches everything - Errors, performance issues, and user frustration are more actionable with session context.
-
Alert on user impact, not infrastructure - Users don't care about server CPU. Alert on LCP, error rates, and rage clicks.
-
Monitor your monitoring - Telemetry SDK adds overhead. Measure it. Keep it under 3%.
Frontend observability done right gives you superpowers: you can see performance regressions before users complain, debug errors with full context, and understand the true user experience. Done wrong, it's noise that bogs down your site and overwhelms your dashboards.
Invest in the architecture. The visibility it provides is worth it.
What did you think?