Back to blog
EngineeringFebruary 28, 2026·12 min read

How we built real-time canvas sync for 10,000 concurrent users

A deep dive into the WebSocket architecture, CRDT conflict resolution, and the performance optimizations that power SketchPad's collaboration engine.

AK
Alex Kim
SketchPad editorial team

The challenge

Real-time whiteboarding breaks down quickly when every cursor move, shape edit, and text mutation competes for the same network budget. Our target was ambitious: keep collaboration fluid even when a large workshop pushes a canvas to thousands of concurrent interactions.

The core problem was not just transport. We needed a model that could absorb latency, preserve intent, and recover gracefully when clients disconnected or rejoined mid-session.

Transport and presence

We used Socket.IO for resilient session management, heartbeat handling, and fallback behavior across unreliable networks. Presence updates are intentionally lightweight so cursor movement never blocks structural document changes.

That split between ephemeral presence and durable canvas mutations reduced bandwidth pressure and made the live experience feel more immediate.

Conflict resolution

For document state, we adopted conflict resolution strategies at the element level. Instead of forcing a full-canvas lock, we let users act locally and merge changes using deterministic rules for position, styling, and deletion conflicts.

That decision kept the interface responsive under contention while avoiding the operational complexity of synchronizing giant snapshots on every change.

Performance wins

The biggest gains came from aggressively narrowing the work each client had to do: viewport culling, dirty-region rendering, and batching state reconciliation on the animation frame.

Those optimizations let us hold interaction quality much longer as canvases grew, which mattered more than any single backend tweak.