Skip to content

JavaScript Guide

The browser client is a single framework-agnostic class, RemoteFramebufferView. All decoding runs in a Web Worker that owns the WebSocket, the decoder, and a transferred OffscreenCanvas, so the main thread stays free for your UI.

The client lives in widgets/ and publishes as @habemus-papadum/rfb-widgets. During development, import from the source (../src/index); when consumed as a package, import from @habemus-papadum/rfb-widgets.

Quick start

import { RemoteFramebufferView } from "@habemus-papadum/rfb-widgets";

const view = new RemoteFramebufferView(document.getElementById("stage")!, {
  url: "ws://localhost:8765",
  onState: (s) => console.log("state:", s),
  onStats: (s) => console.log(s.transport, s.framesDisplayed, "fps-ish"),
});

// when you're done (route change, component unmount, ...):
view.dispose();

Pass either a <canvas> (used directly) or any container element (a canvas is created to fill it). The view sizes the canvas backing store, transfers it to the worker, opens the connection, and starts forwarding input events.

Options

interface RfbViewOptions {
  url: string;                       // ws:// or wss:// endpoint
  workerFactory?: () => Worker;      // override worker construction (see CSP below)
  autoResize?: boolean;              // default true (ResizeObserver -> set_viewport)
  devicePixelRatio?: number;         // override window.devicePixelRatio
  maxBackingDimension?: number;      // cap backing pixels (decoder/GPU limits)
  imageOnly?: boolean;               // force the image transport (skip H.264)
  maxInflight?: number;              // client-side decode backpressure ceiling
  fit?: "contain" | "cover" | "fill"; // frame-vs-canvas AR handling (default "contain")
  background?: string;               // letterbox fill for "contain" (CSS color; default "#000")
  zoom?: number;                     // initial zoom on top of the fit (1 = fit)
  panX?: number; panY?: number;      // initial pan offset (backing device px)
  gestures?: boolean;                // wheel-zoom / drag-pan / pinch (default false)
  minZoom?: number; maxZoom?: number; // zoom clamp for gestures + zoomBy (default 0.05 .. 64)
  surface?: "2d" | "webgl" | "webgpu" | "auto"; // compositing backend (default "2d"; "auto" prefers GPU)
  token?: string;                    // auth credential sent in `hello` (e.g. a Google ID token)
  debug?: boolean;                   // verbose client-side console logging (default false)
  onState?: (s: ConnectionState) => void;
  onStats?: (s: Stats) => void;
  onError?: (e: Error) => void;
  onViewport?: (framing: ViewportFraming) => void; // fires on any fit/zoom/pan change
}

Fit modes. When the stream's aspect ratio differs from the canvas, fit decides: "contain" (default) letterboxes with background, "cover" crops, "fill" stretches each axis (the pre-fit-modes behavior). Change it live with view.setFit(fit, background?).

Display backend (surface). How the decoded frame is drawn to the canvas is pluggable across three backends: "2d" (default) composites with Canvas2D drawImage — the always-available floor, and already fast for a hardware VideoFrame; "webgl" and "webgpu" upload each frame to a GPU texture and sample it; "auto" prefers the GPU tier (WebGPU → WebGL2) and falls back to Canvas2D. If a requested GPU context is unavailable the view silently falls back, and the active backend is reported on stats.surface ("2d" | "webgl" | "webgpu"). All three are pixel-equivalent — same present geometry, same output, verified against a controlled test frame (including zoom/pan) — so the GPU paths exist for client-side compositing, not different pixels. Switch it live with view.setSurface(kind) (see Viewport). See the display backend design.

Which backend pays? Measured (see the benchmark in that proposal's Measurements section): at 1280×720 the per-frame main-thread present cost is negligible for all three backends — under 0.1 ms at p95 to submit setFrame+present, because the actual pixel-moving is a deferred GPU/compositor blit, not main-thread work. So the default is "2d": it costs nothing extra on the thread and drawImage of a hardware VideoFrame is already a GPU blit. Reach for "webgl"/"webgpu" (or FrameTextureFeed, below) when you need the frame as a texture in your own scene (compositing, overlays, later zoom/pan) — that is the reason the GPU paths exist, not a faster present. Absolute GPU throughput is workload- and hardware-specific; run the benchmark on your target to compare there. The worker→main frame transfer + upload that FrameTextureFeed (Mode B) adds is measured too (proposal Measurements §15.1): the cross-thread handoff is a handle move below the timer floor, and uploading a transferred frame costs the same as an on-thread one.

Debug logging. debug: true turns on a tagged console play-by-play from the main thread and the decode worker — [rfb:worker] ws / config / keyframe / frame and [rfb:view] state — so you can watch the connection negotiate, keyframes get requested, and frames decode. Genuine failures (WebSocket error, decoder error, image-decode throw) are logged to console.error either way — silently swallowing them was a footgun — and debug layers the verbose stream on top. The pdum-rfb demo UI exposes this as a toggle (also honored from ?debug=1 / localStorage). The client owns a single frame↔canvas transform (viewport.ts), so it maps every pointer/wheel event to framebuffer pixels through the current fit before sending — the publisher receives coordinates that index its frame directly, correct under any fit / DPR (see Input events). Wide-gamut streams (the server tagged color=DISPLAY_P3) render on a matching display-p3 canvas automatically.

ConnectionState is connecting | open | negotiated | closed | error. Stats reports the local decode side — framesDisplayed, framesDropped, lastDisplayedSeq, decodeQueueSize, transport (image | webcodecs | none), and surface (the active draw path, 2d | webgl | webgpu). When the server is started with stats_interval (and/or adaptive), it also pushes authoritative server-truth metrics that Stats surfaces as optional fields: serverRttMs, serverFpsSent, serverBitrateBps, serverEncodeMs, serverDropped, and the adaptive targetBitrate / targetFps (undefined until the server sends them). For the full loop and a worked stats-HUD example, see Metrics & adaptive quality.

Viewport: crop / zoom / pan

The client frames the stream itself — the server never zooms and never learns your zoom/pan. One shared encoded stream, N viewers each looking wherever they like, and the client does its own coordinate transforms so input events still map back to frame space. The transform is a uniform zoom on top of the fit plus a backing-pixel pan; the default (zoom: 1, no pan) is exactly the classic fit.

Drive it imperatively:

view.setViewport({ zoom: 2, panX: -120 });   // any subset of { fit, zoom, panX, panY, background }
view.zoomBy(1.25, cssX, cssY);                // multiply zoom, anchored at a CSS point (default: center)
view.applyPreset("fit" | "cover" | "fill" | "one-to-one");  // named presets over the transform
view.resetView();                             // back to fit (zoom 1, no pan)
const { fit, zoom, panX, panY } = view.getViewport();

Everything re-presents the retained last frame immediately (present is decoupled from decode), so zoom/pan stay smooth even on a sparse stream, with no wire traffic. onViewport fires on every change so chrome can show the current zoom. Events remain correct under any zoom/pan — the same viewport.ts geometry that draws the frame inverts a pointer back to frame pixels (inside is false once you pan a click off the frame).

Gestures (gestures: true, off by default): the wheel zooms toward the cursor, a middle- or right-button drag pans, and a two-finger pinch zooms/pans. Left-button drag, clicks, and hover still pass through to the publisher, so app interaction is unaffected; only the pan/zoom gestures are consumed. The batteries <RemoteFramebuffer> turns gestures on and adds zoom + / / fit buttons.

Switching the display backend live

view.setSurface("2d" | "webgl" | "webgpu" | "auto") changes the draw path on a running view. A canvas's context type is immutable — a canvas that yielded a 2d context can never yield webgl2 — so this cannot mutate the existing canvas: it tears down and rebuilds the view/worker with the new backend and reconnects the stream (a brief reconnect flash; the current fit/zoom/pan carry over). It therefore only works when the view owns its canvas — i.e. you constructed it with a container element (as the batteries wrappers do), not a caller-provided <canvas> (whose context is already bound). The batteries chrome exposes this as a backend dropdown next to capture PNG.

Recording what the viewer sees

Record the live view — exactly what's on screen, fit/zoom/pan and all — to a video file in the browser. This is the client-side counterpart to the server-side Display.record(...) tap (the Python guide): the display canvas is captured with HTMLCanvasElement.captureStream() and encoded by a MediaRecorder (WebM/MP4 per browser support).

import { MainThreadPresentView, recordingSupported } from "@habemus-papadum/rfb-widgets";

const view = new MainThreadPresentView(container, { url: "wss://…" });
if (view.canRecord) {
  view.startRecording({ frameRate: 30 });          // optional: mimeType, videoBitsPerSecond, timeslice
  // …later…
  const blob = await view.stopRecording();          // a WebM (or MP4) Blob of what was shown
  const a = Object.assign(document.createElement("a"), {
    href: URL.createObjectURL(blob),
    download: "framebuffer.webm",
  });
  a.click();
}

captureStream() needs a real on-screen canvas. Recording is only available on a view that owns a main-thread <canvas>MainThreadPresentView (the anywidget's default present path). The default RemoteFramebufferView (Mode A) transfers its canvas to the decode worker via transferControlToOffscreen(), so the main thread has no drawable surface to capture, and OffscreenCanvas.captureStream() is not portable. On that view canRecord is false and startRecording() throws RECORDING_UNSUPPORTED_MESSAGE — guard with view.canRecord.

API: view.canRecord (capability gate — MediaRecorder + a main-thread canvas), view.isRecording, view.startRecording(options?), and view.stopRecording(): Promise<Blob>. dispose() cancels an in-progress recording. Helpers recordingSupported() and pickRecordingMimeType(preferred?) are exported for feature-detection. The batteries chrome surfaces this as a record toggle next to capture PNG (disabled when the active view can't record). Sparse/on-demand scenes record sparse frames — the view redraws (and thus the stream captures) on each displayed frame.

Frame-as-texture (Mode B): composite the stream into your own scene

RemoteFramebufferView owns a canvas and presents the frame for you (Mode A). When you instead want the decoded frame as a GPU texture in your own WebGL2/WebGPU context — to composite it into a hardware scene (a windowed frame + a corner minimap + a client-drawn GUI overlay, custom blends) — use FrameTextureFeed. A decode worker owns the WebSocket + decoder and transfers each decoded frame to your thread; the feed uploads it into a texture; you drive your own render loop and sample it.

import { FrameTextureFeed } from "@habemus-papadum/rfb-widgets";

const gl = myCanvas.getContext("webgl2")!;              // YOUR context and render loop
const feed = new FrameTextureFeed({ url: "wss://…", gl,
  onFrame: () => scheduleRender(),                      // a new frame is in feed.texture
});
// in your rAF loop:
drawScene(gl, feed.texture, feed.frameW, feed.frameH);  // sample it like any texture
  • WebGL2: feed.texture is the live WebGLTexture. Its UV origin (0,0) is the frame's top-left; v increases toward the bottom.
  • WebGPU: pass device instead of gl. feed.currentTexture() is a persistent RGBA GPUTexture you can sample and retain; feed.importCurrentFrame() is a zero-copy GPUExternalTexture for the current VideoFrame — call it inside the render pass that samples it (it expires at task end).

You own the compositing and the cadence; decode and present are naturally decoupled (the worker decodes, your loop renders). See the display backend design (§6) for the full model.

Authentication

Pass token (e.g. a Google OAuth ID token your page already obtained) and it is sent in the hello message; the server's authenticate hook verifies it before streaming and closes the socket with code 4401 if it's rejected (see the Python guide). Resolve the token before constructing the view; for short-lived tokens you currently reconnect with a fresh one (there is no built-in refresh/reconnect yet).

Methods/getters: view.state, view.stats, view.surface, view.lastCaptureSeq, view.capture("imagedata" | "blob") (a debug/test hook that reads back the current canvas pixels); the viewport controls view.setFit, view.setViewport, view.zoomBy, view.applyPreset, view.resetView, view.getViewport, and view.setSurface; recording (MainThreadPresentView only) view.canRecord, view.startRecording, view.stopRecording (see Recording); and view.dispose().

Framework integration

The core (@habemus-papadum/rfb-widgets) has no framework dependency, and there are thin idiomatic wrappers for the big three. Each ships two tiers:

  • Tier 1 — headless. A hook / action / primitive that owns the view lifecycle and exposes reactive state / stats / error + capture / reconnect. No markup, no CSS — you render and style everything.
  • Tier 2 — batteries. A <RemoteFramebuffer> component with a status pill, a compact latency badge, a toggleable stats HUD, an error banner, and a toolbar (screenshot / record / fullscreen / transport toggle / HUD toggle, a display-backend selector, and zoom + / / fit controls; crop/zoom/pan gestures are on). Opt-in stylesheet, fully themeable (see Theming).
Framework Package Tier 1 Tier 2
React (≥18) @habemus-papadum/rfb-react useRemoteFramebuffer / useRemoteFramebufferStats <RemoteFramebuffer>
Svelte (5) @habemus-papadum/rfb-svelte createRemoteFramebuffer (use: action + stores) <RemoteFramebuffer>
Solid (≥1.8) @habemus-papadum/rfb-solid createRemoteFramebuffer (ref + signals) <RemoteFramebuffer>

Each wrapper peer-depends the core, so you install both (e.g. pnpm add @habemus-papadum/rfb-react @habemus-papadum/rfb-widgets react react-dom). The Web Worker is inlined in the core, so no extra bundler config is needed.

Recreate-on-change: the core has no setters, so changing a connect-critical option (url, token, imageOnly, dpr, maxBackingDimension, maxInflight, autoResize) disposes and rebuilds the connection — the remote stream genuinely restarts. Cosmetic props and fresh callback closures do not recreate it.

React

import { RemoteFramebuffer, useRemoteFramebuffer, useRemoteFramebufferStats } from "@habemus-papadum/rfb-react";
import "@habemus-papadum/rfb-react/styles.css"; // only needed for the batteries component

// Batteries:
<RemoteFramebuffer url="ws://localhost:8765" style={{ width: 640, height: 480 }} />;

// Headless: build your own UI on the hook.
function MyView({ url }: { url: string }) {
  const { containerRef, state, view } = useRemoteFramebuffer({ url });
  const stats = useRemoteFramebufferStats(view); // opt-in; no re-render storm at frame rate
  return (
    <div style={{ width: 640, height: 480 }}>
      <div ref={containerRef} style={{ width: "100%", height: "100%" }} />
      <span>{state} · {stats.transport}</span>
    </div>
  );
}

Svelte

<script lang="ts">
  import { RemoteFramebuffer, createRemoteFramebuffer } from "@habemus-papadum/rfb-svelte";
  import "@habemus-papadum/rfb-svelte/styles.css";

  // Headless: `use:` action + stores.
  const fb = createRemoteFramebuffer({ url: "ws://localhost:8765" });
  const { state, stats } = fb;
</script>

<!-- Batteries -->
<RemoteFramebuffer url="ws://localhost:8765" style="width:640px;height:480px" />

<!-- Headless -->
<div class="viewport" use:fb.action={{ url: "ws://localhost:8765" }}></div>
<p>{$state} · {$stats.transport}</p>

Solid

import { RemoteFramebuffer, createRemoteFramebuffer } from "@habemus-papadum/rfb-solid";
import "@habemus-papadum/rfb-solid/styles.css";

// Batteries:
<RemoteFramebuffer url="ws://localhost:8765" style={{ width: "640px", height: "480px" }} />;

// Headless: ref + signals (pass an accessor for reactive connect params).
function MyView(props: { url: string }) {
  const fb = createRemoteFramebuffer(() => ({ url: props.url }));
  return (
    <div style={{ width: "640px", height: "480px" }}>
      <div ref={fb.ref} style={{ width: "100%", height: "100%" }} />
      <span>{fb.state()} · {fb.stats().transport}</span>
    </div>
  );
}

Theming the batteries component

Tier 1 ships no CSS. Tier 2's stylesheet is opt-in and restyleable three ways, without forking:

  1. CSS custom properties on .rfb-root--rfb-accent, --rfb-bg, --rfb-fg, --rfb-overlay-bg, --rfb-status-{connecting,open,closed,error}, --rfb-radius, --rfb-font, … Override on any ancestor to reskin.
  2. Stable part classes.rfb-root[data-state], .rfb-viewport, .rfb-toolbar, .rfb-button, .rfb-status, .rfb-badge, .rfb-hud, .rfb-banner — for precise CSS.
  3. Structural replacement — React/Solid renderStatus / renderToolbar / renderHud / renderError render-props (each given the reactive chrome context) and children; Svelte named slots. Drop regions entirely with toolbar={false} / hud={false} / status={false} / badge={false}.

Other frameworks / vanilla

The core class works anywhere — instantiate in a mount hook, dispose() on cleanup:

import { RemoteFramebufferView } from "@habemus-papadum/rfb-widgets";
const view = new RemoteFramebufferView(el, { url: "ws://localhost:8765" });
// … later …
view.dispose();

For example, in Vue: onMounted(() => (view = new RemoteFramebufferView(el.value!, { url }))) and onBeforeUnmount(() => view?.dispose()).

Input events

The view captures DOM events on the canvas and forwards normalized versions to the server, following the renderview spec — the event vocabulary shared by jupyter_rfb / pygfx / fastplotlib — so events feed those consumers without translation. It forwards pointermove/down/up, wheel, and keydown/keyup, and:

  • sends pointer/wheel x/y as physical framebuffer pixels (top-left origin): the worker maps CSS → backing → frame through the current fit (viewport.ts), so the publisher receives coordinates that index its frame directly — correct under any fit mode or DPR. It also adds inside (false in letterbox padding / a cover crop) and pixel_ratio (the frame's render DPR echo), so a publisher rendering in logical coordinates can divide it out;
  • reports button as renderview's 0=none, 1=left, 2=right, 3=middle and buttons as the tuple of currently-pressed buttons (not a DOM bitmask);
  • capitalizes modifiers: "Shift", "Control", "Alt", "Meta";
  • keeps a code (physical-key) field on key events — an additive extra over renderview — and a timestamp (seconds) on every input event;
  • normalizes wheel deltaMode (line/page) to pixels;
  • sets tabindex on the canvas so it can receive keyboard focus, and uses setPointerCapture so drags that leave the canvas keep reporting;
  • observes resize (and DPR changes) and sends set_viewport (logical width/ height, physical pwidth/pheight, ratio), after which the worker resizes the OffscreenCanvas and requests a fresh keyframe.

The server receives the common event vocabulary ({type, x, y, button, buttons, modifiers, timestamp}, etc.); you drain it (tagged with client_id/principal) from display.poll_events() in your own loop (see the Python guide).

Transport selection

The worker probes WebCodecs (VideoDecoder.isConfigSupported) and advertises webcodecs/h264-annexb only when avc1 decode is actually supported; otherwise it advertises image formats only. The server then picks H.264 or the image path. Force the image path with imageOnly: true (useful for debugging or environments without H.264 decode).

Worker packaging & CSP

By default the worker is inlined into the published bundle (Vite ?worker&inline), so RemoteFramebufferView works with any bundler — or none — with zero worker configuration. The cost is that it constructs the worker from a blob: URL, which requires the CSP directive worker-src blob:.

For strict-CSP sites that disallow blob: workers, the package ships the same worker as a standalone, self-contained ES module at the /worker subpath export (@habemus-papadum/rfb-widgets/worker). Point workerFactory at it via whatever asset-URL mechanism your bundler provides, and the worker loads from a real, cacheable URL under worker-src 'self' — no blob: needed:

// Vite: `?url` yields the emitted asset's URL.
import rfbWorkerUrl from "@habemus-papadum/rfb-widgets/worker?url";

new RemoteFramebufferView(el, {
  url,
  workerFactory: () => new Worker(rfbWorkerUrl, { type: "module" }),
});
// webpack 5 / Rollup / Parcel: the `new URL(..., import.meta.url)` form is
// statically detected and emits the worker as a real asset.
new RemoteFramebufferView(el, {
  url,
  workerFactory: () =>
    new Worker(new URL("@habemus-papadum/rfb-widgets/worker", import.meta.url), { type: "module" }),
});

The worker is versioned in lockstep with the main entry (same package), so it never drifts from the client that drives it. You no longer need to vendor src/worker/entry.ts.

Advanced: protocol & helpers

The package also exports the lower-level pieces for custom integrations: unpackBinaryMessage / packBinaryMessage, probeCapabilities / isCodecSupported, BackpressureController / KeyframeGate, the event normalizers (normalizePointerEvent, pointerToCanvas, mapButton/mapButtons, computeBackingSize, …), and all the wire/event TypeScript types. See Internals for the wire format and worker design.

Building & developing

pnpm install
pnpm dev          # demo at http://localhost:5173 (?ws=...&transport=image|video)
pnpm typecheck    # tsc for library + worker (separate DOM / WebWorker libs)
pnpm test         # Vitest unit tests
pnpm build        # dist/index.js (+ .d.ts), worker inlined
pnpm e2e          # Playwright headless e2e (boots the Python server + demo)