Office

Terminal session streaming: minimal message shape that stays correct

A durable terminal stream is small and explicit: Input(text), Output(bytes), Resize(cols, rows). Treat output as a byte stream and keep limits strict.

Published: 2026-01-02 · Last updated: 2026-01-04

Direct answer

A minimal terminal streaming protocol is: Input(text) from client to server, Output(bytes) from server to client, and Resize(cols, rows) from client to server. Output must be treated as raw bytes (not “lines”), resize must be explicit, and the server must bound message sizes and enforce session identity.

Mechanism (why “minimal” is safer and more durable)

1) Terminals are byte streams, not text boxes

A terminal produces bytes that include:

  • printable characters
  • ANSI escape codes
  • cursor movement
  • screen clearing
  • partial UTF-8 sequences (depending on timing)

If you “helpfully” split on newlines or assume JSON strings are safe for output, you will eventually corrupt the display.

Rule: send output as bytes; let the terminal emulator render it.

2) Input is not symmetrical with output

User input is typically UTF-8 text (key presses, paste, shortcuts). Output is a byte stream.

A practical split is:

  • client -> server: Input(String) (UTF-8)
  • server -> client: Output(Vec<u8>) (bytes)

If you need special keys (arrows, ctrl), the client can encode them into the input stream the same way it would for a local terminal.

3) Resize must be a first-class message

On desktop you can ignore resizing for a while. On mobile you cannot.

The terminal needs the current dimensions:

  • columns (character cells)
  • rows

Without explicit resize, full-screen programs (curses, editors, pagers) break quickly.

4) Session identity is part of the protocol even if it isn’t a field

You can model session identity in the URL path rather than inside each message, but it is still a protocol contract:

  • the server interprets a connection as “attach to this session”
  • the server refuses attach if the session is unauthorized/expired

A minimal message shape (illustrative)

This is the concept, not a library commitment:

Client -> Server:
  Input("ls -la\n")
  Resize(104, 28)

Server -> Client:
  Output(<bytes>)

Implementation choices you can make without changing the shape:

  • binary frames vs JSON frames
  • message batching (multiple Output chunks per frame)
  • compression off by default (term output compresses poorly and increases latency variability)

Common failure modes

Failure mode: “Output is sent as UTF-8 strings”

Symptom: random rendering corruption, broken cursor state, garbled colors.

Root cause: output bytes are not guaranteed to be valid UTF-8 sequences at chunk boundaries.

Fix: send bytes; keep chunk boundaries arbitrary.

Failure mode: “Resize is inferred, not explicit”

Symptom: layout breaks after rotation or address bar changes.

Root cause: server never learns real cols/rows; it guesses.

Fix: client measures viewport; client sends Resize; server applies to the session.

Failure mode: “Message sizes are unbounded”

Symptom: memory spikes or stuck connections on paste or on very verbose output.

Root cause: server accepts arbitrarily large Input; client tries to render huge frames.

Fix: enforce max message size; chunk output; truncate past a safe threshold.

Failure mode: “Multiple clients attach and interleave input”

Symptom: keystrokes collide; output becomes confusing.

Root cause: no policy for multi-attach.

Fix: enforce single-attached client, or implement controlled handoff (new attach closes old).

How to verify

  1. Byte correctness

    • Run commands that emit ANSI sequences (colors, progress bars).
    • Expected: no corruption after reconnects.
  2. Full-screen apps

    • Run a curses UI and rotate.
    • Expected: redraw behaves correctly; no broken wraps.
  3. Large paste

    • Paste a long command or file fragment.
    • Expected: server enforces limits; UI stays responsive.

Scope boundary

This note describes protocol shape and failure modes. It does not prescribe a production exposure model. Any public-facing terminal requires auth, rate limiting, and strict process/filesystem containment.