Blog

NoHang Client — The Reliability Layer Under Agent Workflows

NoHang Client — The Reliability Layer Under Agent Workflows — How to build an OpenAI-compatible async client that doesn’t hang: semaphores, timeouts, retries, and SSE streaming.

Most “agent failures” aren’t philosophical — they’re operational:

  • hung HTTP calls
  • unbounded concurrency
  • rate-limit storms
  • streaming treated like JSON

This post explains the design of a small wedge: an OpenAI-compatible async chat client that stays stable under long-running workloads.

The four failure modes (and the fixes)

1) Unbounded concurrency

If you gather() 200 tasks, you’ll spike connections and hit 429s.

Fix: a semaphore. One client instance owns a hard concurrency ceiling.

2) No hard timeouts

Without timeouts, a single stuck call can stall a whole run.

Fix: set connect/read/total timeouts and handle failure deterministically.

3) Retry storms

Retries are necessary, but naive retries multiply traffic at the worst moment.

Fix: retry only transient failures, use exponential backoff + jitter, respect Retry-After.

4) Streaming isn’t JSON

Providers often stream via SSE:

  • data: {...}
  • data: [DONE]

Fix: parse SSE and aggregate deltas into a normal response shape.

Why this is a wedge (not a framework)

NoHang doesn’t replace orchestration. It makes orchestration safe:

  • you can run many stages concurrently without melting down
  • you can close sessions cleanly and avoid leaked sockets
  • you can stream reliably and still return consistent outputs

Where to go next

Create account

Create account