Blog
NoHang Client — The Reliability Layer Under Agent Workflows
NoHang Client — The Reliability Layer Under Agent Workflows — How to build an OpenAI-compatible async client that doesn’t hang: semaphores, timeouts, retries, and SSE streaming.
Most “agent failures” aren’t philosophical — they’re operational:
- hung HTTP calls
- unbounded concurrency
- rate-limit storms
- streaming treated like JSON
This post explains the design of a small wedge: an OpenAI-compatible async chat client that stays stable under long-running workloads.
The four failure modes (and the fixes)
1) Unbounded concurrency
If you gather() 200 tasks, you’ll spike connections and hit 429s.
Fix: a semaphore. One client instance owns a hard concurrency ceiling.
2) No hard timeouts
Without timeouts, a single stuck call can stall a whole run.
Fix: set connect/read/total timeouts and handle failure deterministically.
3) Retry storms
Retries are necessary, but naive retries multiply traffic at the worst moment.
Fix: retry only transient failures, use exponential backoff + jitter, respect Retry-After.
4) Streaming isn’t JSON
Providers often stream via SSE:
data: {...}data: [DONE]
Fix: parse SSE and aggregate deltas into a normal response shape.
Why this is a wedge (not a framework)
NoHang doesn’t replace orchestration. It makes orchestration safe:
- you can run many stages concurrently without melting down
- you can close sessions cleanly and avoid leaked sockets
- you can stream reliably and still return consistent outputs
Where to go next
- Reliability pillar:
/llm-workflow-reliability/ - Audit log schema tool:
/tools/llm-audit-log-schema/ - If you want this productionized:
/request-blueprint/