Streaming delivery
RustyGPT streams conversation activity to authenticated clients over Server-Sent Events (SSE). The implementation lives in
rustygpt-server/src/handlers/streaming.rs and is gated by features.sse_v1.
Flow
sequenceDiagram
participant Client
participant API as Axum /api
participant Hub as StreamHub
participant DB as rustygpt.sse_event_log
Client->>API: POST /api/threads/{conversation}/root
API->>Hub: publish ConversationStreamEvent
Hub->>Client: SSE event (thread.new)
Note over Hub,DB: if persistence enabled
Hub->>DB: sp_record_sse_event
Client->>API: reconnect with Last-Event-ID
API->>Hub: subscribe(after)
Hub->>DB: sp_sse_replay
DB-->>Hub: persisted events
Hub-->>Client: replay then live stream
Clients subscribe to /api/stream/conversations/:conversation_id. The route is protected by the auth middleware when
features.auth_v1 is enabled, so callers must present a valid session cookie (the CLI handles this automatically).
Event payloads
Events are instances of shared::models::ConversationStreamEvent and are encoded as JSON envelopes with type and payload
fields. See Threaded conversations for the full list of variants.
The SSE handler assigns monotonically increasing sequence numbers per conversation. When persistence is enabled the sequence is
also stored in rustygpt.sse_event_log, allowing reconnecting clients to pass Last-Event-ID and receive any missed events
before resuming the live stream.
Persistence and retention
Configure persistence via [sse.persistence] in config.toml:
[sse.persistence]
enabled = true
max_events_per_user = 500
prune_batch_size = 100
retention_hours = 48
services::sse_persistence stores events using the stored procedures in scripts/pg/schema/050_sse_persistence.sql. The pruning
logic runs after each insert to keep the table bounded.
Backpressure handling
The in-memory queue for each conversation defaults to channel_capacity = 128. Configure behaviour under [sse.backpressure]:
drop_strategy = "drop_tokens"drops assistant token events firstdrop_strategy = "drop_tokens_and_system"also discards system events once the queue fillswarn_queue_ratiocontrols when a warning is logged about queue pressure
These settings keep hot conversations from exhausting memory while still delivering key state changes (presence, membership, unread counters).
Client responsibilities
- Reconnect with
Last-Event-IDso the server can replay persisted events when available - Handle
401responses by re-running the session refresh flow (/api/auth/refresh); the CLI and web client do this automatically - Clear typing state on
typing.updateand update unread counters whenunread.updatearrives
Use REST API endpoints to backfill state when the requested Last-Event-ID falls outside the retention
window.