RustyGPT Documentation
Welcome! This mdBook describes the RustyGPT workspace in depth: how the server and clients are structured, how the PostgreSQL-backed features behave, and how to operate the platform locally or in shared environments.
Start with the guides to get a local environment running, then dive into the reference and architecture chapters for precise APIs and design notes.
Quick navigation
- Quickstart – configure and launch the server, web client, and CLI.
- Local development – watcher workflows, debugging tools, and environment variables.
- REST API – endpoint catalogue for conversations, streaming, authentication, and admin features.
- Service topology – how the Axum server, Yew SPA, PostgreSQL, and SSE stream hub fit together.
What RustyGPT ships today
RustyGPT focuses on a cohesive Rust stack:
rustygpt-serverexposes REST + SSE endpoints with cookie-based auth (handlers/auth.rs), rate limiting (middleware/rate_limit.rs), and OpenAPI documentation (openapi.rs).rustygpt-webis a Yew SPA that consumes the server API viasrc/api.rsand renders threaded conversations, presence, and typing indicators.rustygpt-clishares the same models as the server, providing commands for login, conversation inspection, SSE following, and OpenAPI generation (src/commands).rustygpt-sharedhouses configuration loading, llama.cpp bindings, and the data transfer objects used across all crates.
Each documentation section links back to the relevant modules so you can cross-reference behaviour with the implementation.
Guide Overview
These walkthroughs assume you are starting from a fresh checkout. They show how to configure config.toml, run the
Axum server, keep the Yew frontend hot-reloading, and validate that authentication + streaming work end to end.
- Quickstart bootstraps the database, enables feature flags, and walks through the setup flow.
- Local Development documents the
justrecipes, watcher processes, and debugging tips.
For conceptual background jump to Concepts; for task-specific runbooks see How-to.
Quickstart
TL;DR – copy
config.example.toml, enable the feature flags you need, start PostgreSQL, run the Axum server, complete the/api/setupflow, then bring up the Yew frontend and CLI.
1. Prerequisites
- Rust toolchain (
rustup default stable),cargo, andjust trunkfor the web client (cargo install trunk)- PostgreSQL 15+ running locally (the provided
docker-compose.yamlexposes one atpostgres://tinroof:rusty@localhost:5432/rusty_gpt) - Optional: llama.cpp-compatible model files if you plan to exercise assistant streaming
Fetch dependencies once:
cargo fetch --workspace
2. Configure the server
Create config.toml and adjust it for your environment:
cp config.example.toml config.toml
At minimum set:
[db]
url = "postgres://tinroof:rusty@localhost/rustygpt_dev"
[features]
auth_v1 = true
sse_v1 = true
well_known = true
rustygpt-shared::config::server::Config supports TOML/YAML/JSON files and environment overrides (e.g.
RUSTYGPT__SERVER__PORT=8080). See Configuration for the complete matrix of keys.
3. Start PostgreSQL
If you are using Docker Compose:
docker compose up postgres -d
On server startup the bootstrap runner executes every SQL script under scripts/pg/{schema,procedures,indexes,seed} in order.
The seed stage enables feature flags and inserts the default rate-limit profile (conversation.post).
4. Run the backend
Either launch directly:
cargo run -p rustygpt-server -- serve --port 8080
or use the helper recipe that also builds configuration if needed:
just run-server
The process listens on http://127.0.0.1:8080. Health probes are available at /api/healthz and /api/readyz.
5. Complete initial setup
The first authenticated user is created by POSTing to /api/setup:
curl -X POST http://127.0.0.1:8080/api/setup \
-H 'Content-Type: application/json' \
-d '{"username":"admin","email":"admin@example.com","password":"change-me"}'
Subsequent calls return 400 once a user already exists.
6. Start the web client
In a new terminal:
cd rustygpt-web
trunk serve
The SPA proxies /api/* requests to the backend. After logging in you should see:
- Conversation list populated by
GET /api/conversations/{conversation_id}/threads - Thread view that streams updates from
/api/stream/conversations/{conversation_id} - Presence and typing indicators driven by
ConversationStreamEventpayloads
7. Exercise the CLI
The rustygpt binary shares configuration and cookie handling with the server:
cargo run -p rustygpt-cli -- login
cargo run -p rustygpt-cli -- chat --conversation <conversation-uuid>
cargo run -p rustygpt-cli -- follow --root <thread-uuid>
follow connects to the SSE endpoint, reconstructs events, and prints deltas as they arrive. If you see authentication required
errors, confirm you completed the setup step and that [features].auth_v1 is true.
8. Next steps
- Review Local Development for watcher workflows, linting, and debugging tips.
- Explore REST API for a full endpoint catalogue and payload shapes.
- Consult Service Topology to understand how the components interact at runtime.
Local Development
TL;DR – keep
config.tomlin sync with your environment, usejust devfor paired watchers, and rely on the CLI for quick smoke tests of authentication and streaming.
Environment configuration
All binaries load configuration through rustygpt-shared::config::server::Config. The loader merges:
- Built-in defaults selected by the active profile (Dev/Test/Prod)
- Optional
config.toml/config.yaml/config.json - Environment variables such as
RUSTYGPT__SERVER__PORT=9000 - CLI overrides (e.g.
cargo run -p rustygpt-server -- serve --port 9000)
Keep secrets out of the repo—override them with environment variables or a private config.toml. See
Configuration for the full key list.
Watcher workflows
The Justfile orchestrates the common flows:
# Run server + web watchers together (uses rustygpt-tools/confuse)
just dev
# Backend only hot-reload
just watch-server
# Run fmt, check, and clippy
just check
just dev spawns two subprocesses:
rustygpt-serverviacargo watch -x 'run -- serve --port 8080'rustygpt-webviatrunk watch
Logs stream to stdout so you can confirm when migrations finish (db_bootstrap_* metrics) and when the SSE hub accepts
connections.
CLI smoke tests
The CLI binary lives at rustygpt-cli. Useful commands while iterating:
# Launch the server directly from the CLI crate
cargo run -p rustygpt-cli -- serve --port 8080
# Generate the OpenAPI spec
cargo run -p rustygpt-cli -- spec openapi.yaml
# Generate config skeletons
cargo run -p rustygpt-cli -- config --format toml
# Manage sessions
cargo run -p rustygpt-cli -- login
cargo run -p rustygpt-cli -- me
cargo run -p rustygpt-cli -- logout
CLI commands reuse the same cookie jar as the web client. Cookies are stored under ~/.config/rustygpt/session.cookies by
default (see [cli] in the configuration schema).
Debugging tips
- Enable verbose tracing:
RUST_LOG=rustygpt_server=debug,tower_http=info just run-server - Inspect SSE payloads:
curl -N http://127.0.0.1:8080/api/stream/conversations/<conversation-id>(requires an authenticated session andfeatures.sse_v1 = true) - Verify configuration resolution:
cargo run -p rustygpt-cli -- config --format jsonand inspect the generated file - Regenerate database bindings or seed data by restarting the server; bootstrap scripts rerun automatically when the process starts
- Use
docker compose logs postgresif migrations fail during bootstrap
For operational playbooks (e.g. Docker deployment or rotating secrets) see the How-to section.
Concepts Overview
This section explains the core ideas that appear across the server, web client, and CLI. Use it to understand the vocabulary used in API responses and stream payloads before diving into the reference material.
- Threaded conversations describes how messages, threads, and
ConversationStreamEventvalues relate to each other. - Shared models covers the
rustygpt-sharedcrate, focusing on how typed DTOs and enums keep clients in sync with the server.
Pair these concepts with the Architecture diagrams to see where each part lives at runtime.
Threaded conversations
RustyGPT models chat history as threaded conversations stored in PostgreSQL. Each conversation has participants, invites,
thread roots, and replies. The server exposes this structure through the DTOs in rustygpt-shared/src/models/chat.rs.
Core data types
| Type | Purpose | Defined in |
|---|---|---|
ConversationCreateRequest | Payload for creating a new conversation. | shared::models::chat |
ThreadTreeResponse | Depth-first snapshot of a thread including metadata for each node. | shared::models::chat |
MessageChunk | Persisted assistant output chunk (used when streaming replies). | shared::models::chat |
ConversationStreamEvent | Enum describing SSE events (thread.new, message.delta, etc.). | shared::models::chat |
Each thread is anchored by a root message (POST /api/threads/{conversation_id}/root). Replies hang off the tree using parent
IDs (POST /api/messages/{message_id}/reply). The ThreadTreeResponse payload includes ancestry hints so clients can render the
structure without additional queries.
Streaming lifecycle
When features.sse_v1 = true, the server emits ConversationStreamEvent variants via StreamHub (handlers/streaming.rs). The
naming mirrors the enum variants:
thread.new– new thread summary createdthread.activity– updatedlast_activity_atmessage.delta– incremental assistant tokens (ChatDeltaChunk)message.done– completion marker with usage statspresence.update– user presence heartbeattyping.update– typing indicator stateunread.update– unread count per thread rootmembership.changed– conversation membership changeerror– terminal failure while streaming
Events carry both the conversation_id and (when applicable) root_id so clients can scope updates precisely. SSE persistence is
optional: enable [sse.persistence] in configuration to record events in rustygpt.sse_event_log via
services::sse_persistence and replay them on reconnect.
Access control
The chat service (services::chat_service.rs) enforces membership checks and rate limits before mutating data. Rate limit
profiles are backed by tables in scripts/pg/schema/040_rate_limits.sql and can be tuned via the admin API. Presence updates
mark the acting user online and emit events so all subscribers stay consistent.
For endpoint details see REST API; for the transport-level diagram visit Streaming Delivery.
Shared models
rustygpt-shared centralises the data structures consumed by the server, web client, and CLI. Keeping these DTOs in one crate
prevents drift between components and allows serde + utoipa derives to stay consistent.
Configuration loader
src/config/server.rs defines the Config struct and associated sub-structures (ServerConfig, RateLimitConfig,
SseConfig, etc.). Every binary loads configuration through Config::load_config, which merges defaults, optional files, and
environment overrides. Feature flags such as features.auth_v1 gate server subsystems without requiring code changes.
API payloads
The src/models directory contains strongly typed request/response structs:
models/chat.rs– conversations, threads, message payloads, and streaming eventsmodels/oauth.rs– GitHub/Apple OAuth exchangesmodels/setup.rs– first-time setup contract (SetupRequest,SetupResponse)models/limits.rs– rate limit admin DTOs (CreateRateLimitProfileRequest,RateLimitAssignment, ...)models/session.rs– session summaries returned by/api/auth/*
All types derive Serialize, Deserialize, and when relevant utoipa::ToSchema so the OpenAPI generator stays in sync.
LLM abstractions
src/llms exposes traits (LLMProvider, LLMModel) and helpers for llama.cpp integration. The server’s
AssistantService uses these traits to stream responses and emit metrics (llm_model_cache_hits_total, llm_model_load_seconds).
When you add a new provider, implement the traits here and update the configuration schema.
Why it matters
- Type safety – clients compile against the same structs the server uses, catching breaking changes early.
- Single-source documentation – OpenAPI docs and mdBook pages pull names directly from these types.
- Testing – shared fixtures in
shared::modelsmake it easier to write integration tests that cover both server handlers and CLI commands.
Whenever you extend the API, add or update the relevant struct in rustygpt-shared first, then regenerate the OpenAPI spec with
cargo run -p rustygpt-cli -- spec.
Architecture Overview
These chapters document the runtime architecture of RustyGPT: how the Axum server is composed, how the SSE stream hub works, and how rate limiting integrates with PostgreSQL. Use them alongside the concepts and reference sections when exploring the code.
Streaming delivery
RustyGPT streams conversation activity to authenticated clients over Server-Sent Events (SSE). The implementation lives in
rustygpt-server/src/handlers/streaming.rs and is gated by features.sse_v1.
Flow
sequenceDiagram
participant Client
participant API as Axum /api
participant Hub as StreamHub
participant DB as rustygpt.sse_event_log
Client->>API: POST /api/threads/{conversation}/root
API->>Hub: publish ConversationStreamEvent
Hub->>Client: SSE event (thread.new)
Note over Hub,DB: if persistence enabled
Hub->>DB: sp_record_sse_event
Client->>API: reconnect with Last-Event-ID
API->>Hub: subscribe(after)
Hub->>DB: sp_sse_replay
DB-->>Hub: persisted events
Hub-->>Client: replay then live stream
Clients subscribe to /api/stream/conversations/:conversation_id. The route is protected by the auth middleware when
features.auth_v1 is enabled, so callers must present a valid session cookie (the CLI handles this automatically).
Event payloads
Events are instances of shared::models::ConversationStreamEvent and are encoded as JSON envelopes with type and payload
fields. See Threaded conversations for the full list of variants.
The SSE handler assigns monotonically increasing sequence numbers per conversation. When persistence is enabled the sequence is
also stored in rustygpt.sse_event_log, allowing reconnecting clients to pass Last-Event-ID and receive any missed events
before resuming the live stream.
Persistence and retention
Configure persistence via [sse.persistence] in config.toml:
[sse.persistence]
enabled = true
max_events_per_user = 500
prune_batch_size = 100
retention_hours = 48
services::sse_persistence stores events using the stored procedures in scripts/pg/schema/050_sse_persistence.sql. The pruning
logic runs after each insert to keep the table bounded.
Backpressure handling
The in-memory queue for each conversation defaults to channel_capacity = 128. Configure behaviour under [sse.backpressure]:
drop_strategy = "drop_tokens"drops assistant token events firstdrop_strategy = "drop_tokens_and_system"also discards system events once the queue fillswarn_queue_ratiocontrols when a warning is logged about queue pressure
These settings keep hot conversations from exhausting memory while still delivering key state changes (presence, membership, unread counters).
Client responsibilities
- Reconnect with
Last-Event-IDso the server can replay persisted events when available - Handle
401responses by re-running the session refresh flow (/api/auth/refresh); the CLI and web client do this automatically - Clear typing state on
typing.updateand update unread counters whenunread.updatearrives
Use REST API endpoints to backfill state when the requested Last-Event-ID falls outside the retention
window.
Service topology
RustyGPT is composed of a single Axum process backed by PostgreSQL. The web client and CLI talk to the same REST + SSE surface.
Components
- Axum API (
rustygpt-server) – exposes/api/*endpoints, authentication middleware, rate limiting, SSE, OpenAPI docs, health probes, metrics, and static file hosting. - PostgreSQL – stores users, sessions, conversations, threads, SSE history, and rate-limit configuration. Schema and stored
procedures live in
scripts/pgand are applied automatically during bootstrap. - SSE StreamHub – in-memory fan-out implemented in
handlers/streaming.rs, optionally persisting events throughservices::sse_persistence. - Yew SPA (
rustygpt-web) – compiled to WebAssembly with Trunk.src/api.rshandles authentication, CSRF, SSE reconnection, and REST calls for conversations/threads. - CLI (
rustygpt-cli) – shares configuration and models with the server. Provides helper commands for session management, SSE following, and OpenAPI/config generation.
Data flow
flowchart LR
subgraph Clients
web[Yew web app]
cli[CLI]
end
api[Axum /api]
stream[StreamHub]
db[(PostgreSQL)]
web --> api
cli --> api
api --> db
api --> stream
stream --> web
stream --> cli
Requests hit the Axum router, which talks to PostgreSQL via SQLx and fans out live events via StreamHub. Clients subscribe to
/api/stream/conversations/:conversation_id to receive updates.
Feature flags
Many subsystems are gated by [features] in configuration:
auth_v1– enables session middleware,/api/auth/*, protected routes, and the rate-limit admin APIsse_v1– enables the SSE route and persistence optionswell_known– serves.well-known/*entries from the config
Toggle these flags without recompiling the binaries.
Scaling notes
The server is stateless apart from in-memory SSE buffers. For horizontal scaling you must either:
- Disable persistence and tolerate best-effort delivery, or
- Configure
[sse.persistence]so each instance replays from PostgreSQL on reconnect
Rate limiting already supports multi-instance deployments because configuration is stored in the database and periodically
reloaded (RateLimitState::reload_from_db).
Rate-limit architecture
RustyGPT enforces per-route throttling using a leaky-bucket strategy implemented in middleware::rate_limit. Configuration
comes from two tables managed by stored procedures in scripts/pg/procs/034_limits.sql.
Data model
rustygpt.rate_limit_profiles– named profiles containing algorithm + JSON parameters (currentlygcrastyle withrequests_per_second/burstoptions).rustygpt.rate_limit_assignments– maps HTTP method + path pattern to a profile.rustygpt.message_rate_limits– per-user, per-conversation state used bysp_user_can_postto throttle message posting.
RateLimitState::reload_from_db loads profiles and assignments into memory. The admin API under /api/admin/limits/* can
create, update, or delete records at runtime; after each change the state refreshes automatically.
Matching logic
enforce_rate_limits computes a cache key as "{METHOD} {path}" and finds the first matching pattern. Supported patterns:
- Exact path matches (
/api/messages/{id}/replybecomes/api/messages/:id/replyin the database) *suffix for prefixes (e.g./api/admin/*)
If no assignment matches, the middleware falls back to the default strategy derived from [rate_limits.default_rps] and
[rate_limits.burst]. Login routes (/api/auth/login) use the dedicated auth_login_per_ip_per_min limiter.
Metrics and headers
When a request is evaluated the middleware records:
http_rate_limit_requests_total{profile,result}– allowed vs denied countshttp_rate_limit_remaining{profile}– remaining tokens after the decisionhttp_rate_limit_reset_seconds{profile}– seconds until the bucket refillsrustygpt_limits_profiles/rustygpt_limits_assignments– gauges updated on reload
Responses include the standard headers RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset, and
X-RateLimit-Profile so clients can react accordingly.
Admin API payloads
All admin payloads live in shared::models::limits:
CreateRateLimitProfileRequestUpdateRateLimitProfileRequestAssignRateLimitRequestRateLimitProfile/RateLimitAssignment
These endpoints require an authenticated session with the admin role (handlers/admin_limits.rs).
Conversation posting limits
ChatService::post_root_message and ChatService::reply_message call sp_user_can_post, which enforces a GCRA window per
(user_id, conversation_id) using rustygpt.message_rate_limits. Tweak the conversation.post profile via SQL or the admin
API to adjust posting cadence.
Reference Overview
Authoritative details for RustyGPT’s public surfaces:
- Authentication – session cookies, setup flow, and rotation behaviour
- REST API – endpoint catalogue grouped by feature area
- Configuration –
config.tomlstructure and environment overrides
For operational workflows see How-to; for high-level design context visit the Architecture section.
Authentication
RustyGPT uses cookie-based sessions backed by PostgreSQL. Session management lives in rustygpt-server/src/auth/session.rs and
is exposed through /api/auth/* handlers when features.auth_v1 = true.
Session lifecycle
- Setup –
POST /api/setuphashes the supplied password and inserts the first user (admin + member roles). Further calls are rejected. - Login –
POST /api/auth/loginverifies credentials viasp_auth_login. Successful responses include:Set-Cookie: SESSION_ID=...; HttpOnly; Secure?; SameSite=LaxSet-Cookie: CSRF-TOKEN=...; SameSite=StrictX-Session-Rotated: 1
- Authenticated requests – non-GET operations must include the CSRF header
X-CSRF-TOKENwith the cookie value. The web client (rustygpt-web/src/api.rs) and CLI handle this automatically. - Refresh –
POST /api/auth/refreshrotates cookies inside the idle window (default 8 hours). If either idle or absolute expiry is exceeded, the call returns401 session_expired. - Logout –
POST /api/auth/logoutclears the session and CSRF cookies.
Sessions are stored in rustygpt.user_sessions. The idle and absolute windows come from [session] in configuration. When
max_sessions_per_user is set the newest session evicts the oldest via sp_auth_login.
Cookie configuration
config.toml controls cookie behaviour:
[session]
idle_seconds = 28800
absolute_seconds = 604800
session_cookie_name = "SESSION_ID"
csrf_cookie_name = "CSRF-TOKEN"
max_sessions_per_user = 5
[security.cookie]
domain = ""
secure = false
same_site = "lax"
[security.csrf]
cookie_name = "CSRF-TOKEN"
header_name = "X-CSRF-TOKEN"
enabled = true
Adjust security.cookie.secure and security.cookie.domain for production deployments. When security.csrf.enabled = false
the middleware skips header validation (useful for service-to-service calls but not recommended for browsers).
CLI workflow
The CLI wraps the same endpoints:
cargo run -p rustygpt-cli -- login
cargo run -p rustygpt-cli -- me
cargo run -p rustygpt-cli -- logout
Cookies are stored at ~/.config/rustygpt/session.cookies by default (see [cli.session_store]). The follow and chat
commands automatically attach the CSRF header when present.
Observability
Authentication currently relies on logs for troubleshooting. Set RUST_LOG=rustygpt_server=debug to trace session decisions
(SessionService::authenticate, SessionService::refresh_session). Prometheus metrics for auth flows are not yet implemented.
REST API
All endpoints are served under /api unless noted otherwise. Session cookies and CSRF headers are required for non-authenticated
GET requests when features.auth_v1 is enabled. The OpenAPI schema is generated from rustygpt-server/src/openapi.rs and can
be exported with cargo run -p rustygpt-cli -- spec.
Setup
| Method | Path | Description |
|---|---|---|
| GET | /api/setup | Returns { "is_setup": bool } by calling is_setup() in PostgreSQL. |
| POST | /api/setup | Creates the first administrator account (see scripts/pg/procs/010_auth.sql::init_setup). Subsequent calls return 400. |
Authentication
| Method | Path | Description |
|---|---|---|
| POST | /api/auth/login | Email/password login. Returns LoginResponse with session + CSRF cookies. |
| POST | /api/auth/logout | Revokes the current session. Requires CSRF header. |
| POST | /api/auth/refresh | Rotates session cookies inside the idle window. |
| GET | /api/auth/me | Returns MeResponse (requires authenticated session). |
OAuth helpers
Handlers in handlers/github_auth.rs and handlers/apple_auth.rs expose optional OAuth flows when credentials are present:
| Method | Path | Notes |
|---|---|---|
| GET | /api/oauth/github | Returns an authorization URL based on GITHUB_* environment variables. |
| GET | /api/oauth/github/callback | Exchanges the code for a session via ProductionOAuthService. |
| POST | /api/oauth/github/manual | Developer helper that accepts a raw auth code. |
| GET | /api/oauth/apple | Same as GitHub but for Apple. |
| GET | /api/oauth/apple/callback | Callback handler. |
| POST | /api/oauth/apple/manual | Manual exchange helper. |
Conversations & membership
Routes implemented in handlers/conversations.rs:
| Method | Path | Description |
|---|---|---|
| POST | /api/conversations | Create a new conversation. |
| POST | /api/conversations/{conversation_id}/participants | Invite/add a participant. Emits membership + presence SSE events. |
| DELETE | /api/conversations/{conversation_id}/participants/{user_id} | Remove a participant. |
| POST | /api/conversations/{conversation_id}/invites | Create an invite token. |
| POST | /api/invites/accept | Accept an invite token. |
| POST | /api/invites/{token}/revoke | Revoke an invite token. |
| GET | /api/conversations/{conversation_id}/threads | List thread summaries (supports after + limit query params). |
| GET | /api/conversations/{conversation_id}/unread | Return unread counts per thread. |
Threads & messages
Routes from handlers/threads.rs:
| Method | Path | Description |
|---|---|---|
| GET | /api/threads/{root_id}/tree | Depth-first thread slice (cursor_path + limit optional). |
| POST | /api/threads/{conversation_id}/root | Create a new thread root. Triggers assistant streaming when role = assistant. |
| POST | /api/messages/{parent_id}/reply | Reply to an existing message. |
| GET | /api/messages/{message_id}/chunks | Retrieve persisted assistant chunks. |
| POST | /api/threads/{root_id}/read | Mark thread as read (MarkThreadReadRequest). |
| POST | /api/messages/{message_id}/delete | Soft-delete a message. |
| POST | /api/messages/{message_id}/restore | Restore a previously deleted message. |
| POST | /api/messages/{message_id}/edit | Replace message content. |
| POST | /api/typing | Set typing state (TypingRequest). |
| POST | /api/presence/heartbeat | Update presence heartbeat. |
Streaming
| Method | Path | Description |
|---|---|---|
| GET | /api/stream/conversations/{conversation_id} | SSE endpoint producing ConversationStreamEvent values. Requires session cookie and (optionally) Last-Event-ID. |
Copilot-compatible endpoints
These helpers live in handlers/copilot.rs and provide simple echo responses for integration tests:
| Method | Path | Description |
|---|---|---|
| GET | /v1/models | Returns ModelsResponse with two static models (gpt-4, gpt-3.5). |
| POST | /v1/chat/completions | Echoes provided messages as assistant responses (ChatCompletionResponse). |
Admin rate limit API
Available when features.auth_v1 = true and rate_limits.admin_api_enabled = true (handlers/admin_limits.rs):
| Method | Path | Description |
|---|---|---|
| GET | /api/admin/limits/profiles | List profiles. |
| POST | /api/admin/limits/profiles | Create a profile. |
| PUT | /api/admin/limits/profiles/{id} | Update profile parameters/description. |
| DELETE | /api/admin/limits/profiles/{id} | Delete a profile (fails if still assigned). |
| GET | /api/admin/limits/assignments | List route assignments. |
| POST | /api/admin/limits/assignments | Assign a profile to a route. |
| DELETE | /api/admin/limits/assignments/{id} | Remove an assignment. |
Health and observability
Outside of the /api prefix, the server exposes:
| Method | Path | Description |
|---|---|---|
| GET | /healthz | Liveness probe. |
| GET | /readyz | Readiness probe (verifies PostgreSQL bootstrap). |
| GET | /metrics | Prometheus metrics via metrics-exporter-prometheus. |
| GET | /.well-known/{path} | Served when features.well_known = true; entries configured via [well_known.entries]. |
| GET | /openapi.json / /openapi.yaml | Generated OpenAPI spec. |
Refer to Authentication for cookie details and Configuration for the relevant keys.
Configuration
RustyGPT uses a layered configuration loader (rustygpt-shared::config::server::Config). Defaults are determined by the active
profile (Dev/Test/Prod), then merged with an optional file and environment overrides. CLI flags can override specific values
(e.g. --port).
Loading order
- Profile defaults (
Config::default_for_profile(Profile::Dev)) - Optional config file (
config.toml,config.yaml, orconfig.json) - Environment variables using double underscores (e.g.
RUSTYGPT__SERVER__PORT=9000) - CLI overrides (currently the server/CLI
--portflag)
Config::load_config(path, override_port) performs this merge and validates required fields.
Key sections
[logging]
[logging]
level = "info" # tracing level passed to tracing-subscriber
format = "text" # "text" or "json"
[server]
[server]
host = "127.0.0.1"
port = 8080
public_base_url = "http://localhost:8080"
request_id_header = "x-request-id"
[server.cors]
allowed_origins = ["http://localhost:3000", "http://127.0.0.1:3000"]
allow_credentials = false
max_age_seconds = 600
public_base_url is derived automatically when not supplied (scheme depends on profile). request_id_header controls which
header the middleware reads when assigning request IDs.
[security]
[security.hsts]
enabled = false
max_age_seconds = 63072000
include_subdomains = true
preload = false
[security.cookie]
domain = ""
secure = false
same_site = "lax"
[security.csrf]
cookie_name = "CSRF-TOKEN"
header_name = "X-CSRF-TOKEN"
enabled = true
[rate_limits]
[rate_limits]
auth_login_per_ip_per_min = 10
default_rps = 50.0
burst = 100
admin_api_enabled = false
When admin_api_enabled = true the /api/admin/limits/* routes become available.
[session]
[session]
idle_seconds = 28800
absolute_seconds = 604800
session_cookie_name = "SESSION_ID"
csrf_cookie_name = "CSRF-TOKEN"
max_sessions_per_user = 5
Set max_sessions_per_user = 0 (or null) to disable automatic eviction.
[oauth]
[oauth]
redirect_base = "http://localhost:8080/api/auth/github/callback"
[oauth.github]
client_id = "..."
client_secret = "..."
If oauth.github is omitted the GitHub endpoints still respond but return empty URLs. Apple support reads APPLE_*
environment variables directly in the handler.
[db]
[db]
url = "postgres://tinroof:rusty@localhost/rustygpt_dev"
statement_timeout_ms = 5000
max_connections = 10
bootstrap_path = "../scripts/pg"
bootstrap_path points to the directory containing schema/, procedures/, indexes/, and seed/ folders.
[sse]
[sse]
heartbeat_seconds = 20
channel_capacity = 128
id_prefix = "evt_"
[sse.persistence]
enabled = false
max_events_per_user = 500
prune_batch_size = 100
retention_hours = 48
[sse.backpressure]
drop_strategy = "drop_tokens"
warn_queue_ratio = 0.75
[features]
[features]
auth_v1 = true
sse_v1 = true
well_known = true
Flags gate optional subsystems without recompiling the binary.
[cli] and [web]
[cli]
session_store = "~/.config/rustygpt/session.cookies"
[web]
static_dir = "../rustygpt-web/dist"
spa_index = "../rustygpt-web/dist/index.html"
[llm]
Config embeds LLMConfiguration from rustygpt-shared::config::llm. Use it to describe llama.cpp models/providers:
[llm.global_settings]
persist_stream_chunks = true
[llm.providers.default]
provider_type = "llama_cpp"
model_path = "./models/your-model.gguf"
See rustygpt-shared/src/config/llm.rs for the full schema.
Environment variable syntax
Nested keys map to uppercase names with double underscores. Examples:
RUSTYGPT__SERVER__PORT=9001RUSTYGPT__SECURITY__COOKIE__SECURE=trueRUSTYGPT__FEATURES__SSE_V1=true
Booleans and numbers follow standard Rust parsing rules. Paths can be relative or absolute.
How-to Overview
Task-focused guides for operating RustyGPT. These assume you already understand the system from the Guides and Reference sections.
- Docker Deploy – build/publish images and run Compose.
- Rotate Secrets – refresh credentials and validate sessions.
Docker deploy
This guide covers building the RustyGPT container image and running it alongside PostgreSQL with Docker Compose.
Build the image
The repository ships a multi-stage Dockerfile that builds the workspace and bundles the server binary plus
static assets:
docker build -t rustygpt/server:latest -f Dockerfile .
Set BUILD_PROFILE=release to compile with optimisations. The final image exposes the server on port 8080.
Compose stack
docker-compose.yaml defines two services:
backend– builds from the Dockerfile (targetruntime). Environment variables includeDATABASE_URL, OAuth credentials, and feature toggles. Update them to match your deployment.postgres–postgres:17-alpinewith credentials matchingconfig.example.toml.
Bring the stack up:
docker compose up --build
The compose file mounts ./.data/postgres for database storage and ./.data/postgres-init for init scripts. To reuse the
workspace schema, copy the contents of scripts/pg into that directory before the first run:
mkdir -p .data/postgres-init
cp -r scripts/pg/* .data/postgres-init/
Alternatively, rely on the server’s bootstrap runner by exposing the same directory inside the backend container and pointing
[db].bootstrap_path at it.
Configuration and secrets
- Copy
config.example.tomlto a volume or bake it into the image and setRUSTYGPT__CONFIGvariables as needed. - Provide OAuth credentials (
GITHUB_*,APPLE_*) if you plan to use those flows; otherwise the endpoints return placeholder URLs. - Set
features.auth_v1,features.sse_v1, andfeatures.well_knowntotruevia environment variables or the config file. - If running behind TLS terminate HTTPS at the reverse proxy and set
server.public_base_urlto the external URL.
Post-deployment checks
- Hit
http://HOST:8080/healthzandhttp://HOST:8080/readyzuntil both return200. - POST to
/api/setupto create the initial admin account. - Use the CLI container (or a local build) to log in:
docker compose exec backend rustygpt login. - Visit
/metricsand confirm counters increment when making requests. - If using the web UI, serve the
rustygpt-webbuild either from the same container (set[web.static_dir]) or via a separate static host.
Rollback
Tag each release in your registry. To roll back:
docker compose pull backend:previous-tag
docker compose up -d backend
The PostgreSQL data directory is persisted on disk so sessions and conversations remain intact.
Rotate secrets
Use this runbook to update credentials (database passwords, OAuth secrets, session keys) while keeping RustyGPT online.
Preparation
- Inventory the secrets in use (e.g.
DATABASE_URL,GITHUB_CLIENT_SECRET, config entries under[security.cookie]). - Update your secret manager or environment files with new values, but do not apply them yet.
- Coordinate a maintenance window if session cookie rotation is expected to log users out.
Rotation steps
- Stage – write new values to your secret store or
.envfile. - Deploy – restart the server with updated environment variables/config (
docker compose restart backendor rolling restart in your orchestrator). The bootstrap runner is idempotent, so restarting is safe. - Verify – run smoke tests:
cargo run -p rustygpt-cli -- login cargo run -p rustygpt-cli -- me curl -sSf http://HOST:8080/readyz - Cleanup – remove old secrets from the manager and audit logs for unexpected errors.
Session cookies are independent of database passwords or OAuth secrets. If you change [security.cookie] settings (e.g. enable
secure or change session_cookie_name), expect users to sign in again.
Observability
- Watch application logs for
SessionServicewarnings. - Confirm
http_rate_limit_requests_totalcontinues to increment after the restart. - Verify the CLI can still access streaming endpoints (
cargo run -p rustygpt-cli -- follow --root <id>).
Incident response
If a rotation fails:
- Roll back to the previous secret values and restart the server.
- Capture logs around the failure (authentication errors, database connection failures, etc.).
- File an issue or ADR documenting the change and follow-up actions.
Release notes
The authoritative changelog lives in CHANGELOG.md. No tagged releases have been published yet; the
[Unreleased] section tracks ongoing development across the workspace crates.
When a release is tagged (vMAJOR.MINOR.PATCH) update the changelog and, if applicable, regenerate the docs index with
just docs-index so the summaries reflect the new features.