RustyGPT Documentation

Welcome! This mdBook describes the RustyGPT workspace in depth: how the server and clients are structured, how the PostgreSQL-backed features behave, and how to operate the platform locally or in shared environments.

Start with the guides to get a local environment running, then dive into the reference and architecture chapters for precise APIs and design notes.

Quickstart – configure and launch the server, web client, and CLI.
Local development – watcher workflows, debugging tools, and environment variables.
REST API – endpoint catalogue for conversations, streaming, authentication, and admin features.
Service topology – how the Axum server, Yew SPA, PostgreSQL, and SSE stream hub fit together.

What RustyGPT ships today

RustyGPT focuses on a cohesive Rust stack:

rustygpt-server exposes REST + SSE endpoints with cookie-based auth (handlers/auth.rs), rate limiting (middleware/rate_limit.rs), and OpenAPI documentation (openapi.rs).
rustygpt-web is a Yew SPA that consumes the server API via src/api.rs and renders threaded conversations, presence, and typing indicators.
rustygpt-cli shares the same models as the server, providing commands for login, conversation inspection, SSE following, and OpenAPI generation (src/commands).
rustygpt-shared houses configuration loading, llama.cpp bindings, and the data transfer objects used across all crates.

Each documentation section links back to the relevant modules so you can cross-reference behaviour with the implementation.

Guide Overview

These walkthroughs assume you are starting from a fresh checkout. They show how to configure config.toml, run the Axum server, keep the Yew frontend hot-reloading, and validate that authentication + streaming work end to end.

Quickstart bootstraps the database, enables feature flags, and walks through the setup flow.
Local Development documents the just recipes, watcher processes, and debugging tips.

For conceptual background jump to Concepts; for task-specific runbooks see How-to.

Quickstart

TL;DR – copy config.example.toml, enable the feature flags you need, start PostgreSQL, run the Axum server, complete the /api/setup flow, then bring up the Yew frontend and CLI.

1. Prerequisites

Rust toolchain (rustup default stable), cargo, and just
trunk for the web client (cargo install trunk)
PostgreSQL 15+ running locally (the provided docker-compose.yaml exposes one at postgres://tinroof:rusty@localhost:5432/rusty_gpt)
Optional: llama.cpp-compatible model files if you plan to exercise assistant streaming

Fetch dependencies once:

cargo fetch --workspace

2. Configure the server

Create config.toml and adjust it for your environment:

cp config.example.toml config.toml

At minimum set:

[db]
url = "postgres://tinroof:rusty@localhost/rustygpt_dev"

[features]
auth_v1 = true
sse_v1 = true
well_known = true

rustygpt-shared::config::server::Config supports TOML/YAML/JSON files and environment overrides (e.g. RUSTYGPT__SERVER__PORT=8080). See Configuration for the complete matrix of keys.

3. Start PostgreSQL

If you are using Docker Compose:

docker compose up postgres -d

On server startup the bootstrap runner executes every SQL script under scripts/pg/{schema,procedures,indexes,seed} in order. The seed stage enables feature flags and inserts the default rate-limit profile (conversation.post).

4. Run the backend

Either launch directly:

cargo run -p rustygpt-server -- serve --port 8080

or use the helper recipe that also builds configuration if needed:

just run-server

The process listens on http://127.0.0.1:8080. Health probes are available at /api/healthz and /api/readyz.

5. Complete initial setup

The first authenticated user is created by POSTing to /api/setup:

curl -X POST http://127.0.0.1:8080/api/setup \
  -H 'Content-Type: application/json' \
  -d '{"username":"admin","email":"admin@example.com","password":"change-me"}'

Subsequent calls return 400 once a user already exists.

6. Start the web client

In a new terminal:

cd rustygpt-web
trunk serve

The SPA proxies /api/* requests to the backend. After logging in you should see:

Conversation list populated by GET /api/conversations/{conversation_id}/threads
Thread view that streams updates from /api/stream/conversations/{conversation_id}
Presence and typing indicators driven by ConversationStreamEvent payloads

7. Exercise the CLI

The rustygpt binary shares configuration and cookie handling with the server:

cargo run -p rustygpt-cli -- login
cargo run -p rustygpt-cli -- chat --conversation <conversation-uuid>
cargo run -p rustygpt-cli -- follow --root <thread-uuid>

follow connects to the SSE endpoint, reconstructs events, and prints deltas as they arrive. If you see authentication required errors, confirm you completed the setup step and that [features].auth_v1 is true.

8. Next steps

Review Local Development for watcher workflows, linting, and debugging tips.
Explore REST API for a full endpoint catalogue and payload shapes.
Consult Service Topology to understand how the components interact at runtime.

Local Development

TL;DR – keep config.toml in sync with your environment, use just dev for paired watchers, and rely on the CLI for quick smoke tests of authentication and streaming.

Environment configuration

All binaries load configuration through rustygpt-shared::config::server::Config. The loader merges:

Built-in defaults selected by the active profile (Dev/Test/Prod)
Optional config.toml / config.yaml / config.json
Environment variables such as RUSTYGPT__SERVER__PORT=9000
CLI overrides (e.g. cargo run -p rustygpt-server -- serve --port 9000)

Keep secrets out of the repo—override them with environment variables or a private config.toml. See Configuration for the full key list.

Watcher workflows

The Justfile orchestrates the common flows:

# Run server + web watchers together (uses rustygpt-tools/confuse)
just dev

# Backend only hot-reload
just watch-server

# Run fmt, check, and clippy
just check

just dev spawns two subprocesses:

rustygpt-server via cargo watch -x 'run -- serve --port 8080'
rustygpt-web via trunk watch

Logs stream to stdout so you can confirm when migrations finish (db_bootstrap_* metrics) and when the SSE hub accepts connections.

CLI smoke tests

The CLI binary lives at rustygpt-cli. Useful commands while iterating:

# Launch the server directly from the CLI crate
cargo run -p rustygpt-cli -- serve --port 8080

# Generate the OpenAPI spec
cargo run -p rustygpt-cli -- spec openapi.yaml

# Generate config skeletons
cargo run -p rustygpt-cli -- config --format toml

# Manage sessions
cargo run -p rustygpt-cli -- login
cargo run -p rustygpt-cli -- me
cargo run -p rustygpt-cli -- logout

CLI commands reuse the same cookie jar as the web client. Cookies are stored under ~/.config/rustygpt/session.cookies by default (see [cli] in the configuration schema).

Debugging tips

Enable verbose tracing: RUST_LOG=rustygpt_server=debug,tower_http=info just run-server
Inspect SSE payloads: curl -N http://127.0.0.1:8080/api/stream/conversations/<conversation-id> (requires an authenticated session and features.sse_v1 = true)
Verify configuration resolution: cargo run -p rustygpt-cli -- config --format json and inspect the generated file
Regenerate database bindings or seed data by restarting the server; bootstrap scripts rerun automatically when the process starts
Use docker compose logs postgres if migrations fail during bootstrap

For operational playbooks (e.g. Docker deployment or rotating secrets) see the How-to section.

Concepts Overview

This section explains the core ideas that appear across the server, web client, and CLI. Use it to understand the vocabulary used in API responses and stream payloads before diving into the reference material.

Threaded conversations describes how messages, threads, and ConversationStreamEvent values relate to each other.
Shared models covers the rustygpt-shared crate, focusing on how typed DTOs and enums keep clients in sync with the server.

Pair these concepts with the Architecture diagrams to see where each part lives at runtime.

Threaded conversations

RustyGPT models chat history as threaded conversations stored in PostgreSQL. Each conversation has participants, invites, thread roots, and replies. The server exposes this structure through the DTOs in rustygpt-shared/src/models/chat.rs.

Core data types

Type	Purpose	Defined in
`ConversationCreateRequest`	Payload for creating a new conversation.	`shared::models::chat`
`ThreadTreeResponse`	Depth-first snapshot of a thread including metadata for each node.	`shared::models::chat`
`MessageChunk`	Persisted assistant output chunk (used when streaming replies).	`shared::models::chat`
`ConversationStreamEvent`	Enum describing SSE events (`thread.new`, `message.delta`, etc.).	`shared::models::chat`

Each thread is anchored by a root message (POST /api/threads/{conversation_id}/root). Replies hang off the tree using parent IDs (POST /api/messages/{message_id}/reply). The ThreadTreeResponse payload includes ancestry hints so clients can render the structure without additional queries.

Streaming lifecycle

When features.sse_v1 = true, the server emits ConversationStreamEvent variants via StreamHub (handlers/streaming.rs). The naming mirrors the enum variants:

thread.new – new thread summary created
thread.activity – updated last_activity_at
message.delta – incremental assistant tokens (ChatDeltaChunk)
message.done – completion marker with usage stats
presence.update – user presence heartbeat
typing.update – typing indicator state
unread.update – unread count per thread root
membership.changed – conversation membership change
error – terminal failure while streaming

Events carry both the conversation_id and (when applicable) root_id so clients can scope updates precisely. SSE persistence is optional: enable [sse.persistence] in configuration to record events in rustygpt.sse_event_log via services::sse_persistence and replay them on reconnect.

Access control

The chat service (services::chat_service.rs) enforces membership checks and rate limits before mutating data. Rate limit profiles are backed by tables in scripts/pg/schema/040_rate_limits.sql and can be tuned via the admin API. Presence updates mark the acting user online and emit events so all subscribers stay consistent.

For endpoint details see REST API; for the transport-level diagram visit Streaming Delivery.

Shared models

rustygpt-shared centralises the data structures consumed by the server, web client, and CLI. Keeping these DTOs in one crate prevents drift between components and allows serde + utoipa derives to stay consistent.

Configuration loader

src/config/server.rs defines the Config struct and associated sub-structures (ServerConfig, RateLimitConfig, SseConfig, etc.). Every binary loads configuration through Config::load_config, which merges defaults, optional files, and environment overrides. Feature flags such as features.auth_v1 gate server subsystems without requiring code changes.

API payloads

The src/models directory contains strongly typed request/response structs:

models/chat.rs – conversations, threads, message payloads, and streaming events
models/oauth.rs – GitHub/Apple OAuth exchanges
models/setup.rs – first-time setup contract (SetupRequest, SetupResponse)
models/limits.rs – rate limit admin DTOs (CreateRateLimitProfileRequest, RateLimitAssignment, ...)
models/session.rs – session summaries returned by /api/auth/*

All types derive Serialize, Deserialize, and when relevant utoipa::ToSchema so the OpenAPI generator stays in sync.

LLM abstractions

src/llms exposes traits (LLMProvider, LLMModel) and helpers for llama.cpp integration. The server’s AssistantService uses these traits to stream responses and emit metrics (llm_model_cache_hits_total, llm_model_load_seconds). When you add a new provider, implement the traits here and update the configuration schema.

Why it matters

Type safety – clients compile against the same structs the server uses, catching breaking changes early.
Single-source documentation – OpenAPI docs and mdBook pages pull names directly from these types.
Testing – shared fixtures in shared::models make it easier to write integration tests that cover both server handlers and CLI commands.

Whenever you extend the API, add or update the relevant struct in rustygpt-shared first, then regenerate the OpenAPI spec with cargo run -p rustygpt-cli -- spec.

Architecture Overview

These chapters document the runtime architecture of RustyGPT: how the Axum server is composed, how the SSE stream hub works, and how rate limiting integrates with PostgreSQL. Use them alongside the concepts and reference sections when exploring the code.

Streaming delivery

RustyGPT streams conversation activity to authenticated clients over Server-Sent Events (SSE). The implementation lives in rustygpt-server/src/handlers/streaming.rs and is gated by features.sse_v1.

Flow

sequenceDiagram
  participant Client
  participant API as Axum /api
  participant Hub as StreamHub
  participant DB as rustygpt.sse_event_log

  Client->>API: POST /api/threads/{conversation}/root
  API->>Hub: publish ConversationStreamEvent
  Hub->>Client: SSE event (thread.new)
  Note over Hub,DB: if persistence enabled
  Hub->>DB: sp_record_sse_event
  Client->>API: reconnect with Last-Event-ID
  API->>Hub: subscribe(after)
  Hub->>DB: sp_sse_replay
  DB-->>Hub: persisted events
  Hub-->>Client: replay then live stream

Clients subscribe to /api/stream/conversations/:conversation_id. The route is protected by the auth middleware when features.auth_v1 is enabled, so callers must present a valid session cookie (the CLI handles this automatically).

Event payloads

Events are instances of shared::models::ConversationStreamEvent and are encoded as JSON envelopes with type and payload fields. See Threaded conversations for the full list of variants.

The SSE handler assigns monotonically increasing sequence numbers per conversation. When persistence is enabled the sequence is also stored in rustygpt.sse_event_log, allowing reconnecting clients to pass Last-Event-ID and receive any missed events before resuming the live stream.

Persistence and retention

Configure persistence via [sse.persistence] in config.toml:

[sse.persistence]
enabled = true
max_events_per_user = 500
prune_batch_size = 100
retention_hours = 48

services::sse_persistence stores events using the stored procedures in scripts/pg/schema/050_sse_persistence.sql. The pruning logic runs after each insert to keep the table bounded.

Backpressure handling

The in-memory queue for each conversation defaults to channel_capacity = 128. Configure behaviour under [sse.backpressure]:

drop_strategy = "drop_tokens" drops assistant token events first
drop_strategy = "drop_tokens_and_system" also discards system events once the queue fills
warn_queue_ratio controls when a warning is logged about queue pressure

These settings keep hot conversations from exhausting memory while still delivering key state changes (presence, membership, unread counters).

Client responsibilities

Reconnect with Last-Event-ID so the server can replay persisted events when available
Handle 401 responses by re-running the session refresh flow (/api/auth/refresh); the CLI and web client do this automatically
Clear typing state on typing.update and update unread counters when unread.update arrives

Use REST API endpoints to backfill state when the requested Last-Event-ID falls outside the retention window.

Service topology

RustyGPT is composed of a single Axum process backed by PostgreSQL. The web client and CLI talk to the same REST + SSE surface.

Components

Axum API (rustygpt-server) – exposes /api/* endpoints, authentication middleware, rate limiting, SSE, OpenAPI docs, health probes, metrics, and static file hosting.
PostgreSQL – stores users, sessions, conversations, threads, SSE history, and rate-limit configuration. Schema and stored procedures live in scripts/pg and are applied automatically during bootstrap.
SSE StreamHub – in-memory fan-out implemented in handlers/streaming.rs, optionally persisting events through services::sse_persistence.
Yew SPA (rustygpt-web) – compiled to WebAssembly with Trunk. src/api.rs handles authentication, CSRF, SSE reconnection, and REST calls for conversations/threads.
CLI (rustygpt-cli) – shares configuration and models with the server. Provides helper commands for session management, SSE following, and OpenAPI/config generation.

Data flow

flowchart LR
  subgraph Clients
    web[Yew web app]
    cli[CLI]
  end
  api[Axum /api]
  stream[StreamHub]
  db[(PostgreSQL)]

  web --> api
  cli --> api
  api --> db
  api --> stream
  stream --> web
  stream --> cli

Requests hit the Axum router, which talks to PostgreSQL via SQLx and fans out live events via StreamHub. Clients subscribe to /api/stream/conversations/:conversation_id to receive updates.

Feature flags

Many subsystems are gated by [features] in configuration:

auth_v1 – enables session middleware, /api/auth/*, protected routes, and the rate-limit admin API
sse_v1 – enables the SSE route and persistence options
well_known – serves .well-known/* entries from the config

Toggle these flags without recompiling the binaries.

Scaling notes

The server is stateless apart from in-memory SSE buffers. For horizontal scaling you must either:

Disable persistence and tolerate best-effort delivery, or
Configure [sse.persistence] so each instance replays from PostgreSQL on reconnect

Rate limiting already supports multi-instance deployments because configuration is stored in the database and periodically reloaded (RateLimitState::reload_from_db).

Rate-limit architecture

RustyGPT enforces per-route throttling using a leaky-bucket strategy implemented in middleware::rate_limit. Configuration comes from two tables managed by stored procedures in scripts/pg/procs/034_limits.sql.

Data model

rustygpt.rate_limit_profiles – named profiles containing algorithm + JSON parameters (currently gcra style with requests_per_second / burst options).
rustygpt.rate_limit_assignments – maps HTTP method + path pattern to a profile.
rustygpt.message_rate_limits – per-user, per-conversation state used by sp_user_can_post to throttle message posting.

RateLimitState::reload_from_db loads profiles and assignments into memory. The admin API under /api/admin/limits/* can create, update, or delete records at runtime; after each change the state refreshes automatically.

Matching logic

enforce_rate_limits computes a cache key as "{METHOD} {path}" and finds the first matching pattern. Supported patterns:

Exact path matches (/api/messages/{id}/reply becomes /api/messages/:id/reply in the database)
* suffix for prefixes (e.g. /api/admin/*)

If no assignment matches, the middleware falls back to the default strategy derived from [rate_limits.default_rps] and [rate_limits.burst]. Login routes (/api/auth/login) use the dedicated auth_login_per_ip_per_min limiter.

Metrics and headers

When a request is evaluated the middleware records:

http_rate_limit_requests_total{profile,result} – allowed vs denied counts
http_rate_limit_remaining{profile} – remaining tokens after the decision
http_rate_limit_reset_seconds{profile} – seconds until the bucket refills
rustygpt_limits_profiles / rustygpt_limits_assignments – gauges updated on reload

Responses include the standard headers RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset, and X-RateLimit-Profile so clients can react accordingly.

Admin API payloads

All admin payloads live in shared::models::limits:

CreateRateLimitProfileRequest
UpdateRateLimitProfileRequest
AssignRateLimitRequest
RateLimitProfile / RateLimitAssignment

These endpoints require an authenticated session with the admin role (handlers/admin_limits.rs).

Conversation posting limits

ChatService::post_root_message and ChatService::reply_message call sp_user_can_post, which enforces a GCRA window per (user_id, conversation_id) using rustygpt.message_rate_limits. Tweak the conversation.post profile via SQL or the admin API to adjust posting cadence.

Reference Overview

Authoritative details for RustyGPT’s public surfaces:

Authentication – session cookies, setup flow, and rotation behaviour
REST API – endpoint catalogue grouped by feature area
Configuration – config.toml structure and environment overrides

For operational workflows see How-to; for high-level design context visit the Architecture section.

Authentication

RustyGPT uses cookie-based sessions backed by PostgreSQL. Session management lives in rustygpt-server/src/auth/session.rs and is exposed through /api/auth/* handlers when features.auth_v1 = true.

Session lifecycle

Setup – POST /api/setup hashes the supplied password and inserts the first user (admin + member roles). Further calls are rejected.
Login – POST /api/auth/login verifies credentials via sp_auth_login. Successful responses include:
- Set-Cookie: SESSION_ID=...; HttpOnly; Secure?; SameSite=Lax
- Set-Cookie: CSRF-TOKEN=...; SameSite=Strict
- X-Session-Rotated: 1
Authenticated requests – non-GET operations must include the CSRF header X-CSRF-TOKEN with the cookie value. The web client (rustygpt-web/src/api.rs) and CLI handle this automatically.
Refresh – POST /api/auth/refresh rotates cookies inside the idle window (default 8 hours). If either idle or absolute expiry is exceeded, the call returns 401 session_expired.
Logout – POST /api/auth/logout clears the session and CSRF cookies.

Sessions are stored in rustygpt.user_sessions. The idle and absolute windows come from [session] in configuration. When max_sessions_per_user is set the newest session evicts the oldest via sp_auth_login.

config.toml controls cookie behaviour:

[session]
idle_seconds = 28800
absolute_seconds = 604800
session_cookie_name = "SESSION_ID"
csrf_cookie_name = "CSRF-TOKEN"
max_sessions_per_user = 5

[security.cookie]
domain = ""
secure = false
same_site = "lax"

[security.csrf]
cookie_name = "CSRF-TOKEN"
header_name = "X-CSRF-TOKEN"
enabled = true

Adjust security.cookie.secure and security.cookie.domain for production deployments. When security.csrf.enabled = false the middleware skips header validation (useful for service-to-service calls but not recommended for browsers).

CLI workflow

The CLI wraps the same endpoints:

cargo run -p rustygpt-cli -- login
cargo run -p rustygpt-cli -- me
cargo run -p rustygpt-cli -- logout

Cookies are stored at ~/.config/rustygpt/session.cookies by default (see [cli.session_store]). The follow and chat commands automatically attach the CSRF header when present.

Observability

Authentication currently relies on logs for troubleshooting. Set RUST_LOG=rustygpt_server=debug to trace session decisions (SessionService::authenticate, SessionService::refresh_session). Prometheus metrics for auth flows are not yet implemented.

REST API

All endpoints are served under /api unless noted otherwise. Session cookies and CSRF headers are required for non-authenticated GET requests when features.auth_v1 is enabled. The OpenAPI schema is generated from rustygpt-server/src/openapi.rs and can be exported with cargo run -p rustygpt-cli -- spec.

Setup

Method	Path	Description
GET	`/api/setup`	Returns `{ "is_setup": bool }` by calling `is_setup()` in PostgreSQL.
POST	`/api/setup`	Creates the first administrator account (see `scripts/pg/procs/010_auth.sql::init_setup`). Subsequent calls return `400`.

Authentication

Method	Path	Description
POST	`/api/auth/login`	Email/password login. Returns `LoginResponse` with session + CSRF cookies.
POST	`/api/auth/logout`	Revokes the current session. Requires CSRF header.
POST	`/api/auth/refresh`	Rotates session cookies inside the idle window.
GET	`/api/auth/me`	Returns `MeResponse` (requires authenticated session).

OAuth helpers

Handlers in handlers/github_auth.rs and handlers/apple_auth.rs expose optional OAuth flows when credentials are present:

Method	Path	Notes
GET	`/api/oauth/github`	Returns an authorization URL based on `GITHUB_*` environment variables.
GET	`/api/oauth/github/callback`	Exchanges the code for a session via `ProductionOAuthService`.
POST	`/api/oauth/github/manual`	Developer helper that accepts a raw auth code.
GET	`/api/oauth/apple`	Same as GitHub but for Apple.
GET	`/api/oauth/apple/callback`	Callback handler.
POST	`/api/oauth/apple/manual`	Manual exchange helper.

Conversations & membership

Routes implemented in handlers/conversations.rs:

Method	Path	Description
POST	`/api/conversations`	Create a new conversation.
POST	`/api/conversations/{conversation_id}/participants`	Invite/add a participant. Emits membership + presence SSE events.
DELETE	`/api/conversations/{conversation_id}/participants/{user_id}`	Remove a participant.
POST	`/api/conversations/{conversation_id}/invites`	Create an invite token.
POST	`/api/invites/accept`	Accept an invite token.
POST	`/api/invites/{token}/revoke`	Revoke an invite token.
GET	`/api/conversations/{conversation_id}/threads`	List thread summaries (supports `after` + `limit` query params).
GET	`/api/conversations/{conversation_id}/unread`	Return unread counts per thread.

Threads & messages

Routes from handlers/threads.rs:

Method	Path	Description
GET	`/api/threads/{root_id}/tree`	Depth-first thread slice (`cursor_path` + `limit` optional).
POST	`/api/threads/{conversation_id}/root`	Create a new thread root. Triggers assistant streaming when role = `assistant`.
POST	`/api/messages/{parent_id}/reply`	Reply to an existing message.
GET	`/api/messages/{message_id}/chunks`	Retrieve persisted assistant chunks.
POST	`/api/threads/{root_id}/read`	Mark thread as read (`MarkThreadReadRequest`).
POST	`/api/messages/{message_id}/delete`	Soft-delete a message.
POST	`/api/messages/{message_id}/restore`	Restore a previously deleted message.
POST	`/api/messages/{message_id}/edit`	Replace message content.
POST	`/api/typing`	Set typing state (`TypingRequest`).
POST	`/api/presence/heartbeat`	Update presence heartbeat.

Streaming

Method	Path	Description
GET	`/api/stream/conversations/{conversation_id}`	SSE endpoint producing `ConversationStreamEvent` values. Requires session cookie and (optionally) `Last-Event-ID`.

Copilot-compatible endpoints

These helpers live in handlers/copilot.rs and provide simple echo responses for integration tests:

Method	Path	Description
GET	`/v1/models`	Returns `ModelsResponse` with two static models (`gpt-4`, `gpt-3.5`).
POST	`/v1/chat/completions`	Echoes provided messages as assistant responses (`ChatCompletionResponse`).

Admin rate limit API

Available when features.auth_v1 = true and rate_limits.admin_api_enabled = true (handlers/admin_limits.rs):

Method	Path	Description
GET	`/api/admin/limits/profiles`	List profiles.
POST	`/api/admin/limits/profiles`	Create a profile.
PUT	`/api/admin/limits/profiles/{id}`	Update profile parameters/description.
DELETE	`/api/admin/limits/profiles/{id}`	Delete a profile (fails if still assigned).
GET	`/api/admin/limits/assignments`	List route assignments.
POST	`/api/admin/limits/assignments`	Assign a profile to a route.
DELETE	`/api/admin/limits/assignments/{id}`	Remove an assignment.

Health and observability

Outside of the /api prefix, the server exposes:

Method	Path	Description
GET	`/healthz`	Liveness probe.
GET	`/readyz`	Readiness probe (verifies PostgreSQL bootstrap).
GET	`/metrics`	Prometheus metrics via `metrics-exporter-prometheus`.
GET	`/.well-known/{path}`	Served when `features.well_known = true`; entries configured via `[well_known.entries]`.
GET	`/openapi.json` / `/openapi.yaml`	Generated OpenAPI spec.

Refer to Authentication for cookie details and Configuration for the relevant keys.

Configuration

RustyGPT uses a layered configuration loader (rustygpt-shared::config::server::Config). Defaults are determined by the active profile (Dev/Test/Prod), then merged with an optional file and environment overrides. CLI flags can override specific values (e.g. --port).

Loading order

Profile defaults (Config::default_for_profile(Profile::Dev))
Optional config file (config.toml, config.yaml, or config.json)
Environment variables using double underscores (e.g. RUSTYGPT__SERVER__PORT=9000)
CLI overrides (currently the server/CLI --port flag)

Config::load_config(path, override_port) performs this merge and validates required fields.

Key sections

`[logging]`

[logging]
level = "info"        # tracing level passed to tracing-subscriber
format = "text"        # "text" or "json"

`[server]`

[server]
host = "127.0.0.1"
port = 8080
public_base_url = "http://localhost:8080"
request_id_header = "x-request-id"

[server.cors]
allowed_origins = ["http://localhost:3000", "http://127.0.0.1:3000"]
allow_credentials = false
max_age_seconds = 600

public_base_url is derived automatically when not supplied (scheme depends on profile). request_id_header controls which header the middleware reads when assigning request IDs.

`[security]`

[security.hsts]
enabled = false
max_age_seconds = 63072000
include_subdomains = true
preload = false

[security.cookie]
domain = ""
secure = false
same_site = "lax"

[security.csrf]
cookie_name = "CSRF-TOKEN"
header_name = "X-CSRF-TOKEN"
enabled = true

`[rate_limits]`

[rate_limits]
auth_login_per_ip_per_min = 10
default_rps = 50.0
burst = 100
admin_api_enabled = false

When admin_api_enabled = true the /api/admin/limits/* routes become available.

`[session]`

[session]
idle_seconds = 28800
absolute_seconds = 604800
session_cookie_name = "SESSION_ID"
csrf_cookie_name = "CSRF-TOKEN"
max_sessions_per_user = 5

Set max_sessions_per_user = 0 (or null) to disable automatic eviction.

`[oauth]`

[oauth]
redirect_base = "http://localhost:8080/api/auth/github/callback"

[oauth.github]
client_id = "..."
client_secret = "..."

If oauth.github is omitted the GitHub endpoints still respond but return empty URLs. Apple support reads APPLE_* environment variables directly in the handler.

`[db]`

[db]
url = "postgres://tinroof:rusty@localhost/rustygpt_dev"
statement_timeout_ms = 5000
max_connections = 10
bootstrap_path = "../scripts/pg"

bootstrap_path points to the directory containing schema/, procedures/, indexes/, and seed/ folders.

`[sse]`

[sse]
heartbeat_seconds = 20
channel_capacity = 128
id_prefix = "evt_"

[sse.persistence]
enabled = false
max_events_per_user = 500
prune_batch_size = 100
retention_hours = 48

[sse.backpressure]
drop_strategy = "drop_tokens"
warn_queue_ratio = 0.75

`[features]`

[features]
auth_v1 = true
sse_v1 = true
well_known = true

Flags gate optional subsystems without recompiling the binary.

`[cli]` and `[web]`

[cli]
session_store = "~/.config/rustygpt/session.cookies"

[web]
static_dir = "../rustygpt-web/dist"
spa_index = "../rustygpt-web/dist/index.html"

`[llm]`

Config embeds LLMConfiguration from rustygpt-shared::config::llm. Use it to describe llama.cpp models/providers:

[llm.global_settings]
persist_stream_chunks = true

[llm.providers.default]
provider_type = "llama_cpp"
model_path = "./models/your-model.gguf"

See rustygpt-shared/src/config/llm.rs for the full schema.

Environment variable syntax

Nested keys map to uppercase names with double underscores. Examples:

RUSTYGPT__SERVER__PORT=9001
RUSTYGPT__SECURITY__COOKIE__SECURE=true
RUSTYGPT__FEATURES__SSE_V1=true

Booleans and numbers follow standard Rust parsing rules. Paths can be relative or absolute.

How-to Overview

Task-focused guides for operating RustyGPT. These assume you already understand the system from the Guides and Reference sections.

Docker Deploy – build/publish images and run Compose.
Rotate Secrets – refresh credentials and validate sessions.

Docker deploy

This guide covers building the RustyGPT container image and running it alongside PostgreSQL with Docker Compose.

Build the image

The repository ships a multi-stage Dockerfile that builds the workspace and bundles the server binary plus static assets:

docker build -t rustygpt/server:latest -f Dockerfile .

Set BUILD_PROFILE=release to compile with optimisations. The final image exposes the server on port 8080.

Compose stack

docker-compose.yaml defines two services:

backend – builds from the Dockerfile (target runtime). Environment variables include DATABASE_URL, OAuth credentials, and feature toggles. Update them to match your deployment.
postgres – postgres:17-alpine with credentials matching config.example.toml.

Bring the stack up:

docker compose up --build

The compose file mounts ./.data/postgres for database storage and ./.data/postgres-init for init scripts. To reuse the workspace schema, copy the contents of scripts/pg into that directory before the first run:

mkdir -p .data/postgres-init
cp -r scripts/pg/* .data/postgres-init/

Alternatively, rely on the server’s bootstrap runner by exposing the same directory inside the backend container and pointing [db].bootstrap_path at it.

Configuration and secrets

Copy config.example.toml to a volume or bake it into the image and set RUSTYGPT__CONFIG variables as needed.
Provide OAuth credentials (GITHUB_*, APPLE_*) if you plan to use those flows; otherwise the endpoints return placeholder URLs.
Set features.auth_v1, features.sse_v1, and features.well_known to true via environment variables or the config file.
If running behind TLS terminate HTTPS at the reverse proxy and set server.public_base_url to the external URL.

Post-deployment checks

Hit http://HOST:8080/healthz and http://HOST:8080/readyz until both return 200.
POST to /api/setup to create the initial admin account.
Use the CLI container (or a local build) to log in: docker compose exec backend rustygpt login.
Visit /metrics and confirm counters increment when making requests.
If using the web UI, serve the rustygpt-web build either from the same container (set [web.static_dir]) or via a separate static host.

Rollback

Tag each release in your registry. To roll back:

docker compose pull backend:previous-tag
docker compose up -d backend

The PostgreSQL data directory is persisted on disk so sessions and conversations remain intact.

Rotate secrets

Use this runbook to update credentials (database passwords, OAuth secrets, session keys) while keeping RustyGPT online.

Preparation

Inventory the secrets in use (e.g. DATABASE_URL, GITHUB_CLIENT_SECRET, config entries under [security.cookie]).
Update your secret manager or environment files with new values, but do not apply them yet.
Coordinate a maintenance window if session cookie rotation is expected to log users out.

Rotation steps

Stage – write new values to your secret store or .env file.
Deploy – restart the server with updated environment variables/config (docker compose restart backend or rolling restart in your orchestrator). The bootstrap runner is idempotent, so restarting is safe.

Verify – run smoke tests:

cargo run -p rustygpt-cli -- login
cargo run -p rustygpt-cli -- me
curl -sSf http://HOST:8080/readyz

Cleanup – remove old secrets from the manager and audit logs for unexpected errors.

Session cookies are independent of database passwords or OAuth secrets. If you change [security.cookie] settings (e.g. enable secure or change session_cookie_name), expect users to sign in again.

Observability

Watch application logs for SessionService warnings.
Confirm http_rate_limit_requests_total continues to increment after the restart.
Verify the CLI can still access streaming endpoints (cargo run -p rustygpt-cli -- follow --root <id>).

Incident response

If a rotation fails:

Roll back to the previous secret values and restart the server.
Capture logs around the failure (authentication errors, database connection failures, etc.).
File an issue or ADR documenting the change and follow-up actions.

Release notes

The authoritative changelog lives in CHANGELOG.md. No tagged releases have been published yet; the [Unreleased] section tracks ongoing development across the workspace crates.

When a release is tagged (vMAJOR.MINOR.PATCH) update the changelog and, if applicable, regenerate the docs index with just docs-index so the summaries reflect the new features.

Keyboard shortcuts

RustyGPT Documentation