Self-Hosting

LongMem runs comfortably on a single small server — the reference deployment is a 4 GB box serving the public instance.

The stack#

Component	Role
PostgreSQL + pgvector	memories, collections, embeddings, full-text — one database for everything; tenant isolation via row-level security
Redis	queues (async ingest + extraction), result cache, one-time tokens
FastAPI / uvicorn	the API (`app.main`)
Ingest worker	`python -m app.ingest_worker` — file processing + graph extraction; never loads the embedding model (embeds via the API)
nginx	TLS, static landing/docs, API proxy; serve HTML with `Cache-Control: no-cache`

Configuration (`.env`)#

DATABASE_URL=postgresql+asyncpg://longmem:…@localhost:5432/longmem
REDIS_URL=redis://localhost:6379/0

EMBEDDING_PROVIDER=local        # on-box ONNX embeddings — no text leaves the box
OPENAI_API_KEY=sk-…             # only used for extraction / Whisper / vision captions
STORAGE_SECRET_KEY=…            # Fernet key for encrypting BYO credentials at rest

S3_ENDPOINT_URL=… S3_BUCKET=… S3_ACCESS_KEY=… S3_SECRET_KEY=…   # file storage
RESEND_API_KEY=…                # transactional email (optional)

With EMBEDDING_PROVIDER=local and a BYO/local extraction endpoint, the only third-party touchpoints left are Whisper/vision for media — skip media and nothing leaves your infrastructure.

Database setup#

CREATE EXTENSION vector;  CREATE EXTENSION pg_trgm;
# apply versioned migrations as a DDL-capable role (the app role stays unprivileged):
sudo -u postgres .venv/bin/python3 -m app.migrate apply --dsn postgresql:///longmem

The app's database role must NOT be a superuser — row-level security depends on it.

Services#

Systemd units ship in deploy/: the API, the ingest worker, an SMTP receiver (email drop), and an uptime checker on a 1-minute timer that emails on consecutive health-check failures. The reference deploy model is autopull: a timer fetches the git branch, resets, reinstalls, rsyncs the landing, restarts, health-checks — a merge is live in ~60–120s.

Sizing#

4 GB RAM / 2 vCPU runs Postgres + Redis + API (with the local embedding model) + worker with headroom.
The worker embeds through the API's internal endpoint so only one copy of the embedding model is ever loaded.

PreviousJavaScript SDK NextBYO Storage / DB / LLM