Self-Hosting

LongMem runs comfortably on a single small server — the reference deployment is a 4 GB box serving the public instance.

The stack#

ComponentRole
PostgreSQL + pgvectormemories, collections, embeddings, full-text — one database for everything; tenant isolation via row-level security
Redisqueues (async ingest + extraction), result cache, one-time tokens
FastAPI / uvicornthe API (app.main)
Ingest workerpython -m app.ingest_worker — file processing + graph extraction; never loads the embedding model (embeds via the API)
nginxTLS, static landing/docs, API proxy; serve HTML with Cache-Control: no-cache

Configuration (.env)#

DATABASE_URL=postgresql+asyncpg://longmem:…@localhost:5432/longmem
REDIS_URL=redis://localhost:6379/0

EMBEDDING_PROVIDER=local        # on-box ONNX embeddings — no text leaves the box
OPENAI_API_KEY=sk-…             # only used for extraction / Whisper / vision captions
STORAGE_SECRET_KEY=…            # Fernet key for encrypting BYO credentials at rest

S3_ENDPOINT_URL=… S3_BUCKET=… S3_ACCESS_KEY=… S3_SECRET_KEY=…   # file storage
RESEND_API_KEY=…                # transactional email (optional)

With EMBEDDING_PROVIDER=local and a BYO/local extraction endpoint, the only third-party touchpoints left are Whisper/vision for media — skip media and nothing leaves your infrastructure.

Database setup#

CREATE EXTENSION vector;  CREATE EXTENSION pg_trgm;
# apply versioned migrations as a DDL-capable role (the app role stays unprivileged):
sudo -u postgres .venv/bin/python3 -m app.migrate apply --dsn postgresql:///longmem
The app's database role must NOT be a superuser — row-level security depends on it.

Services#

Systemd units ship in deploy/: the API, the ingest worker, an SMTP receiver (email drop), and an uptime checker on a 1-minute timer that emails on consecutive health-check failures. The reference deploy model is autopull: a timer fetches the git branch, resets, reinstalls, rsyncs the landing, restarts, health-checks — a merge is live in ~60–120s.

Sizing#