Multi-process¶
mneme ships three modes for running the cache across processes. Pick with multi_process_mode= when you instantiate the cache:
SemanticCache(path="cache.db", embedder=..., multi_process_mode="single") # default
SemanticCache(path="cache.db", embedder=..., multi_process_mode="stale-tolerant",
stale_check_interval=0.5)
SemanticCache(path="cache.db", embedder=..., multi_process_mode="mmap-shared")
At a glance¶
| Mode | Coordination cost | Read consistency | Best for |
|---|---|---|---|
single (default) |
none | own writes only | One process owning the cache file |
stale-tolerant |
periodic poll of version_counter + iter_since deltas |
eventually consistent (configurable lag) | Multiple processes on one host, mostly-readonly workloads |
mmap-shared |
fcntl.flock (POSIX) / msvcrt.locking (Windows) on a shared mmap matrix |
strong on the same host | Read-heavy, latency-sensitive, willing to operate on a single host |
Cross-host coordination is a different problem - use a shared store backend (Redis, Postgres, DynamoDB) instead of a multi-process mode.
single (default)¶
Assumes one process owns the cache file. Writes go to the store; reads serve from the in-memory index. No coordination logic at all. Fastest path; correct only when you actually have one writer.
If two processes both run single against the same SQLite file, SQLite's WAL mode prevents corruption - but the in-memory indices drift. One process's put() is invisible to the other until it reopens the cache.
stale-tolerant¶
The cache periodically polls store.version_counter. When the counter has advanced, it asks the store for entries inserted since the last seen last_id (store.iter_since) and applies the delta to its in-memory index.
cache = SemanticCache(
path="cache.db",
embedder=embedder,
multi_process_mode="stale-tolerant",
stale_check_interval=0.5, # poll every 500 ms
)
Trade-offs:
- Lag is bounded by
stale_check_interval. Smaller is fresher, larger is cheaper. 0.1–1 s is a typical range. - Writes are immediately visible to the writer; readers see them after the next poll.
- Tombstones from
delete()propagate the same way - readers don't see deleted rows on the very nextget, but will after the next stale check. - Above a threshold of pending changes, the cache full-rebuilds instead of applying deltas. The threshold is tuned to keep stale checks cheap; the rebuild is amortized.
This mode is the workhorse for multiple processes sharing a SQLite file on one host - typical for Gunicorn/Uvicorn workers, multi-process job runners, or simple production setups.
mmap-shared¶
Advanced. The in-memory index is replaced with a single mmap-backed matrix shared across processes. Writes acquire an exclusive file lock; reads acquire a shared lock.
Pros:
- Strong read consistency on the same host. After a write commits, all readers see the new entry on their next
get. - No periodic polling overhead.
- One physical copy of the matrix in RAM regardless of process count.
Cons:
- POSIX (
fcntl.flock) is the first-class implementation. Windows (msvcrt.locking) is best-effort and not exhaustively tested. - Compaction at 25% tombstone density rewrites the file via atomic-rename; readers transparently re-mmap.
- Careful integration:
MmapSharedCoordinatoris exposed as a separately-instantiable primitive, used directly by advanced users (it's how the showcase's stress tests verify the coordination semantics). - Single-host only. For cross-host shared cache, use a network store backend.
Cross-host: use a network store¶
For genuinely distributed deployments, the multi-process modes don't help - they coordinate processes on one machine. Across hosts, use a network-backed Store:
| Backend | When |
|---|---|
RedisStore |
Lowest latency, ephemeral OK, you already run Redis |
PostgresStore |
Durable, transactional, you already run Postgres |
DynamoDBStore |
Serverless, multi-region, AWS shop |
Each one bumps version_counter in the same transaction as every data write, so even cross-host the stale-tolerant polling pattern works. Set multi_process_mode="stale-tolerant" plus a network store and your processes converge on the shared state with bounded staleness.
What mneme doesn't do¶
- No background threads. The cache never spawns its own thread. The
stale-tolerantpolling happens inline on eachgetif the interval has elapsed; it's a piggyback, not a worker. - No
multiprocessing.shared_memory. That API has known cleanup issues across Python versions;mmap-shareduses plainmmap+fcntl.flockinstead. - No
asyncio.Lockover the cacheRLock. The async cache reuses the sync core'sRLock; there's no second async lock layered on top.
Concurrency invariants¶
Whichever mode you pick, the cache's locking semantics are the same:
- One
RLockperSemanticCacheinstance, held for the full duration of every public method. - Writes (
put,delete,clear_namespace,clear) bumpversion_counterin the same transaction as the data write at the store level. - The lock is released across embedder awaits in
AsyncSemanticCacheso concurrentgetcalls overlap their I/O.
Where to go next¶
- Stores: SQLite - the default
single/stale-tolerantbacking. - Stores: Redis / Postgres / DynamoDB - cross-host options.
- Performance tuning - picking
stale_check_intervalfor your workload.