Re-embed migration¶
When you change the embedder model or its dimension, every cached vector in the existing cache is now incompatible. The cache validates this on open and refuses to mix vectors:
SemanticCache(path="cache.db", embedder=NewEmbedder())
# raises EmbedderMismatchError(
# "Stored fingerprint 'old:v1' does not match supplied 'new:v1'.
# Remediation: open with the original embedder, or use
# mneme.tools.migrate.reembed() to migrate."
# )
The supported migration path is mneme.tools.migrate.reembed: walk every entry through the new embedder, write to a new cache, leave the source untouched.
The basic call¶
from mneme import SemanticCache, MemoryStore
from mneme.tools.migrate import reembed
# Source cache - opened with the old embedder.
source = SemanticCache(path="old.db", embedder=old_embedder)
# Destination cache - opened with the new embedder.
dest = SemanticCache(path="new.db", embedder=new_embedder)
# Walks source.iter_all() through new_embedder.embed() and inserts into dest.
n = reembed(source=source, dest=dest, batch_size=128)
print(f"migrated {n} entries")
source.close()
dest.close()
reembed returns the count of entries successfully migrated.
What's preserved, what's not¶
Preserved across migration:
- Query, response, metadata - same strings, same dict.
- Namespace. Each entry lands in its original namespace.
- TTL. If the source had
ttl=, the dest gets the same TTL (counted from the originalcreated_at, not from migration time). - Counters. Per-namespace metrics counters copy over.
Not preserved:
- Embedding bytes. Re-computed from scratch - that's the whole point.
- Embedder fingerprint. Now matches the new embedder.
ids. The dest assigns its own ids; expect them to differ from the source.last_accessed_at. Reset tocreated_at(migration is not an access).- Index backend state. The dest builds its index fresh from the new vectors.
Async variant¶
For async embedders, use areembed:
from mneme.tools.migrate import areembed
async def migrate():
source = SemanticCache(path="old.db", embedder=to_sync_embedder(old_async))
dest = AsyncSemanticCache(path="new.db", embedder=new_async_embedder)
try:
n = await areembed(source=source, dest=dest, batch_size=128)
finally:
source.close()
await dest.close()
The source can be either sync or async (use to_sync_embedder to adapt); the destination's embedder shape determines which migration variant you call.
Cost¶
Re-embedding is expensive - every cached entry runs through the new embedder. For an OpenAI embedder, that's one API call per entry, billed at embedding rates. For a local model, it's one matvec per entry but on CPU/GPU.
A 100k-entry cache through text-embedding-3-small:
- ~100k API calls at OpenAI's pricing
- Or batched with
dimensions=parameter andinput=as a list -reembed'sbatch_size=controls the batch size
For very large caches, consider:
- Migrate during a maintenance window. The cache returns misses while migration runs (you're using the old cache against new traffic).
- Migrate top-k LRU instead of everything. Drop the cold tail; only re-embed the hot entries.
def hot_iter():
for id_ in source._store.iter_lru_ids(n=10_000): # private API; see Custom stores
yield source._store.get_by_id(id_)
- Run migration in parallel against multiple destinations. Each destination shard handles a fraction of the corpus; the script splits by id-modulo or namespace.
When not to migrate¶
If you're just tuning similarity_threshold, vector_dtype, or index_backend - don't re-embed. Those are runtime settings, not embedder properties:
| Change | Re-embed? |
|---|---|
similarity_threshold |
no - runtime knob |
vector_dtype (fp32 ↔ fp16 ↔ int8) |
no - cache.requantize() |
index_backend (numpy ↔ hnsw) |
no - cache.requantize() rebuilds the index from the store vectors |
multi_process_mode |
no - runtime knob |
max_entries / namespace_quotas |
no - runtime knob |
| Embedder model | yes - different vector space |
Embedder dim (e.g. OpenAI dimensions=) |
yes - different vector space |
| Tokenizer change in the same model | depends - if vectors drift, yes |
Migration is for vector-space-incompatible changes only.
Failure handling¶
reembed doesn't checkpoint mid-stream. If the new embedder is flaky, the dest cache ends up with whatever entries succeeded before the failure. Re-running the migration is idempotent - re-puts use the same (namespace, query_hash) key, so the previous attempt's entries get overwritten.
For very expensive migrations, wrap your own checkpoint logic:
seen_ids = set()
try:
n = reembed(source=source, dest=dest)
except Exception:
# Recover: dest's iter_all tells you what was migrated; rerun on the rest.
seen_ids = {e.query_hash for e in dest._store.iter_all()}
raise
Then on retry, filter source.iter_all() against seen_ids before passing to a custom migration loop.
Where to go next¶
- Embedders concept - why fingerprints matter.
- Checkpoints - for backup-without-migration.
- API reference: tools -
reembed/areembedsignatures.