docs(self-hosting): document local embedding memory behavior#1094
Open
chetanunadkat wants to merge 1 commit into
Open
docs(self-hosting): document local embedding memory behavior#1094chetanunadkat wants to merge 1 commit into
chetanunadkat wants to merge 1 commit into
Conversation
Each local embedding worker's WASM linear memory only grows (up to ~4 GB) and is reclaimed only when the pool goes fully idle. Add a 'Memory considerations' note so self-hosters can budget peak memory (~POOL_SIZE x 4 GB) and avoid OOM on small/continuously-ingesting hosts. Refs supermemoryai#1093 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
|
@chetanunadkat I see some conflicts here |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a Memory considerations subsection to the self-hosting embedding docs explaining that each local embedding worker's WebAssembly heap only grows (up to ~4 GB) and is reclaimed only when the pool goes fully idle.
Why
Self-hosters on small hosts hit unbounded RSS growth (20–50 GB) under continuous ingestion because peak memory scales as roughly
POOL_SIZE × ~4 GBand the only reclamation path (idle shutdown) never fires while the server is busy. The current docs presentPOOL_SIZEpurely as a throughput knob, with no memory-cost guidance. This adds the missing budgeting note and the safe-default recommendation for ≤16 GB hosts.Documents the user-facing/operational side of #1093. The worker/pool implementation lives in the closed binary, so this PR only covers the docs; the issue proposes the code-level fix (recycle busy workers after N batches / a memory ceiling).
Change
One subsection added to
apps/docs/self-hosting/configuration.mdx, right after the embedding-performance table. Docs-only, no code changes.