feat: add DeepSeek/Mistral/OpenAI-compatible providers and model CLI#9
Conversation
Expands the model plugin ecosystem beyond the existing 5 providers so users can plug in a much wider range of LLM backends without writing glue code. - model-deepseek: DeepSeek-V3.2 / R1 / V4 via OpenAI-compatible API (cheap reasoning, 164K context). LLM-only; pair with a secondary embedding provider. - model-mistral: Mistral Small 4 (MoE), Large 2.1, Codestral, Pixtral. LLM + 1024-dim `mistral-embed`, vision-capable. - model-openai-compatible: generic provider for any OpenAI-compatible endpoint — vLLM, LM Studio, Together, Fireworks, Groq, DeepInfra, SiliconFlow, OpenRouter. Supports `extraHeaders` and `disableEmbedding`. Also wires the three new providers into `PROVIDER_MAP` / `EMBEDDING_DIMENSIONS` in server bootstrap, adds documentation and Gemma 4 Ollama examples, and updates the README Cloud Providers table. Gemma 3/4 (and any other Ollama model) is already supported via the existing `model-ollama` plugin — only requires `ollama pull gemma3:27b` and setting `model.llm` in the config. All 14 new unit tests pass. Existing `packages/cli/tests/commands/ask.test.ts` RAG timeout failure reproduces on unchanged bootstrap code and is unrelated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Makes model setup and switching a first-class CLI flow instead of requiring hand-editing opendocuments.config.ts. Also extends init to cover the three new providers added in the previous commit (DeepSeek, Mistral, openai-compatible). New command: opendocuments model - model list [--suggestions] — current config + installed Ollama models with per-model disk footprint, plus a curated catalog of local/cloud options. - model pull <name> — streams /api/pull progress, checks Ollama reachability, estimates disk footprint, and refuses to clobber a low-disk machine silently. Offers the official install command when Ollama is missing. - model rm <name> — deletes a local Ollama model. - model test [-p prompt] — round-trips a short prompt against the configured LLM and embedder, reports latency/chunks/embedding-dim. Surfaces degraded-mode issues immediately. - model switch — interactive provider swap. Rewrites only the `model:` block of opendocuments.config.ts (preserves the rest of the config). Supports all 8 providers including openai-compatible with baseUrl. init improvements - Cloud menu now includes DeepSeek and Mistral. - Third backend option: "OpenAI-compatible endpoint" (vLLM / LM Studio / Groq / Together / Fireworks / OpenRouter) with baseUrl prompt. - API key validation extended to grok, deepseek, mistral (previously openai / anthropic / google only). - "Secondary embedding provider" flow generalised from anthropic-only to any provider that lacks an embeddings API (deepseek joins anthropic). - Ollama auto-install: on macOS/Linux, offers to run the official install script and waits for the daemon to come up. - Pre-pull disk-space check with per-model size estimates (1.5GB headroom). - Progress line updates in place instead of spamming stdio.inherit. - Local model recommendations refreshed for April 2026 (Gemma 3/Qwen 3.5/ Llama 4/DeepSeek R1 distilled). Supporting utilities - packages/cli/src/utils/ollama.ts — shared Ollama client (isRunning, listModels, pullModel w/ progress, deleteModel, disk-space + size estimator, install-command selector). - 4 unit tests for rewriteModelBlock (cloud swap, openai-compatible with baseUrl, round-trip back to ollama, missing-block error). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 2 of model UX improvements. Covers the paper cuts left after the initial `model` command landed. model command - pull now accepts multiple names: `opendocuments model pull gemma3:27b bge-m3` runs pulls sequentially, prints per-model + total size estimates, sums disk headroom, reports which pulls failed at the end. - install-ollama: standalone command for the common "Ollama missing" case. Runs the official install script on macOS/Linux, waits up to 15s for the daemon to come up, exits with a clear message if it doesn't. Windows users get a direct download link. - set-key <provider>: prompts (password mode) or accepts --key inline, then writes `<PROVIDER>_API_KEY=...` to .env. Updates an existing line instead of appending, and warns when .env isn't in .gitignore. doctor command - Adds a per-provider diagnostics section for openai, anthropic, google, grok, deepseek, mistral, and openai-compatible. Pings each provider's models endpoint and reports 401/403 distinctly from network errors. Includes the API-key provisioning URL in the "key not set" message. - Secondary embedding provider (e.g. deepseek + ollama) is also pinged independently. - Ollama model presence check now only fires for models Ollama is actually responsible for (previously it would fail noisily when the LLM was on a cloud provider and only the embedding was Ollama). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lists the full set of opendocuments model subcommands added in the last two commits (install-ollama, batch pull, set-key, test, switch) alongside the existing admin commands. Also clarifies that `opendocuments doctor` now does per-provider API pings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c4c71ad3de
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| log.info('Using local Ollama BGE-M3 as secondary embedding provider.') | ||
| embeddingProvider = 'ollama' | ||
| embeddingModel = 'bge-m3' |
There was a problem hiding this comment.
Set disableEmbedding when choosing Ollama embed fallback
When users answer “no” to Does this endpoint provide embeddings?, this branch only sets embeddingProvider = 'ollama', but the primary provider remains openai-compatible with embedding still enabled. In bootstrap.loadModelPlugin, the main plugin is probed for embeddings before secondary providers are loaded, so LLM-only endpoints (e.g. Groq) fail that probe and fall back to stub models instead of using Ollama embeddings. This makes the default no-embedding path degrade at runtime.
Useful? React with 👍 / 👎.
| * Preserves surrounding config — throws if the model block can't be located. | ||
| */ | ||
| export function rewriteModelBlock(src: string, update: ModelBlockUpdate): string { | ||
| const re = /model:\s*\{[\s\S]*?\n\s*\},/m |
There was a problem hiding this comment.
Parse model block safely before rewriting config
This regex stops at the first \n\s*}, inside model: { ... }, so a nested object such as extraHeaders: { ... }, causes partial replacement and can leave a syntactically broken config after model switch. Since the new openai-compatible provider documents nested model options, this string-based matcher should be replaced with brace-aware parsing (or at least detect nested objects and abort).
Useful? React with 👍 / 👎.
| grok: { llm: 'grok-4', embedding: 'grok-2-embed' }, | ||
| deepseek: { llm: 'deepseek-chat', embedding: 'bge-m3', embeddingProvider: 'ollama' }, | ||
| mistral: { llm: 'mistral-small-latest', embedding: 'mistral-embed' }, | ||
| 'openai-compatible': { llm: '', embedding: 'bge-m3', embeddingProvider: 'ollama' }, |
There was a problem hiding this comment.
Require a non-empty LLM default for openai-compatible switch
The openai-compatible default uses an empty llm, so pressing Enter on the prompt can produce an empty value; rewriteModelBlock then omits the llm line entirely (update.llm ? ... : null). The config loader will backfill its generic default model, which is usually invalid for OpenAI-compatible endpoints and leads to generation failures after switching providers.
Useful? React with 👍 / 👎.
Summary
Expands OpenDocuments's model ecosystem from 5 to 8 providers and introduces a first-class CLI for model management so users no longer need to hand-edit
opendocuments.config.tsor.env.New model plugins (3)
model-deepseekmodel-mistralmistral-embed(1024-dim)model-openai-compatibleextraHeadersanddisableEmbeddingGemma 3/4 and other Ollama models continue to work through the existing
model-ollamaplugin (ollama pull gemma3:27b+model.llmin config).New CLI:
opendocuments model(7 subcommands)model list [--suggestions]— current config + installed Ollama models (size, params, quantization) + curated catalog of local/cloud optionsmodel pull <a> <b> <c>— batch pull with per-model size estimates, summed disk-headroom check, in-place progress, and a failure summary at the endmodel install-ollama— runs the official Ollama install script on macOS/Linux and waits for the daemon (noop if already running; clear fallback for Windows)model set-key <provider>— password-masked prompt (or--keyinline), updates existing.envline instead of appending, warns if.envisn't in.gitignoremodel test— round-trips the configured LLM + embedder, reports latency / chunks / embedding dimensionmodel switch— interactive provider swap that rewrites only themodel:block of the config (preserves everything else)model rm <name>— delete an Ollama modelinitwizardbaseUrlprompt (vLLM / LM Studio / Groq / Together / Fireworks / OpenRouter)doctordiagnosticsShared utility
packages/cli/src/utils/ollama.ts—isOllamaRunning,listOllamaModels,pullOllamaModelwith streaming progress,deleteOllamaModel,getAvailableDiskBytes(viastatfsSync),estimateModelSize(curated hint table),getOllamaInstallCommand(platform-aware).Server bootstrap
PROVIDER_MAPandEMBEDDING_DIMENSIONSextended to cover the three new providers. No breaking changes to existing config.Test plan
npm run build— 30/30 workspaces build cleannpm run typecheck— all packages passrewriteModelBlockhelper tests: 4/4 (cloud swap, openai-compatible + baseUrl, round-trip to ollama, missing-block error)opendocuments model --help,model list --suggestionsagainst real Ollama installask.test.tsRAG timeout failure reproduces on unchanged bootstrap (tracked separately, not caused by this PR)initnew-provider flow (manual)doctorping against at least one configured cloud provider (manual)Out of scope / follow-ups
ollama pull(we stream the API; it's close but not identical)