feat: add DeepSeek/Mistral/OpenAI-compatible providers and model CLI by joungminsung · Pull Request #9 · joungminsung/OpenDocuments

joungminsung · 2026-04-20T05:45:08Z

Summary

Expands OpenDocuments's model ecosystem from 5 to 8 providers and introduces a first-class CLI for model management so users no longer need to hand-edit opendocuments.config.ts or .env.

New model plugins (3)

Plugin	Models	Embedding	Notes
`model-deepseek`	DeepSeek-V3.2 / R1 / V4 (upcoming)	—	OpenAI-compatible API, 164K context, cheap reasoning
`model-mistral`	Small 4 (MoE) / Large 2.1 / Codestral / Pixtral	`mistral-embed` (1024-dim)	Chat + embeddings + vision
`model-openai-compatible`	Any OpenAI-compatible endpoint	Optional	vLLM / LM Studio / Together / Fireworks / Groq / DeepInfra / SiliconFlow / OpenRouter; supports `extraHeaders` and `disableEmbedding`

Gemma 3/4 and other Ollama models continue to work through the existing model-ollama plugin (ollama pull gemma3:27b + model.llm in config).

New CLI: `opendocuments model` (7 subcommands)

model list [--suggestions] — current config + installed Ollama models (size, params, quantization) + curated catalog of local/cloud options
model pull <a> <b> <c> — batch pull with per-model size estimates, summed disk-headroom check, in-place progress, and a failure summary at the end
model install-ollama — runs the official Ollama install script on macOS/Linux and waits for the daemon (noop if already running; clear fallback for Windows)
model set-key <provider> — password-masked prompt (or --key inline), updates existing .env line instead of appending, warns if .env isn't in .gitignore
model test — round-trips the configured LLM + embedder, reports latency / chunks / embedding dimension
model switch — interactive provider swap that rewrites only the model: block of the config (preserves everything else)
model rm <name> — delete an Ollama model

`init` wizard

Cloud menu adds DeepSeek and Mistral
Third backend option: OpenAI-compatible endpoint with baseUrl prompt (vLLM / LM Studio / Groq / Together / Fireworks / OpenRouter)
API-key validation extended to grok, deepseek, mistral (previously openai / anthropic / google only)
Secondary embedding provider flow generalized from anthropic-only to any provider without embeddings (deepseek joins anthropic)
Ollama auto-install: offers to run the official install script and waits up to 15s for daemon startup
Pre-pull disk-space check with per-model size estimates and 1.5GB headroom
Local model recommendations refreshed for April 2026 (Gemma 3 / Qwen 3.5 / Llama 4 / DeepSeek R1 distilled)
Progress updates render in place instead of spamming stdio

`doctor` diagnostics

Per-provider API ping section for openai, anthropic, google, grok, deepseek, mistral, openai-compatible
Distinguishes 401/403 (bad key) from network errors and missing env vars, and links to each provider's key-provisioning URL
Secondary embedding provider is pinged independently
Ollama model-presence check now only fires for models Ollama actually owns (avoids spurious failures when LLM is cloud + embedder is Ollama)

Shared utility

packages/cli/src/utils/ollama.ts — isOllamaRunning, listOllamaModels, pullOllamaModel with streaming progress, deleteOllamaModel, getAvailableDiskBytes (via statfsSync), estimateModelSize (curated hint table), getOllamaInstallCommand (platform-aware).

Server bootstrap

PROVIDER_MAP and EMBEDDING_DIMENSIONS extended to cover the three new providers. No breaking changes to existing config.

Test plan

npm run build — 30/30 workspaces build clean
npm run typecheck — all packages pass
New plugin tests: 14/14 (deepseek 5 + mistral 4 + openai-compatible 5)
rewriteModelBlock helper tests: 4/4 (cloud swap, openai-compatible + baseUrl, round-trip to ollama, missing-block error)
Smoke-tested opendocuments model --help, model list --suggestions against real Ollama install
Pre-existing CLI ask.test.ts RAG timeout failure reproduces on unchanged bootstrap (tracked separately, not caused by this PR)
Interactive test of init new-provider flow (manual)
doctor ping against at least one configured cloud provider (manual)

Out of scope / follow-ups

Web UI model manager (CLI only for now)
Ollama-native progress parity with ollama pull (we stream the API; it's close but not identical)
Changeset entries for the three new plugin packages (add when cutting a release)

Expands the model plugin ecosystem beyond the existing 5 providers so users can plug in a much wider range of LLM backends without writing glue code. - model-deepseek: DeepSeek-V3.2 / R1 / V4 via OpenAI-compatible API (cheap reasoning, 164K context). LLM-only; pair with a secondary embedding provider. - model-mistral: Mistral Small 4 (MoE), Large 2.1, Codestral, Pixtral. LLM + 1024-dim `mistral-embed`, vision-capable. - model-openai-compatible: generic provider for any OpenAI-compatible endpoint — vLLM, LM Studio, Together, Fireworks, Groq, DeepInfra, SiliconFlow, OpenRouter. Supports `extraHeaders` and `disableEmbedding`. Also wires the three new providers into `PROVIDER_MAP` / `EMBEDDING_DIMENSIONS` in server bootstrap, adds documentation and Gemma 4 Ollama examples, and updates the README Cloud Providers table. Gemma 3/4 (and any other Ollama model) is already supported via the existing `model-ollama` plugin — only requires `ollama pull gemma3:27b` and setting `model.llm` in the config. All 14 new unit tests pass. Existing `packages/cli/tests/commands/ask.test.ts` RAG timeout failure reproduces on unchanged bootstrap code and is unrelated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Makes model setup and switching a first-class CLI flow instead of requiring hand-editing opendocuments.config.ts. Also extends init to cover the three new providers added in the previous commit (DeepSeek, Mistral, openai-compatible). New command: opendocuments model - model list [--suggestions] — current config + installed Ollama models with per-model disk footprint, plus a curated catalog of local/cloud options. - model pull <name> — streams /api/pull progress, checks Ollama reachability, estimates disk footprint, and refuses to clobber a low-disk machine silently. Offers the official install command when Ollama is missing. - model rm <name> — deletes a local Ollama model. - model test [-p prompt] — round-trips a short prompt against the configured LLM and embedder, reports latency/chunks/embedding-dim. Surfaces degraded-mode issues immediately. - model switch — interactive provider swap. Rewrites only the `model:` block of opendocuments.config.ts (preserves the rest of the config). Supports all 8 providers including openai-compatible with baseUrl. init improvements - Cloud menu now includes DeepSeek and Mistral. - Third backend option: "OpenAI-compatible endpoint" (vLLM / LM Studio / Groq / Together / Fireworks / OpenRouter) with baseUrl prompt. - API key validation extended to grok, deepseek, mistral (previously openai / anthropic / google only). - "Secondary embedding provider" flow generalised from anthropic-only to any provider that lacks an embeddings API (deepseek joins anthropic). - Ollama auto-install: on macOS/Linux, offers to run the official install script and waits for the daemon to come up. - Pre-pull disk-space check with per-model size estimates (1.5GB headroom). - Progress line updates in place instead of spamming stdio.inherit. - Local model recommendations refreshed for April 2026 (Gemma 3/Qwen 3.5/ Llama 4/DeepSeek R1 distilled). Supporting utilities - packages/cli/src/utils/ollama.ts — shared Ollama client (isRunning, listModels, pullModel w/ progress, deleteModel, disk-space + size estimator, install-command selector). - 4 unit tests for rewriteModelBlock (cloud swap, openai-compatible with baseUrl, round-trip back to ollama, missing-block error). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Round 2 of model UX improvements. Covers the paper cuts left after the initial `model` command landed. model command - pull now accepts multiple names: `opendocuments model pull gemma3:27b bge-m3` runs pulls sequentially, prints per-model + total size estimates, sums disk headroom, reports which pulls failed at the end. - install-ollama: standalone command for the common "Ollama missing" case. Runs the official install script on macOS/Linux, waits up to 15s for the daemon to come up, exits with a clear message if it doesn't. Windows users get a direct download link. - set-key <provider>: prompts (password mode) or accepts --key inline, then writes `<PROVIDER>_API_KEY=...` to .env. Updates an existing line instead of appending, and warns when .env isn't in .gitignore. doctor command - Adds a per-provider diagnostics section for openai, anthropic, google, grok, deepseek, mistral, and openai-compatible. Pings each provider's models endpoint and reports 401/403 distinctly from network errors. Includes the API-key provisioning URL in the "key not set" message. - Secondary embedding provider (e.g. deepseek + ollama) is also pinged independently. - Ollama model presence check now only fires for models Ollama is actually responsible for (previously it would fail noisily when the LLM was on a cloud provider and only the embedding was Ollama). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lists the full set of opendocuments model subcommands added in the last two commits (install-ollama, batch pull, set-key, test, switch) alongside the existing admin commands. Also clarifies that `opendocuments doctor` now does per-provider API pings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c4c71ad3de

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-20T05:50:33Z

+          log.info('Using local Ollama BGE-M3 as secondary embedding provider.')
+          embeddingProvider = 'ollama'
+          embeddingModel = 'bge-m3'


Set disableEmbedding when choosing Ollama embed fallback

When users answer “no” to Does this endpoint provide embeddings?, this branch only sets embeddingProvider = 'ollama', but the primary provider remains openai-compatible with embedding still enabled. In bootstrap.loadModelPlugin, the main plugin is probed for embeddings before secondary providers are loaded, so LLM-only endpoints (e.g. Groq) fail that probe and fall back to stub models instead of using Ollama embeddings. This makes the default no-embedding path degrade at runtime.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-20T05:50:33Z

+ * Preserves surrounding config — throws if the model block can't be located.
+ */
+export function rewriteModelBlock(src: string, update: ModelBlockUpdate): string {
+  const re = /model:\s*\{[\s\S]*?\n\s*\},/m


Parse model block safely before rewriting config

This regex stops at the first \n\s*}, inside model: { ... }, so a nested object such as extraHeaders: { ... }, causes partial replacement and can leave a syntactically broken config after model switch. Since the new openai-compatible provider documents nested model options, this string-based matcher should be replaced with brace-aware parsing (or at least detect nested objects and abort).

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-20T05:50:33Z

+        grok: { llm: 'grok-4', embedding: 'grok-2-embed' },
+        deepseek: { llm: 'deepseek-chat', embedding: 'bge-m3', embeddingProvider: 'ollama' },
+        mistral: { llm: 'mistral-small-latest', embedding: 'mistral-embed' },
+        'openai-compatible': { llm: '', embedding: 'bge-m3', embeddingProvider: 'ollama' },


Require a non-empty LLM default for openai-compatible switch

The openai-compatible default uses an empty llm, so pressing Enter on the prompt can produce an empty value; rewriteModelBlock then omits the llm line entirely (update.llm ? ... : null). The config loader will backfill its generic default model, which is usually invalid for OpenAI-compatible endpoints and leads to generation failures after switching providers.

Useful? React with 👍 / 👎.

joungminsung and others added 4 commits April 20, 2026 12:50

chatgpt-codex-connector Bot reviewed Apr 20, 2026

View reviewed changes

joungminsung added 2 commits May 21, 2026 12:46

merge main into model providers branch

e0724ed

fix: rebase model providers onto current core

81544a5

joungminsung merged commit 68b7404 into main May 21, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add DeepSeek/Mistral/OpenAI-compatible providers and model CLI#9

feat: add DeepSeek/Mistral/OpenAI-compatible providers and model CLI#9
joungminsung merged 6 commits into
mainfrom
feat/add-more-models

joungminsung commented Apr 20, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 20, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 20, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

joungminsung commented Apr 20, 2026

Summary

New model plugins (3)

New CLI: opendocuments model (7 subcommands)

init wizard

doctor diagnostics

Shared utility

Server bootstrap

Test plan

Out of scope / follow-ups

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New CLI: `opendocuments model` (7 subcommands)

`init` wizard

`doctor` diagnostics