Codex can connect to an OpenAI Responses-compatible gateway by configuring a
custom model provider. LMDeploy exposes POST /v1/responses, so Codex can route
requests to a local or remote LMDeploy api_server.
The request path is:
Codex -> http://<server>:<port>/v1/responses -> LMDeploy api_server
Launch an LMDeploy API server with the model you want Codex to use:
lmdeploy serve api_server Qwen/Qwen3.5-35B-A3B --backend pytorch --server-port 23333For tool calling, launch the server with a tool parser supported by your model:
lmdeploy serve api_server <model> \
--server-port 23333 \
--tool-call-parser <parser-name>Before configuring Codex, check that LMDeploy responds to Responses-style requests:
curl http://127.0.0.1:23333/v1/responses \
-H "content-type: application/json" \
-d '{
"model": "Qwen/Qwen3.5-35B-A3B",
"input": "Say hello from LMDeploy",
"max_output_tokens": 32
}'The model value must match a model name exposed by your LMDeploy server. You can
check the model list with:
curl http://127.0.0.1:23333/v1/modelsAdd the LMDeploy provider configuration to ~/.codex/config.toml:
model = "Qwen/Qwen3.5-35B-A3B"
model_provider = "lmdeploy"
[model_providers.lmdeploy]
name = "LMDeploy"
base_url = "http://127.0.0.1:23333/v1"
env_key = "LMDEPLOY_API_KEY"
wire_api = "responses"
requires_openai_auth = false
stream_idle_timeout_ms = 300000Set base_url to the /v1 root, not to /v1/responses. Codex appends
/responses itself when wire_api = "responses".
Set LMDEPLOY_API_KEY before starting Codex. If the LMDeploy server was not
launched with --api-keys, any non-empty value is sufficient. Otherwise, use the
matching key instead of dummy:
export LMDEPLOY_API_KEY=dummyStart Codex:
codexOr run a single non-interactive request:
codex exec "Say hello from LMDeploy"Codex commonly uses tool calls for shell commands and file inspection. LMDeploy's
/v1/responses endpoint supports function-call output when the API server is
launched with --tool-call-parser.
Codex uses the OpenAI Responses-compatible /v1/responses endpoint. The
Anthropic-compatible /v1/messages endpoint is for Anthropic-style clients such
as Claude Code.