fix(verl): preserve multi-turn tool-call prefix extension for math tool agent for Qwen 3 models by JasonWei05 · Pull Request #516 · rllm-org/rllm

JasonWei05 · 2026-04-29T19:07:13Z

Summary

This PR adds rllm.multi_turn_extension support for Qwen 3 multi-turn tool-calling rollouts through the verl gateway path. When enabled, the gateway renders chat prompts with rLLM's chat template parser and forwards raw text to vLLM's /v1/completions, avoiding vLLM chat-template canonicalization that breaks prefix-extension checks. Qwen 3 and Deepseek-Qwen parsers now preserve model-emitted assistant whitespace around tool calls so replayed turns remain byte-identical. This is not an issue for Qwen 3.5 and Qwen 3 coder models, which use qwen3_coder and qwen3_xml parsers. This is an issue with the hermes parser.

Type of change

What changed

Added parser-backed gateway transport for multi_turn_extension=true, converting chat-completions requests to raw-text completions while preserving trace token ids and logprobs.
Threaded multi_turn_extension, tokenizer name, thinking mode, and accumulated reasoning config through the gateway and verl rollout setup.
Updated Qwen and Deepseek-Qwen parser rendering/parsing so tool-call turns can preserve model-emitted whitespace.
Made cookbooks/math_tool_agent/train_verl.sh default to MULTI_TURN_EXTENSION=true with legacy vLLM tool parsing only when disabled.
Added focused parser and gateway tests for raw-text transport, logprob normalization, tool-call finish reasons, and Qwen/Deepseek-Qwen whitespace behavior.

Validation

pre-commit run --all-files
Targeted tests: pytest ...
Manual validation performed
Not run (reason below)

Validation details:

PYTHONPATH=. uv run --no-project --with pytest python -m pytest tests/parser/test_multi_turn_extension.py -q passed: 4 passed.
PYTHONPATH=src uv run --no-project --with pytest --with pytest-asyncio --with fastapi --with uvicorn --with httpx --with 'pydantic>=2' --with aiosqlite --with PyYAML python -m pytest tests/unit/test_parser_transport.py tests/unit/test_server.py -q passed: 60 passed, 2 warnings.
PYTHONPATH=src uv run --no-project --with pytest --with pytest-asyncio --with fastapi --with uvicorn --with httpx --with 'pydantic>=2' --with aiosqlite --with PyYAML python -m pytest tests/unit/ -q passed: 186 passed, 2 warnings.
uv run --no-project --with ruff ruff check ... passed on changed Python files.
python -m compileall -q ..., bash -n cookbooks/math_tool_agent/train_verl.sh, and git diff --check passed.

Breaking changes / migration notes

Adds rllm.multi_turn_extension, defaulting to False in base config.
For multi_turn_extension=true, gateway parser transport requires a tokenizer/model path and either rllm.disable_thinking=true or rllm.accumulate_reasoning=true.
Streaming, n>1, multimodal chat content, and required tool-choice enforcement are not supported by the v1 parser transport path.

Docs / examples

Not needed
Updated docs
Updated examples
Follow-up docs needed

Related issues / PRs

Fixes #
Related to #
Stacked on / depends on #

Screenshots / logs

N/A

jason.wei and others added 2 commits April 30, 2026 02:58

add multi-turn extension for qwen models

0c04aed

ruff formatting

e4de856

JasonWei05 marked this pull request as draft April 30, 2026 08:31

JasonWei05 changed the title ~~fix(verl): preserve multi-turn tool-call prefix extension for math tool agent~~ fix(verl): preserve multi-turn tool-call prefix extension for math tool agent for Qwen 3 models Apr 30, 2026

listar2000 requested a review from kylemontgomery1 April 30, 2026 22:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(verl): preserve multi-turn tool-call prefix extension for math tool agent for Qwen 3 models#516

fix(verl): preserve multi-turn tool-call prefix extension for math tool agent for Qwen 3 models#516
JasonWei05 wants to merge 2 commits into
mainfrom
feat/multi-turn-extension

JasonWei05 commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

JasonWei05 commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of change

What changed

Validation

Breaking changes / migration notes

Docs / examples

Related issues / PRs

Screenshots / logs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JasonWei05 commented Apr 29, 2026 •

edited

Loading