verl-project · HarishKMurali · Jun 5, 2026 · Jun 5, 2026
diff --git a/context_management/README.md b/context_management/README.md
@@ -0,0 +1,62 @@
+# Context-management agent loops
+
+Plug-in **context management** for verl agent loops: keep multi-turn / long-horizon rollouts within
+the model's context window by compressing the trajectory on the fly, instead of truncating or
+failing once the window is exceeded.
+
+This recipe provides two ready-to-use agent loops and the `ContextManager` abstraction they share:
+
+| Agent loop (`name`) | Class | Strategy |
+|---|---|---|
+| `naive_summarizer_agent` | `SummarizerAgentLoop` | When the model emits a `<summary>...</summary>` block, replace the history with `(initial prompt + summary)` and continue. |
+| `tool_sliding_window_agent` | `ToolSlidingWindowAgentLoop` | Keep a sliding window over tool-calling turns, dropping the oldest turns when the window is exceeded. |
+
+Both subclass `AgentLoopWithContextManagement`, which drives a generic
+`generate → check_and_compress → continue` loop around any `ContextManager`
+(`SummarizerContextManager`, `SlidingWindowContextManager`, or your own).
+
+## Background
+
+This code was originally proposed for verl core in
+[volcengine/verl#5636](https://github.com/verl-project/verl/pull/5636)
+("[algo] feat: supporting agentic rl with context management", see issue
+[#5375](https://github.com/verl-project/verl/issues/5375)). At the maintainers' request it now lives
+here as a self-contained recipe rather than in `verl/experimental/agent_loop/`, so it can evolve
+independently of the core library. The multi-trajectory / session-level GRPO training support that
+complements it lands separately in core (see verl#5401, #5969).
+
+## Layout
+
+```
+context_management/
+  context_manager.py                      # ContextManager + Sliding-window / Summarizer implementations
+  agent_loop_with_context_management.py   # AgentLoopWithContextManagement + the two agent loops
+  context_manager_plugin.md               # design notes / how to write a custom ContextManager
+  test_context_manager.py                 # CPU unit tests
+  test_agent_loop_with_context_management.py
+  example/                                # runnable GRPO example wiring the summarizer loop
+```
+
+## Usage
+
+The loops register themselves under the `name`s above. Point verl at this recipe's agent-loop config
+and select a loop:
+
+```bash
+actor_rollout_ref.rollout.agent.agent_loop_config_path=recipe/context_management/example/agent.yaml
+actor_rollout_ref.rollout.agent.default_agent_loop=naive_summarizer_agent
+```
+
+See [`example/`](example/) for a full run script, and
+[`context_manager_plugin.md`](context_manager_plugin.md) for writing your own `ContextManager`.
+
+## Required verl version
+
+See [`REQUIRED_VERL.txt`](REQUIRED_VERL.txt) for the upstream repo and the pinned core-library commit.
+
+## Tests
+
+```bash
+pytest recipe/context_management/test_context_manager.py
+pytest recipe/context_management/test_agent_loop_with_context_management.py
+```
diff --git a/context_management/REQUIRED_VERL.txt b/context_management/REQUIRED_VERL.txt
@@ -0,0 +1,11 @@
+# context_management — rolling; refresh the commit against your verl checkout before publishing
+UPSTREAM=https://github.com/verl-project/verl.git
+MODE=rolling
+BRANCH=main
+# Core-library commit this recipe was developed/tested against. Refresh before opening the PR.
+VERL_COMMIT=9c38b8bb1876a81273d76de3e79328b2dd2b7b32
+PIP_INSTALL=pip install verl@git+https://github.com/verl-project/verl.git@9c38b8bb1876a81273d76de3e79328b2dd2b7b32
+GIT_SETUP=git clone https://github.com/verl-project/verl.git && cd verl && git checkout 9c38b8bb1876a81273d76de3e79328b2dd2b7b32 && git submodule update --init --recursive recipe
+RECIPE_FOLDER=context_management
+NOTES=Depends only on stable verl core APIs: verl.experimental.agent_loop.agent_loop (AgentLoopBase, register, AgentLoopOutput, AgentLoopMetrics), verl.tools, verl.utils.chat_template, verl.utils.tokenizer, verl.workers.rollout.replica.TokenOutput. No core code changes are required to use this recipe.
+REFRESH=Recompute VERL_COMMIT: (cd verl && git rev-parse HEAD). Re-run the tests under recipe/context_management/ after bumping.
diff --git a/context_management/__init__.py b/context_management/__init__.py
@@ -0,0 +1,13 @@
+# Copyright 2024 Bytedance Ltd. and/or its affiliates
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.