[WIP] SWE-Agent recipe adapted to the new black-box Agent Framework by wangtiance · Pull Request #91 · verl-project/verl-recipe

wangtiance · 2026-04-27T01:32:52Z

Adapted from https://github.com/verl-project/verl-recipe/tree/main/swe_agent, replacing the AgentLoopManager with the new gateway-based agent framework (verl-project/verl#5931 and zackcxb/verl#1)

gemini-code-assist

Code Review

This pull request introduces the SWE-agent framework integration for VERL, enabling reinforcement learning on software engineering tasks. The implementation includes a robust configuration system, Docker-based sandboxing, dataset preparation utilities for synthetic tasks, and a specialized reward function for patch-based evaluation. Feedback focuses on improving code quality and robustness, specifically by removing redundant initializations and unused imports, tightening file permissions for lock files, enhancing the regex used for git diff parsing, and ensuring that configuration parsing errors are not silently swallowed.

gemini-code-assist · 2026-04-27T01:34:42Z

+        from transformers import AutoTokenizer
+        tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)
+
+        load_balancer = GlobalRequestLoadBalancer.remote(
+            server_actor_ids=server_addresses,
+        )
+
+        gateway_count = int(OmegaConf.select(config, "actor_rollout_ref.rollout.agent_framework.gateway_count", default=1))
+        servers = list(zip(server_addresses, server_handles, strict=True))
+
+        from transformers import AutoTokenizer
+        tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)
+
+        from verl.agent.gateway.runtime import GatewayServingRuntime


The AutoTokenizer is imported and initialized twice within the create method. This is redundant and should be cleaned up to improve readability and avoid unnecessary overhead.

Suggested change

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)

load_balancer = GlobalRequestLoadBalancer.remote(

server_actor_ids=server_addresses,

)

gateway_count = int(OmegaConf.select(config, "actor_rollout_ref.rollout.agent_framework.gateway_count", default=1))

servers = list(zip(server_addresses, server_handles, strict=True))

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)

from verl.agent.gateway.runtime import GatewayServingRuntime

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)

load_balancer = GlobalRequestLoadBalancer.remote(

server_actor_ids=server_addresses,

)

gateway_count = int(OmegaConf.select(config, "actor_rollout_ref.rollout.agent_framework.gateway_count", default=1))

servers = list(zip(server_addresses, server_handles, strict=True))

from verl.agent.gateway.runtime import GatewayServingRuntime

gemini-code-assist · 2026-04-27T01:34:42Z

+import uuid
+from typing import Any, Optional
+
+import numpy as np


The numpy import is unused in this file and should be removed to keep the code clean.

gemini-code-assist · 2026-04-27T01:34:42Z

+        while True:
+            for slot_idx in range(max_parallel_tasks_per_worker):
+                lock_path = os.path.join(lock_dir, f"slot_{slot_idx}.lock")
+                fd = os.open(lock_path, os.O_CREAT | os.O_RDWR | getattr(os, "O_CLOEXEC", 0), 0o666)


Using 0o666 for lock file permissions makes them world-writable. It is recommended to use more restrictive permissions like 0o600 (owner read/write only) to enhance security, especially for lock files in shared temporary directories.

Suggested change

fd = os.open(lock_path, os.O_CREAT | os.O_RDWR | getattr(os, "O_CLOEXEC", 0), 0o666)

fd = os.open(lock_path, os.O_CREAT | os.O_RDWR | getattr(os, "O_CLOEXEC", 0), 0o600)

gemini-code-assist · 2026-04-27T01:34:42Z

+    """Extract set of changed files from a patch."""
+    if not patch:
+        return set()
+    pattern = r"diff --git a/(.+?) b/(.+)"


The regex r"diff --git a/(.+?) b/(.+)" is fragile and may fail if filenames contain spaces (which git diff quotes) or if the sequence b/ appears within a filename. Consider using a more robust parsing method or looking for --- a/ and +++ b/ lines to extract filenames.

gemini-code-assist · 2026-04-27T01:34:42Z

+            return json.loads(val)
+        except (json.JSONDecodeError, TypeError):
+            return {}


Swallowing json.JSONDecodeError and returning an empty dictionary can lead to silent configuration failures. It would be better to log a warning or raise an error if the input string was expected to be valid JSON but failed to parse, ensuring that configuration issues are visible to the user.

gemini-code-assist · 2026-04-27T01:34:42Z

+        exec_dir: Working directory for the subprocess
+                  (avoids YAML parsing issues with ``docker`` subdir).
+        swe_agent_timeout: Overall timeout in seconds.
+        proxy_port: ModelProxy port (for logging only).


The docstring mentions a proxy_port argument which is not present in the function signature of execute_swe_agent. This should be removed to maintain consistency between documentation and code.

black box agent framework based swe recipe

62533db

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] SWE-Agent recipe adapted to the new black-box Agent Framework#91

[WIP] SWE-Agent recipe adapted to the new black-box Agent Framework#91
wangtiance wants to merge 1 commit into
verl-project:mainfrom
wangtiance:main

wangtiance commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	fd = os.open(lock_path, os.O_CREAT \| os.O_RDWR \| getattr(os, "O_CLOEXEC", 0), 0o666)
	fd = os.open(lock_path, os.O_CREAT \| os.O_RDWR \| getattr(os, "O_CLOEXEC", 0), 0o600)

Conversation

wangtiance commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant