Skip to content

[WIP] SWE-Agent recipe adapted to the new black-box Agent Framework#91

Draft
wangtiance wants to merge 1 commit into
verl-project:mainfrom
wangtiance:main
Draft

[WIP] SWE-Agent recipe adapted to the new black-box Agent Framework#91
wangtiance wants to merge 1 commit into
verl-project:mainfrom
wangtiance:main

Conversation

@wangtiance

Copy link
Copy Markdown

Adapted from https://github.com/verl-project/verl-recipe/tree/main/swe_agent, replacing the AgentLoopManager with the new gateway-based agent framework (verl-project/verl#5931 and zackcxb/verl#1)

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the SWE-agent framework integration for VERL, enabling reinforcement learning on software engineering tasks. The implementation includes a robust configuration system, Docker-based sandboxing, dataset preparation utilities for synthetic tasks, and a specialized reward function for patch-based evaluation. Feedback focuses on improving code quality and robustness, specifically by removing redundant initializations and unused imports, tightening file permissions for lock files, enhancing the regex used for git diff parsing, and ensuring that configuration parsing errors are not silently swallowed.

Comment on lines +158 to +171
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)

load_balancer = GlobalRequestLoadBalancer.remote(
server_actor_ids=server_addresses,
)

gateway_count = int(OmegaConf.select(config, "actor_rollout_ref.rollout.agent_framework.gateway_count", default=1))
servers = list(zip(server_addresses, server_handles, strict=True))

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)

from verl.agent.gateway.runtime import GatewayServingRuntime

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The AutoTokenizer is imported and initialized twice within the create method. This is redundant and should be cleaned up to improve readability and avoid unnecessary overhead.

Suggested change
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)
load_balancer = GlobalRequestLoadBalancer.remote(
server_actor_ids=server_addresses,
)
gateway_count = int(OmegaConf.select(config, "actor_rollout_ref.rollout.agent_framework.gateway_count", default=1))
servers = list(zip(server_addresses, server_handles, strict=True))
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)
from verl.agent.gateway.runtime import GatewayServingRuntime
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_config.path, trust_remote_code=True)
load_balancer = GlobalRequestLoadBalancer.remote(
server_actor_ids=server_addresses,
)
gateway_count = int(OmegaConf.select(config, "actor_rollout_ref.rollout.agent_framework.gateway_count", default=1))
servers = list(zip(server_addresses, server_handles, strict=True))
from verl.agent.gateway.runtime import GatewayServingRuntime

import uuid
from typing import Any, Optional

import numpy as np

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The numpy import is unused in this file and should be removed to keep the code clean.

while True:
for slot_idx in range(max_parallel_tasks_per_worker):
lock_path = os.path.join(lock_dir, f"slot_{slot_idx}.lock")
fd = os.open(lock_path, os.O_CREAT | os.O_RDWR | getattr(os, "O_CLOEXEC", 0), 0o666)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Using 0o666 for lock file permissions makes them world-writable. It is recommended to use more restrictive permissions like 0o600 (owner read/write only) to enhance security, especially for lock files in shared temporary directories.

Suggested change
fd = os.open(lock_path, os.O_CREAT | os.O_RDWR | getattr(os, "O_CLOEXEC", 0), 0o666)
fd = os.open(lock_path, os.O_CREAT | os.O_RDWR | getattr(os, "O_CLOEXEC", 0), 0o600)

"""Extract set of changed files from a patch."""
if not patch:
return set()
pattern = r"diff --git a/(.+?) b/(.+)"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex r"diff --git a/(.+?) b/(.+)" is fragile and may fail if filenames contain spaces (which git diff quotes) or if the sequence b/ appears within a filename. Consider using a more robust parsing method or looking for --- a/ and +++ b/ lines to extract filenames.

Comment on lines +144 to +146
return json.loads(val)
except (json.JSONDecodeError, TypeError):
return {}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Swallowing json.JSONDecodeError and returning an empty dictionary can lead to silent configuration failures. It would be better to log a warning or raise an error if the input string was expected to be valid JSON but failed to parse, ensuring that configuration issues are visible to the user.

exec_dir: Working directory for the subprocess
(avoids YAML parsing issues with ``docker`` subdir).
swe_agent_timeout: Overall timeout in seconds.
proxy_port: ModelProxy port (for logging only).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docstring mentions a proxy_port argument which is not present in the function signature of execute_swe_agent. This should be removed to maintain consistency between documentation and code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant