feat: add Daytona sandbox backend for code execution#460
Conversation
JiwaniZakir
left a comment
There was a problem hiding this comment.
In daytona_tool.py, the retry logic in forward has a subtle issue: when all retries are exhausted, _restart_sandbox is called to create a fresh sandbox, but then an error is immediately returned without attempting to execute code on the new sandbox — so the restart effort is entirely wasted for the current call. The caller receives an error and must retry manually, defeating the purpose of the restart.
The __main__ block's async test is also broken: interpreter(code=..., use_async=True) passes use_async through **kwargs to forward, which never inspects it, so the call returns a CodeToolOutput synchronously rather than a coroutine — await coro will then raise a TypeError. Either forward needs an async def variant or the example should be removed.
Minor: super().__init__() is called after self._init_sandbox() in __init__, which means sandboxes are already allocated before the parent class is fully initialized — if the base CodeTool.__init__ raises, you'd leak sandbox resources. Calling super().__init__() first would be safer.
Signed-off-by: Muhammad Hashmi <muhash@Muhammads-MacBook-Pro-2.local>
4a040ea to
06b20ef
Compare
Signed-off-by: Muhammad Hashmi <muhash@Muhammads-MacBook-Pro-2.local>
06b20ef to
f7a8f8d
Compare
Signed-off-by: Muhammad Hashmi <muhash@Muhammads-MacBook-Pro-2.local>
Signed-off-by: Muhammad Hashmi <muhash@Muhammads-MacBook-Pro-2.local>
f7a8f8d to
726f8ee
Compare
|
@JiwaniZakir updated PR addressing review comments (apologies for letting the PR go stale for this long!) |
|
@JiwaniZakir just wanted to gently bump this, updated the PR addressing the issues you brought up. appreciate the review! |
fixes #459
Summary
Adds daytona as a new cloud sandbox backend. This uses the
daytonapackage and thedaytona.Daytonasdk.What changed
rllm/tools/code_tools/daytona_tool.pyPythonInterpreter(backend="daytona")to the new backenddaytonato the[code-tools]extraImplementation Notes
sandbox.process.code_run(...)instead of the code interpreter/session path, so execution stays stateless per call and matches currentCodeToolusage in rllmapi_url,snapshot, andenv_varsstdout, and non-zero exits tostderrplus the last traceback line inerrorValidation