Add ManiSkill3 environments (Franka Panda + SIMPLER Bridge)#251
Open
qhua360 wants to merge 11 commits into
Open
Add ManiSkill3 environments (Franka Panda + SIMPLER Bridge)#251qhua360 wants to merge 11 commits into
qhua360 wants to merge 11 commits into
Conversation
Wrap ManiSkill3 (SAPIEN) manipulation tasks as standard SWM gym envs via a single robot-agnostic ManiSkillWrapper, registering two stationary-arm families: Franka Panda table-top manipulation (swm/MS*) and the SIMPLER / real2sim Bridge digital twins on WidowX (swm/Simpler*). - ManiSkillWrapper mirrors FetchWrapper: wraps gym.make at num_envs=1, debatches torch->numpy, maps the native success detector onto `terminated` (what World.evaluate scores), and lifts env_name/proprio/state/instruction into info. Robot-agnostic: proprio flattens obs['agent'], the render camera auto-detects, and **kwargs pass through to gym.make. mani_skill is imported lazily (GPU-only). - tasks.py holds a declarative TASK_SPECS registry; adding a robot/task is a one-line entry. envs/__init__.py registers all ids in one generic loop. - Optional [maniskill] extra (mani-skill), kept out of [all] (GPU-only). - Tests: CPU-safe registry checks + GPU rollout test that skips without mani_skill installed. Docs page + mkdocs nav entry. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Verified on a Lambda H100 (CUDA 12.8, driver 580.105.08): both families render real (224,224,3) frames, step with scalar reward + bool terminated, expose success/instruction, and run end-to-end through World.evaluate. pytest tests/envs/test_maniskill.py: 4 passed (2 CPU + 2 GPU). Two fixes surfaced against the real ManiSkill3 API: - render_mode is a read-only property on gym.Wrapper; pass it to gym.make instead of assigning it on the wrapper. - The Bridge digital-twin tasks only support obs_mode='rgb+segmentation' (not plain 'rgb'); set it per-task in TASK_SPECS. Docs: note first-run asset downloads (scene + WidowX robot; public, no token). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Set MS_SKIP_ASSET_DOWNLOAD_PROMPT=1 (via os.environ.setdefault) in the wrapper so missing scene/robot assets download automatically on first gym.make instead of prompting — the interactive prompt raises EOFError under headless/non-interactive stdin. Overridable: set the var to 0 to be prompted. Verified on the H100 by deleting the WidowX robot asset and re-creating the Bridge env non-interactively (auto-downloaded, no EOFError). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Note in the ManiSkill docs that ManiSkill's motion-planning solvers (mplib==0.1.1, pinned on Linux) segfault under NumPy 2 due to a NumPy-1.x ABI mismatch in mplib's compiled extension. Clarify this does NOT affect the integration — the simulator, wrapper, rendering, and World.evaluate all run on NumPy 2; only the optional MP trajectory generators are hit — and give the workarounds (numpy<2 env, or replay download_demo datasets). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove the mplib/NumPy-2 limitation admonition: ManiSkill's motion-planning solvers are not part of this integration's intended workflow, so calling out their NumPy-2 incompatibility was more confusing than useful. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves PR feedback on the ManiSkill envs.
Factors of Variation: replace the empty variation_space with applied,
verified visual FoV (SIMPLER-aligned), each confirmed to change the rendered
frame on an A100 via reset(options={'variation_values': ...}):
- light.intensity -> scene.set_ambient_light (17074 px)
- camera.angle_delta -> render-camera set_local_pose (15300 px)
- object.color -> cube material set_base_color ( 102 px)
- rendering.transparent_arm -> robot-link material alpha (5129 px)
Applied in _apply_variations using ManiSkill's own SAPIEN idioms (cf.
digital_twins/so100_arm/grasp_cube.py), then scene.update_render()+get_obs()
so the frame reflects the change. Sampled values flow to info['variation.*'].
table.color/background.color are deferred: those surfaces are textured, so
set_base_color is a no-op (verified) — they need a texture swap / greenscreen.
Comment style: remove the `# -----` dividers (env.py) and `# --- X ---`
section labels (tasks.py).
Success rate: add scripts/examples/maniskill_demo_replay.py — restores each
ManiSkill demo's initial env_state and replays its actions through the wrapper.
PickCube motionplanning demos reproduce 20/20 = 100%, confirming the env +
success->terminated wiring yields and detects real successes.
Verified on an A100 (CUDA 12.8): 5/5 pytest, FoV pixel-diffs above, demo
replay 100%.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Missed two `# --- ... ---` section dividers in tests/envs/test_maniskill.py during the earlier comment cleanup (which only touched env.py/tasks.py). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a "World-model (MPC) evaluation — follow-up" note to the ManiSkill docs: what eval_wm.py needs (a sim-collected dataset in the pusht_expert_train.h5 layout: episode_idx/step_idx/ep_len/ep_offset + HWC pixels + flat state; goal- conditioning callables; a model-block checkpoint config), the validated 2% goal-reaching result, and that real Open-X datasets (DROID/Bridge) are for training policies, not MPC-eval datasets. Eval scaffolding lives on the maniskill-wm-eval branch for the follow-up PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fold the collect/train/eval scaffolding into the env integration: - env.py: _set_state/_set_goal_state + flat get_state() recording + goal-distance terminated (PushT-style goal-conditioned MPC eval). Validated on H100 (exact state round-trip; goal-distance termination fires). - scripts/data/collect_maniskill.py + config: RandomPolicy collection on PickCube. - scripts/train/config/data/maniskill.yaml: LeWM training data config. - scripts/plan/config/maniskill.yaml: goal-conditioned eval config. Docs: fold the world-model MPC result (LeWM 2% / 1-of-50 goal-reaching on swm/MSPickCube-v0) and the collect->train->eval setup into the Success rate section; note that the MPC eval consumes a dataset in the benchmark layout (episode_idx/step_idx/ep_len/ep_offset + HWC pixels + flat state), supplied separately like PushT's pusht_expert_train.h5, and that the checkpoint config must be the model block (config_key='model'). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The world-model MPC eval does not run on raw collect output. Mark the config as a template and point it at the artifacts it actually needs: - header note: requires a benchmark-format dataset (episode_idx/step_idx/ep_len/ ep_offset + HWC pixels + flat state, like pusht_expert_train.h5) and a model-block checkpoint config (save_pretrained config_key='model'). - eval.dataset_name: maniskill/mspickcube_random.lance -> maniskill/mspickcube_eval.h5 (collect output isn't readable by eval; supply a prepared dataset). - policy: lewm_mspickcube -> lewm_mspickcube/weights_epoch_30.pt (a bare run name is ambiguous across epochs). - docs: note that scripts/plan/config/maniskill.yaml is a template. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI's `uv sync --all-extras` installed the GPU-only maniskill extra, pulling mani-skill -> mplib -> toppra. No toppra version ships cp311/cp312 wheels (0.6.8 is cp310-only, no sdist), so the install failed on 3.11/3.12; on 3.10 it installed and the GPU rollout tests ran and crashed on missing Vulkan. mani-skill is GPU-only (CUDA + Vulkan) with no clean wheel install on linux py3.11/3.12, so it doesn't belong in `--all-extras`. Move it from [project.optional-dependencies] to a [dependency-groups] `maniskill` group (same mechanism as `dev`); `--all-extras` ignores non-default groups, so CI resolves cleanly with no workflow change. Opt in via `uv sync --group maniskill`. Also: - test: skip the rollout test when no CUDA is available (defensive — it needs a GPU/Vulkan even if the group is installed locally). - docs: install via `uv sync --group maniskill`; note it's an opt-in group. Verified: `uv export --all-extras` excludes mani-skill/mplib/toppra; `uv export --group maniskill` includes them; uv.lock unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds ManiSkill3 (SAPIEN) manipulation tasks as standard SWM environments via a single robot-agnostic wrapper. Two stationary-arm families:
swm/MS*) — PickCube, PushCube, PullCube, PokeCube, StackCube, LiftPegUpright, PegInsertionSide, PlugCharger, PickSingleYCB, RollBall.swm/Simpler*) — carrot-on-plate, spoon-on-towel, stack-cube, eggplant-in-basket.Includes data collection (
collect_maniskill.py) and world-model eval support (LeWM training data config + goal-conditioned MPC eval config).World.evaluate+ SWM's policies/solvers also handle policy eval via the task's native success detector (mapped ontoterminated).Design
ManiSkillWrapper(gym.Wrapper)mirrorsFetchWrapper: wrapsgym.make(..., num_envs=1), debatches torch→numpy, mapsinfo['success'] → terminated, liftsenv_name/proprio/state/instructionintoinfo, renders(H,W,3)uint8. No robot-specific branching.TASK_SPECSis the single extension point — one line per task/embodiment._set_state/_set_goal_staterestore/compare the flat ManiSkill sim state (info['state']), mirroring the PushT env, so SWM's MPC eval (eval_wm.py) works.maniskilldependency group (GPU-only — needs CUDA+Vulkan; excluded fromuv sync --all-extras/[all], installed viauv sync --group maniskill), lazily imported soimport stable_worldmodelstays CPU-importable.MS_SKIP_ASSET_DOWNLOAD_PROMPT=1).Factors of Variation
The env declares a
variation_spaceof visual distribution-shift knobs (SIMPLER-aligned), applied in_apply_variationsusing ManiSkill's own SAPIEN idioms (cf.digital_twins/so100_arm/grasp_cube.py) thenscene.update_render()+get_obs()so the frame reflects them; sampled values flow intoinfo['variation.*']. Each was verified on an A100 to change the rendered frame viareset(options={'variation_values': ...}):light.intensityscene.set_ambient_lightcamera.angle_deltaset_local_poseobject.colorset_base_colorrendering.transparent_armDeferred (documented):
table.color/background.color— those surfaces are textured, soset_base_coloris a verified no-op; they need a texture swap / greenscreen. Distractors and object-pose-as-settable-factor are follow-ups (object pose is already seed-randomized per episode).Success rate
Demo replay. No trained model is bundled, so
scripts/examples/maniskill_demo_replay.pyreplays ManiSkill's official demonstrations through the wrapper — restoring each demo's initialenv_statethen replaying its recorded actions, reportingsuccess_rate:This confirms the env +
success→terminatedwiring yields and detects real successes (vs. ~0% from a random policy). (Reproduction uses the initial env_state, not the seed — batched-GPU demos aren't seed-reproducible; the absolute-joint demos replay deterministically.)World-model MPC. A LeWM world model trained on a collected PickCube dataset and evaluated with goal-conditioned MPC reached 2% (1/50) goal-reaching success on
swm/MSPickCube-v0(random-policy data, 30 epochs, goal threshold 2.0 on the flat sim state) — a real but low baseline:Creating an eval dataset
World-model MPC eval is wired (env callables + collect/train/eval configs), but — exactly like PushT — it runs on a prepared dataset, not raw
World.collectoutput. The dataset must be in the benchmark layout: per-rowepisode_idx/step_idx+ep_len/ep_offset, withpixels(HWC),action,proprio, andstate(the flat ManiSkill sim state used by_set_state) — the same format as the providedpusht_expert_train.h5.World.collectoutput isn't natively in that layout (Lance hides the index columns; HDF5 omits them), so the eval dataset is produced/provided separately. The trained checkpoint'sconfig.jsonmust be the model block (save_pretrained(config_key='model')). Real Open-X datasets (DROID / BridgeData V2) are for training policies, not as MPC-eval datasets (they carry no ManiSkill sim state to restore).Verification (A100, CUDA 12.8, NumPy 2)
pytest tests/envs/test_maniskill.py→ 5 passed (registry + variation-space CPU tests + 2 GPU rollouts). On CPU/CI without ManiSkill the GPU tests skip.render()frames visually confirmed (real Panda/WidowX scenes).swm/MSPickCube-v0).Notes
control_mode='pd_ee_delta_pose'for a uniform 7-D EE action.🤖 Generated with Claude Code