cross-vendor parity tracking

meta-issue. whenever an apple-specific optimization lands in
`MargueriteMetalExt`, open a tracking issue for the CUDA and AMDGPU
equivalents so apple doesn't pull ahead on functionality (only perf).

watch-list:
- [ ] `gpu_storage` keyword (when added) — no-op pass-through on CUDA / AMDGPU
- [ ] `@autoreleasepool` boundary — Metal-specific; not needed on CUDA / AMDGPU,
  but document why and how their equivalents work (or don't)
- [ ] benchmark rows in `docs/src/gpu_backends.md` are currently "—" for CUDA
  and AMDGPU — fill in when measured
- [ ] device tests under `MARGUERITE_TEST_GROUP=gpu` — currently auto-detects
  Metal first; verify CUDA and AMDGPU code paths exercise the same coverage
- [ ] **`c.gap` device-resident on non-Metal backends** — current `BatchCache.gap::Vector{T}` is host-allocated and the GPU LMO kernels write to it directly. Works on Metal (unified memory). On CUDA the host-pointer write from a device kernel is technically UB. Make `gap` (and `objective`, `obj_trial`, `step_gamma` if any kernel writes them) device-resident on non-CPU backends, with a host shadow for the convergence check.
- [ ] **GPU↔CPU numerical agreement asserts in `test_gpu_backend.jl`** — current smoke tests check feasibility (`sum≈1`, in-bounds). Add asserts that `Array(X_gpu) ≈ X_cpu_reference` to atol scaled by eltype (`1e-5` for F64, `1e-3` for F32). Catches buggy reductions that land feasible-but-wrong.
- [ ] **rrule-on-real-GPU testset** — one `rrule(batch_solve, expr, ScalarBox, X0_gpu, θ)` testset under `MARGUERITE_TEST_GROUP=gpu` with FD check vs CPU. The `adapt(Array, ...)` host-pull boundary in `batch_diff.jl` is end-to-end untested today.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cross-vendor parity tracking #59

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

cross-vendor parity tracking #59

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions