Skip to content

spectraplex GPU eigendecomposition #52

@samtalki

Description

@samtalki

spectraplex LMO is eigen!(Symmetric(...)) over a reshape of an n²-vec to n×n.
O(n³). dominates per-iter cost on PSD-cone problems past n≈100 (75–80% of
per-iter for trace regression at n=200, see examples/bench_spectraplex_apple.jl).

AppleAccelerate doesn't help — dsyevd / ssyevd lands within noise of
OpenBLAS at n≤500 on apple silicon. real lever is GPU eigen.

paths:

  • MPS eigensolver on Metal. verify NaN behavior at large n (MPS: Large matrix multiplications using can contains NaNs JuliaGPU/Metal.jl#381).
  • cuSOLVER syevd on CUDA.
  • cross-vendor: lanczos via KernelAbstractions.jl. FW only needs the
    smallest eigenvector — partial decomp is much cheaper than full.
  • dispatch on n: above some crossover, GPU; below, CPU.
  • crossover benchmark — extend bench_spectraplex_apple.jl.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions