|
| 1 | +# PyDiffGame benchmark study — where a differential game actually helps |
| 2 | + |
| 3 | +This directory is an honest, reproducible head-to-head between **PyDiffGame** and |
| 4 | +the **[python-control](https://python-control.readthedocs.io/)** package across a |
| 5 | +catalogue of standard control systems (carts, vehicles, aircraft, drones and |
| 6 | +flexible structures), with rendered GIFs and a formal verification of *where* |
| 7 | +the differential-game approach beats classical optimal control — and where it |
| 8 | +honestly does not. |
| 9 | + |
| 10 | +## TL;DR (the honest scientific bottom line) |
| 11 | + |
| 12 | +1. **On a single shared quadratic cost, a differential game cannot beat a |
| 13 | + centralized LQR — at best it ties it.** This is not a tuning failure, it is |
| 14 | + theory: the centralized LQR is the optimum of that cost, and a Nash game is a |
| 15 | + *constrained* (decomposed) design, so its cost is `>= LQR` (price of anarchy), |
| 16 | + with equality when the decomposition is lossless. We verify the *lossless* |
| 17 | + case directly: on the masses-on-springs system PyDiffGame's modal game |
| 18 | + reproduces the python-control LQR **to ~5e-12 on every metric**, disturbances |
| 19 | + included. No boost — and we say so. |
| 20 | + |
| 21 | +2. **The real, formally-verifiable win is robustness.** Robust (H-infinity) |
| 22 | + control *is* a differential game — the saddle point of a controller-vs- |
| 23 | + adversarial-disturbance zero-sum game. PyDiffGame now ships this as |
| 24 | + `ContinuousHInfinityControl`, and it **provably reduces the worst-case |
| 25 | + disturbance gain** an LQR leaves on the table — at a documented nominal-cost |
| 26 | + price, and only when the plant has worst-case gain to recover. |
| 27 | + |
| 28 | +## What is measured |
| 29 | + |
| 30 | +For every system we design two state-feedback controllers on the **same** |
| 31 | +weights `(Q, R)`: |
| 32 | + |
| 33 | +| controller | what it optimizes | |
| 34 | +| --- | --- | |
| 35 | +| `control.lqr` (python-control) | nominal cost (no disturbance) | |
| 36 | +| `PyDiffGame.ContinuousHInfinityControl` | worst-case disturbance gain (the game) | |
| 37 | + |
| 38 | +and report, on the same closed loop: |
| 39 | + |
| 40 | +- **`‖G_zw‖∞`** — the closed-loop worst-case L2 gain from the disturbance to the |
| 41 | + weighted performance output `z = [Q^{1/2}x; R^{1/2}u]` (the formal robustness |
| 42 | + metric; lower is more robust). Computed slycot-free by a refined frequency |
| 43 | + sweep. |
| 44 | +- **time-domain peak** of the output under the single worst-case sinusoidal |
| 45 | + disturbance (at the LQR's most vulnerable frequency). |
| 46 | +- **nominal LQ cost penalty** — how much nominal performance the robust design |
| 47 | + gives up (always `>= 0`, since the LQR is the nominal optimum; this is the |
| 48 | + *price* of robustness, reported honestly alongside the gain). |
| 49 | + |
| 50 | +## Nominal regime (two honest outcomes) |
| 51 | + |
| 52 | +On the shared cost a game never beats the centralized LQR; it either ties it or |
| 53 | +pays a small price. Both happen, and both are shown: |
| 54 | + |
| 55 | +- **Lossless tie** — `run_masses.py`: with the *modal* decomposition the |
| 56 | + objectives decouple, so PyDiffGame's Nash game reproduces the LQR to ~5e-12 on |
| 57 | + every metric (`results/masses_pdg_vs_lqr.gif`). |
| 58 | +- **Price of anarchy** — `run_anarchy.py`: two carts with *competing* |
| 59 | + per-cart objectives and one actuator each. The decentralized Nash equilibrium |
| 60 | + costs **+0.33%** on the joint objective vs the centralized LQR — while using |
| 61 | + *less* control energy and overshoot (`results/coupled_carts_anarchy.gif`). The |
| 62 | + tiny price buys decentralization and compositionality. |
| 63 | + |
| 64 | +## Results (robustness) |
| 65 | + |
| 66 | +See [`results/ROBUSTNESS_REPORT.md`](results/ROBUSTNESS_REPORT.md) for the full |
| 67 | +auto-generated table, and `results/robust_<system>.gif` for each animation |
| 68 | +(left: time response to the worst-case disturbance; right: the `σmax(ω)` curves |
| 69 | +whose peak *is* `‖G_zw‖∞`). |
| 70 | + |
| 71 | +<p align="center"> |
| 72 | + <img alt="H-infinity game vs LQR under worst-case disturbance (inverted pendulum)" src="results/robust_inverted_pendulum.gif" width="820"/> |
| 73 | +</p> |
| 74 | + |
| 75 | +*Inverted pendulum: the LQR leaves a sharp resonant peak (`σmax ≈ 1.92`) that the |
| 76 | +H∞ game flattens to `≈ 1.25` (−35% worst-case gain), roughly halving the |
| 77 | +pendulum-angle response to the worst-case disturbance — at a documented nominal-cost |
| 78 | +price. The honest high-frequency trade-off (H∞ slightly higher past the peak) is |
| 79 | +visible too.* |
| 80 | + |
| 81 | +Headline (honest): all 13 systems show a *relative* worst-case-gain reduction, |
| 82 | +but only the ones with a **non-negligible absolute gain** matter in practice — |
| 83 | +**10/13 are practically significant** (inverted pendulum +35%, PVTOL/quadrotor |
| 84 | ++26%, seismic building +24%, flexible two-mass / cart / DC motor ~+22%, gantry |
| 85 | +crane / active suspension / aircraft ~+15%, ...), exactly the lightly-damped and |
| 86 | +unstable plants where the LQR leaves a sharp resonant peak. The two cars |
| 87 | +(cruise, bicycle) have an absolute worst-case gain of ~0 — the disturbance is |
| 88 | +already rejected by *any* reasonable controller — so their real relative |
| 89 | +reductions are **not practically meaningful**, and we say so rather than headline |
| 90 | +a "10/10 win". (The bicycle's earlier apparent "tie" turned out to be a `γ*` |
| 91 | +numerical artifact, caught by the methodology review and fixed; the corrected |
| 92 | +result is a real-but-immaterial relative reduction.) |
| 93 | + |
| 94 | +## Reproduce |
| 95 | + |
| 96 | +```bash |
| 97 | +uv run --extra dev python -m benchmarks.run_masses # nominal: game == LQR (lossless) |
| 98 | +uv run --extra dev python -m benchmarks.robust_compare # one robust comparison + GIF |
| 99 | +uv run --extra dev python -m benchmarks.run_robust_suite # the full 10-system suite + report |
| 100 | +``` |
| 101 | + |
| 102 | +## Rigor |
| 103 | + |
| 104 | +The catalogue models were verified entry-for-entry against the controls / |
| 105 | +vehicle-dynamics / flight-dynamics literature, and the comparison methodology |
| 106 | +(GARE solve, `γ*` search, the worst-case-gain metric, the nominal-cost |
| 107 | +accounting, and the fairness of scoring both controllers on the same output) was |
| 108 | +adversarially reviewed; the review hardened PyDiffGame's `γ*` search against a |
| 109 | +boundary numerical instability (now regression-tested). |
0 commit comments