[Feature] Major Updates to SPT Web by MarcelMatsal · Pull Request #428 · galilai-group/stable-pretraining

MarcelMatsal · 2026-06-05T14:20:38Z

Description

This PR adds a large batch of web viewer improvements.

State persistence & URL sharing

All interactive state (visible runs, filters, grouping, sort, axes, theme, etc.) is serialised to localStorage on every mutation and restored on page load. Visible run IDs and the active tab are also reflected in the URL fragment (#runs=...&tab=...) via history.replaceState, so shared URLs land on the sender's exact view.

Chart value annotations

Each uPlot chart panel now renders a compact annotation table below it showing the last and best logged value per visible series. A lowerIsBetter heuristic handles loss/error/perplexity metrics automatically. The best-value row is bolded. Clicking a row toggles that series' visibility.

Runs table view

A fourth Table tab renders all visible runs as a sortable, filterable, horizontally-scrollable table. Columns are the union of all hparams and summary keys. Cells that differ across runs are highlighted amber; uniform cells are dimmed. Headers and the run-ID column are sticky. Column sort cycles asc → desc → none. A debounced column search box with a count badge is persisted to localStorage. Row clicks open the existing detail modal.

Run display names & inline editing

Sidebar rows now show display_name as the primary label with the raw run_id dimmed below it. Double-clicking the name opens an inline editor that commits via PATCH /api/run-meta with optimistic rollback on failure. The table tab was also fixed to use display_name.

Notes editing

The run detail modal gained a Notes textarea that auto-resizes to content. Edits commit on blur or Ctrl+Enter via the same PATCH /api/run-meta endpoint.

Live log tail

The .out and .err log tabs auto-refresh every 10 s for running/stale runs, showing a pulsing live badge. A pause button and manual refresh button are provided. Scroll position is preserved across refreshes.

Run elapsed time / duration

Backend: Logger.finalize() now records ended_at and exposes it through the sidecar and scan serialiser. Frontend: each run row shows a formatted elapsed/total duration, updated on a 60 s tick. The detail modal shows started, ended, and duration fields.

Heartbeat staleness indicator

Runs whose heartbeat file mtime is more than 5 minutes old are treated as stale. Stale rows show an amber ⚠ dot with a tooltip. The status stat chip, filter/group-by/sort consumers, and the figures-tab overview (activity timeline, stat card, status bars) all reflect the stale state via a centralised effectiveStatus() helper.

Scatter plot

A scatter plot is rendered at the bottom of the figures tab when two or more runs are visible. X and Y axes are user-selectable from any numeric hparams.* or summary.* key across visible runs. Each run is a single coloured dot; axis selection and visibility are persisted to localStorage.

Metric CSV export

A download CSV button exports all visible runs × filtered metrics × steps from in-memory state (no server round-trip) as spt_metrics.csv with columns run_id, run_name, metric, step, epoch, value.

Zoom reset on charts

Each chart panel shows a hidden ⤢ reset button that appears after a drag-zoom. Clicking it resets all charts simultaneously (they share a uPlot sync key). The drag-selection rectangle is now visually styled.

Sidebar resize + virtual scroll

A drag handle lets the user resize the sidebar between 160–600 px (persisted to localStorage). The run list scrolls independently of the fixed sidebar controls. Lists exceeding 300 ungrouped runs switch to a virtual scroll renderer using spacer divs and a scroll listener, keeping DOM size bounded.

Keyboard shortcuts

Six shortcuts: / (focus metric search), r (focus run search), t (cycle tabs), Shift+A (select all), Shift+C (clear selection), ? (toggle help popover). All are suppressed inside text inputs. A styled help popover lists all shortcuts with <kbd> key chips.

Tag editing from UI

Tags in the run detail modal are now interactive pills with × remove buttons. An expanding inline input with datalist autocomplete (populated from other runs' tags) handles additions on Enter or ,. All mutations go through PATCH /api/run-meta with optimistic rollback. No backend changes were needed.

Combine metrics into one chart

Each chart panel title bar has a + toggle button. Selecting two or more metrics reveals a Combine (N) button in the figures toolbar; clicking it creates a persistent combined chart panel pinned above the metric grid. Combined panels show all visible runs' series for all listed metrics in one uPlot instance, with a full annotation table. Combined chart state persists to localStorage.

Per-chart PNG export

Each chart panel has a ⬇ download button (revealed on hover). Clicking it composites the uPlot canvas onto an offscreen canvas with a padded border and bold metric-name title, filled with the current theme's colours. The result is downloaded as a full-resolution (HiDPI-aware) PNG named after the metric.

Per-metric direction override

Each chart panel title bar has a ↓ min / ↑ max toggle button. On first render it reflects the lowerIsBetter heuristic; clicking flips the direction and stores an explicit override in state.metricDir. The button turns accent colour when an override is active. Overrides persist to localStorage and immediately re-highlight the best-value annotation row.

Hide same-value columns in table

A hide same toggle in the table controls bar filters the rendered columns to only those whose values differ across visible runs. The column-count badge updates to show "N diff / M cols" when active. State persists to localStorage as tableHideSame.

Light mode visual refinement

The light-mode palette was replaced with warm off-white neutrals (--bg: #f0ede9, --surface: #f9f8f6, --surface-2: #f3f1ee) and the accent shifted to indigo (#4f46e5) for stronger contrast. Run count text and stale-run chip backgrounds were hardened for both themes.

File-deletion safety (TOCTOU fixes + deletion toast)

Three check-then-open races in scan.py (metrics_json, metrics_stream, log_content) were replaced with direct try/except OSError opens, eliminating windows where file deletion between the check and the open would crash a server thread. On the frontend, if a run that was selected or open in the detail modal disappears from disk, a dismissible amber toast notification is shown automatically.

Rename / edit robustness

Four edge-case bugs fixed: (1) renderRunList now calls .blur() on any active inline rename input before tearing out the DOM, so in-flight edits are never silently lost; (2) do_PATCH catches OSError from write_sidecar and returns a 500 JSON response instead of dropping the TCP connection; (3) patch_run_meta validates field types (display_name/notes → str | None, tags → list, archived → bool); (4) rename and notes failure handlers now call showToast(...) with the server error message and look up a fresh run reference from state.runs instead of using a potentially stale closure.

Checklist

I have read the Contributing document.
The documentation is up-to-date with the changes I made (check build artifacts).
All tests passed, and additional code has been covered with new tests.
I have added the PR to the RELEASES.rst file.

Updating Main Branch

Merging Main Updates into this branch

…the text box changes in size based on the amount of notes

… refreshing for those .out and .err

…ing but are not actually

…zooming in

This reverts commit 29c0474.

RandallBalestriero · 2026-06-16T23:21:02Z

Thank you @MarcelMatsal ! Super nice just last tiny updates:
Tests (stable_pretraining/tests/unit/test_web.py)

test_cache_hit_avoids_reparse patches builtins.open, but the code reads via mpath.open (Path.open), which doesn't route through builtins.open
— so the AssertionError side-effect can never fire and the test passes even if the cache is fully broken. Patch pathlib.Path.open instead (like
the OSError tests already do), or assert on a parse counter.
Add a test for the do_PATCH body-too-large branch (Content-Length > 64 KiB → 400 "request body too large") — currently uncovered.
Add a test for the non-dict JSON body branch (e.g. send [1,2] or 42 → 400 "body must be a JSON object") — currently uncovered.
Add a test for run_id present but non-string / empty ({"run_id": 123} / {"run_id": ""} → 400) — only missing-run_id is covered today.
Assert persistence of archived and notes by re-reading the sidecar (we have this for display_name and tags, but not these two — a
validate-then-drop-on-write bug would slip through).
Assert value round-trip for ended_at and heartbeat_at in _serialize, not just key-presence — set ended_at in the sidecar / create a heartbeat
file and check the serialized values (this also exercises the new except OSError guard around heartbeat_mtime).
Add a test that patch_run_meta publishes the SSE update (_publish("update", {"changed":[run_id],"removed":[]})) on success, and does not
publish on the unknown-run / False path.

Frontend (stable_pretraining/web/assets/app.js)

exportMetricsCSV under-escapes: run_id is never quoted, and values containing ", newlines, or \r aren't handled (it only quotes on ,). Use a
proper CSV-quote helper (quote when the value contains ,/"/newline/CR, escape inner " by doubling), and consider neutralizing leading =/+/-/@ to
avoid spreadsheet formula injection.

…le-pretraining into spt_web_update

MarcelMatsal · 2026-06-18T22:42:31Z

Done @RandallBalestriero !

Thank you @MarcelMatsal ! Super nice just last tiny updates: Tests (stable_pretraining/tests/unit/test_web.py)

test_cache_hit_avoids_reparse patches builtins.open, but the code reads via mpath.open (Path.open), which doesn't route through builtins.open
— so the AssertionError side-effect can never fire and the test passes even if the cache is fully broken. Patch pathlib.Path.open instead (like
the OSError tests already do), or assert on a parse counter.

Add a test for the do_PATCH body-too-large branch (Content-Length > 64 KiB → 400 "request body too large") — currently uncovered.

Add a test for the non-dict JSON body branch (e.g. send [1,2] or 42 → 400 "body must be a JSON object") — currently uncovered.

Add a test for run_id present but non-string / empty ({"run_id": 123} / {"run_id": ""} → 400) — only missing-run_id is covered today.

Assert persistence of archived and notes by re-reading the sidecar (we have this for display_name and tags, but not these two — a
validate-then-drop-on-write bug would slip through).

Assert value round-trip for ended_at and heartbeat_at in _serialize, not just key-presence — set ended_at in the sidecar / create a heartbeat
file and check the serialized values (this also exercises the new except OSError guard around heartbeat_mtime).

Add a test that patch_run_meta publishes the SSE update (_publish("update", {"changed":[run_id],"removed":[]})) on success, and does not
publish on the unknown-run / False path.

Frontend (stable_pretraining/web/assets/app.js)

exportMetricsCSV under-escapes: run_id is never quoted, and values containing ", newlines, or \r aren't handled (it only quotes on ,). Use a
proper CSV-quote helper (quote when the value contains ,/"/newline/CR, escape inner " by doubling), and consider neutralizing leading =/+/-/@ to
avoid spreadsheet formula injection.

MarcelMatsal and others added 16 commits May 19, 2026 09:05

Merge pull request #3 from galilai-group/main

6f47c0d

Updating Main Branch

implemented state persistence

21780f9

Merge branch 'galilai-group:main' into main

fa2423a

Merge pull request #4 from MarcelMatsal/main

2ddada9

Merging Main Updates into this branch

many new features for spt web, more coming soon

7df08bd

making it possible to update the names internally and have persistence

985605a

now you can add notes per note that persist throughout, in addition, …

f708f62

…the text box changes in size based on the amount of notes

added a small live display to logs outputting, in addition, automatic…

6009554

… refreshing for those .out and .err

added duration for the runs

5bcbfc1

added functionality to detect stale runs, that would be shown as runn…

ae86b50

…ing but are not actually

added scatterplot to compare different metrics across runs

b40300f

csv download functionality added

fc11d27

added resizing functionality for the tables allowing to easily reset …

e235294

…zooming in

added resizeable sidebar

dd3d97a

added keyboard shortcuts

9095392

added functionality to add new tags

4d1b488

MarcelMatsal requested a review from RandallBalestriero as a code owner June 5, 2026 14:20

fixing format

c95b0cf

MarcelMatsal changed the title ~~[Feature] Major Updates to SPT Web~~ [WIP][Feature] Major Updates to SPT Web Jun 5, 2026

MarcelMatsal and others added 11 commits June 5, 2026 11:07

Merge branch 'main' into spt_web_update

b0b69c9

made it so that removing a zoom, zooms out of all graphs

bd06035

added button to table that will hide all the things that are the same

2599ffe

functionality to download tables

e9dffb4

functionality to combine different metrics into a single graph for a run

fef791f

update to light mode, making it more pleasing to the eyes

5eff3a9

feature to be able to choose if min or max is better for a graph

58faa18

added graceful failure when files are deleted

d4a8c33

added more graceful error catching to renaming

0769506

updates for agents to understand how to interact with spt web

7dfc150

added lots of unit tests

e55a34c

MarcelMatsal changed the title ~~[WIP][Feature] Major Updates to SPT Web~~ [Feature] Major Updates to SPT Web Jun 7, 2026

MarcelMatsal and others added 6 commits June 7, 2026 14:55

fixing precommit errors

2714a7d

fixed final precommit errors from the jax implementation

155e9ef

small updates to pass all tests

9daba1a

fixing failing integration tests

29c0474

Revert "fixing failing integration tests"

d271b05

This reverts commit 29c0474.

Merge branch 'main' into spt_web_update

4a5bb07

MarcelMatsal added 3 commits June 18, 2026 18:25

first round of additional tests and fixes

1020dd7

Merge branch 'spt_web_update' of https://github.com/MarcelMatsal/stab…

d70d5d0

…le-pretraining into spt_web_update

new safety fix to frontend

be2201c

RandallBalestriero and others added 2 commits June 19, 2026 13:33

Merge branch 'main' into spt_web_update

ff9ddf0

Merge branch 'main' into spt_web_update

f788bdd

RandallBalestriero approved these changes Jun 20, 2026

View reviewed changes

RandallBalestriero merged commit 35263d4 into galilai-group:main Jun 20, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Major Updates to SPT Web#428

[Feature] Major Updates to SPT Web#428
RandallBalestriero merged 39 commits into
galilai-group:mainfrom
MarcelMatsal:spt_web_update

MarcelMatsal commented Jun 5, 2026 •

edited

Loading

Uh oh!

RandallBalestriero commented Jun 16, 2026

Uh oh!

MarcelMatsal commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

MarcelMatsal commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

State persistence & URL sharing

Chart value annotations

Runs table view

Run display names & inline editing

Notes editing

Live log tail

Run elapsed time / duration

Heartbeat staleness indicator

Scatter plot

Metric CSV export

Zoom reset on charts

Sidebar resize + virtual scroll

Keyboard shortcuts

Tag editing from UI

Combine metrics into one chart

Per-chart PNG export

Per-metric direction override

Hide same-value columns in table

Light mode visual refinement

File-deletion safety (TOCTOU fixes + deletion toast)

Rename / edit robustness

Checklist

Uh oh!

RandallBalestriero commented Jun 16, 2026

Uh oh!

MarcelMatsal commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MarcelMatsal commented Jun 5, 2026 •

edited

Loading

MarcelMatsal commented Jun 18, 2026 •

edited

Loading