feat(trajectory_planning): add BEV-only multi-sample trajectory scorer and shifted inference schedule by FLagbusted · Pull Request #76 · autowarefoundation/auto_e2e

FLagbusted · 2026-06-18T20:50:39Z

PR title (Conventional Commits, per CONTRIBUTING.md semantic-pull-request check)

feat(trajectory_planning): add BEV-only multi-sample trajectory scorer and shifted inference schedule

PR body

Summary

Adds a Phase 1, training-free upgrade to the flow-matching driving policy,
scoped to exactly what was discussed in the 17/06 WG meeting: get
multi-sample scoring working against the BEV/map data we already have,
without building GoalFlow's full goal-point vocabulary + learned DAC
classifier yet (tracked separately as Phase 2).

Implements #75 .

What changed

New: Model/model_components/trajectory_planning/trajectory_scorer.py
— TrajectoryComplianceScorer, a zero-trainable-parameter wrapper that
draws K samples from any BasePlanner and re-ranks them by drivable-area
compliance (raw rasterized map pixel lookup) + kinematic comfort
(acceleration/curvature bound violations).
Modified: Model/model_components/trajectory_planning/flow_matching_planner.py
— adds an optional timestep_schedule="shifted" inference mode
implementing GoalFlow's (alpha·t)/(1+(alpha-1)·t) timestep warp.
Default ("uniform") is byte-for-byte identical to current behaviour;
this is purely additive.

Why

FlowMatchingPlanner (#40) generates one trajectory per forward call with
no scoring or compliance check. GoalFlow (arXiv:2503.05689, discussed this
week) shows multi-sample generation + re-ranking meaningfully improves
both safety compliance and final score (PDMS), but its full method needs a
goal-point vocabulary and a learned BEV semantic segmentation head neither
KITScenes nor L2D currently feed into this repo. This PR takes only the
part of that idea that needs nothing new: the rasterized map image already
produced by RasterizedMapEncoder's input pipeline (#55) is enough to do a
deterministic drivable-area check without training anything.

Code snippet — usage example

from Model.model_components.trajectory_planning import build_planner
from Model.model_components.trajectory_planning.trajectory_scorer import (
    TrajectoryComplianceScorer, ScorerConfig,
)

planner = build_planner("flow_matching", embed_dim=256, num_timesteps=64,
                        num_signals=2, egomotion_dim=256,
                        visual_history_dim=896)

scorer = TrajectoryComplianceScorer(
    planner, num_timesteps=64,
    config=ScorerConfig(selection="nearest", dac_weight=1.0, comfort_weight=0.5),
)

trajectory, ego_hidden, scores = scorer.sample_and_score(
    bev_features, visual_history, egomotion_history, map_input,
    num_samples=8, seed=0,
)

Testing done

Smoke test (tests/test_trajectory_scorer.py) — unit-tested
decode_trajectory_to_xy, project_xy_to_bev_pixel,
drivable_area_compliance, kinematic_comfort_score, and
TrajectoryComplianceScorer.sample_and_score against a fake planner
satisfying the BasePlanner contract. Shapes, OOB-point handling, both
selection modes, and the invalid-selection-mode error path all verified.
Runs on CPU with no trained model or dataset required.
Pending — hardware constraint: GPU inference test against real
KITScenes data and a trained FlowMatchingPlanner checkpoint cannot be
run for the next few days due to the contributor's machine being
unavailable. Will update this PR with ADE/quality numbers once hardware
is back — this is not blocking code or calibration review, only the
final merge decision.
Pending — calibration review: project_xy_to_bev_pixel not yet
verified against the actual KITScenes/L2D renderer pixel convention.
Requesting review from whoever owns the map renderer before merge.
Pending — shifted schedule quality: timestep_schedule="shifted"
quality impact on a real validation split not yet measured. Only the
alpha=1.0 == uniform algebraic identity and basic schedule shapes
were confirmed locally.

Checklist

Conventional Commit PR title
References the discussion/issue above
Default behaviour of FlowMatchingPlanner.forward() unchanged
(timestep_schedule="uniform" is the default)
Smoke test passes (CPU, no GPU required)
ruff check . passes
Map-pixel calibration confirmed by renderer-owning reviewer
ADE/quality validation on real KITScenes data — pending hardware

…r and shifted inference schedule Signed-off-by: Flagbusted <justthefourofus@proton.me>

FLagbusted force-pushed the feat/trajectory-scorer branch from 216c0bb to a98d436 Compare June 18, 2026 21:12

riita10069 requested a review from m-zain-khawaja June 22, 2026 15:17

feat(trajectory_planning): add BEV-only multi-sample trajectory score…

c5850c2

…r and shifted inference schedule Signed-off-by: Flagbusted <justthefourofus@proton.me>

FLagbusted force-pushed the feat/trajectory-scorer branch from a98d436 to c5850c2 Compare June 22, 2026 17:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(trajectory_planning): add BEV-only multi-sample trajectory scorer and shifted inference schedule#76

feat(trajectory_planning): add BEV-only multi-sample trajectory scorer and shifted inference schedule#76
FLagbusted wants to merge 1 commit into
autowarefoundation:mainfrom
FLagbusted:feat/trajectory-scorer

FLagbusted commented Jun 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FLagbusted commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR title (Conventional Commits, per CONTRIBUTING.md semantic-pull-request check)

PR body

Summary

What changed

Why

Code snippet — usage example

Testing done

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FLagbusted commented Jun 18, 2026 •

edited

Loading