Add OpenStreetMap SD-map prior (SDTagNet) as a vector map branch by immel-f · Pull Request #89 · autowarefoundation/auto_e2e

immel-f · 2026-06-24T00:16:37Z

Hi, as mentioned in this issue #47, I tried to integrate the SDTagNet SD map encoder as a vector map branch. It would be awesome if vector map branches could make it into the model somehow! This is a early draft and you are very welcome to propose or implement any changes 😄 I tested that the tests still pass as before (3 fail even before) and there is a test visualization of the parsed OSM map data (from a random position, thats why it is not on a road):

I tried to make as little changes as possible to the original encoder, but to have one cache of the maps that works for both branches some changes had to be made there and now the raw OSM xml maps are cached and converted when loading for the original branch. I did not test myself if the original branch still works with these changes, I didn't find any visualization where I could test that easily, if I missed some script for that I could also try that quickly.

Changes

Model (`Model/model_components/map_encoder/`)

osm_vector/osm_map_encoder.py — OSMVectorMapEncoder, ported from
OSMMapEncoderPointLevel, stripped of mmdetection3d/mmcv and ablation-only
code (kept only the canonical config: point-level tokens, NLP tag embeddings,
ORF graph identifiers with fixed order, sine continuous positional encoding).
Returns (tokens [B,N,C], key_padding_mask [B,N]). NLP model is loaded lazily
from a local path or injected (for tests); constructed on CPU so the module is
single-device until .to().
map_bev_fusion/osm_cross_attn_fusion.py — OSMCrossAttnFusion: BEV cells
cross-attend to OSM tokens via F.scaled_dot_product_attention (mem-efficient
backend, so the full 450×300 grid never materialises a QK matrix), honours the
padding mask, has a learnable null token (no NaN when a sample has no OSM), and
a zero-init per-channel gate (training starts identical to no-map).
Registered osm_vector / osm_cross_attn; wired into AutoE2E behind
map_type (default path byte-identical; new map_encoder_kwargs passthrough).

Data pipeline (`Model/data_parsing/osm_sd_map/`)

osm_parser.py — OSM XML parser + ego-centric patch extraction (ported;
city-coord conversion replaced by generic wgs84_to_local.py, ego frame
X=forward/Y=left/Z=up).
overpass.py — the exact tested SDTagNet Overpass query; trajectory-bbox
fetch with a shared gzip cache and containment reuse (a request is served
by any cached XML whose bbox covers it).
osm_tokenize.py — fixed-point way resampling + tag tokenisation → the
ragged osm_map_data dict.
cache.py — per-episode tokenised .pt shards; one Overpass fetch per
episode sized to the trajectory.
collate.py — collate_osm_batch for ragged OSM fields (standard keys still
default-collated).
nlp_download.py — downloads + extracts the trained tag encoder from
immel-f/SDTagNet (nlp_encoder/bert-144-osm-tags-embed-from_scratch.tar.gz).
visualize.py — SDTagNet-style SD-map figure + a real-data validation CLI
(python -m data_parsing.osm_sd_map.visualize ... --run-encoder).
L2DDataset — flag-gated osm_cache_dir merges per-frame OSM into samples,
plus episode_ego_poses() for building the cache.

Shared cache (both branches)

Both the rendered-tile and vector branches now fetch by trajectory bounding
box + margin instead of a fixed centroid radius — this fixes a latent bug
where episodes longer than ~1.5 km silently lost OSM at the trajectory ends.
The renderer (map_rendering/cache.py) builds its graph from the same shared
raw-OSM XML via ox.graph_from_xml. The vector branch's fetch margin defaults
to the renderer's render radius (DEFAULT_FETCH_MARGIN_M=800) so they share one
download; pass fetch_margin_m=patch_reach_m(pc_range) for a minimal
vector-only fetch.

Dependencies

Added to requirements.txt (NLP path is required, not optional):
sentence-transformers, transformers, shapely, pyproj, requests,
huggingface_hub, matplotlib.

Misc

.gitignore: checkpoints/, cache/, *.osm.gz, osm_bbox_*.xml.gz,
graph_*.pkl, osm_map_vis.png.

Testing

New offline suites (tests/test_osm_vector.py, tests/test_osm_sd_map_data.py):
encoder shapes/mask/relations/grad, fusion (zero-gate / null-token / masking),
registries, collate, AutoE2E osm_vector end-to-end, WGS84→ego transform,
exact Overpass query string, parser+tokenise, trace-bbox + containment-reuse
cache, and the visualizer (realistic scene). Heavy deps guarded with
importorskip; a fake NLP module keeps the model tests checkpoint-free.
Full suite passes except 3 pre-existing TestBatchIndependence cases that
fail only on CUDA (batch-size-dependent reduction nondeterminism in the
default rasterized path; unrelated to this change — they pass on CPU).

Validating on real data

cd Model
python -m data_parsing.osm_sd_map.visualize --lat 48.9930 --lon 8.4037 --heading 0.0  --download-nlp --run-encoder --out osm_map_vis.png

Fetches real OSM, runs the full pipeline, renders the SD map, and runs the
encoder forward — for eyeballing geometry/ego-framing/tag decoding.

Notes / follow-ups

Confirm the L2D heading convention (vehicle[1]) against the visualization.
Confirm ox.graph_from_xml filtering parity with network_type="drive" on
your osmnx version (cosmetic for the tile).
Architecture diagram in Model/README.md still predates this branch.
Possible follow-ups: a --from-l2d EPISODE FRAME flag for the visualizer;
token-level fusion variants; NLP-encoder pretraining scripts.

Signed-off-by: Fabian Immel <fabian.immel@kit.edu>

draft integration of sdtagnet sd map encoder

e67b2a1

Signed-off-by: Fabian Immel <fabian.immel@kit.edu>

immel-f force-pushed the sdtagnet_sd_map_encoder branch from e5f2512 to e67b2a1 Compare June 24, 2026 00:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OpenStreetMap SD-map prior (SDTagNet) as a vector map branch#89

Add OpenStreetMap SD-map prior (SDTagNet) as a vector map branch#89
immel-f wants to merge 1 commit into
autowarefoundation:mainfrom
immel-f:sdtagnet_sd_map_encoder

immel-f commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

immel-f commented Jun 24, 2026

Changes

Model (Model/model_components/map_encoder/)

Data pipeline (Model/data_parsing/osm_sd_map/)

Shared cache (both branches)

Dependencies

Misc

Testing

Validating on real data

Notes / follow-ups

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Model (`Model/model_components/map_encoder/`)

Data pipeline (`Model/data_parsing/osm_sd_map/`)