The University of Southampton
University of Southampton Institutional Repository

AugMapNet: improving spatial latent structure via BEV grid augmentation for enhanced vectorized online HD map construction

AugMapNet: improving spatial latent structure via BEV grid augmentation for enhanced vectorized online HD map construction
AugMapNet: improving spatial latent structure via BEV grid augmentation for enhanced vectorized online HD map construction
Autonomous driving requires understanding infrastructure elements, such as lanes and crosswalks. To navigate safely, this understanding must be derived from sensor data in real-time and needs to be represented in vectorized form. Learned Bird’s-Eye View (BEV) encoders are commonly used to combine a set of camera images from multiple views into one joint latent BEV grid. Traditionally, from this latent space, an intermediate raster map is predicted, providing dense spatial supervision but requiring post-processing into the desired vectorized form. More recent models directly derive infrastructure elements as polylines using vectorized map decoders, providing instance-level information. Our approach, Augmentation Map Network (AugMapNet), proposes latent BEV feature grid augmentation, a novel technique that significantly enhances the latent BEV representation. AugMapNet combines vector decoding and dense spatial supervision more effectively than existing architectures while remaining easy to integrate compared to other hybrid approaches. It additionally benefits from extra processing on its latent BEV features. Experiments on nuScenes and Argoverse2 datasets demonstrate significant improvements on vectorized map prediction of up to 13.3 % over the StreamMapNet baseline on 60 m range and greater improvements on larger ranges. We confirm transferability by applying our method to another baseline, SQD-MapNet, and find similar improvements. A detailed analysis of the latent BEV grid confirms a more structured latent space of AugMapNet and shows the value of our novel concept beyond pure performance improvement. The code can be found at https://github.com/tmonnin/augmapnet.
8541-8550
IEEE
Monninger, Thomas
4b9da19d-b0db-44fa-81df-85cfa01bb716
Anwar, Md Zafar
6757b332-586c-4dce-9ff2-f740e38a681b
Antol, Stanislaw
63498576-45e5-4b9b-9484-b19c968b2f9c
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
Ding, Sihao
509a57ec-06d6-4f50-8013-71078448906c
Monninger, Thomas
4b9da19d-b0db-44fa-81df-85cfa01bb716
Anwar, Md Zafar
6757b332-586c-4dce-9ff2-f740e38a681b
Antol, Stanislaw
63498576-45e5-4b9b-9484-b19c968b2f9c
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
Ding, Sihao
509a57ec-06d6-4f50-8013-71078448906c

Monninger, Thomas, Anwar, Md Zafar, Antol, Stanislaw, Staab, Steffen and Ding, Sihao (2026) AugMapNet: improving spatial latent structure via BEV grid augmentation for enhanced vectorized online HD map construction. In 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE. pp. 8541-8550 . (doi:10.1109/WACV61042.2026.00824).

Record type: Conference or Workshop Item (Paper)

Abstract

Autonomous driving requires understanding infrastructure elements, such as lanes and crosswalks. To navigate safely, this understanding must be derived from sensor data in real-time and needs to be represented in vectorized form. Learned Bird’s-Eye View (BEV) encoders are commonly used to combine a set of camera images from multiple views into one joint latent BEV grid. Traditionally, from this latent space, an intermediate raster map is predicted, providing dense spatial supervision but requiring post-processing into the desired vectorized form. More recent models directly derive infrastructure elements as polylines using vectorized map decoders, providing instance-level information. Our approach, Augmentation Map Network (AugMapNet), proposes latent BEV feature grid augmentation, a novel technique that significantly enhances the latent BEV representation. AugMapNet combines vector decoding and dense spatial supervision more effectively than existing architectures while remaining easy to integrate compared to other hybrid approaches. It additionally benefits from extra processing on its latent BEV features. Experiments on nuScenes and Argoverse2 datasets demonstrate significant improvements on vectorized map prediction of up to 13.3 % over the StreamMapNet baseline on 60 m range and greater improvements on larger ranges. We confirm transferability by applying our method to another baseline, SQD-MapNet, and find similar improvements. A detailed analysis of the latent BEV grid confirms a more structured latent space of AugMapNet and shows the value of our novel concept beyond pure performance improvement. The code can be found at https://github.com/tmonnin/augmapnet.

Text
Monninger_AugMapNet_Improving_Spatial_Latent_Structure_via_BEV_Grid_Augmentation_for_WACV_2026_paper - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (5MB)

More information

Published date: 5 May 2026
Venue - Dates: The IEEE/CVF Winter Conference on Applications of Computer Vision, , Tucson, Arizona, United States, 2026-03-06 - 2026-03-10

Identifiers

Local EPrints ID: 511653
URI: http://eprints.soton.ac.uk/id/eprint/511653
PURE UUID: 01bc9c4e-2950-4a00-9802-ff73f8b4aef8
ORCID for Steffen Staab: ORCID iD orcid.org/0000-0002-0780-4154

Catalogue record

Date deposited: 26 May 2026 17:00
Last modified: 27 May 2026 01:48

Export record

Altmetrics

Contributors

Author: Thomas Monninger
Author: Md Zafar Anwar
Author: Stanislaw Antol
Author: Steffen Staab ORCID iD
Author: Sihao Ding

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×