CornerPoint3D: look at the nearest corner instead of the center

3D object detection aims to predict object centers, dimensions, and rotations from LiDAR point clouds. Despite its simplicity, LiDAR captures only the near side of objects, making center-based detectors prone to poor localization accuracy in cross-domain tasks with varying point distributions. Meanwhile, existing evaluation metrics designed for single-domain assessment also suffer from overfitting due to dataset-specific size variations. A key question arises: Do we really need models to maintain excellent performance in the entire 3D bounding boxes after being applied across domains? Actually, one of our main focuses is on preventing collisions between vehicles and other obstacles, especially in cross-domain scenarios where correctly predicting the sizes is much more difficult. To address these issues, we rethink cross-domain 3D object detection from a practical perspective. We propose two new metrics that evaluate a model's ability to detect objects' closer-surfaces to the LiDAR sensor. Additionally, we introduce EdgeHead, a refinement head that guides models to focus more on learnable closer surfaces, significantly improving cross-domain performance under both our new and traditional BEV/3D metrics. Furthermore, we argue that predicting the nearest corner rather than the object center enhances robustness. We propose a novel 3D object detector, coined as CornerPoint3D, which is built upon CenterPoint and uses heatmaps to supervise the learning and detection of the nearest corner of each object. Our proposed methods realize a balanced trade-off between the detection quality of entire bounding boxes and the locating accuracy of closer surfaces to the LiDAR sensor, outperforming the traditional center-based detector CenterPoint in multiple cross-domain tasks and providing a more practically reasonable and robust cross-domain 3D object detection solution.

cs.CV, cs.AI

10.48550/arXiv.2504.02464

arXiv

Zhang, Ruixiao

fc3c4eb9-b692-4ab3-8056-030cb6731fc5

Guan, Runwei

c9bbd12d-493e-4e99-a2eb-7b6150a0bde8

Chen, Xiangyu

ad3807ce-7bd5-43fb-9cdd-43674b20393f

Prugel-Bennett, Adam

b107a151-1751-4d8b-b8db-2c395ac4e14e

Cai, Xiaohao

de483445-45e9-4b21-a4e8-b0427fc72cee

3 April 2025

Zhang, Ruixiao

fc3c4eb9-b692-4ab3-8056-030cb6731fc5

Guan, Runwei

c9bbd12d-493e-4e99-a2eb-7b6150a0bde8

Chen, Xiangyu

ad3807ce-7bd5-43fb-9cdd-43674b20393f

Prugel-Bennett, Adam

b107a151-1751-4d8b-b8db-2c395ac4e14e

Cai, Xiaohao

de483445-45e9-4b21-a4e8-b0427fc72cee

[Unknown type: UNSPECIFIED]

Record type: UNSPECIFIED

Abstract

Text

2504.02464v1 - Author's Original

Available under License Creative Commons Attribution.

Download (4MB)

More information

Published date: 3 April 2025

Additional Information: arXiv admin note: substantial text overlap with arXiv:2407.04061

Keywords: cs.CV, cs.AI

Learn more about Vision, Learning and Control research Learn more about School of Electronics and Computer Science research

Identifiers

Local EPrints ID: 501769

URI: http://eprints.soton.ac.uk/id/eprint/501769

DOI: doi:10.48550/arXiv.2504.02464

PURE UUID: 12ef9d33-1a49-444c-be1e-67b19e6722b7

ORCID for Xiaohao Cai:

orcid.org/0000-0003-0924-2834

Catalogue record

Date deposited: 09 Jun 2025 18:06

Last modified: 10 Jun 2025 02:04

Export record

Altmetrics

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Ruixiao Zhang

Author: Runwei Guan

Author: Xiangyu Chen

Author: Adam Prugel-Bennett

Author: Xiaohao Cai

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information