Cross-domain analysis of 3D object detection with point clouds
Cross-domain analysis of 3D object detection with point clouds
Deep learning models such as convolutional neural networks and transformers have
been widely applied to solve 3D object detection problems in the domain of autonomous driving. However, the robustness of models related to cross-domain tasks is usually ignored. To adapt models to other domains, including different cities, countries and weathers, re-training with the target domain data is usually necessary for existing 3D object detection methods, which hinders the wide application of autonomous driving. To better understand the challenges in cross-domain 3D object detection tasks, we first deeply analyse the cross-domain performance of the state-of-the-art models. We observe that most models will overfit the training domains and it is challenging to directly adapt them to other domains. Meanwhile, existing domain adaptation solutions are shifting the knowledge domain of the models instead of improving their generalisation ability. To encounter the overfitting problem, we propose a novel sub-task that pays more attention to the detection quality of the bounding box’s surfaces and corners that are closer to the LiDAR sensor. Two additional evaluation metrics and a novel refinement head named EdgeHead are proposed for this new task. By guiding models to focus more on the learnable closer surfaces, EdgeHead greatly improves the cross-domain performance of existing models. Furthermore, we propose a more robust cross-domain 3D object detector, CornerPoint3D, which focuses more on the visible point cloud data and the nearest corners of objects towards the LiDAR sensor, realising a balanced trade-off between the detection quality of entire bounding boxes and the closer surfaces to the LiDAR sensor, providing a more practical and robust cross-domain 3D object detection solution. In addition, we also explore the 3D referring expression comprehension task. We introduce Talk2Radar, the first dataset including both the LiDAR and radar data for this task, and an efficient LiDAR/radar-based 3D REC model, T-RadarNet, which achieves SOTA performance on the Talk2Radar dataset. These works present deep investigations and more robust solutions for cross-domain 3D object detection, as well as advances in the interactive perception capabilities of LiDAR and 4D radar for environmental understanding in autonomous driving.
University of Southampton
Zhang, Ruixiao
fc3c4eb9-b692-4ab3-8056-030cb6731fc5
July 2025
Zhang, Ruixiao
fc3c4eb9-b692-4ab3-8056-030cb6731fc5
Cai, Xiaohao
de483445-45e9-4b21-a4e8-b0427fc72cee
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Zhang, Ruixiao
(2025)
Cross-domain analysis of 3D object detection with point clouds.
University of Southampton, Doctoral Thesis, 145pp.
Record type:
Thesis
(Doctoral)
Abstract
Deep learning models such as convolutional neural networks and transformers have
been widely applied to solve 3D object detection problems in the domain of autonomous driving. However, the robustness of models related to cross-domain tasks is usually ignored. To adapt models to other domains, including different cities, countries and weathers, re-training with the target domain data is usually necessary for existing 3D object detection methods, which hinders the wide application of autonomous driving. To better understand the challenges in cross-domain 3D object detection tasks, we first deeply analyse the cross-domain performance of the state-of-the-art models. We observe that most models will overfit the training domains and it is challenging to directly adapt them to other domains. Meanwhile, existing domain adaptation solutions are shifting the knowledge domain of the models instead of improving their generalisation ability. To encounter the overfitting problem, we propose a novel sub-task that pays more attention to the detection quality of the bounding box’s surfaces and corners that are closer to the LiDAR sensor. Two additional evaluation metrics and a novel refinement head named EdgeHead are proposed for this new task. By guiding models to focus more on the learnable closer surfaces, EdgeHead greatly improves the cross-domain performance of existing models. Furthermore, we propose a more robust cross-domain 3D object detector, CornerPoint3D, which focuses more on the visible point cloud data and the nearest corners of objects towards the LiDAR sensor, realising a balanced trade-off between the detection quality of entire bounding boxes and the closer surfaces to the LiDAR sensor, providing a more practical and robust cross-domain 3D object detection solution. In addition, we also explore the 3D referring expression comprehension task. We introduce Talk2Radar, the first dataset including both the LiDAR and radar data for this task, and an efficient LiDAR/radar-based 3D REC model, T-RadarNet, which achieves SOTA performance on the Talk2Radar dataset. These works present deep investigations and more robust solutions for cross-domain 3D object detection, as well as advances in the interactive perception capabilities of LiDAR and 4D radar for environmental understanding in autonomous driving.
Text
Ruixiao_Southampton_PhD_Thesis_Final_PDFA
- Version of Record
Text
Final-thesis-submission-Examination-Mr-Ruixiao-Zhang
Restricted to Repository staff only
More information
Published date: July 2025
Identifiers
Local EPrints ID: 503215
URI: http://eprints.soton.ac.uk/id/eprint/503215
PURE UUID: 11aced06-fc54-439b-a4db-6fa8e266db4c
Catalogue record
Date deposited: 24 Jul 2025 16:38
Last modified: 10 Sep 2025 10:21
Export record
Contributors
Author:
Ruixiao Zhang
Thesis advisor:
Xiaohao Cai
Thesis advisor:
Adam Prugel-Bennett
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics