Talk2Radar: bridging natural language with 4D mmWave radar for 3D referring expression comprehension
Talk2Radar: bridging natural language with 4D mmWave radar for 3D referring expression comprehension
Embodied perception is essential for intelligent vehicles and robots in interactive environmental understanding. However, these advancements primarily focus on vision, with limited attention given to using 3D modeling sensors, restricting a comprehensive understanding of objects in response to prompts containing qualitative and quantitative queries. Recently, as a promising automotive sensor with affordable cost, 4D millimeter-wave radars provide denser point clouds than conventional radars and perceive both semantic and physical characteristics of objects, thereby enhancing the reliability of perception systems. To foster the development of natural language-driven context understanding in radar scenes for 3D visual grounding, we construct the first dataset, Talk2Radar, which bridges these two modalities for 3D Referring Expression Comprehension (REC). Talk2Radar contains 8,682 referring prompt samples with 20, 558 referred objects. Moreover, we propose a novel model, T-RadarNet, for 3D REC on point clouds, achieving State-Of-The-Art (SOTA) performance on the Talk2Radar dataset compared to counterparts. Deformable-FPN and Gated Graph Fusion are meticulously designed for efficient point cloud feature modeling and cross-modal fusion between radar and text features, respectively. Comprehensive experiments provide deep insights into radar-based 3D REC. We release our project at https://github.com/GuanRunwei/Talk2Radar.
cs.RO, cs.CV
10884-10891
Guan, Runwei
c9bbd12d-493e-4e99-a2eb-7b6150a0bde8
Zhang, Ruixiao
2c6f06ac-00dd-461c-b35b-cff245bac50b
Ouyang, Ningwei
7fe50e3f-2ed7-4e54-8821-e3344190835f
Liu, Jianan
2fb92d3c-502a-4f3c-89ef-69b18a5a262e
Man, Ka Lok
14dda91a-2203-4b90-af2d-99765519b5cc
Cai, Xiaohao
de483445-45e9-4b21-a4e8-b0427fc72cee
Xu, Ming
78785a43-fb4d-4914-97c3-5416f0aca646
Smith, Jeremy
6d488539-cb40-4e16-8f04-805687fe7a1e
Lim, Eng Gee
431ad550-6a3b-4403-9a00-39ba30b97ca9
Yue, Yutao
39e0cc36-8d8e-4be2-bac2-49fb05cc962f
Xiong, Hui
3c13b2bd-05c3-4a86-b06b-e2a5bcd5a1e0
2 September 2025
Guan, Runwei
c9bbd12d-493e-4e99-a2eb-7b6150a0bde8
Zhang, Ruixiao
2c6f06ac-00dd-461c-b35b-cff245bac50b
Ouyang, Ningwei
7fe50e3f-2ed7-4e54-8821-e3344190835f
Liu, Jianan
2fb92d3c-502a-4f3c-89ef-69b18a5a262e
Man, Ka Lok
14dda91a-2203-4b90-af2d-99765519b5cc
Cai, Xiaohao
de483445-45e9-4b21-a4e8-b0427fc72cee
Xu, Ming
78785a43-fb4d-4914-97c3-5416f0aca646
Smith, Jeremy
6d488539-cb40-4e16-8f04-805687fe7a1e
Lim, Eng Gee
431ad550-6a3b-4403-9a00-39ba30b97ca9
Yue, Yutao
39e0cc36-8d8e-4be2-bac2-49fb05cc962f
Xiong, Hui
3c13b2bd-05c3-4a86-b06b-e2a5bcd5a1e0
Guan, Runwei, Zhang, Ruixiao, Ouyang, Ningwei, Liu, Jianan, Man, Ka Lok, Cai, Xiaohao, Xu, Ming, Smith, Jeremy, Lim, Eng Gee, Yue, Yutao and Xiong, Hui
(2025)
Talk2Radar: bridging natural language with 4D mmWave radar for 3D referring expression comprehension.
Ott, Christian, Admoni, Henny, Behnke, Sven, Bogdan, Stjepan, Bolopion, Aude, Choi, Youngjin, Ficuciello, Fanny, Gans, Nicholas, Gosselin, Clement, Harada, Kensuke, Kayacan, Erdal, Kim, H. Jin, Leutenegger, Stefan, Liu, Zhe, Maiolino, Perla, Marques, Lino, Matsubara, Takamitsu, Mavromatti, Anastasia, Minor, Mark, O'Kane, Jason, Park, Hae Won, Park, Hae-Won, Rekleitis, Ioannis, Renda, Federico, Ricci, Elisa, Riek, Laurel D., Sabattini, Lorenzo, Shen, Shaojie, Sun, Yu, Wieber, Pierre-Brice, Yamane, Katsu and Yu, Jingjin
(eds.)
In 2025 IEEE International Conference on Robotics and Automation, ICRA 2025.
IEEE.
.
(doi:10.48550/arXiv.2405.12821).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Embodied perception is essential for intelligent vehicles and robots in interactive environmental understanding. However, these advancements primarily focus on vision, with limited attention given to using 3D modeling sensors, restricting a comprehensive understanding of objects in response to prompts containing qualitative and quantitative queries. Recently, as a promising automotive sensor with affordable cost, 4D millimeter-wave radars provide denser point clouds than conventional radars and perceive both semantic and physical characteristics of objects, thereby enhancing the reliability of perception systems. To foster the development of natural language-driven context understanding in radar scenes for 3D visual grounding, we construct the first dataset, Talk2Radar, which bridges these two modalities for 3D Referring Expression Comprehension (REC). Talk2Radar contains 8,682 referring prompt samples with 20, 558 referred objects. Moreover, we propose a novel model, T-RadarNet, for 3D REC on point clouds, achieving State-Of-The-Art (SOTA) performance on the Talk2Radar dataset compared to counterparts. Deformable-FPN and Gated Graph Fusion are meticulously designed for efficient point cloud feature modeling and cross-modal fusion between radar and text features, respectively. Comprehensive experiments provide deep insights into radar-based 3D REC. We release our project at https://github.com/GuanRunwei/Talk2Radar.
Text
2405.12821v2
- Author's Original
Available under License Other.
More information
Published date: 2 September 2025
Venue - Dates:
2025 IEEE International Conference on Robotics and Automation, ICRA 2025, , Atlanta, United States, 2025-05-19 - 2025-05-23
Keywords:
cs.RO, cs.CV
Identifiers
Local EPrints ID: 498021
URI: http://eprints.soton.ac.uk/id/eprint/498021
ISSN: 1050-4729
PURE UUID: 71281a4c-3c2e-47f2-90a4-8fb72e28b6d2
Catalogue record
Date deposited: 06 Feb 2025 17:32
Last modified: 08 Jan 2026 03:06
Export record
Altmetrics
Contributors
Author:
Runwei Guan
Author:
Ruixiao Zhang
Author:
Ningwei Ouyang
Author:
Jianan Liu
Author:
Ka Lok Man
Author:
Xiaohao Cai
Author:
Ming Xu
Author:
Jeremy Smith
Author:
Eng Gee Lim
Author:
Yutao Yue
Author:
Hui Xiong
Editor:
Christian Ott
Editor:
Henny Admoni
Editor:
Sven Behnke
Editor:
Stjepan Bogdan
Editor:
Aude Bolopion
Editor:
Youngjin Choi
Editor:
Fanny Ficuciello
Editor:
Nicholas Gans
Editor:
Clement Gosselin
Editor:
Kensuke Harada
Editor:
Erdal Kayacan
Editor:
H. Jin Kim
Editor:
Stefan Leutenegger
Editor:
Zhe Liu
Editor:
Perla Maiolino
Editor:
Lino Marques
Editor:
Takamitsu Matsubara
Editor:
Anastasia Mavromatti
Editor:
Mark Minor
Editor:
Jason O'Kane
Editor:
Hae Won Park
Editor:
Hae-Won Park
Editor:
Ioannis Rekleitis
Editor:
Federico Renda
Editor:
Elisa Ricci
Editor:
Laurel D. Riek
Editor:
Lorenzo Sabattini
Editor:
Shaojie Shen
Editor:
Yu Sun
Editor:
Pierre-Brice Wieber
Editor:
Katsu Yamane
Editor:
Jingjin Yu
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics