ENWAR 2.0: an agentic multimodal wireless LLM framework with reasoning, situation-aware explainability and beam tracking
ENWAR 2.0: an agentic multimodal wireless LLM framework with reasoning, situation-aware explainability and beam tracking
The evolution of next-generation wireless networks demands intelligent, adaptive, and explainable decision-making for robust communication in dynamic environments. This paper presents ENWAR 2.0, the first agentic large language model (LLM) framework integrating adaptive retrieval-augmented generation (RAG) and chain-of-thought (CoT) reasoning into situation-aware and explainable wireless network management. ENWAR 2.0 introduces two specialized agents: a transformerfusion (TransFusion)-based beam prediction agent and an environment perception agent, both of which fuse multi-modal sensory inputs—including camera, LiDAR, radar, and GPS—from the DeepSense6G dataset. The beam prediction agent enables infrastructure-to-vehicle (I2V) target-in-the-loop beam tracking and real-time adaptation based on dynamic environmental conditions. In contrast, the environment perception agent provides situation-aware reasoning and justifications for beam decisions. Unlike its predecessor, ENWAR 1.0, which relied on static knowledge bases (KBs) and text-only LLMs, ENWAR 2.0 is designed for CoT reasoning, leverages LLaMa3.2-3B/LLaMa3.1-8B/LLaMa3.3-70B for text-generation, the multi-modal capabilities of LLaMa 3.2, and employs LlamaIndex for fine-grained, dynamic context retrieval, eliminating retrieval ambiguities and enhancing response relevance. Numerical results show that the beam prediction agent achieves up to 90.0% Top-3 accuracy at t + 3, effectively predicting optimal beam selections three time steps ahead. Overall, ENWAR 2.0 achieves state-of-the-art performance, with up to 89.7%/83.5% interpretation/perception correctness, 81.6%/80.9% faithfulness, and 89.9%/88.2% relevancy. In comparison, the baseline pretrained LLaMa3 models without adaptive RAG achieves up to 80.3%/77.3% correctness, and the baseline without RAG performs significantly worse at 67.1%/64.8%. Additionally, ENWAR 2.0 reduces processing time by over 100% relative to the baseline, while its adaptive RAG improves performance by up to 13.7% compared to static RAG.
Nazar, Ahmad M.
08c49739-566d-4afa-8aaf-bcae430fbead
Celik, Abdulkadir
f8e72266-763c-4849-b38e-2ea2f50a69d0
Selim, Mohamed Y.
34252a1d-1a3b-448c-b5c1-52d1428bab4b
Abdallah, Asmaa
86b80268-48be-4bc8-9577-c989e496e459
Qiao, Daji
08190337-6fc1-4e91-9108-9037ba7d69e5
Eltawil, Ahmed M.
5eb9e965-5ec8-4da1-baee-c3cab0fb2a72
Nazar, Ahmad M.
08c49739-566d-4afa-8aaf-bcae430fbead
Celik, Abdulkadir
f8e72266-763c-4849-b38e-2ea2f50a69d0
Selim, Mohamed Y.
34252a1d-1a3b-448c-b5c1-52d1428bab4b
Abdallah, Asmaa
86b80268-48be-4bc8-9577-c989e496e459
Qiao, Daji
08190337-6fc1-4e91-9108-9037ba7d69e5
Eltawil, Ahmed M.
5eb9e965-5ec8-4da1-baee-c3cab0fb2a72
Nazar, Ahmad M., Celik, Abdulkadir, Selim, Mohamed Y., Abdallah, Asmaa, Qiao, Daji and Eltawil, Ahmed M.
(2025)
ENWAR 2.0: an agentic multimodal wireless LLM framework with reasoning, situation-aware explainability and beam tracking.
IEEE Transactions on Mobile Computing.
(doi:10.1109/TMC.2025.3629736).
Abstract
The evolution of next-generation wireless networks demands intelligent, adaptive, and explainable decision-making for robust communication in dynamic environments. This paper presents ENWAR 2.0, the first agentic large language model (LLM) framework integrating adaptive retrieval-augmented generation (RAG) and chain-of-thought (CoT) reasoning into situation-aware and explainable wireless network management. ENWAR 2.0 introduces two specialized agents: a transformerfusion (TransFusion)-based beam prediction agent and an environment perception agent, both of which fuse multi-modal sensory inputs—including camera, LiDAR, radar, and GPS—from the DeepSense6G dataset. The beam prediction agent enables infrastructure-to-vehicle (I2V) target-in-the-loop beam tracking and real-time adaptation based on dynamic environmental conditions. In contrast, the environment perception agent provides situation-aware reasoning and justifications for beam decisions. Unlike its predecessor, ENWAR 1.0, which relied on static knowledge bases (KBs) and text-only LLMs, ENWAR 2.0 is designed for CoT reasoning, leverages LLaMa3.2-3B/LLaMa3.1-8B/LLaMa3.3-70B for text-generation, the multi-modal capabilities of LLaMa 3.2, and employs LlamaIndex for fine-grained, dynamic context retrieval, eliminating retrieval ambiguities and enhancing response relevance. Numerical results show that the beam prediction agent achieves up to 90.0% Top-3 accuracy at t + 3, effectively predicting optimal beam selections three time steps ahead. Overall, ENWAR 2.0 achieves state-of-the-art performance, with up to 89.7%/83.5% interpretation/perception correctness, 81.6%/80.9% faithfulness, and 89.9%/88.2% relevancy. In comparison, the baseline pretrained LLaMa3 models without adaptive RAG achieves up to 80.3%/77.3% correctness, and the baseline without RAG performs significantly worse at 67.1%/64.8%. Additionally, ENWAR 2.0 reduces processing time by over 100% relative to the baseline, while its adaptive RAG improves performance by up to 13.7% compared to static RAG.
Text
EnwarV2___Beam_Prediction-2
- Accepted Manuscript
More information
e-pub ahead of print date: 6 November 2025
Identifiers
Local EPrints ID: 508740
URI: http://eprints.soton.ac.uk/id/eprint/508740
ISSN: 1536-1233
PURE UUID: 9a739834-9b87-4e44-a186-a45d6d5b2b26
Catalogue record
Date deposited: 02 Feb 2026 17:58
Last modified: 03 Feb 2026 03:15
Export record
Altmetrics
Contributors
Author:
Ahmad M. Nazar
Author:
Abdulkadir Celik
Author:
Mohamed Y. Selim
Author:
Asmaa Abdallah
Author:
Daji Qiao
Author:
Ahmed M. Eltawil
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics