Using reinforcement learning to combine Green Light Optimised Speed Advisory and responsive traffic control systems with non-autonomous vehicles: Effects of imperfect training and incomplete information
Using reinforcement learning to combine Green Light Optimised Speed Advisory and responsive traffic control systems with non-autonomous vehicles: Effects of imperfect training and incomplete information
The advantages of Green Light Optimal Speed Advisory (GLOSA) include improvements in travel times, fuel consumption, battery life, emissions, the number of stops, and ride smoothness for cars, buses, HGVs, HEVs, and EVs in scenarios ranging from isolated junctions to connected networks. However, GLOSA is a detriment when paired with responsive traffic control systems (RTCs), which are presently widespread, greatly reducing the number of locations where the advantages of GLOSA could be obtained. This is because RTCs cannot provide accurate long-term future signal plans without the introduction of some mitigating solution. However, as Reinforcement Learning methods can, by trial and error, identify unseen solutions to complicated problems, there appears to be no reason they couldn’t be used to create frameworks designed to combine GLOSA and RTCs and benefit (Connected Non-Autonomous Vehicles) CNAVs, allowing the advantages of GLOSA to be obtained with the present traffic and traffic control make-ups. Therefore, this thesis investigates the performance of such frameworks in both ideal and non-ideal conditions. To achieve this, an initial framework was constructed and evaluated on a simulated isolated junction. It was found that the initial framework had a positive impact on stopping time and junction entry speed when traffic densities were at 55% of saturation levels but a negative impact at traffic densities of 70% as the Reinforcement Learning Traffic Control (RLTC) system was unable to reach states where vehicles could approach the junction without the obstruction of queues regularly enough. Also, in the 55% scenario, when the training penetration rate (TPR) was lower than 20%, performance was degraded if the evaluation penetration rate (EPR) was increased, as vehicle behaviour differed greatly from what the models were expecting. To expand testing to arterial flows, a revised framework was created and tested on a simulated arterial testbed. Alterations were also made to the framework to account for lessons learned from the isolated junction experiments. It was found that the revised framework had a positive impact at all traffic densities, with vehicle speed increasing and stops and waiting time decreasing compared to benchmark systems, even when performance was degraded by the absence of vehicle route information. However, performance was still degraded at higher EPRs when the TPR was lower than 40%. Overall, this research shows that GLOSA can be combined with Reinforcement Learning Traffic Control (RLTC) systems in a way that benefits CNAVs in terms of waiting time, number of stops, junction entry speed and average speeds, and demonstrates that GLOSA could be installed despite present traffic and traffic control make-ups. It also highlights the need for future frameworks to be tested at a variety of EPRs and with/without route information to avoid decreases in performance or outright failures as conditions at the site of deployments change over time.
University of Southampton
Paine, William
46ec67ed-03db-4e24-8a67-29044b14fbc8
2025
Paine, William
46ec67ed-03db-4e24-8a67-29044b14fbc8
Waterson, Ben
60a59616-54f7-4c31-920d-975583953286
Cherrett, Tom
e5929951-e97c-4720-96a8-3e586f2d5f95
Paine, William
(2025)
Using reinforcement learning to combine Green Light Optimised Speed Advisory and responsive traffic control systems with non-autonomous vehicles: Effects of imperfect training and incomplete information.
University of Southampton, Doctoral Thesis, 218pp.
Record type:
Thesis
(Doctoral)
Abstract
The advantages of Green Light Optimal Speed Advisory (GLOSA) include improvements in travel times, fuel consumption, battery life, emissions, the number of stops, and ride smoothness for cars, buses, HGVs, HEVs, and EVs in scenarios ranging from isolated junctions to connected networks. However, GLOSA is a detriment when paired with responsive traffic control systems (RTCs), which are presently widespread, greatly reducing the number of locations where the advantages of GLOSA could be obtained. This is because RTCs cannot provide accurate long-term future signal plans without the introduction of some mitigating solution. However, as Reinforcement Learning methods can, by trial and error, identify unseen solutions to complicated problems, there appears to be no reason they couldn’t be used to create frameworks designed to combine GLOSA and RTCs and benefit (Connected Non-Autonomous Vehicles) CNAVs, allowing the advantages of GLOSA to be obtained with the present traffic and traffic control make-ups. Therefore, this thesis investigates the performance of such frameworks in both ideal and non-ideal conditions. To achieve this, an initial framework was constructed and evaluated on a simulated isolated junction. It was found that the initial framework had a positive impact on stopping time and junction entry speed when traffic densities were at 55% of saturation levels but a negative impact at traffic densities of 70% as the Reinforcement Learning Traffic Control (RLTC) system was unable to reach states where vehicles could approach the junction without the obstruction of queues regularly enough. Also, in the 55% scenario, when the training penetration rate (TPR) was lower than 20%, performance was degraded if the evaluation penetration rate (EPR) was increased, as vehicle behaviour differed greatly from what the models were expecting. To expand testing to arterial flows, a revised framework was created and tested on a simulated arterial testbed. Alterations were also made to the framework to account for lessons learned from the isolated junction experiments. It was found that the revised framework had a positive impact at all traffic densities, with vehicle speed increasing and stops and waiting time decreasing compared to benchmark systems, even when performance was degraded by the absence of vehicle route information. However, performance was still degraded at higher EPRs when the TPR was lower than 40%. Overall, this research shows that GLOSA can be combined with Reinforcement Learning Traffic Control (RLTC) systems in a way that benefits CNAVs in terms of waiting time, number of stops, junction entry speed and average speeds, and demonstrates that GLOSA could be installed despite present traffic and traffic control make-ups. It also highlights the need for future frameworks to be tested at a variety of EPRs and with/without route information to avoid decreases in performance or outright failures as conditions at the site of deployments change over time.
Text
Using Reinforcement Learning to Combine Green Light Optimised Speed Advisory and Responsive Traffic Control Systems with Non-Autonomous Vehicles: Effects of Imperfect Training and Incomplete Information
- Accepted Manuscript
Text
Final-thesis-submission-Examination-Mr-William-Paine
Restricted to Repository staff only
More information
Published date: 2025
Identifiers
Local EPrints ID: 502165
URI: http://eprints.soton.ac.uk/id/eprint/502165
PURE UUID: 59d41d28-f567-4353-b6a1-41bfed933ac5
Catalogue record
Date deposited: 17 Jun 2025 16:58
Last modified: 11 Sep 2025 03:10
Export record
Contributors
Author:
William Paine
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics