Realising the benefits of dynamic DNNs on reconfigurable hardware
Realising the benefits of dynamic DNNs on reconfigurable hardware
Deep Neural Networks (DNNs) are increasingly capable of solving many cognitive problems, with countless everyday life applications such as autonomous driving, voice assistants, medical diagnoses, etc. Their increased computational, memory, and energy demands led to the widespread adoption of powerful GPU cloud servers for their execution. However, there is an increasing interest in moving the computation of DNNs to the edge. By processing data locally, the dependency on cloud infrastructure is reduced, thereby decreasing the data transmission load, improving privacy and security, and increasing the availability of networks in environments with limited internet connectivity. However, edge devices (IoT devices, mobile phones, etc.) are typically resource restricted in processing power, memory, and power consumption. This research aims to contribute towards enabling the deployment of complex DNNs on resource restricted devices.
Field Programmable Gate Arrays (FPGAs) have been proven to be very effective in accelerating neural networks due to their configurability, parallelization capabilities, and low energy consumption. Nonetheless, attempting to map modern DNNs onto them without compression is not feasible. Dynamic DNNs are approaches that go beyond the limits of static model compression by tuning computational workload to the difficulty of inputs on a per-sample basis. This thesis first explores the challenges introduced by this DNN approaches when their FPGA implementation is targeted. Three limiting factors were identified: the lack of software libraries and frameworks, the lack of hardware modules and frameworks, and the dependencies on intermediate feature maps.
Having identified these challenges, the next contribution of this thesis is a first realisation of dynamic networks on FPGAs. The design followed the standard architecture and achieved a minimum of 3.2x faster execution over a Jetson embedded device and comparable latency to a CPU/GPU system while maintaining very low energy consumption (at least 1.8x less than the Jetson). This highlighted the feasibility and efficiency of deploying the dynamic network on FPGAs. Nonetheless, FPGAs are inherently parallel devices, and leveraging that, the third contribution of the thesis is a second design approach that explores the simultaneous execution of the two main components of dynamic networks. It addressed two main challenges concerning the dependencies on intermediate feature maps and further accelerated the execution of the dynamic network by up to 23%.
Finally, targeting the versatility and reconfigurability of FPGAs, the last contribution of the thesis is the exploration of two confidence-controlled dynamic schemes. Utilising control values generated by dynamic networks, the first dynamically selects the location of the exit points within the network, and the second the applied quantisation level. Both approaches enhance the adaptability and performance of the early-exit dynamic DNN, achieving an 18% reduction in computations and up to a 21.9% reduction in latency, respectively, with minimal accuracy drops.
University of Southampton
Dimitriou, Anastasios
02f87799-17dc-4271-96c3-8b30e64e659e
29 April 2025
Dimitriou, Anastasios
02f87799-17dc-4271-96c3-8b30e64e659e
Merrett, Geoff
89b3a696-41de-44c3-89aa-b0aa29f54020
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
Dimitriou, Anastasios
(2025)
Realising the benefits of dynamic DNNs on reconfigurable hardware.
University of Southampton, Doctoral Thesis, 137pp.
Record type:
Thesis
(Doctoral)
Abstract
Deep Neural Networks (DNNs) are increasingly capable of solving many cognitive problems, with countless everyday life applications such as autonomous driving, voice assistants, medical diagnoses, etc. Their increased computational, memory, and energy demands led to the widespread adoption of powerful GPU cloud servers for their execution. However, there is an increasing interest in moving the computation of DNNs to the edge. By processing data locally, the dependency on cloud infrastructure is reduced, thereby decreasing the data transmission load, improving privacy and security, and increasing the availability of networks in environments with limited internet connectivity. However, edge devices (IoT devices, mobile phones, etc.) are typically resource restricted in processing power, memory, and power consumption. This research aims to contribute towards enabling the deployment of complex DNNs on resource restricted devices.
Field Programmable Gate Arrays (FPGAs) have been proven to be very effective in accelerating neural networks due to their configurability, parallelization capabilities, and low energy consumption. Nonetheless, attempting to map modern DNNs onto them without compression is not feasible. Dynamic DNNs are approaches that go beyond the limits of static model compression by tuning computational workload to the difficulty of inputs on a per-sample basis. This thesis first explores the challenges introduced by this DNN approaches when their FPGA implementation is targeted. Three limiting factors were identified: the lack of software libraries and frameworks, the lack of hardware modules and frameworks, and the dependencies on intermediate feature maps.
Having identified these challenges, the next contribution of this thesis is a first realisation of dynamic networks on FPGAs. The design followed the standard architecture and achieved a minimum of 3.2x faster execution over a Jetson embedded device and comparable latency to a CPU/GPU system while maintaining very low energy consumption (at least 1.8x less than the Jetson). This highlighted the feasibility and efficiency of deploying the dynamic network on FPGAs. Nonetheless, FPGAs are inherently parallel devices, and leveraging that, the third contribution of the thesis is a second design approach that explores the simultaneous execution of the two main components of dynamic networks. It addressed two main challenges concerning the dependencies on intermediate feature maps and further accelerated the execution of the dynamic network by up to 23%.
Finally, targeting the versatility and reconfigurability of FPGAs, the last contribution of the thesis is the exploration of two confidence-controlled dynamic schemes. Utilising control values generated by dynamic networks, the first dynamically selects the location of the exit points within the network, and the second the applied quantisation level. Both approaches enhance the adaptability and performance of the early-exit dynamic DNN, achieving an 18% reduction in computations and up to a 21.9% reduction in latency, respectively, with minimal accuracy drops.
Text
Realising the Benefits of Dynamic DNNs on Reconfigurable Hardware
- Version of Record
Text
Final-thesis-submission-Examination-Mr-Anastasios-Dimitriou
Restricted to Repository staff only
More information
Published date: 29 April 2025
Identifiers
Local EPrints ID: 500908
URI: http://eprints.soton.ac.uk/id/eprint/500908
PURE UUID: 176ac618-2513-4af4-9069-49e0bac305da
Catalogue record
Date deposited: 15 May 2025 17:07
Last modified: 01 Oct 2025 04:01
Export record
Contributors
Author:
Anastasios Dimitriou
Thesis advisor:
Geoff Merrett
Thesis advisor:
Jonathon Hare
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics