The University of Southampton
University of Southampton Institutional Repository

Realising the benefits of dynamic DNNs on reconfigurable hardware

Realising the benefits of dynamic DNNs on reconfigurable hardware
Realising the benefits of dynamic DNNs on reconfigurable hardware
Deep Neural Networks (DNNs) are increasingly capable of solving many cognitive problems, with countless everyday life applications such as autonomous driving, voice assistants, medical diagnoses, etc. Their increased computational, memory, and energy demands led to the widespread adoption of powerful GPU cloud servers for their execution. However, there is an increasing interest in moving the computation of DNNs to the edge. By processing data locally, the dependency on cloud infrastructure is reduced, thereby decreasing the data transmission load, improving privacy and security, and increasing the availability of networks in environments with limited internet connectivity. However, edge devices (IoT devices, mobile phones, etc.) are typically resource restricted in processing power, memory, and power consumption. This research aims to contribute towards enabling the deployment of complex DNNs on resource restricted devices.
Field Programmable Gate Arrays (FPGAs) have been proven to be very effective in accelerating neural networks due to their configurability, parallelization capabilities, and low energy consumption. Nonetheless, attempting to map modern DNNs onto them without compression is not feasible. Dynamic DNNs are approaches that go beyond the limits of static model compression by tuning computational workload to the difficulty of inputs on a per-sample basis. This thesis first explores the challenges introduced by this DNN approaches when their FPGA implementation is targeted. Three limiting factors were identified: the lack of software libraries and frameworks, the lack of hardware modules and frameworks, and the dependencies on intermediate feature maps.
Having identified these challenges, the next contribution of this thesis is a first realisation of dynamic networks on FPGAs. The design followed the standard architecture and achieved a minimum of 3.2x faster execution over a Jetson embedded device and comparable latency to a CPU/GPU system while maintaining very low energy consumption (at least 1.8x less than the Jetson). This highlighted the feasibility and efficiency of deploying the dynamic network on FPGAs. Nonetheless, FPGAs are inherently parallel devices, and leveraging that, the third contribution of the thesis is a second design approach that explores the simultaneous execution of the two main components of dynamic networks. It addressed two main challenges concerning the dependencies on intermediate feature maps and further accelerated the execution of the dynamic network by up to 23%.
Finally, targeting the versatility and reconfigurability of FPGAs, the last contribution of the thesis is the exploration of two confidence-controlled dynamic schemes. Utilising control values generated by dynamic networks, the first dynamically selects the location of the exit points within the network, and the second the applied quantisation level. Both approaches enhance the adaptability and performance of the early-exit dynamic DNN, achieving an 18% reduction in computations and up to a 21.9% reduction in latency, respectively, with minimal accuracy drops.
University of Southampton
Dimitriou, Anastasios
02f87799-17dc-4271-96c3-8b30e64e659e
Dimitriou, Anastasios
02f87799-17dc-4271-96c3-8b30e64e659e
Merrett, Geoff
89b3a696-41de-44c3-89aa-b0aa29f54020
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9

Dimitriou, Anastasios (2025) Realising the benefits of dynamic DNNs on reconfigurable hardware. University of Southampton, Doctoral Thesis, 137pp.

Record type: Thesis (Doctoral)

Abstract

Deep Neural Networks (DNNs) are increasingly capable of solving many cognitive problems, with countless everyday life applications such as autonomous driving, voice assistants, medical diagnoses, etc. Their increased computational, memory, and energy demands led to the widespread adoption of powerful GPU cloud servers for their execution. However, there is an increasing interest in moving the computation of DNNs to the edge. By processing data locally, the dependency on cloud infrastructure is reduced, thereby decreasing the data transmission load, improving privacy and security, and increasing the availability of networks in environments with limited internet connectivity. However, edge devices (IoT devices, mobile phones, etc.) are typically resource restricted in processing power, memory, and power consumption. This research aims to contribute towards enabling the deployment of complex DNNs on resource restricted devices.
Field Programmable Gate Arrays (FPGAs) have been proven to be very effective in accelerating neural networks due to their configurability, parallelization capabilities, and low energy consumption. Nonetheless, attempting to map modern DNNs onto them without compression is not feasible. Dynamic DNNs are approaches that go beyond the limits of static model compression by tuning computational workload to the difficulty of inputs on a per-sample basis. This thesis first explores the challenges introduced by this DNN approaches when their FPGA implementation is targeted. Three limiting factors were identified: the lack of software libraries and frameworks, the lack of hardware modules and frameworks, and the dependencies on intermediate feature maps.
Having identified these challenges, the next contribution of this thesis is a first realisation of dynamic networks on FPGAs. The design followed the standard architecture and achieved a minimum of 3.2x faster execution over a Jetson embedded device and comparable latency to a CPU/GPU system while maintaining very low energy consumption (at least 1.8x less than the Jetson). This highlighted the feasibility and efficiency of deploying the dynamic network on FPGAs. Nonetheless, FPGAs are inherently parallel devices, and leveraging that, the third contribution of the thesis is a second design approach that explores the simultaneous execution of the two main components of dynamic networks. It addressed two main challenges concerning the dependencies on intermediate feature maps and further accelerated the execution of the dynamic network by up to 23%.
Finally, targeting the versatility and reconfigurability of FPGAs, the last contribution of the thesis is the exploration of two confidence-controlled dynamic schemes. Utilising control values generated by dynamic networks, the first dynamically selects the location of the exit points within the network, and the second the applied quantisation level. Both approaches enhance the adaptability and performance of the early-exit dynamic DNN, achieving an 18% reduction in computations and up to a 21.9% reduction in latency, respectively, with minimal accuracy drops.

Text
Realising the Benefits of Dynamic DNNs on Reconfigurable Hardware - Version of Record
Available under License University of Southampton Thesis Licence.
Download (4MB)
Text
Final-thesis-submission-Examination-Mr-Anastasios-Dimitriou
Restricted to Repository staff only

More information

Published date: 29 April 2025

Identifiers

Local EPrints ID: 500908
URI: http://eprints.soton.ac.uk/id/eprint/500908
PURE UUID: 176ac618-2513-4af4-9069-49e0bac305da
ORCID for Anastasios Dimitriou: ORCID iD orcid.org/0009-0005-0925-8459
ORCID for Geoff Merrett: ORCID iD orcid.org/0000-0003-4980-3894
ORCID for Jonathon Hare: ORCID iD orcid.org/0000-0003-2921-4283

Catalogue record

Date deposited: 15 May 2025 17:07
Last modified: 01 Oct 2025 04:01

Export record

Contributors

Author: Anastasios Dimitriou ORCID iD
Thesis advisor: Geoff Merrett ORCID iD
Thesis advisor: Jonathon Hare ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×