The University of Southampton
University of Southampton Institutional Repository

A Suitability Score to optimize CNNs on an FPGA accelerator

A Suitability Score to optimize CNNs on an FPGA accelerator
A Suitability Score to optimize CNNs on an FPGA accelerator
This thesis presents a structured optimisation methodology for deploying convolutional neural networks (CNNs) on field-programmable gate arrays (FPGAs), targeting high-throughput operation under constraints of computational resources and latency. The proposed approach integrates model-level restructuring, hardware-aware scheduling, and hardware–software co-design and deployment on FPGAs to deliver high-throughput performance while preserving CNN model accuracy. Oesophageal
cancer detection is used as a representative case study, providing a computationally intensive and accuracy-critical scenario for evaluating the proposed methods.
The proposed methodology introduces the Suitability Score, a metric identifying which convolutional layers benefit most from hardware-aware optimisation. This analysis enables selective adjustments that reduce computational cost without sacrificing model accuracy. Based on these insights, a layer-specific pipelining strategy improves the hardware resource efficiency and inference latency of the deployed CNN accelerator. The optimised model is deployed on an FPGA using a co-design framework, demonstrating high throughput and competitive accuracy while consuming fewer hardware resources than FPGA-based CNN accelerators reported in the literature.
The proposed accelerator is deployed on an AMD Kintex UltraScale+ FPGA and evaluated against graphics processing units (GPU)-based inference and existing FPGA implementations. Compared to a GPU baseline, the accelerator achieves at least 47.6% higher throughput and more than twice the energy efficiency. In FPGA-based comparisons, it processes up to 7.8× more images per second while using fewer hardware resources. Moreover, the results demonstrate that the proposed accelerator achieves a throughput of 76.19 images/s with 97.45% accuracy, while maintaining low resource and power consumption. These results demonstrate that the proposed FPGA-based approach supports real-time CNN inference with high accuracy, high throughput, and efficient hardware usage, making it suitable for broader use in embedded, latency-sensitive image analysis applications.
University of Southampton
Saglam, Serkan
7507c7d4-d9ca-46e9-b29b-4dc7509e9280
Saglam, Serkan
7507c7d4-d9ca-46e9-b29b-4dc7509e9280
Zwolinski, Mark
adfcb8e7-877f-4bd7-9b55-7553b6cb3ea0
Ramchurn, Gopal
1d62ae2a-a498-444e-912d-a6082d3aaea3
Underwood, Tim
8e81bf60-edd2-4b0e-8324-3068c95ea1c6

Saglam, Serkan (2025) A Suitability Score to optimize CNNs on an FPGA accelerator. University of Southampton, Doctoral Thesis, 143pp.

Record type: Thesis (Doctoral)

Abstract

This thesis presents a structured optimisation methodology for deploying convolutional neural networks (CNNs) on field-programmable gate arrays (FPGAs), targeting high-throughput operation under constraints of computational resources and latency. The proposed approach integrates model-level restructuring, hardware-aware scheduling, and hardware–software co-design and deployment on FPGAs to deliver high-throughput performance while preserving CNN model accuracy. Oesophageal
cancer detection is used as a representative case study, providing a computationally intensive and accuracy-critical scenario for evaluating the proposed methods.
The proposed methodology introduces the Suitability Score, a metric identifying which convolutional layers benefit most from hardware-aware optimisation. This analysis enables selective adjustments that reduce computational cost without sacrificing model accuracy. Based on these insights, a layer-specific pipelining strategy improves the hardware resource efficiency and inference latency of the deployed CNN accelerator. The optimised model is deployed on an FPGA using a co-design framework, demonstrating high throughput and competitive accuracy while consuming fewer hardware resources than FPGA-based CNN accelerators reported in the literature.
The proposed accelerator is deployed on an AMD Kintex UltraScale+ FPGA and evaluated against graphics processing units (GPU)-based inference and existing FPGA implementations. Compared to a GPU baseline, the accelerator achieves at least 47.6% higher throughput and more than twice the energy efficiency. In FPGA-based comparisons, it processes up to 7.8× more images per second while using fewer hardware resources. Moreover, the results demonstrate that the proposed accelerator achieves a throughput of 76.19 images/s with 97.45% accuracy, while maintaining low resource and power consumption. These results demonstrate that the proposed FPGA-based approach supports real-time CNN inference with high accuracy, high throughput, and efficient hardware usage, making it suitable for broader use in embedded, latency-sensitive image analysis applications.

Text
Thesis_of_Doctor_of_Philosophy_Serkan_Saglam__Final
Available under License University of Southampton Thesis Licence.
Download (14MB)
Text
Final-thesis-submission-Examination-MR-Serkan-Saglam
Restricted to Repository staff only

More information

Published date: 2025

Identifiers

Local EPrints ID: 505575
URI: http://eprints.soton.ac.uk/id/eprint/505575
PURE UUID: c7d80f63-d21b-4ce2-a351-cf3d4f1ab4c5
ORCID for Mark Zwolinski: ORCID iD orcid.org/0000-0002-2230-625X
ORCID for Gopal Ramchurn: ORCID iD orcid.org/0000-0001-9686-4302
ORCID for Tim Underwood: ORCID iD orcid.org/0000-0001-9455-2188

Catalogue record

Date deposited: 14 Oct 2025 16:41
Last modified: 18 Oct 2025 01:41

Export record

Contributors

Author: Serkan Saglam
Thesis advisor: Mark Zwolinski ORCID iD
Thesis advisor: Gopal Ramchurn ORCID iD
Thesis advisor: Tim Underwood ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×