A Suitability Score to optimize CNNs on an FPGA accelerator

This thesis presents a structured optimisation methodology for deploying convolutional neural networks (CNNs) on field-programmable gate arrays (FPGAs), targeting high-throughput operation under constraints of computational resources and latency. The proposed approach integrates model-level restructuring, hardware-aware scheduling, and hardware–software co-design and deployment on FPGAs to deliver high-throughput performance while preserving CNN model accuracy. Oesophageal
cancer detection is used as a representative case study, providing a computationally intensive and accuracy-critical scenario for evaluating the proposed methods.
The proposed methodology introduces the Suitability Score, a metric identifying which convolutional layers benefit most from hardware-aware optimisation. This analysis enables selective adjustments that reduce computational cost without sacrificing model accuracy. Based on these insights, a layer-specific pipelining strategy improves the hardware resource efficiency and inference latency of the deployed CNN accelerator. The optimised model is deployed on an FPGA using a co-design framework, demonstrating high throughput and competitive accuracy while consuming fewer hardware resources than FPGA-based CNN accelerators reported in the literature.
The proposed accelerator is deployed on an AMD Kintex UltraScale+ FPGA and evaluated against graphics processing units (GPU)-based inference and existing FPGA implementations. Compared to a GPU baseline, the accelerator achieves at least 47.6% higher throughput and more than twice the energy efficiency. In FPGA-based comparisons, it processes up to 7.8× more images per second while using fewer hardware resources. Moreover, the results demonstrate that the proposed accelerator achieves a throughput of 76.19 images/s with 97.45% accuracy, while maintaining low resource and power consumption. These results demonstrate that the proposed FPGA-based approach supports real-time CNN inference with high accuracy, high throughput, and efficient hardware usage, making it suitable for broader use in embedded, latency-sensitive image analysis applications.

University of Southampton

Saglam, Serkan

7507c7d4-d9ca-46e9-b29b-4dc7509e9280

2025

Saglam, Serkan

7507c7d4-d9ca-46e9-b29b-4dc7509e9280

Zwolinski, Mark

adfcb8e7-877f-4bd7-9b55-7553b6cb3ea0

Ramchurn, Gopal

1d62ae2a-a498-444e-912d-a6082d3aaea3

Underwood, Tim

8e81bf60-edd2-4b0e-8324-3068c95ea1c6

Saglam, Serkan (2025) A Suitability Score to optimize CNNs on an FPGA accelerator. University of Southampton, Doctoral Thesis, 143pp.

Record type: Thesis (Doctoral)

Abstract

Text

Thesis_of_Doctor_of_Philosophy_Serkan_Saglam__Final

Available under License University of Southampton Thesis Licence.

Download (14MB)

Text

Final-thesis-submission-Examination-MR-Serkan-Saglam

Restricted to Repository staff only

More information

Published date: 2025

Learn more about the School of Electronics and Computer Science

Identifiers

Local EPrints ID: 505575

URI: http://eprints.soton.ac.uk/id/eprint/505575

PURE UUID: c7d80f63-d21b-4ce2-a351-cf3d4f1ab4c5

ORCID for Mark Zwolinski:

orcid.org/0000-0002-2230-625X

ORCID for Gopal Ramchurn:

orcid.org/0000-0001-9686-4302

ORCID for Tim Underwood:

orcid.org/0000-0001-9455-2188

Catalogue record

Date deposited: 14 Oct 2025 16:41

Last modified: 08 Jan 2026 02:42

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Serkan Saglam

Thesis advisor: Mark Zwolinski

Thesis advisor: Gopal Ramchurn

Thesis advisor: Tim Underwood

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information