The University of Southampton
University of Southampton Institutional Repository

Enabling ImageNet-scale deep learning on MCUs for accurate and efficient inference

Enabling ImageNet-scale deep learning on MCUs for accurate and efficient inference
Enabling ImageNet-scale deep learning on MCUs for accurate and efficient inference
Conventional approaches to TinyML achieve high accuracy by deploying the largest deep learning model with highest input resolutions that fit within the size constraints imposed by the microcontroller's (MCUs) fast internal storage and memory. In this paper, we perform an in-depth analysis of prior works to show that models derived within these constraints suffer from low accuracy and, surprisingly, high latency. We propose an alternative approach that enables the deployment of efficient models with low inference latency, but free from the constraints of internal memory. We take a holistic view of typical MCU architectures, and utilise plentiful but slower external memories to relax internal storage and memory constraints. To avoid the lower speed of external memory impacting inference latency, we build on the TinyOps inference framework, which performs operation partitioning and uses overlays via DMA, to accelerate the latency. Using insights from our study, we deploy efficient models from the TinyOps design space onto a range of embedded MCUs achieving record performance on TinyML ImageNet classification with up to 6.7% higher accuracy and 1.4x faster latency compared to state-of-the-art internal memory approaches.
2327-4662
Sadiq, Sulaiman
e82e1fe2-6b8c-4c49-b051-8aef0dabe99a
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
Craske, Simon
9b47212e-aaad-4f80-9154-898ec7912df3
Maji, Partha
00b12708-7af3-4e00-8755-307d6d333947
Merrett, Geoff
89b3a696-41de-44c3-89aa-b0aa29f54020
Sadiq, Sulaiman
e82e1fe2-6b8c-4c49-b051-8aef0dabe99a
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
Craske, Simon
9b47212e-aaad-4f80-9154-898ec7912df3
Maji, Partha
00b12708-7af3-4e00-8755-307d6d333947
Merrett, Geoff
89b3a696-41de-44c3-89aa-b0aa29f54020

Sadiq, Sulaiman, Hare, Jonathon, Craske, Simon, Maji, Partha and Merrett, Geoff (2023) Enabling ImageNet-scale deep learning on MCUs for accurate and efficient inference. IEEE Internet of Things Journal. (In Press)

Record type: Article

Abstract

Conventional approaches to TinyML achieve high accuracy by deploying the largest deep learning model with highest input resolutions that fit within the size constraints imposed by the microcontroller's (MCUs) fast internal storage and memory. In this paper, we perform an in-depth analysis of prior works to show that models derived within these constraints suffer from low accuracy and, surprisingly, high latency. We propose an alternative approach that enables the deployment of efficient models with low inference latency, but free from the constraints of internal memory. We take a holistic view of typical MCU architectures, and utilise plentiful but slower external memories to relax internal storage and memory constraints. To avoid the lower speed of external memory impacting inference latency, we build on the TinyOps inference framework, which performs operation partitioning and uses overlays via DMA, to accelerate the latency. Using insights from our study, we deploy efficient models from the TinyOps design space onto a range of embedded MCUs achieving record performance on TinyML ImageNet classification with up to 6.7% higher accuracy and 1.4x faster latency compared to state-of-the-art internal memory approaches.

Text
Enabling ImageNet-Scale Deep Learning on MCUs for Accurate and Efficient Inference - Accepted Manuscript
Download (1MB)

More information

Accepted/In Press date: 31 October 2023

Identifiers

Local EPrints ID: 483972
URI: http://eprints.soton.ac.uk/id/eprint/483972
ISSN: 2327-4662
PURE UUID: 87d1efc4-19e1-4144-a500-1e128a5e1555
ORCID for Jonathon Hare: ORCID iD orcid.org/0000-0003-2921-4283
ORCID for Geoff Merrett: ORCID iD orcid.org/0000-0003-4980-3894

Catalogue record

Date deposited: 08 Nov 2023 17:51
Last modified: 18 Mar 2024 03:03

Export record

Contributors

Author: Sulaiman Sadiq
Author: Jonathon Hare ORCID iD
Author: Simon Craske
Author: Partha Maji
Author: Geoff Merrett ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×