The University of Southampton
University of Southampton Institutional Repository

YONO: modeling multiple heterogeneous neural networks on microcontrollers

YONO: modeling multiple heterogeneous neural networks on microcontrollers
YONO: modeling multiple heterogeneous neural networks on microcontrollers

Internet of Things (IoT) systems provide large amounts of data on all aspects of human behavior. Machine learning techniques, especially deep neural networks (DNN), have shown promise in making sense of this data at a large scale. Also, the research community has worked to reduce the computational and resource demands of DNN to compute on low-resourced micro controllers (MCUs). However, most of the current work in embedded deep learning focuses on solving a single task efficiently, while the multi-tasking nature and applications of IoT devices demand systems that can handle a diverse range of tasks (such as activity, gesture, voice, and context recognition) with input from a variety of sensors, simultaneously. In this paper, we propose YONO, a product quantization (PQ) based approach that compresses multiple heterogeneous models and enables in-memory model execution and model switching for dissimilar multi-task learning on MCUs. We first adopt PQ to learn codebooks that store weights of different models. Also, we propose a novel network optimization and heuristics to maximize the com-pression rate and minimize the accuracy loss. Then, we develop an online component of YONO for efficient model execution and switching between multiple tasks on an MCU at run time without relying on an external storage device. YONO shows remarkable performance as it can compress multiple heterogeneous models with negligible or no loss of accuracy up to 12.37x. Furthermore, YONO's online component enables an efficient execution (latency of 16-159 ms and energy consumption of 3.8-37.9 mJ per operation) and reduces modelloading/switching la-tency and energy consumption by 93.3-94.5% and 93.9-95.0%, respectively, compared to external storage access. Interestingly, YONO can compress various architectures trained with datasets that were not shown during YONO's offline codebook learning phase showing the generalizability of our method. To summarize, YONO shows great potential and opens further doors to enable multi-task learning systems on extremely resource-constrained devices.

Microcontrollers, Multi Task Learning, Product Quantization
285-297
IEEE
Kwon, Young D.
3e8c3dcd-214c-4771-90f4-b36ede48d763
Chauhan, Jagmohan
831a12dc-6df9-40ea-8bb3-2c5da8882804
Mascolo, Cecilia
e4a7bcf7-72c8-43b7-b6b3-4f8980da245d
Kwon, Young D.
3e8c3dcd-214c-4771-90f4-b36ede48d763
Chauhan, Jagmohan
831a12dc-6df9-40ea-8bb3-2c5da8882804
Mascolo, Cecilia
e4a7bcf7-72c8-43b7-b6b3-4f8980da245d

Kwon, Young D., Chauhan, Jagmohan and Mascolo, Cecilia (2022) YONO: modeling multiple heterogeneous neural networks on microcontrollers. In 2022 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE. pp. 285-297 . (doi:10.1109/IPSN54338.2022.00030).

Record type: Conference or Workshop Item (Paper)

Abstract

Internet of Things (IoT) systems provide large amounts of data on all aspects of human behavior. Machine learning techniques, especially deep neural networks (DNN), have shown promise in making sense of this data at a large scale. Also, the research community has worked to reduce the computational and resource demands of DNN to compute on low-resourced micro controllers (MCUs). However, most of the current work in embedded deep learning focuses on solving a single task efficiently, while the multi-tasking nature and applications of IoT devices demand systems that can handle a diverse range of tasks (such as activity, gesture, voice, and context recognition) with input from a variety of sensors, simultaneously. In this paper, we propose YONO, a product quantization (PQ) based approach that compresses multiple heterogeneous models and enables in-memory model execution and model switching for dissimilar multi-task learning on MCUs. We first adopt PQ to learn codebooks that store weights of different models. Also, we propose a novel network optimization and heuristics to maximize the com-pression rate and minimize the accuracy loss. Then, we develop an online component of YONO for efficient model execution and switching between multiple tasks on an MCU at run time without relying on an external storage device. YONO shows remarkable performance as it can compress multiple heterogeneous models with negligible or no loss of accuracy up to 12.37x. Furthermore, YONO's online component enables an efficient execution (latency of 16-159 ms and energy consumption of 3.8-37.9 mJ per operation) and reduces modelloading/switching la-tency and energy consumption by 93.3-94.5% and 93.9-95.0%, respectively, compared to external storage access. Interestingly, YONO can compress various architectures trained with datasets that were not shown during YONO's offline codebook learning phase showing the generalizability of our method. To summarize, YONO shows great potential and opens further doors to enable multi-task learning systems on extremely resource-constrained devices.

This record has no associated files available for download.

More information

Published date: 18 July 2022
Venue - Dates: 21st ACM/IEEE International Conference on Information Processing in Sensor Networks, IPSN 2022, , Virtual, Online, Italy, 2022-05-04 - 2022-05-06
Keywords: Microcontrollers, Multi Task Learning, Product Quantization

Identifiers

Local EPrints ID: 491967
URI: http://eprints.soton.ac.uk/id/eprint/491967
PURE UUID: 79997b2f-4f1a-4ade-8dd8-4373086b5e4d

Catalogue record

Date deposited: 09 Jul 2024 17:42
Last modified: 11 Jul 2024 01:46

Export record

Altmetrics

Contributors

Author: Young D. Kwon
Author: Jagmohan Chauhan
Author: Cecilia Mascolo

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×