The University of Southampton
University of Southampton Institutional Repository

Nucleus: finding the sharing limit of heterogeneous cores

Vougioukas, Ilias, Sandberg, Andreas, Diestelhorst, Stephan, Al-Hashimi, Bashir and Merrett, Geoffrey (2017) Nucleus: finding the sharing limit of heterogeneous cores ACM Transactions on Embedded Computing Systems, 1, (1)

Record type: Article


Heterogeneous multi-processors are designed to bridge the gap between performance and energy efficiency in modern embedded systems. This is achieved by pairing Out-of-Order (OoO) cores, yielding performance through aggressive speculation and latency masking, with In-Order (InO) cores, that preserve energy through simpler design. By leveraging migrations between them, workloads can therefore select the best setting for any given energy/delay envelope. However, migrations introduce execution overheads that can hurt performance if they happen too frequently. Finding the optimal migration frequency is critical to maximize energy savings while maintaining acceptable performance. We develop a simulation methodology that can 1) isolate the hardware effects of migrations from the software, 2) directly compare the performance of different core types, 3) quantify the performance degradation and 4) calculate the cost of migrations for each case. To showcase our methodology we run mibench, a microbenchmark suite, and show that migrations can happen as fast as every 100k instructions with little performance loss. We also show that, contrary to numerous recent studies, hypothetical designs do not need to share all of their internal components to be able to migrate at that frequency. Instead, we propose a feasible system that shares level 2 caches and a translation lookaside buffer that matches performance and efficiency. Our results show that there are phases comprising up to 10% that a migration to the OoO core leads to performance benefits without any additional energy cost when running on the InO core, and up to 6% of phases where a migration to the InO core can save energy without affecting performance. When considering a policy that focuses on improving the energy-delay product, results show that on average 66% of the phases can be migrated to deliver equal or better system operation without having to aggressively share the entire memory system or to revert to migration periods finer than 100k instructions.

PDF 38_Vougioukas - Accepted Manuscript
Available under License Creative Commons Attribution Share Alike.
Download (2MB)

More information

Accepted/In Press date: 30 June 2017
Published date: 1 October 2017


Local EPrints ID: 412917
ISSN: 1539-9087
PURE UUID: 94471bd3-75b4-49a2-9c0c-52ba8db2dba4
ORCID for Geoffrey Merrett: ORCID iD

Catalogue record

Date deposited: 08 Aug 2017 16:31
Last modified: 08 Aug 2017 16:31

Export record


Author: Ilias Vougioukas
Author: Andreas Sandberg
Author: Stephan Diestelhorst
Author: Geoffrey Merrett ORCID iD

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton:

ePrints Soton supports OAI 2.0 with a base URL of

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.