A combined system metrics approach to cloud service reliability using artificial intelligence
A combined system metrics approach to cloud service reliability using artificial intelligence
Identifying and anticipating potential failures in the cloud is an effective method for increasing cloud reliability and proactive failure management. Many studies have been conducted to predict potential failure, but none have combined SMART (self-monitoring, analysis, and reporting technology) hard drive metrics with other system metrics, such as central processing unit (CPU) utilisation. Therefore, we propose a combined system metrics approach for failure prediction based on artificial intelligence to improve reliability. We tested over 100 cloud servers’ data and four artificial intelligence algorithms: random forest, gradient boosting, long short-term memory, and gated recurrent unit, and also performed correlation analysis. Our correlation analysis sheds light on the relationships that exist between system metrics and failure, and the experimental results demonstrate the advantages of combining system metrics, outperforming the state-of-the-art.
Chhetri, Tek Raj
c3431de5-4860-43e5-b09f-3dbb752c8490
Dehury, Chinmaya Kumar
70fd2764-04aa-4cd0-bc58-b74c66243efd
Lind, Artjom
03b6e2d1-3bf4-41d6-917e-052d4cd7ba9e
Srirama, Satish Narayana
9c7b33c8-ee2b-45f1-81eb-3d603b8be475
Fensel, Anna
6d0be8a7-8261-48f1-9214-fc5fc59c40d3
1 March 2022
Chhetri, Tek Raj
c3431de5-4860-43e5-b09f-3dbb752c8490
Dehury, Chinmaya Kumar
70fd2764-04aa-4cd0-bc58-b74c66243efd
Lind, Artjom
03b6e2d1-3bf4-41d6-917e-052d4cd7ba9e
Srirama, Satish Narayana
9c7b33c8-ee2b-45f1-81eb-3d603b8be475
Fensel, Anna
6d0be8a7-8261-48f1-9214-fc5fc59c40d3
Chhetri, Tek Raj, Dehury, Chinmaya Kumar, Lind, Artjom, Srirama, Satish Narayana and Fensel, Anna
(2022)
A combined system metrics approach to cloud service reliability using artificial intelligence.
Big Data and Cognitive Computing, 6 (1), [26].
(doi:10.3390/bdcc6010026).
Abstract
Identifying and anticipating potential failures in the cloud is an effective method for increasing cloud reliability and proactive failure management. Many studies have been conducted to predict potential failure, but none have combined SMART (self-monitoring, analysis, and reporting technology) hard drive metrics with other system metrics, such as central processing unit (CPU) utilisation. Therefore, we propose a combined system metrics approach for failure prediction based on artificial intelligence to improve reliability. We tested over 100 cloud servers’ data and four artificial intelligence algorithms: random forest, gradient boosting, long short-term memory, and gated recurrent unit, and also performed correlation analysis. Our correlation analysis sheds light on the relationships that exist between system metrics and failure, and the experimental results demonstrate the advantages of combining system metrics, outperforming the state-of-the-art.
Text
BDCC-06-00026-v2
- Version of Record
More information
Accepted/In Press date: 24 February 2022
Published date: 1 March 2022
Identifiers
Local EPrints ID: 481456
URI: http://eprints.soton.ac.uk/id/eprint/481456
ISSN: 2504-2289
PURE UUID: 0bd2d513-527a-4b33-a738-405bcfc09af3
Catalogue record
Date deposited: 29 Aug 2023 16:52
Last modified: 17 Mar 2024 04:21
Export record
Altmetrics
Contributors
Author:
Tek Raj Chhetri
Author:
Chinmaya Kumar Dehury
Author:
Artjom Lind
Author:
Satish Narayana Srirama
Author:
Anna Fensel
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics