Nirvana: A non-intrusive black-box monitoring framework for rack-level fault detection
Nirvana: A non-intrusive black-box monitoring framework for rack-level fault detection
Many organizations today still manage mid or large in-house data centers that require very expensive maintenance efforts, including fault detection. Common monitoring frameworks used to quickly detect faults are complex to deploy/maintain, expensive, and intrusive as they require the installation of probes on monitored hw/sw to collect raw data. Such intrusiveness can be problematic as it imposes installation/management overhead and may interfere with security/privacy policies. In this paper we introduce NIRVANA, a novel monitoring system for fault detection that works at rack-level and is (i) non-intrusive, i.e., it does not require the installation of software probes on the hosts to be monitored and (ii) black-box, i.e., agnostic with respect to monitored applications. At the core of our solution lies the observation that aggregated features that can be monitored at rack-level in a non-intrusive and black-box way, show predictable behaviors while the system works in both fault-free and faulty states, it is therefore possible to detect and identify faults by monitoring and analyzing any perturbations to these behaviors. An extensive experimental evaluation shows that non-intrusiveness does not significantly hamper the fault detection capabilities of the monitoring system, thus validating our approach.
Ciccotelli, Claudio
da54f041-47a2-45ea-8947-f35d01d1d488
Aniello, Leonardo
9846e2e4-1303-4b8b-9092-5d8e9bb514c3
Lombardi, Federico
78e41297-64c9-4c1e-9515-8eb59334a795
Montanari, Luca
1cb51e2a-48d7-4bee-9761-3fdb242d8709
Querzoni, Leonardo
c0eee656-74e7-419d-876c-3cad808683d6
Baldoni, Roberto
6ea5e1cc-92fe-4b9d-9ed3-0b7970553965
2015
Ciccotelli, Claudio
da54f041-47a2-45ea-8947-f35d01d1d488
Aniello, Leonardo
9846e2e4-1303-4b8b-9092-5d8e9bb514c3
Lombardi, Federico
78e41297-64c9-4c1e-9515-8eb59334a795
Montanari, Luca
1cb51e2a-48d7-4bee-9761-3fdb242d8709
Querzoni, Leonardo
c0eee656-74e7-419d-876c-3cad808683d6
Baldoni, Roberto
6ea5e1cc-92fe-4b9d-9ed3-0b7970553965
Ciccotelli, Claudio, Aniello, Leonardo, Lombardi, Federico, Montanari, Luca, Querzoni, Leonardo and Baldoni, Roberto
(2015)
Nirvana: A non-intrusive black-box monitoring framework for rack-level fault detection.
In 2015 IEEE 21st Pacific Rim International Symposium on Dependable Computing (PRDC).
IEEE..
(doi:10.1109/PRDC.2015.22).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Many organizations today still manage mid or large in-house data centers that require very expensive maintenance efforts, including fault detection. Common monitoring frameworks used to quickly detect faults are complex to deploy/maintain, expensive, and intrusive as they require the installation of probes on monitored hw/sw to collect raw data. Such intrusiveness can be problematic as it imposes installation/management overhead and may interfere with security/privacy policies. In this paper we introduce NIRVANA, a novel monitoring system for fault detection that works at rack-level and is (i) non-intrusive, i.e., it does not require the installation of software probes on the hosts to be monitored and (ii) black-box, i.e., agnostic with respect to monitored applications. At the core of our solution lies the observation that aggregated features that can be monitored at rack-level in a non-intrusive and black-box way, show predictable behaviors while the system works in both fault-free and faulty states, it is therefore possible to detect and identify faults by monitoring and analyzing any perturbations to these behaviors. An extensive experimental evaluation shows that non-intrusiveness does not significantly hamper the fault detection capabilities of the monitoring system, thus validating our approach.
Text
NIRVANA: A Non-Intrusive Black-Box Monitoring Framework for Rack-level Fault Detection
Restricted to Repository staff only
Request a copy
More information
Published date: 2015
Identifiers
Local EPrints ID: 431130
URI: http://eprints.soton.ac.uk/id/eprint/431130
PURE UUID: df3297b1-2821-44e3-8a95-14e64ec10cbc
Catalogue record
Date deposited: 24 May 2019 16:30
Last modified: 16 Mar 2024 04:32
Export record
Altmetrics
Contributors
Author:
Claudio Ciccotelli
Author:
Leonardo Aniello
Author:
Federico Lombardi
Author:
Luca Montanari
Author:
Leonardo Querzoni
Author:
Roberto Baldoni
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics