A fault-tolerant mechanism for desktop cloud systems
A fault-tolerant mechanism for desktop cloud systems
Cloud computing is a paradigm that promises to move IT another step towards the age of computing utility. Traditionally, Clouds employ dedicated resources located in data centres to provide services to clients. The resources in such Cloud systems are known to be highly reliable with a low probability of failure. Desktop Cloud computing is a new type of Cloud computing that aims to provide Cloud services at little or no cost. This ambition can be achieved by combining Cloud computing and Volunteer computing into Desktop Clouds, harnessing non-dedicated resources when idle.
The resources can be any type of computing machine, for example a standard PC, but such computing resources are renowned for their volatility; failures can happen at any time without warning. In Cloud computing, tasks are submitted by Cloud users or brokers to be processed and executed by virtual machines (VMs), and virtual mechanisms are hosted by physical machines (PMs). In this context, throughput is defined as the proportion of the total number of tasks that are successfully processed, so the failure of a PM can have a negative impact on this measure of a Desktop Cloud system by causing the destruction of all hosted VMs, leading to the loss of submitted tasks currently being processed. The aim of this research is to design a VM allocation mechanism for Desktop Cloud systems that is tolerant to node failure. VM allocation mechanisms are responsible for allocating VMs to PMs and migrating them during runtime with the objective of optimisation, yet those available pay little attention to node failure events.
The contribution of this research is to propose a Fault-Tolerant VM allocation mechanism that handles failure events in PMs in Desktop Clouds to ensure that the throughput of Desktop Cloud system remains within acceptable levels by employing a replication technique. Since doing so causes an increase of power consumption in PMs, the mechanism is enhanced with a migration policy to minimise this effect, evaluated using three metrics: throughput of tasks; power consumption of PMs; and service availability. The evaluation is conducted using DesktopCloudSim, a tool developed for the purpose by this study as an extension to CloudSim, the well-known Cloud simulation tool, to simulate node failure events in Cloud systems, analysing node failure with real data sets of collected from Failure Trace Archives. The experiments demonstrate that the FT mechanism improves the throughput of Cloud systems statistically significantly compared with traditional mechanisms (First Come First Serve, Greedy and RoundRobin) in the presence of node failures. The FT mechanism reduces power consumption statistically significantly when its migration policy is employed.
Alwabel, Abdulelah
f9aaaf3c-6edc-4cb1-a780-daabfd5822a0
June 2015
Alwabel, Abdulelah
f9aaaf3c-6edc-4cb1-a780-daabfd5822a0
Wills, Gary
3a594558-6921-4e82-8098-38cd8d4e8aa0
Alwabel, Abdulelah
(2015)
A fault-tolerant mechanism for desktop cloud systems.
University of Southampton, Physical Sciences and Engineering, Doctoral Thesis, 150pp.
Record type:
Thesis
(Doctoral)
Abstract
Cloud computing is a paradigm that promises to move IT another step towards the age of computing utility. Traditionally, Clouds employ dedicated resources located in data centres to provide services to clients. The resources in such Cloud systems are known to be highly reliable with a low probability of failure. Desktop Cloud computing is a new type of Cloud computing that aims to provide Cloud services at little or no cost. This ambition can be achieved by combining Cloud computing and Volunteer computing into Desktop Clouds, harnessing non-dedicated resources when idle.
The resources can be any type of computing machine, for example a standard PC, but such computing resources are renowned for their volatility; failures can happen at any time without warning. In Cloud computing, tasks are submitted by Cloud users or brokers to be processed and executed by virtual machines (VMs), and virtual mechanisms are hosted by physical machines (PMs). In this context, throughput is defined as the proportion of the total number of tasks that are successfully processed, so the failure of a PM can have a negative impact on this measure of a Desktop Cloud system by causing the destruction of all hosted VMs, leading to the loss of submitted tasks currently being processed. The aim of this research is to design a VM allocation mechanism for Desktop Cloud systems that is tolerant to node failure. VM allocation mechanisms are responsible for allocating VMs to PMs and migrating them during runtime with the objective of optimisation, yet those available pay little attention to node failure events.
The contribution of this research is to propose a Fault-Tolerant VM allocation mechanism that handles failure events in PMs in Desktop Clouds to ensure that the throughput of Desktop Cloud system remains within acceptable levels by employing a replication technique. Since doing so causes an increase of power consumption in PMs, the mechanism is enhanced with a migration policy to minimise this effect, evaluated using three metrics: throughput of tasks; power consumption of PMs; and service availability. The evaluation is conducted using DesktopCloudSim, a tool developed for the purpose by this study as an extension to CloudSim, the well-known Cloud simulation tool, to simulate node failure events in Cloud systems, analysing node failure with real data sets of collected from Failure Trace Archives. The experiments demonstrate that the FT mechanism improves the throughput of Cloud systems statistically significantly compared with traditional mechanisms (First Come First Serve, Greedy and RoundRobin) in the presence of node failures. The FT mechanism reduces power consumption statistically significantly when its migration policy is employed.
Text
__soton.ac.uk_ude_personalfiles_users_jo1d13_mydesktop_PhD Thesis.pdf
- Other
More information
Published date: June 2015
Organisations:
University of Southampton, Electronic & Software Systems
Identifiers
Local EPrints ID: 387007
URI: http://eprints.soton.ac.uk/id/eprint/387007
PURE UUID: 20ac3800-0a2d-434c-9b99-39b9e84461cb
Catalogue record
Date deposited: 17 Feb 2016 14:38
Last modified: 15 Mar 2024 02:51
Export record
Contributors
Author:
Abdulelah Alwabel
Thesis advisor:
Gary Wills
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics