The University of Southampton
University of Southampton Institutional Repository

Multi–Armed Bandit Models for Efficient Long–Term Information Collection in Wireless Sensor Networks

Multi–Armed Bandit Models for Efficient Long–Term Information Collection in Wireless Sensor Networks
Multi–Armed Bandit Models for Efficient Long–Term Information Collection in Wireless Sensor Networks
We are entering a new age in the evolution of computer systems, in which pervasive computing technologies seamlessly interact with human users. These technologies serve people in their everyday lives at home and work by functioning invisibly in the background, creating a smart environment around them. For example, this could be an intelligent building or a smart traffic control system. Now, since such smart environments need information about their surroundings to function effectively, they rely first and foremost on sensory data from the real world. More accurately, this data is typically provided by wireless sensor networks, which are networks of small, autonomous sensor devices. The advantages of wireless sensor networks, such as flexibility, low cost and ease of deployment, have ensured they have gained significant attention from both researchers and manufacturers. However, due to the limited resource constraints of such sensors (e.g. hardware limitations, low computational capacity, or limited energy budget), there are still a number of significant and specific research challenges to be addressed in this domain. To overcome these challenges, we believe an efficient solution for long–term information collection in wireless sensor network should be able to fulfill the following requirements: (i) adaptivity to environmental changes; (ii) robustness and flexibility; (iii) computational feasibility; and (iv) limited use of communication. In more detail, wireless sensor networks are typically deployed in dynamic environments, we must take environmental changes into account, and thus, it must be able to adapt to those changes. Furthermore, since future changes of the environment are typically unknown a priori, we cannot accurately predict these changes. Thus, in order to efficiently adapt to the environment, a good solution must be on–line, so that it can quickly react to environmental changes. Besides, we must be aware of topological and physical changes (e.g. node or communication failures) as well. Finally, due to the limited resources of the sensors, communication and computational cost should not be significant, compared to the size of the network. Previous work of information collection in wireless sensor networks has typically focused on optimising data sampling, routing, information valuation and energy management in order to achieve efficient information collection. However, it usually fail to provide all of the aforementioned requirements. Specifically, existing solutions are typically not designed for long–term operation, since they cannot adapt to environmental changes. That is, they do not have the ability of modifying their behaviour so that they could efficiently adapt to the new characteristics of the environment. Other algorithms follow the concept of centralised control mechanism (i.e. a central unit is responsible for all the calculations and decision making). These solutions, however, are not robust and flexible, since the central unit may represent a computational bottleneck. Against this background, this transfer report focuses on the challenge of developing decentralised adaptive on–line algorithms for efficient long–term information collection in the wireless sensor network domain. In particular, we focus on developing energy management and information–centric data routing policies that adapt their behaviour according to the energy that is harvested, in order to achieve efficient performance. In so doing, we introduce two new energy management techniques, based on multi–armed bandit learning, that allow each sensor to adaptively allocate its energy budget across the tasks of data sampling, receiving and transmitting. These approaches are devised in order to deal with the following different situations: (i) when the sensors can harvest energy from the environment; and (ii) when energy harvesting from the environment is not possible. By using this approaches, each sensor can learn the optimal energy budget settings that gives it efficient information collection in the long run. In addition, we propose a novel decentralised algorithm for information–centric routing. In more detail, we first tackle the energy management problem with energy–harvesting sensors from the multi–armed bandit perspective. That is, we reduce the energy management problem to a non–stochastic multi–armed bandit model. Then through extensive simulations, we demonstrate that the performance of this approach outperforms other state–of–the–art non–learning algorithms. For the case of energy management with non–harvesting sensors, we show that existing multi–armed bandit models are not suitable for modelling this problem. Given this, we introduce a new bandit model, the budgeted multi–armed bandit with pulling cost, in order to efficiently tackle the energy management problem. Following this, we propose an epsilon–first approach for this new bandit problem, in which the first epsilon portion of the total budget is allocated to exploration (i.e. learning which actions are the most efficient). Finally, for the routing, we introduce an information–centric routing problem, the maximal information throughput routing problem. Existing routing algorithms, however, are not suitable to solve this problem. Thus, we devise a simple, but proveably optimal decentralised algorithm, that maximises the information throughput in the network.
Tran-Thanh, Long
e0666669-d34b-460e-950d-e8b139fab16c
Tran-Thanh, Long
e0666669-d34b-460e-950d-e8b139fab16c

Tran-Thanh, Long (2010) Multi–Armed Bandit Models for Efficient Long–Term Information Collection in Wireless Sensor Networks (In Press)

Record type: Monograph (Project Report)

Abstract

We are entering a new age in the evolution of computer systems, in which pervasive computing technologies seamlessly interact with human users. These technologies serve people in their everyday lives at home and work by functioning invisibly in the background, creating a smart environment around them. For example, this could be an intelligent building or a smart traffic control system. Now, since such smart environments need information about their surroundings to function effectively, they rely first and foremost on sensory data from the real world. More accurately, this data is typically provided by wireless sensor networks, which are networks of small, autonomous sensor devices. The advantages of wireless sensor networks, such as flexibility, low cost and ease of deployment, have ensured they have gained significant attention from both researchers and manufacturers. However, due to the limited resource constraints of such sensors (e.g. hardware limitations, low computational capacity, or limited energy budget), there are still a number of significant and specific research challenges to be addressed in this domain. To overcome these challenges, we believe an efficient solution for long–term information collection in wireless sensor network should be able to fulfill the following requirements: (i) adaptivity to environmental changes; (ii) robustness and flexibility; (iii) computational feasibility; and (iv) limited use of communication. In more detail, wireless sensor networks are typically deployed in dynamic environments, we must take environmental changes into account, and thus, it must be able to adapt to those changes. Furthermore, since future changes of the environment are typically unknown a priori, we cannot accurately predict these changes. Thus, in order to efficiently adapt to the environment, a good solution must be on–line, so that it can quickly react to environmental changes. Besides, we must be aware of topological and physical changes (e.g. node or communication failures) as well. Finally, due to the limited resources of the sensors, communication and computational cost should not be significant, compared to the size of the network. Previous work of information collection in wireless sensor networks has typically focused on optimising data sampling, routing, information valuation and energy management in order to achieve efficient information collection. However, it usually fail to provide all of the aforementioned requirements. Specifically, existing solutions are typically not designed for long–term operation, since they cannot adapt to environmental changes. That is, they do not have the ability of modifying their behaviour so that they could efficiently adapt to the new characteristics of the environment. Other algorithms follow the concept of centralised control mechanism (i.e. a central unit is responsible for all the calculations and decision making). These solutions, however, are not robust and flexible, since the central unit may represent a computational bottleneck. Against this background, this transfer report focuses on the challenge of developing decentralised adaptive on–line algorithms for efficient long–term information collection in the wireless sensor network domain. In particular, we focus on developing energy management and information–centric data routing policies that adapt their behaviour according to the energy that is harvested, in order to achieve efficient performance. In so doing, we introduce two new energy management techniques, based on multi–armed bandit learning, that allow each sensor to adaptively allocate its energy budget across the tasks of data sampling, receiving and transmitting. These approaches are devised in order to deal with the following different situations: (i) when the sensors can harvest energy from the environment; and (ii) when energy harvesting from the environment is not possible. By using this approaches, each sensor can learn the optimal energy budget settings that gives it efficient information collection in the long run. In addition, we propose a novel decentralised algorithm for information–centric routing. In more detail, we first tackle the energy management problem with energy–harvesting sensors from the multi–armed bandit perspective. That is, we reduce the energy management problem to a non–stochastic multi–armed bandit model. Then through extensive simulations, we demonstrate that the performance of this approach outperforms other state–of–the–art non–learning algorithms. For the case of energy management with non–harvesting sensors, we show that existing multi–armed bandit models are not suitable for modelling this problem. Given this, we introduce a new bandit model, the budgeted multi–armed bandit with pulling cost, in order to efficiently tackle the energy management problem. Following this, we propose an epsilon–first approach for this new bandit problem, in which the first epsilon portion of the total budget is allocated to exploration (i.e. learning which actions are the most efficient). Finally, for the routing, we introduce an information–centric routing problem, the maximal information throughput routing problem. Existing routing algorithms, however, are not suitable to solve this problem. Thus, we devise a simple, but proveably optimal decentralised algorithm, that maximises the information throughput in the network.

Text
18-month_report_Long_Tran-Thanh.pdf - Other
Download (1MB)

More information

Accepted/In Press date: 15 April 2010
Organisations: Agents, Interactions & Complexity

Identifiers

Local EPrints ID: 272585
URI: http://eprints.soton.ac.uk/id/eprint/272585
PURE UUID: 5658d236-6811-4d68-93b2-d84bd8809f38
ORCID for Long Tran-Thanh: ORCID iD orcid.org/0000-0003-1617-8316

Catalogue record

Date deposited: 20 Jul 2011 14:04
Last modified: 14 Mar 2024 10:05

Export record

Contributors

Author: Long Tran-Thanh ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×