The University of Southampton
University of Southampton Institutional Repository

SmallClient for big data: an indexing framework towards fast data retrieval

SmallClient for big data: an indexing framework towards fast data retrieval
SmallClient for big data: an indexing framework towards fast data retrieval
Numerous applications are continuously generating massive amount of data and it has become critical to extract useful information while maintaining acceptable computing performance. The objective of this work is to design an indexing framework which minimizes indexing overhead and improves query execution and data search performance with optimum aggregation of computing performance. We propose Small-Client, an indexing framework to speed up query execution. SmallClient has three modules: block creation, index creation and query execution. Block creation module supports improving data retrieval performance with minimum data uploading overhead. Index creation module allows maximum indexes on a dataset to increase index hit ratio with minimized indexing overhead. Finally, query execution module offers incoming queries to utilize these indexes. The evaluation shows that Small-Client outperforms Hadoop full scan with more than 90% search performance. Meanwhile, indexing overhead of SmallClient is reduced to approximately 50% and 80% for index size and indexing time respectively.
Big data, ? Big data indexing, Big data retrieval, ? Big data analytics, ? Query execution? Datasearch performance
1386-7857
Siddiqa, Aisha
4be8b0e4-9be1-4368-9c0c-00727a86bab1
Karim, Ahmad
2a648c9a-d709-4592-b02e-1141e2d6c0cc
Chang, Victor
a7c75287-b649-4a63-a26c-6af6f26525a4
Siddiqa, Aisha
4be8b0e4-9be1-4368-9c0c-00727a86bab1
Karim, Ahmad
2a648c9a-d709-4592-b02e-1141e2d6c0cc
Chang, Victor
a7c75287-b649-4a63-a26c-6af6f26525a4

Siddiqa, Aisha, Karim, Ahmad and Chang, Victor (2017) SmallClient for big data: an indexing framework towards fast data retrieval. Cluster Computing. (doi:10.1007/s10586-016-0712-4).

Record type: Article

Abstract

Numerous applications are continuously generating massive amount of data and it has become critical to extract useful information while maintaining acceptable computing performance. The objective of this work is to design an indexing framework which minimizes indexing overhead and improves query execution and data search performance with optimum aggregation of computing performance. We propose Small-Client, an indexing framework to speed up query execution. SmallClient has three modules: block creation, index creation and query execution. Block creation module supports improving data retrieval performance with minimum data uploading overhead. Index creation module allows maximum indexes on a dataset to increase index hit ratio with minimized indexing overhead. Finally, query execution module offers incoming queries to utilize these indexes. The evaluation shows that Small-Client outperforms Hadoop full scan with more than 90% search performance. Meanwhile, indexing overhead of SmallClient is reduced to approximately 50% and 80% for index size and indexing time respectively.

Text
CLUS_Comp_Aisha_VC_accepted.pdf - Accepted Manuscript
Download (1MB)

More information

e-pub ahead of print date: 20 December 2016
Published date: June 2017
Keywords: Big data, ? Big data indexing, Big data retrieval, ? Big data analytics, ? Query execution? Datasearch performance
Organisations: Electronics & Computer Science, Electronic & Software Systems

Identifiers

Local EPrints ID: 403456
URI: http://eprints.soton.ac.uk/id/eprint/403456
ISSN: 1386-7857
PURE UUID: ce43c077-f279-4b01-8f70-7ddcb60dc397

Catalogue record

Date deposited: 30 Nov 2016 15:56
Last modified: 15 Mar 2024 03:43

Export record

Altmetrics

Contributors

Author: Aisha Siddiqa
Author: Ahmad Karim
Author: Victor Chang

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×