The University of Southampton
University of Southampton Institutional Repository

Benchmarking scalability of stream processing frameworks deployed as microservices in the cloud

Benchmarking scalability of stream processing frameworks deployed as microservices in the cloud
Benchmarking scalability of stream processing frameworks deployed as microservices in the cloud

Context: The combination of distributed stream processing with microservice architectures is an emerging pattern for building data-intensive software systems. In such systems, stream processing frameworks such as Apache Flink, Apache Kafka Streams, Apache Samza, Hazelcast Jet, or the Apache Beam SDK are used inside microservices to continuously process massive amounts of data in a distributed fashion. While all of these frameworks promote scalability as a core feature, there is only little empirical research evaluating and comparing their scalability. Objective: The goal of this study to obtain evidence about the scalability of state-of-the-art stream processing framework in different execution environments and regarding different scalability dimensions. Method: We benchmark five modern stream processing frameworks regarding their scalability using a systematic method. We conduct over 740 h of experiments on Kubernetes clusters in the Google cloud and in a private cloud, where we deploy up to 110 simultaneously running microservice instances, which process up to one million messages per second. Results: All benchmarked frameworks exhibit approximately linear scalability as long as sufficient cloud resources are provisioned. However, the frameworks show considerable differences in the rate at which resources have to be added to cope with increasing load. There is no clear superior framework, but the ranking of the frameworks depends on the use case. Using Apache Beam as an abstraction layer still comes at the cost of significantly higher resource requirements regardless of the use case. We observe our results regardless of scaling load on a microservice, scaling the computational work performed inside the microservice, and the selected cloud environment. Moreover, vertical scaling can be a complementary measure to achieve scalability of stream processing frameworks. Conclusion: While scalable microservices can be designed with all evaluated frameworks, the choice of a framework and its deployment has a considerable impact on the cost of operating it.

Benchmarking, Microservices, Scalability, Stream processing
0164-1212
Henning, Sören
e09ef4ea-8a2f-4d11-903b-db51d6371fcb
Hasselbring, Wilhelm
ee89c5c9-a900-40b1-82c1-552268cd01bd
Henning, Sören
e09ef4ea-8a2f-4d11-903b-db51d6371fcb
Hasselbring, Wilhelm
ee89c5c9-a900-40b1-82c1-552268cd01bd

Henning, Sören and Hasselbring, Wilhelm (2023) Benchmarking scalability of stream processing frameworks deployed as microservices in the cloud. Journal of Systems and Software, 208, [111879]. (doi:10.1016/j.jss.2023.111879).

Record type: Article

Abstract

Context: The combination of distributed stream processing with microservice architectures is an emerging pattern for building data-intensive software systems. In such systems, stream processing frameworks such as Apache Flink, Apache Kafka Streams, Apache Samza, Hazelcast Jet, or the Apache Beam SDK are used inside microservices to continuously process massive amounts of data in a distributed fashion. While all of these frameworks promote scalability as a core feature, there is only little empirical research evaluating and comparing their scalability. Objective: The goal of this study to obtain evidence about the scalability of state-of-the-art stream processing framework in different execution environments and regarding different scalability dimensions. Method: We benchmark five modern stream processing frameworks regarding their scalability using a systematic method. We conduct over 740 h of experiments on Kubernetes clusters in the Google cloud and in a private cloud, where we deploy up to 110 simultaneously running microservice instances, which process up to one million messages per second. Results: All benchmarked frameworks exhibit approximately linear scalability as long as sufficient cloud resources are provisioned. However, the frameworks show considerable differences in the rate at which resources have to be added to cope with increasing load. There is no clear superior framework, but the ranking of the frameworks depends on the use case. Using Apache Beam as an abstraction layer still comes at the cost of significantly higher resource requirements regardless of the use case. We observe our results regardless of scaling load on a microservice, scaling the computational work performed inside the microservice, and the selected cloud environment. Moreover, vertical scaling can be a complementary measure to achieve scalability of stream processing frameworks. Conclusion: While scalable microservices can be designed with all evaluated frameworks, the choice of a framework and its deployment has a considerable impact on the cost of operating it.

Text
1-s2.0-S0164121223002741-main - Version of Record
Available under License Creative Commons Attribution.
Download (1MB)

More information

Accepted/In Press date: 12 October 2023
e-pub ahead of print date: 24 October 2023
Published date: 27 October 2023
Keywords: Benchmarking, Microservices, Scalability, Stream processing

Identifiers

Local EPrints ID: 488788
URI: http://eprints.soton.ac.uk/id/eprint/488788
ISSN: 0164-1212
PURE UUID: 48cdf7cd-3504-40c0-b770-e699c16e0a83
ORCID for Wilhelm Hasselbring: ORCID iD orcid.org/0000-0001-6625-4335

Catalogue record

Date deposited: 05 Apr 2024 16:40
Last modified: 10 Apr 2024 02:15

Export record

Altmetrics

Contributors

Author: Sören Henning
Author: Wilhelm Hasselbring ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×