Constructing a streaming knowledge resolution requires thorough testing on the scale it is going to function in a manufacturing setting. Streaming purposes working at scale typically deal with giant volumes of as much as GBs per second, and it’s difficult for builders to simulate high-traffic Amazon Kinesis-based purposes to generate such load simply.
Amazon Kinesis Knowledge Streams and Amazon Kinesis Knowledge Firehose are able to capturing and storing terabytes of information per hour from quite a few sources. Creating Kinesis knowledge streams or Firehose supply streams is simple by way of the AWS Administration Console, AWS Command Line Interface (AWS CLI), or Kinesis API. Nonetheless, producing a steady stream of check knowledge requires a customized course of or script to run constantly. Though the Amazon Kinesis Knowledge Generator (KDG) offers a user-friendly UI for this goal, it has some limitations, equivalent to bandwidth constraints and elevated spherical journey latency. (For extra data on the KDG, consult with Take a look at Your Streaming Knowledge Resolution with the New Amazon Kinesis Knowledge Generator.)
To beat these limitations, this submit describes learn how to use Locust, a contemporary load testing framework, to conduct large-scale load testing for a extra complete analysis of the streaming knowledge resolution.
Overview
This undertaking emits temperature sensor readings through Locust to Kinesis. We arrange the Amazon Elastic Compute Cloud (Amazon EC2) Locust occasion through the AWS Cloud Improvement Equipment (AWS CDK) to load check Kinesis-based purposes. You may entry the Locust dashboard to carry out and observe the load check and join through Session Supervisor, a functionality of AWS Programs Supervisor, for configuration modifications. The next diagram illustrates this structure.

In our testing with the most important really helpful occasion (c7g.16xlarge), the setup was able to emitting over 1 million occasions per second to Kinesis knowledge streams in on-demand capability mode, with a batch measurement (simulated customers per Locust person) of 500. You could find extra particulars on what this implies and learn how to configure the load check later on this submit.
Locust overview
Locust is an open-source, scriptable, and scalable efficiency testing instrument that lets you outline person conduct utilizing Python code. It provides an easy-to-use interface, making it developer-friendly and extremely expandable. With its distributed and scalable design, Locust can simulate tens of millions of simultaneous customers to imitate actual person conduct throughout a efficiency check.
Every Locust person represents a state of affairs or a particular set of actions that an actual person may carry out in your system. Whenever you run a efficiency check with Locust, you’ll be able to specify the variety of concurrent Locust customers you wish to simulate, and Locust will create an occasion for every person, permitting you to evaluate the efficiency and conduct of your system beneath totally different person hundreds.
For extra data on Locust, consult with the Locust documentation.
Conditions
To get began, clone or obtain the code from the GitHub repository.
Take a look at domestically
To check Locust out domestically first earlier than deploying it to the cloud, it’s important to set up the mandatory Python dependencies. Should you’re new to Python, refer the README for extra data on getting began.
Navigate to the load-test listing and run the next code:
To ship occasions to a Kinesis knowledge stream out of your native machine, you have to to have AWS credentials. For extra data, consult with Configuration and credential file settings.
To carry out the check domestically, keep within the load-test listing and run the next code:
Now you can entry the Locust dashboard through http://0.0.0.0:8089/. Enter the variety of Locust customers, the spawn price (customers added per second), and the goal Amazon Kinesis knowledge stream identify for Host. By default, it deploys the Kinesis knowledge stream DemoStream that you should use for testing.

To see the generated occasions logged, run the next command, which filters solely Locust and root logs (for instance, no Botocore logs):
Arrange sources with the AWS CDK
The GitHub repository incorporates the AWS CDK code to create all the mandatory sources for the load check. This removes alternatives for handbook error, will increase effectivity, and ensures constant configurations over time. To deploy the sources, full the next steps:
- If not already downloaded, clone the GitHub repository to your native pc utilizing the next command:
- Obtain and set up the newest Node.js.
- Navigate to the foundation folder of the undertaking and run the next command to put in the newest model of AWS CDK:
- Set up the mandatory dependencies:
- Run cdk bootstrap to initialize the AWS CDK setting in your AWS account. Exchange your AWS account ID and Area earlier than operating the next command:
To study extra concerning the bootstrapping course of, consult with Bootstrapping.
- After the dependencies are put in, you’ll be able to run the next command to deploy the stack of the AWS CDK template, which units up the infrastructure inside 5 minutes:
The template units up the Locust EC2 check occasion, which is by default a c7g.xlarge occasion, which on the time of publishing prices roughly $0.145 per hour in us-east-1. To seek out probably the most correct pricing data, see Amazon EC2 On-Demand Pricing. You could find extra particulars on learn how to change your occasion measurement in keeping with your scale of load testing later on this submit.
It’s essential to contemplate that the bills incurred throughout load testing aren’t solely attributed to EC2 occasion prices, but in addition closely influenced by knowledge switch prices.
Accessing the Locust dashboard
You may entry the dashboard through the use of the AWS CDK output KinesisLocustLoadTestingStack.locustdashboardurl to open the dashboard, for instance http://1.2.3.4:8089.
The Locust dashboard is password protected. By default, it’s set to person identify locust-user and password locust-dashboard-pwd.
With the default configuration, you’ll be able to obtain as much as 15,000 emitted occasions per second. Enter the variety of Locust customers (instances the batch measurement), the spawn price (customers added per second), and the goal Kinesis knowledge stream identify for Host.

After you’ve got began the load check, you’ll be able to take a look at the load check on the Charts tab.

You can even monitor the load check on the Kinesis Knowledge Streams console by navigating to the stream that you’re load testing. Should you used the default settings, navigate to DemoStream. On the element web page, select the Monitoring tab to see the ingested load.

Adapt workloads
By default, this undertaking generates random temperature sensor readings for each sensor with the next format:
The undertaking comes packaged with Faker, which you should use to adapt the payload to your wants. You simply need to replace the generate_sensor_reading perform within the locust-load-test.py file:
Change configurations
After the preliminary deployment of the load testing instrument, you’ll be able to change configuration in two methods:
- Hook up with the EC2 occasion, make any configuration and code modifications, and restart the Locust course of
- Change the configuration and cargo testing code domestically and redeploy it through
cdk deploy
The primary choice helps you iterate extra shortly on the distant occasion and not using a have to redeploy. The latter makes use of the infrastructure as code (IaC) method and makes positive that your configuration modifications will be dedicated to your supply management system. For a quick improvement cycle, it’s really helpful to check your load check configuration domestically first, connect with your occasion to use the modifications, and after profitable implementation, codify it as a part of your IaC repository after which redeploy.
Locust is created on the EC2 occasion as a systemd service and may subsequently be managed with systemctl. If you wish to change the configuration of Locust as wanted with out redeploying the stack, you’ll be able to connect with the occasion through Programs Supervisor, navigate to the undertaking listing on /usr/native/load-test, change the locust.env file, and restart the service by operating sudo systemctl restart locust.
Giant-scale load testing
This setup is able to emitting over 1 million occasions per second to Kinesis knowledge stream, with a batch measurement of 500 and 64 secondaries on a c7g.16xlarge.
To realize peak efficiency with Locust and Kinesis, maintain the next in thoughts:
- Occasion measurement – Your efficiency is sure by the underlying EC2 occasion, so consult with EC2 occasion sort for extra details about scaling. To set the proper occasion measurement, you’ll be able to configure the occasion measurement within the file kinesis-locust-load-testing.ts.
- Variety of secondaries – Locust advantages from a distributed setup. Subsequently, the setup spins up a major, which does the coordination, and a number of secondaries, which do the precise work. To completely reap the benefits of the cores, you must specify one secondary per core. You may configure the quantity within the locust.env file.
- Batch measurement – The quantity of Kinesis knowledge stream occasions you’ll be able to ship per Locust person is restricted because of the useful resource overhead of switching Locust customers and threads. To beat this, you’ll be able to configure a batch measurement to outline how a lot customers are simulated per Locust person. These are despatched as a Kinesis knowledge stream put_records name. You may configure the quantity within the locust.env file.
This setup is able to emitting over 1 million occasions per second to the Kinesis knowledge stream, with a batch measurement of 500 and 64 secondaries on a c7g.16xlarge occasion.

You may observe this on the Monitoring tab for the Kinesis knowledge stream as properly.

Clear up
In an effort to not incur any pointless prices, delete the stack by operating the next code:
Abstract
Kinesis is already well-liked for its ease of use amongst customers constructing streaming purposes. With this load testing functionality utilizing Locust, now you can check your workloads in a extra simple and sooner method. Go to the GitHub repo to embark in your testing journey.
The undertaking is licensed beneath the Apache 2.0 license, offering the liberty to clone and modify it in keeping with your wants. Moreover, you’ll be able to contribute to the undertaking by submitting points or pull requests through GitHub, fostering collaboration and enchancment within the testing ecosystem.
In regards to the creator

Luis Morales works as Senior Options Architect with digital native companies to assist them in continuously reinventing themselves within the cloud. He’s enthusiastic about software program engineering, cloud-native distributed programs, test-driven improvement, and all issues code and safety
