Amazon Redshift is a extensively used, totally managed, petabyte-scale cloud information warehouse. Tens of 1000’s of shoppers use Amazon Redshift to course of exabytes of knowledge every single day to energy their analytics workloads. With the launch of Amazon Redshift Serverless and the assorted deployment choices Amazon Redshift gives (equivalent to occasion varieties and cluster sizes), clients are in search of instruments that assist them decide probably the most optimum information warehouse configuration to assist their Redshift workload.
On this publish, we reply that query through the use of Redshift Check Drive, an open-source device that permits you to consider which totally different information warehouse configurations choices are greatest suited in your workload. We created Redshift Check Drive from SimpleReplay and redshift-config-compare (see Examine totally different node varieties in your workload utilizing Amazon Redshift for extra particulars) to offer a single entry level for locating the most effective Amazon Redshift configuration in your workload. Redshift Check Drive additionally gives extra options equivalent to a self-hosted evaluation UI and the flexibility to duplicate exterior objects {that a} Redshift workload might work together with.
Amazon Redshift RA3 with managed storage is the most recent occasion sort for Provisioned clusters. It lets you scale and pay for compute and storage independently, in addition to use superior options equivalent to cross-cluster information sharing and cross-Availability Zone cluster relocation. Many purchasers utilizing earlier technology occasion varieties need to improve their clusters to RA3 occasion varieties. On this publish, we present you tips on how to use Redshift Check Drive to judge the efficiency of an RA3 cluster configuration in your Redshift workloads.
Answer overview
At its core, Redshift Check Drive replicates a workload by extracting queries from the supply Redshift information warehouse logs (proven as Workload Extractor within the following determine) and replays the extracted workload towards the goal Redshift information warehouses (Workload Replayer).
If these workloads work together with exterior objects through Amazon Redshift Spectrum (such because the AWS Glue Knowledge Catalog) or COPY instructions, Redshift Check Drive presents an exterior object replicator utility to clone these objects to facilitate replay.
Redshift Check Drive makes use of this technique of workload replication for 2 predominant functionalities: evaluating configurations and evaluating replays.
Examine Amazon Redshift configurations
Redshift Check Drive’s ConfigCompare utility (primarily based on redshift-config-compare device) helps you discover the most effective Redshift information warehouse configuration through the use of your workload to run efficiency and purposeful assessments on totally different configurations in parallel. This utility’s automation begins by creating a brand new AWS CloudFormation stack primarily based on this CloudFormation template. The CloudFormation stack creates an AWS Step Perform state machine, which internally makes use of AWS Lambda capabilities to set off AWS Batch jobs to run workload comparability throughout totally different Redshift occasion varieties. These jobs extract the workload from the supply Redshift information warehouse log location throughout the desired workload time (as supplied within the config parameters) after which replays the extracted workload towards an inventory of various goal Redshift information warehouse configurations as supplied within the configuration file. When the replay is full, the Step Features state machine uploads the efficiency stats for the goal configurations to an Amazon Easy Storage Service (Amazon S3) bucket and creates exterior schemas that may then be queried from any Redshift goal to establish a goal configuration that meets your efficiency necessities.
The next diagram illustrates the structure of this utility.
Examine replay efficiency
Redshift Check Drive additionally gives the flexibility to examine the replay runs visually utilizing a self-hosted UI device. This device reads the stats generated by the workload replicator (saved in Amazon S3) and helps examine the replay runs throughout key efficiency indicators equivalent to longest working queries, error distribution, queries with most deviation of latency throughout runs, and extra.
The next diagram illustrates the structure for the UI.
Walkthrough overview
On this publish, we offer a step-by-step walkthrough of utilizing Redshift Check Drive to robotically replay your workload towards totally different Amazon Redshift configurations with the ConfigCompare utility. Subsequently, we use the self-hosted evaluation UI utility to research the output of ConfigCompare for figuring out the optimum goal warehouse configuration emigrate or improve. The next diagram illustrates the workflow.
Conditions
The next conditions needs to be addressed earlier than we run the ConfigCompare utility:
- Allow audit logging and user-activity logging in your supply cluster.
- Take a snapshot of the supply Redshift information warehouse.
- Export your supply parameter group and WLM configurations to Amazon S3. The parameter group could be exported utilizing the AWS Command Line Interface (AWS CLI), for instance, utilizing CloudShell, by working the next code:
- The WLM configurations could be copied as JSON within the console, from the place you’ll be able to enter them right into a file and add it to Amazon S3. If you wish to check any different WLM configurations (equivalent to evaluating guide vs. auto WLM or enabling concurrency scaling), you’ll be able to create a separate file with that concentrate on configuration and add it to Amazon S3 as nicely.
- Determine the goal configurations you need to check. When you’re upgrading from DC2 to RA3 node varieties, seek advice from Upgrading to RA3 node varieties for suggestions.
For this walkthrough, let’s assume you’ve an present Redshift information warehouse configuration with a two-node dc2.8xlarge provisioned cluster. You need to validate whether or not upgrading your present configuration to a decoupled structure utilizing the RA3 provisioned node sort or Redshift Serverless would meet your workload value/efficiency necessities.
The next desk summarizes the Redshift information warehouse configurations which might be evaluated as a part of this check.
Warehouse Sort | Variety of Nodes/Base RPU | Choice |
dc2.8xlarge | 2 | default auto WLM |
ra3.4xlarge | 4 | default auto WLM |
Redshift Serverless | 64 | auto scaling |
Redshift Serverless | 128 | auto scaling |
Run the ConfigCompare utility
Earlier than you run the utility, customise the main points of the workload to replay, together with the time interval and the goal warehouse configurations to check, in a JSON file. Add this file to Amazon S3 and duplicate the S3 URI path to make use of as an enter parameter for the CloudFormation template that deploys the sources for the remaining orchestration.
You’ll be able to learn extra concerning the particular person parts and inputs of JSON file within the Readme.
For our use case, we use the next JSON file as an enter to the utility:
The utility deploys all the information warehouse configurations included within the CONFIGURATIONS part of the JSON file. A duplicate of the supply configuration can also be included for use for a baseline of the prevailing workload efficiency.
After this file is totally configured and uploaded to Amazon S3, navigate to the AWS CloudFormation console and create a brand new stack primarily based on the this CloudFormation template and specify the related parameters. For extra particulars on the person parameters, seek advice from the GitHub repo. The next screenshot reveals the parameters used for this walkthrough.
After that is up to date, proceed with the following steps on the AWS CloudFormation console to launch a brand new stack.
When the stack is totally created, choose the stack and open the Sources tab. Right here, you’ll be able to seek for the time period StepFunctions
and select the hyperlink subsequent to the RedshiftConfigTestingStepFunction
bodily ID to open the Step Features state machine to run the utility.
On the Step Features web page that opens, select Begin execution. Depart the default values and select Begin execution to set off the run. Monitor the progress of the state machine’s run on the graph view of the web page. The total run will take roughly the identical time because the time window that was specified within the JSON configuration file.
When the standing of the run adjustments from Operating to Succeeded, the run is full.
Analyze the outcomes
When the Step Features state machine run is full, the efficiency metrics are uploaded to the S3 bucket created by the CloudFormation template initially. To research the efficiency of the workload throughout totally different configurations, you should use the self-hosted UI device that comes with Redshift Check Drive. Arrange this device in your workload by following the directions supplied on this Readme.
After you level the UI to the S3 location that has the stats from the ConfigCompare
run, the Replays part will populate with the evaluation for replays discovered within the enter S3 location. Choose the goal configurations you need to examine and select Evaluation to navigate to the comparisons web page.
You need to use the Filter Outcomes part to indicate which question varieties, customers, and time-frame to check, and the Evaluation part will broaden to a bit offering evaluation of all the chosen replays. Right here you’ll be able to see a comparability of the SELECT queries run by the advert hoc consumer of the replay.
The next screenshot reveals an instance of the evaluation of a replay. These outcomes present the distribution of queries accomplished over the total run for a given consumer and question sort, permitting us to establish intervals of excessive and low exercise. We will additionally see runtimes of those queries, aggregated as percentiles, common, and commonplace deviation. For instance, the P50 worth signifies that fifty% of queries ran inside 26.564 seconds. The parameters used to filter for particular customers, question varieties, and runtimes could be dynamically up to date to permit the outcomes and comparisons to be comprehensively investigated in response to the particular efficiency necessities every particular person use case calls for.
Troubleshooting
As proven within the answer structure, the primary transferring elements within the ConfigCompare
automation are AWS CloudFormation, Step Features (internally utilizing Lambda), and AWS Batch.
If any useful resource within the CloudFormation stack fails to deploy, we advocate troubleshooting the difficulty primarily based on the error proven on the AWS CloudFormation console.
To troubleshoot errors with the Step Features state machine, find the Amazon CloudWatch logs for a step by navigating to the state machine’s newest run on the Step Features console and selecting CloudWatch Logs for the failed Step Features step. After resolving the error, you’ll be able to restart the state machine by selecting New execution.
For AWS Batch errors, find the AWS Batch logs by navigating to the AWS CloudFormation console and selecting the Sources tab within the CloudFormation stack. On this tab, seek for LogGroup
to seek out the AWS Batch run logs.
For extra details about frequent errors and their options, seek advice from the Check Drive Readme.
Clear up
When you’ve accomplished the analysis, we advocate manually deleting the deployed Redshift warehouses to keep away from any on-demand expenses that might accrue. After this, you’ll be able to delete the CloudFormation stack to wash up different sources.
Limitations
A number of the limitations for the WorkloadReplicator
(the core utility supporting the ConfigCompare
device) are outlined within the Readme.
Conclusion
On this publish, we demonstrated the method of discovering the correct Redshift information warehouse configuration utilizing Redshift Check Drive. The utility presents an easy-to-use device to duplicate the workload of your alternative towards customizable information warehouse configurations. It additionally gives a self-hosted evaluation UI that will help you dive deeper into the stats generated in the course of the replication course of.
Get began with Check Drive at the moment by following the directions supplied within the Readme. For an in-depth overview of the config examine automation, seek advice from Examine totally different node varieties in your workload utilizing Amazon Redshift. When you’re migrating from DC2 or DS2 node varieties to RA3, seek advice from our suggestions on node rely and sort as a benchmark.
Concerning the Authors
Sathiish Kumar is a Software program Growth Supervisor at Amazon Redshift and has labored on constructing end-to-end functions utilizing totally different database and expertise options during the last 10 years. He’s keen about serving to his clients discover the quickest and probably the most optimized answer to their issues by leveraging open-source applied sciences.
Julia Beck is an Analytics Specialist Options Architect at AWS. She helps clients in validating analytics options by architecting proof of idea workloads designed to satisfy their particular wants.
Ranjan Burman is an Analytics Specialist Options Architect at AWS. He makes a speciality of Amazon Redshift and helps clients construct scalable analytical options. He has greater than 16 years of expertise in several database and information warehousing applied sciences. He’s keen about automating and fixing buyer issues with cloud options.