To enhance a Spark software’s effectivity, it’s important to watch its efficiency and habits. On this publish, we show easy methods to publish detailed Spark metrics from Amazon EMR to Amazon CloudWatch. This gives you the power to establish bottlenecks whereas optimizing useful resource utilization.
CloudWatch offers a strong, scalable, and cost-effective monitoring resolution for AWS sources and purposes, with highly effective customization choices and seamless integration with different AWS providers. By default, Amazon EMR sends fundamental metrics to CloudWatch to trace the exercise and well being of a cluster. Spark’s configurable metrics system permits metrics to be collected in a wide range of sinks, together with HTTP, JMX, and CSV information, however further configuration is required to allow Spark to publish metrics to CloudWatch.
Resolution overview
This resolution contains Spark configuration to ship metrics to a {custom} sink. The {custom} sink collects solely the metrics outlined in a Metricfilter.json file. It makes use of the CloudWatch agent to publish the metrics to a {custom} Cloudwatch namespace. The bootstrap motion script included is chargeable for putting in and configuring the CloudWatch agent and the metric library on the Amazon Elastic Compute Cloud (Amazon EC2) EMR cases. A CloudWatch dashboard can present on the spot perception into the efficiency of an software.
The next diagram illustrates the answer structure and workflow.
The workflow contains the next steps:
- Customers begin a Spark EMR job, making a step on the EMR cluster. With Apache Spark, the workload is distributed throughout the totally different nodes of the EMR cluster.
- In every node (EC2 occasion) of the cluster, a Spark library captures and pushes metric knowledge to a CloudWatch agent, which aggregates the metric knowledge earlier than pushing them to CloudWatch each 30 seconds.
- Customers can view the metrics accessing the {custom} namespace on the CloudWatch console.
We offer an AWS CloudFormation template on this publish as a common information. The template demonstrates easy methods to configure a CloudWatch agent on Amazon EMR to push Spark metrics to CloudWatch. You’ll be able to evaluation and customise it as wanted to incorporate your Amazon EMR safety configurations. As a finest observe, we suggest together with your Amazon EMR safety configurations within the template to encrypt knowledge in transit.
You must also bear in mind that among the sources deployed by this stack incur prices once they stay in use. Moreover, EMR metrics don’t incur CloudWatch prices. Nonetheless, {custom} metrics incur prices based mostly on CloudWatch metrics pricing. For extra data, see Amazon CloudWatch Pricing.
Within the subsequent sections, we undergo the next steps:
- Create and add the metrics library, set up script, and filter definition to an Amazon Easy Storage Service (Amazon S3) bucket.
- Use the CloudFormation template to create the next sources:
- Monitor the Spark metrics on the CloudWatch console.
Conditions
This publish assumes that you’ve the next:
- An AWS account.
- An S3 bucket for storing the bootstrap script, library, and metric filter definition.
- A VPC created in Amazon Digital Non-public Cloud (Amazon VPC), the place your EMR cluster will probably be launched.
- Default IAM service roles for Amazon EMR permissions to AWS providers and sources. You’ll be able to create these roles with the aws emr create-default-roles command within the AWS Command Line Interface (AWS CLI).
- An optionally available EC2 key pair, if you happen to plan to hook up with your cluster via SSH moderately than Session Supervisor, a functionality of AWS Methods Supervisor.
Outline the required metrics
To keep away from sending pointless knowledge to CloudWatch, our resolution implements a metric filter. Overview the Spark documentation to get acquainted with the namespaces and their related metrics. Decide which metrics are related to your particular software and efficiency targets. Totally different purposes might require totally different metrics to watch, relying on the workload, knowledge processing necessities, and optimization targets. The metric names you’d like to watch ought to be outlined within the Metricfilter.json file, together with their related namespaces.
We’ve created an instance Metricfilter.json definition, which incorporates capturing metrics associated to knowledge I/O, rubbish assortment, reminiscence and CPU stress, and Spark job, stage, and process metrics.
Word that sure metrics are usually not out there in all Spark launch variations (for instance, appStatus was launched in Spark 3.0).
Create and add the required information to an S3 bucket
For extra data, see Importing objects and Putting in and working the CloudWatch agent in your servers.
To create and the add the bootstrap script, full the next steps:
- On the Amazon S3 console, select your S3 bucket.
- On the Objects tab, select Add.
- Select Add information, then select the Metricfilter.json, installer.sh, and examplejob.sh information.
- Moreover, add the
emr-custom-cw-sink-0.0.1.jar
metrics library file that corresponds to the Amazon EMR launch model you can be utilizing: - Select Add, and pay attention to the S3 URIs for the information.
Provision sources with the CloudFormation template
Select Launch Stack to launch a CloudFormation stack in your account and deploy the template:
This template creates an IAM function, IAM occasion profile, EMR cluster, and CloudWatch dashboard. The cluster begins a fundamental Spark instance software. You’ll be billed for the AWS sources used if you happen to create a stack from this template.
The CloudFormation wizard will ask you to switch or present these parameters:
- InstanceType – The kind of occasion for all occasion teams. The default is m5.2xlarge.
- InstanceCountCore – The variety of cases within the core occasion group. The default is 4.
- EMRReleaseLabel – The Amazon EMR launch label you wish to use. The default is emr-6.9.0.
- BootstrapScriptPath – The S3 path of the installer.sh set up bootstrap script that you just copied earlier.
- MetricFilterPath – The S3 path of your Metricfilter.json definition that you just copied earlier.
- MetricsLibraryPath – The S3 path of your CloudWatch emr-custom-cw-sink-0.0.1.jar library that you just copied earlier.
- CloudWatchNamespace – The title of the {custom} CloudWatch namespace for use.
- SparkDemoApplicationPath – The S3 path of your examplejob.sh script that you just copied earlier.
- Subnet – The EC2 subnet the place the cluster launches. You should present this parameter.
- EC2KeyPairName – An optionally available EC2 key pair for connecting to cluster nodes, as an alternative choice to Session Supervisor.
View the metrics
After the CloudFormation stack deploys efficiently, the instance job begins robotically and takes roughly quarter-hour to finish. On the CloudWatch console, select Dashboards within the navigation pane. Then filter the checklist by the prefix SparkMonitoring.
The instance dashboard contains data on the cluster and an outline of the Spark jobs, levels, and duties. Metrics are additionally out there below a {custom} namespace beginning with EMRCustomSparkCloudWatchSink
.
Reminiscence, CPU, I/O, and extra process distribution metrics are additionally included.
Lastly, detailed Java rubbish assortment metrics can be found per executor.
Clear up
To keep away from future prices in your account, delete the sources you created on this walkthrough. The EMR cluster will incur prices so long as the cluster is lively, so cease it whenever you’re executed. Full the next steps:
- On the CloudFormation console, within the navigation pane, select Stacks.
- Select the stack you launched (
EMR-CloudWatch-Demo
), then select Delete. - Empty the S3 bucket you created.
- Delete the S3 bucket you created.
Conclusion
Now that you’ve accomplished the steps on this walkthrough, the CloudWatch agent is working in your cluster hosts and configured to push Spark metrics to CloudWatch. With this function, you possibly can successfully monitor the well being and efficiency of your Spark jobs working on Amazon EMR, detecting essential points in actual time and figuring out root causes shortly.
You’ll be able to bundle and deploy this resolution via a CloudFormation template like this instance template, which creates the IAM occasion profile function, CloudWatch dashboard, and EMR cluster. The supply code for the library is offered on GitHub for personalisation.
To take this additional, think about using these metrics in CloudWatch alarms. You possibly can acquire them with different alarms right into a composite alarm or configure alarm actions resembling sending Amazon Easy Notification Service (Amazon SNS) notifications to set off event-driven processes resembling AWS Lambda capabilities.
In regards to the Writer
Le Clue Lubbe is a Principal Engineer at AWS. He works with our largest enterprise clients to resolve a few of their most advanced technical issues. He drives broad options via innovation to affect and enhance the lifetime of our clients.