Amazon EMR is the industry-leading cloud massive knowledge resolution, offering a set of open-source frameworks akin to Spark, Hive, Hudi, and Presto, totally managed and with per-second billing. Amazon EMR on Amazon EKS is a deployment possibility permitting you to deploy Amazon EMR on the identical Amazon Elastic Kubernetes Service (Amazon EKS) clusters that’s multi-tenant and utilized by different purposes, enhancing useful resource utilization, decreasing price, and simplifying infrastructure administration. EMR on EKS present you as much as 5.37 occasions higher efficiency than OSS Spark v3.3.1 with 76.8% price financial savings. It additionally offers all kinds of job submission strategies, like an AWS API known as StartJobRun, or by means of a declarative means with a Kubernetes controller by means of the AWS Controllers for Kubernetes for Amazon EMR on EKS.
This consolidation comes with a trade-off of elevated problem measuring fine-grained prices for showback or chargeback by crew or utility. In response to a CNCF and FinOps Basis survey, 68% of Kubernetes customers both depend on month-to-month estimates or don’t monitor Kubernetes prices in any respect. And for respondents reporting lively Kubernetes price monitoring, AWS Value Explorer and Kubecost have been ranked as the preferred instruments getting used.
At present, you may distribute prices per tenant utilizing a tough multi-tenancy with separate EKS clusters in devoted AWS accounts or a delicate multi-tenancy utilizing separate node teams in a shared EKS cluster. To cut back prices and enhance useful resource utilization, you should use namespace-based segregation, the place nodes are shared throughout totally different namespaces. Nonetheless, calculating and attributing prices to groups by workload or namespaces whereas taking into consideration compute optimization (like Saving Plans or Spot Occasion price) and the price of AWS providers like EMR on EKS is a difficult and non-trivial process.
On this put up, we current a value chargeback resolution for EMR on EKS that mixes the AWS-native capabilities of AWS Value and Utilization Reviews (AWS CUR) alongside the in-depth Kubernetes price visibility and insights utilizing Kubecost on Amazon EKS.
Answer overview
A job in EMR on EKS incur prices primarily on two dimensions: compute assets and a marginal uplift cost for EMR on EKS utilization. To trace the associated fee related to every of the scale, we use knowledge from three sources:
- AWS CUR – We use this to get the EMR on EKS price uplift per job and for Kubecost to reconcile the compute price with any saving plans or reserved occasion used. The supporting infrastructure for CUR is deployed as outlined in Organising Athena utilizing AWS CloudFormation templates.
- Kubecost – We use this to get the compute price incurred by the executor and driver pods.
The fee allocation course of contains the next parts:
- The compute price is supplied by Kubecost. Nonetheless, as a way to do an in-depth evaluation, we outline an hourly Kubernetes CronJob on it that begins a pod to retrieve knowledge from Kubecost and shops it in Amazon Easy Storage Service (Amazon S3).
- CUR recordsdata are saved in an S3 bucket.
- We use Amazon Athena to create a view and supply a consolidated view of the whole price to run an EMR on EKS job.
- Lastly, you may join your most well-liked enterprise intelligence instruments utilizing the JDBC or ODBC connections to Athena. On this put up, we use Amazon QuickSight native integration for visualization functions.
The next diagram exhibits the general structure in addition to how the totally different parts work together with one another.
We offer a shell script to deploy our the monitoring resolution. The shell script configures the infrastructure utilizing an AWS CloudFormation template, the AWS Command Line Interface (AWS CLI), and eksctl
and kubectl
instructions. This script runs the next actions:
- Begin the CloudFormation deployment.
- Create and configure an AWS Value and Utilization Report.
- Configure and deploy Kubecost backed by Amazon Managed Service for Prometheus.
- Deploy a Kubernetes CronJob.
Conditions
You want the next conditions:
This put up assumes you have already got an EKS cluster and run EMR on EKS jobs. In case you don’t have an EKS cluster prepared to check the answer, we propose beginning with a typical EMR on EKS blueprint that configures a cluster to submit EMR on EKS jobs.
Arrange the answer
To run the shell script, full the next steps:
- Clone the next GitHub repository.
- Go to the folder
cost-tracking
with the next command:
cd cost-tracking
- Run the script with following command :
sh deploy-emr-eks-cost-tracking.sh REGION KUBECOST-VERSION EKS-CLUSTER-NAME ACCOUNT-ID
After you run the script, you’re prepared to make use of Kubecost and the CUR knowledge to know the associated fee related along with your EMR on EKS jobs.
Monitoring price
On this part, we present you how one can analyze the compute price that’s retrieved from Kubecost, how one can question EMR on EKS uplift knowledge, and how one can mix them to have a single consolidated view for the associated fee.
Compute price
Kubecost gives numerous methods to trace price per Kubernetes object. For instance, you may observe price by pod, controller, job, label, or deployment. It additionally means that you can perceive the price of idle assets, like Amazon Elastic Compute Cloud (Amazon EC2) situations that aren’t totally utilized by pods. On this put up, we assume that no nodes are provisioned if no EMR on EKS job is operating, and we use the Karpenter Cluster Autoscaler to provision nodes when jobs are submitted. Karpenter additionally does bin packing, which optimizes the EC2 useful resource utilization and in flip reduces the price of idle assets.
To trace compute price related to EMR on EKS pods, we question the Kubecost allocation API by passing pod
and labels
within the mixture parameter. We use the emr-containers.amazonaws.com/job.id
and emr-containers.amazonaws.com/virtual-cluster-id
labels which might be all the time current in executor and driver pods. The labels are used to filter Kubecost knowledge to get solely the associated fee related to EMR on EKS pods. You may evaluate numerous ranges of granularity on the pod, job, and digital cluster stage to know the price of a driver vs. executor, or of utilizing Spot Cases in jobs. You can even use the digital cluster price to know the general price of a EMR on EMR when it’s utilized in a namespace that’s utilized by purposes apart from EMR on EKS.
We additionally present the instance_id
, occasion dimension, and capability sort (On-Demand or Spot) that was used to run the pod. That is retrieved by means of querying the Kubecost belongings API. This knowledge could be helpful to know the way you run your jobs and which capability you employ extra typically.
The information about the price of operating the pods in addition to the belongings is retrieved with a Kubernetes CronJob that submits the request to the Kubecost API, joins the 2 knowledge sources (allocation and belongings knowledge) on the instance_id
, cleans the info, and shops it in Amazon S3 in CSV format.
The compute price knowledge has a number of fields which might be of curiosity, together with cpucost
, ramcost
(price of reminiscence), pvcost
(price of Amazon EBS storage), effectivity of use of CPU and RAM, in addition to whole price, which represents the combination price of all of the assets used, both at pod, job, or digital cluster stage.
To view this knowledge, full the next steps:
- On the Athena console, navigate to the question editor.
- Select
athenacurcfn_c_u_r
for the database andcost_data
for the desk. - Run the next question:
The next screenshot exhibits the question outcomes.
To question the info about info on the pod stage, you may run the next SQL assertion:
EMR on EKS uplift
The fee related to EMR on EKS uplift is offered by means of AWS CUT and is saved in an S3 bucket. The script you ran within the setup step created an Athena desk related to the info within the S3 bucket. The next steps take you thru how one can question the info:
- On the Athena console, navigate to the question editor.
- Select
athenacurcfn_c_u_r
for the database andcur_data
for the desk. - Run the next question:
This question offers you with the associated fee per job. The next screenshot exhibits the outcomes.
You’ll have to wait as much as 24 hours for the CUR knowledge to be obtainable. As such, it is best to solely run the previous question after the CUR knowledge is offered and you’ve got run the EMR on EKS jobs.
Total price
To view the general price and carry out evaluation on it, create a view in Athena as follows:
Now that the view is created, you may question and analyze the price of operating your EMR on EKS jobs:
The next screenshot exhibits an instance output of the question on the created view.
Lastly, you should use QuickSight for a graphical high-level view in your EMR on EKS spend. The next screenshot exhibits an instance dashboard.
Now you can adapt this resolution to your particular wants and construct your customized evaluation.
Clear up
All through this put up, you deployed and configured the required infrastructure parts to trace price in your EMR on EKS workloads. To keep away from incurring further expenses for this resolution, delete all of the assets you created:
- Empty the S3 buckets
cost-data-REGION-ACCOUNT_ID
andaws-athena-query-results-cur-REGION-ACCOUNT_ID
. - Delete the Athena workgroup
kubecost-cur-workgroup
. - Empty and delete the ECR repository
emreks-compute-cost-exporter
. - Run the script destroy-emr-eks-cost-tracking.sh, which is able to delete the AWS CloudFormation deployment, uninstall Kubecost, delete the CronJob, and delete the Value and Utilization Reviews.
Conclusion
On this put up, we confirmed how you should use Kubecost capabilities alongside Value and Utilization Reviews to intently monitor the prices for Amazon EMR on EKS per digital cluster or per job. This resolution means that you can obtain extra granular prices for chargebacks utilizing Athena, Amazon Managed Service for Prometheus, and QuickSight.
The answer offered steps to arrange Value and Utilization Reviews and Kubecost, and configure a CronJob on an hourly foundation to get the price of operating pods spun by EMR on EKS. You may modify the offered resolution to run at longer intervals or to gather knowledge on totally different EKS clusters. You can even modify the Python script run by the CronJob to additional clear knowledge or scale back the quantity of information saved by eliminating fields you don’t want. You need to use the insights supplied to drive price optimization efforts over time, detect any enhance of prices, and measure the influence of latest deployments or specific occasions on useful resource utilization and price efficiency. For extra details about integrating EMR on EKS in your current Amazon EKS deployment, check with Design concerns for Amazon EMR on EKS in a multi-tenant Amazon EKS atmosphere
Concerning the Authors
Lotfi Mouhib is a Senior Options Architect working for the Public Sector crew with Amazon Net Companies. He helps public sector prospects throughout EMEA understand their concepts, construct new providers, and innovate for residents. In his spare time, Lotfi enjoys biking and operating.
Hamza Mimi Principal Options Architect within the French Public sector crew at Amazon Net Companies (AWS). With an extended expertise within the telecommunications {industry}. He’s at the moment working as a buyer advisor on subjects starting from digital transformation to architectural steerage.