Introducing in-place model upgrades with Amazon MWAA


At the moment, AWS is asserting the provision of in-place model upgrades for Amazon Managed Workflow for Apache Airflow (Amazon MWAA). This enhancement lets you seamlessly improve your current Apache Airflow model 2.x environments to newer out there variations whereas retaining the workflow run historical past and surroundings configurations. Now you can make the most of the most recent capabilities of the Apache Airflow platform with out having to create a completely new Amazon MWAA surroundings.

Till now, for those who wished to improve your Amazon MWAA surroundings to a special Apache Airflow model, you needed to observe the Amazon MWAA surroundings migration directions. This concerned creating a brand new Amazon MWAA surroundings after which migrating your whole configurations and Directed Acyclic Graphs (DAGs) to it. When you additionally wanted to protect the historical past of DAG runs, you needed to take a backup of your metadata database after which restore that backup on the newly created surroundings. This course of was error inclined, handbook, and concerned further prices to take care of two separate Amazon MWAA environments till you might confirm the brand new and decommission the outdated.

On this publish, we offer an summary of the in-place model improve characteristic, discover relevant use instances, element the steps to make use of it, and supply further steerage on its capabilities.

Overview of answer

The newly launched in-place model upgrades by Amazon MWAA present a streamlined transition out of your current Apache Airflow model 2.x-based environments to newer out there Apache Airflow variations. Amazon MWAA manages the complete improve course of, from provisioning new Apache Airflow variations to upgrading the metadata database. Within the occasion of an improve failure, Amazon MWAA is designed to roll again to the earlier secure model utilizing the related metadata database snapshot.

Upgrading your current environments on Amazon MWAA is an easy course of. You’ll be able to improve your current Apache Airflow 2.0 and later environments on Amazon MWAA with only a few clicks on the Amazon MWAA console, through the use of the Amazon MWAA API, the AWS Command Line Interface (AWS CLI), or through the use of instruments like AWS CloudFormation, the AWS Cloud Growth Equipment (AWS CDK), or Terraform. This characteristic is accessible in all at the moment supported Amazon MWAA Areas.

On the Amazon MWAA console, merely edit the surroundings and choose an out there Apache Airflow model increased than the present model of your current surroundings. You can too use the UpdateEnvironment API and specify the brand new Apache Airflow model to set off an improve course of. To study extra about in-place model upgrades, consult with Upgrading the Apache Airflow model from Amazon MWAA documentation.

Throughout an improve, Amazon MWAA first creates a snapshot of the prevailing surroundings’s metadata database, which then serves as the idea for a brand new database. Subsequently, all Apache Airflow parts—net server, scheduler, and employees—are upgraded. Lastly, the newly created metadata database is upgraded, successfully finishing the transition to the brand new surroundings.

Relevant use instances

You must think about upgrading your Apache Airflow model on Amazon MWAA in case your current workflows can accommodate the change and a brand new model is accessible with options or enhancements that align together with your use case. By upgrading, you possibly can make the most of the most recent capabilities of the Apache Airflow platform and preserve compatibility with new options and finest practices like data-driven scheduling and new Amazon supplier packages launched in Apache Airflow 2.4.3. The improve course of entails an surroundings downtime that may take as much as 2 hours to finish relying on the surroundings measurement and could be carried out on demand at a time that most accurately fits you. In case your current surroundings is closely used such you can’t afford a downtime, think about creating a brand new surroundings as a substitute.

Conditions

When getting ready for the improve, be sure you full the next prerequisite steps:

  1. Confirm Apache Airflow adjustments between your current and new variations of the surroundings. Assessment the Apache Airflow launch notes to grasp the influence of recent options, vital adjustments, and bug fixes that each one intermediate Apache Airflow releases made between your supply and vacation spot variations.
  2. Assessment your current necessities.txt file to confirm the proper set of dependencies required to your goal surroundings. Moreover, confirm that your necessities.txt file has the proper constraints file added on the high of the file to match your goal surroundings. The Apache Airflow constraints file specifies the dependent modules and supplier variations out there on the time of an Apache Airflow launch. Including a constraints file prevents incompatible libraries from being put in to your surroundings. Within the following instance, exchange {Airflow-version} together with your goal surroundings’s model quantity, and {Python-version} with the model of Python that’s appropriate together with your surroundings: --constraint "https://uncooked.githubusercontent.com/apache/airflow/constraints-{Airflow-version}/constraints-{Python-version}.txt"
  3. Assessment the compatibility of further Python libraries talked about in your necessities.txt file to match your goal surroundings. Apache Airflow v2.4.3 and above use Python v3.10, whereas older Apache Airflow variations use Python v3.7. Due to this fact, in case you are attempting to improve your current Apache Airflow v2.0.2/2.2.2-based surroundings to Apache Airflow v2.4.3 or increased, it’s best to replace your further Python libraries to match Python v3.10.
  4. With Apache Airflow v2.4.3 and above, the listing of supplier packages Amazon MWAA installs by default to your surroundings has modified. Be aware that some imports and operator names have modified within the new supplier package deal in Apache Airflow to be able to standardize the naming conference throughout the supplier packages. Examine the listing of supplier packages put in by default in Apache Airflow v2.2.2 or v2.0.2, and configure any further packages you would possibly want to your new Apache Airflow v2.4.3 and better surroundings.
  5. Guarantee that your DAGs and different workflow assets are appropriate with the brand new Apache Airflow model you’re upgrading to.
  6. Use the aws-mwaa-local-runner utility to check out your current DAGs, necessities, plugins, and dependencies regionally earlier than deploying to Amazon MWAA. You’ll be able to create a goal Apache Airflow surroundings that’s just like an Amazon MWAA manufacturing picture regionally utilizing aws-mwaa-local-runner and confirm all of your parts work earlier than trying to improve your Amazon MWAA surroundings. Moreover, check the brand new surroundings improve course of in decrease Amazon MWAA environments like dev or staging earlier than rolling out the improve in manufacturing environments.

Improve course of

When an improve has been initiated, Amazon MWAA stops the prevailing underlying Apache Airflow parts (net server, scheduler, and employees). This course of halts any employee duties which can be at the moment working. The standing of your surroundings at this stage will present as UPDATING. The improve course of then creates a database snapshot of the metadata database, marked by the standing CREATING_SNAPSHOT. When the snapshot is full, the surroundings standing returns to UPDATING as Amazon MWAA triggers the creation of a brand new Apache Airflow surroundings that matches your model choice and applies the mandatory schema adjustments to the prevailing metadata database to align it with the goal Apache Airflow surroundings. Throughout this section, your specified necessities, plugins, and different dependencies are put in.

Upon completion, your new surroundings is marked as AVAILABLE, indicating that the improve course of has been profitable and the surroundings is prepared for testing. Now you can log in to your Apache Airflow UI to confirm the presence of your current DAGs, their historic runs, configured connections, and extra.

Nevertheless, if there are failures in putting in your specified necessities, plugins, and dependencies recordsdata, the surroundings initiates a rollback to the earlier secure model. Throughout this course of, your surroundings standing will present as ROLLING_BACK. If the rollback is profitable, your earlier secure surroundings might be out there and the standing will show as UPDATE_FAILED till a brand new replace is tried and succeeds. If the rollback fails, the standing will present as UNAVAILABLE, indicating that your surroundings shouldn’t be purposeful.

In case your surroundings improve course of fails, it’s seemingly that the underlying Amazon Elastic Container Service (Amazon ECS) AWS Fargate clusters had stabilization points attributable to conflicting necessities and plugins, networking points, or DB migration points after the Apache Airflow part improve. To mitigate these points, make sure that your DAGs and necessities work with out points utilizing the aws-mwaa-local-runner utility and, ideally, check in a staging Amazon MWAA surroundings.

Extra issues

Have in mind the next further info of this characteristic:

  • The improve course of is accessible on demand, and might be restricted to shifting to newer variations. In-place model upgrades on Amazon MWAA aren’t supported for model 1.10.z. To carry out a serious model improve, for instance from model 1.y.z to 2.y.z, it’s essential to create a brand new surroundings and migrate your assets.
  • You’ll be able to solely choose relevant increased variations you can improve to. Downgrading to a decrease model shouldn’t be out there.
  • The rollback course of can take further time and, when you’ve got Amazon Easy Storage Service (Amazon S3) bucket versioning enabled, Amazon MWAA is designed to revert the surroundings to the earlier working configuration, together with plugins and necessities. Nevertheless, any handbook adjustments made to your DAGs is not going to be reverted throughout this course of.
  • After the improve course of has accomplished efficiently and the surroundings is accessible, any working DAGs that have been interrupted through the improve are scheduled for a retry, relying on the way in which you configure retries to your DAGs. You can too set off them manually or await the following scheduled run.
  • You must iteratively improve your environments beginning with the least important ones first.

Conclusion

On this publish, we talked concerning the new characteristic of Amazon MWAA that lets you improve your current Amazon MWAA surroundings to increased Apache Airflow variations. This characteristic is supported on new and current Amazon MWAA environments working Apache Airflow 2.x and above. Use this characteristic to improve your Apache Airflow variations whereas retaining your current workflow run histories and surroundings configurations. By upgrading, you possibly can make the most of the most recent capabilities of the Apache Airflow platform and preserve compatibility with new options and cling to finest practices.

For extra particulars and code examples on Amazon MWAA, go to the Amazon MWAA Consumer Information and the Amazon MWAA examples GitHub repo.

Apache, Apache Airflow, and Airflow are both registered logos or logos of the Apache Software program Basis in america and/or different nations.


Concerning the Authors

Parnab Basak is a Options Architect and a Serverless Specialist at AWS. He makes a speciality of creating new options which can be cloud native utilizing trendy software program growth practices like serverless, DevOps, and analytics. Parnab works carefully within the analytics and integration companies house serving to clients undertake AWS companies for his or her workflow orchestration wants.

Fernando Gamero is a Senior Options Architect engineer at AWS, having greater than 25 years of expertise within the know-how trade, from telecommunications, banking to startups. He’s now serving to clients with constructing Occasion Pushed Architectures, adopting IoT options on the Edge, and reworking their information and machine studying pipelines at scale.

Shubham Mehta is an skilled product supervisor with over eight years of expertise and a confirmed observe report of delivering profitable merchandise. In his present function as a Senior Product Supervisor at AWS, he oversees Amazon Managed Workflows for Apache Airflow (Amazon MWAA) and spearheads the Apache Airflow open-source contributions to additional improve the product’s performance.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles