Blue/Green Deployments with AWS Cloudformation
Off late I've been working a fair bit on Amazon Web Services. AWS
provide a wide range of high end and mature services which, when used in
combination with each other could provide an enterprise a highly
available, scalable and resilient infrastructure solution.
In the past few months, I've worked a fair bit on implementing blue/green deployments for our services. Since we were already using Cloudformation to implement zero downtime deployments, we decided to re-use the same code and approach. The place that slowed us down was that no one has tried to implement blue/green deployments using cloudformation because I couldn't find a decent article/blog/anything that documented a decent approach for the same. So here's the approach that I implemented for the said scenario.
The first and foremost thing to remember here is to split your cloudformation templates into smaller templates that are task based. Another important thing to remember is that blue/green deployments would create new resources with each deployment, it would be worth increasing your autoscaling/launch configuration/stack limits to a higher appropriate number.
In our case, we wrote multiple templates, one that sets up the VPC, another that handles deployment of application code and config, which means each time the team does a deployment, the VPC stays put, only a new stack parallel to the stack in operation is created.
The second question to answer is how to make the resources in the new stack unique. The way we approach deployments is that any change to the current running code and/or config is a deployment. Also we have our cloudformation templates that are version-ed, so any change would mean an infrastructure level change thus requiring a deployment irrespective of code/config change. The last variable component was the application AMI, used in the autoscaling group. The AMI ID would change in case Amazon rolls out a new version ( in order to keep ourselves sane, we use the Amazon Linux AMIs ). How do we get all these four components and create a unique stack each time any one or more of these change? Simple answer -> Use Git SHAs!
We create the stack name as <application>-<code-scm-revision>-<config-scm-revision>-<ami-id>-<template-scm-revision>. Looks like we have a solution. The same script would now create a new stack each time any or more of the four components change since the stack-name changes each time.
When you give a go to this solution, you'll go smooth, until you start setting up resources. The trick with resources is that they have to have unique names as well else the template would modify any existing resources with the same name and type. Well, one would think since we have already figured out a way to keep the stack name unique, we just use the stack-name to as a prefix or suffix to the resource names. Agreed. Once you give that a go you'll realize that some resources like ELBs don't support names more than 32 characters. So how do we now get the naming correct, yet maintain the stack name since it'll be easier for anyone looking at the stack to identify the component versions? Simple answer -> SHASUM!
We added a new parameter to the cloudformation template -> UniqueID, which was basically a shasum of the stackname. We prefixed this shasum to the application name: <application>-<shasum{stack-name}>. If you use the first 7 or 8 chars of the shasum, the unique id would never be longer than say 20 characters, unless you have an atrocious application name in which case you need to re-think your application naming altogether. Use the UniqueID parameter as a prefix for all resource names.
Once the Blue (Next-To-Go) Stack is set up, you can choose to run a set of smoke tests and switch DNS once the smokes turn green, to make the code available to public. In order to make this zero downtime even as far as DNS resolution goes, make sure to use weighted DNS.
In the past few months, I've worked a fair bit on implementing blue/green deployments for our services. Since we were already using Cloudformation to implement zero downtime deployments, we decided to re-use the same code and approach. The place that slowed us down was that no one has tried to implement blue/green deployments using cloudformation because I couldn't find a decent article/blog/anything that documented a decent approach for the same. So here's the approach that I implemented for the said scenario.
The first and foremost thing to remember here is to split your cloudformation templates into smaller templates that are task based. Another important thing to remember is that blue/green deployments would create new resources with each deployment, it would be worth increasing your autoscaling/launch configuration/stack limits to a higher appropriate number.
In our case, we wrote multiple templates, one that sets up the VPC, another that handles deployment of application code and config, which means each time the team does a deployment, the VPC stays put, only a new stack parallel to the stack in operation is created.
The second question to answer is how to make the resources in the new stack unique. The way we approach deployments is that any change to the current running code and/or config is a deployment. Also we have our cloudformation templates that are version-ed, so any change would mean an infrastructure level change thus requiring a deployment irrespective of code/config change. The last variable component was the application AMI, used in the autoscaling group. The AMI ID would change in case Amazon rolls out a new version ( in order to keep ourselves sane, we use the Amazon Linux AMIs ). How do we get all these four components and create a unique stack each time any one or more of these change? Simple answer -> Use Git SHAs!
We create the stack name as <application>-<code-scm-revision>-<config-scm-revision>-<ami-id>-<template-scm-revision>. Looks like we have a solution. The same script would now create a new stack each time any or more of the four components change since the stack-name changes each time.
When you give a go to this solution, you'll go smooth, until you start setting up resources. The trick with resources is that they have to have unique names as well else the template would modify any existing resources with the same name and type. Well, one would think since we have already figured out a way to keep the stack name unique, we just use the stack-name to as a prefix or suffix to the resource names. Agreed. Once you give that a go you'll realize that some resources like ELBs don't support names more than 32 characters. So how do we now get the naming correct, yet maintain the stack name since it'll be easier for anyone looking at the stack to identify the component versions? Simple answer -> SHASUM!
We added a new parameter to the cloudformation template -> UniqueID, which was basically a shasum of the stackname. We prefixed this shasum to the application name: <application>-<shasum{stack-name}>. If you use the first 7 or 8 chars of the shasum, the unique id would never be longer than say 20 characters, unless you have an atrocious application name in which case you need to re-think your application naming altogether. Use the UniqueID parameter as a prefix for all resource names.
Once the Blue (Next-To-Go) Stack is set up, you can choose to run a set of smoke tests and switch DNS once the smokes turn green, to make the code available to public. In order to make this zero downtime even as far as DNS resolution goes, make sure to use weighted DNS.
Hi! Would you have DNS weighting on internal hosted zone over public hosted zone?
ReplyDeleteIt depends on your use case. The above is a simple way of doing blue green with a public facing end-point. If you have an internal DNS endpoint fronted by a public endpoint you'd be looking at the internal DNS being updated with deployments.
ReplyDelete