Blue/Green Deployment on ECS with EC2 Cluster

5 min readNov 18, 2019

Blue/Green Deployment is an approach to have zero downtime deployment. This can be accomplished by creating two separate environments. One of them will host the current application (called “Blue”) and the other will host the application you will deploy (called “Green”).

This approach can greatly reduce risk especially in production environment. It allows you to serve the new application only when it’s ready, giving a very smooth transition for your users. It also allows you to rollback to the previous application if something goes wrong with the new application.

Blue/Green Deployment on ECS

On ECS, these two separate environments are two Load Balancing Target Groups serving two Task sets. How it works, in brief, is CodeDeploy deploys our new application (Task set) while the original application (Task set) is still serving the Users. After some initial tests, it will reroute the traffic to the new application and after a while, it will terminate the original application.

With Fargate-cluster we won’t have to worry about servers. But with EC2-cluster, other problems might arise: Port conflict, not enough CPU / Memory, and so on.

So in this article, I would like to share my take on the Blue/Green deployment approach on ECS with EC2-cluster.

Preparation

Launch ECS Instances with Auto Scaling Groups

When we create a new ECS cluster with EC2 instances, AWS will automatically create an Auto Scale Configuration and an Auto Scaling Group (ASG) for us. This is the easiest way to start instances for our ECS Cluster. We can easily set up desired instances on our ASG, and it will launch ECS instances for us.

Launching ECS instances through ASG is always recommended. First, it will make sure we have the minimum number of running instances. And as stated in the name, it allows us to automatically scale-out or scale-in whenever we want.

Two Target Groups, Two Auto Scaling Groups

Now Blue/Green deployment will require two Load Balancing Target Groups. One Target Group will handle instances with the old application (the “Blue”), and the other one will handle instances with the new application (the “Green”). ECS can create these for us during deployment, but we want to create it right now so that we can configure it later.

It can raise problems if we only have one ASG for two Target Groups. For example, during deployment, the instances serving the old application may accidentally get terminated and cause downtime. Also, if somehow our old instances get terminated, we cannot rollback to the original application. Having one ASG for two Target Groups is a BIG NO.

The next step is attaching the ASG to the Target Group. We can attach the Target Group during creation or after. By doing so, every instances launched in the ASG will automatically be associated to their Target Group. We don’t have to worry because they will never get to the wrong Target Group.

Attaching Target Group when creating ASG

Attaching Target Group by editing ASG after it’s created

Deployment

Now that our ASG and Target Groups are ready, let’s see how Blue/Green Deployment works. I assume we already have an ECS Service running and configured to use Blue/Green deployment with the two ASG that we just created.

ECS can only start new Task when there are available instances. If no instance is available, then ECS will not proceed with the deployment. We can do this by simply setting the min. and the desired number of instances (unless your max is 0, then you should set the max. too). ASG will start launching instances, and let’s wait until they are Healthy and InService.

Now that the instances are ready, let’s deploy our apps. We can do this simply by updating our Service with the latest version of Task Definition. The update will trigger a deployment process powered by CodeDeploy.

ECS will start new application in the new instances. And after some initial checks, CodeDeploy will route the traffic to the instances in the second Target Group. Finally, our ECS cluster is serving the newest version of the app with zero downtime.

We have around 1 hour (or depending on your configuration) to wait until it automatically terminates the original application. During this time, it’s always good to do some final checking. If we choose to rollback, CodeDeploy will route the traffic back to the instances of the first Target Group.

If you decide to terminate the original application, we can do this easily with a click of a button. It will finish the deployment and mark everything succeeded.

But wouldn’t it be great if we can automate the whole thing? Please give thumbs up on this issue.

Random Tips

For High Availability, configure ASG to launch instances in multiple AZ. If there’s a problem in AZ 1, ASG can quickly launch instances in AZ 2 or other AZ to minimise downtime.
To save operational cost, we should terminate instances in the ASG not serving traffic. To do this, set the min and the desired number of instances to 0.
Always update your instances with the latest AMI. Simply copy the Launch Configuration and edit with the latest AMI, then modify our ASGs to use the latest Launch Configuration.