Deploying to ECS (Elastic Container Service) is the bane of every DevOps; with several implementations, from in-house build ones to CI/CD provided ones. Although there isn't a single definitive approach, you're likely to encounter some of the challenges we faced during our project development process at Povio.

In my experience, it always boils down to separation of concern. Do you configure the app within infrastructure as code or within the app code itself, what should be considered configuration and what is the inherent logic of the app. The solution provided in this article will completely sidestep these questions to provide a more generic approach.

Ignoring the drift

This is by far the most common approach - beyond not using IAC at all (gasp!). Commonly this is setting up the whole infrastructure including the ECS Service and Task and then updating that same task via the ECS API - using any of the provided tools or gems - taking the previous task as the base of the next task.

Terraform for example is often blamed here as it does not allow you to drill down beyond task definition level in the state and treats each version of a task definition as its separate object. You can read more about that on the GitHub issue here.

In short, while you can carry forward environment variables, you can not do the same for secrets. Updating terraform state will always detect the outside changes as drift and require to create a new task definition - even if unchanged.

Handing over keys to IAC

Seen as a fix and preferred to the status-quo, this approach will have the deploy pipeline prepare the application image and hand it over to the infrastructure deploy pipeline - check terraform api-driven pipelines for examples. This approach is preferred by Site Reliability Engineering (SRE) teams at Povio, since it allows complete control over the code in production by forcing an additional confirmation step.

Development teams with a more frequent release schedule might find the overhead with running IAC pipelines unnecessary, along with the cost and manual labor involved. Automating the deploy IAC brings fear into the bones of every seasoned DevOps engineer.

Taking control

“We need to deploy today; we need to have those variables now” and similar statements happen typically right before things go wrong - the worst of the cases to be discovered months after the fact. This approach has the whole ECS task definition as part of the deploy pipeline with manual coordination with the IAC or worse - IAC having zero control of the environment.

This is often the result of the blowback from the previous approach and, while it does speed up deployment, it fragments infrastructure definitions and makes changes at best a coordination nightmare and at worst a game of whack a mole.

The solution using IAC defined task templates

As we learned, we can not trust the deploy pipeline to overwrite a previous task definition and we can not trust it to keep the latest definitions itself. So let’s separate those out.

The IAC in our solution is still the source of truth for the Task definition template but does not control the task itself. That is left to the pipeline that overlays the template with application specific environment variables and the application image.

Deployments and rollbacks

Since the ECS Service does not update with the rest of the infrastructure it does require another app deployment after. This is not a bug - keep in mind that deploying the IAC does not orchestrate the application dependencies potentially leaving the app unresponsive until fully resolved. Decoupling the deployment means we can (and need to) think about backwards compatibility of the infrastructure while the application is not fully deployed.

Source of configuration and secrets

The beauty of this setup means we don't need to define where to store configuration or secrets. While infrastructure dependent variables like database connection details could fit within the task definition template, they would not be misplaced within SSM Parameters or Secret Storage along with non-infrastructure variables. Keeping configuration within the application code and for puritans, not keeping any environment specific configuration within the application code is both supported.

Defining infrastructure

Since the base template does not ever change beyond changing the infrastructure itself, we can avoid keeping the state of ECS Service Tasks - completely sidestepping the tainted state issues.

While there may never be a single correct way to deploy to ECS, my hope is that this article will help you find one that satisfies your needs. By embracing the principles of separation of concerns and leveraging IAC-defined task templates, you can take a step towards a more efficient and sustainable deployment process. Povio employs an internally developed deployment tool, which is publicly available on GitHub under the name "ecs-deploy-cli" and is available free of charge.

Happy deploying!