How we manage stable and secure deploys
By Melissa Chan
Stable deployments help mitigate against apps crashing. Being stable means there are no surprises when it comes to deploying on production. Secure means the app has been heavily tested and there are no possible security issues. In this post, we lay out our process for managing stable and secure deploys.
At Assemble, we value quality work. One of the ways we ensure our developers are able to do quality work is by applying best practices to how we create and deliver work. We make sure that developers have a good workflow that they can apply within the team and that will help them be successful completing projects. We want all of our employees to feel confident every time a sprint ends that they were able to complete and satisfy the needs from the client's expectations.
What are stable and secure deploys and why are they important?
Stable deployments help mitigate against an app crashing. Being stable means there are no surprises when it comes to deploying on production. There shouldn't be any gotchas. At Assemble, when we deploy, it should feel calm.
Secure means the app has been heavily tested and there are no possible security issues. We've tested and thought about all the cases, especially when it comes to a public-facing application. One of the biggest ways to get secure production deployments is to have a process where we deploy at multiple stages before the app gets to production.
How do we achieve stable and secure deploys?
At Assemble, we organize our process into three sprints, to help us achieve stable and secure deploys. We use a sprint model that helps us set a cadence to when we release and set realistic expectations with our clients so they know what to expect and when. Each project has a release cycle:
- Dev Staging Environment
- User Acceptance Testing Environment (UAT)
- Production Environment
Step One: Dev Staging Environment
We do this staging environment in a two-week cadence. We have a Dev branch, which means all developers are working off the same branch. This branch reflects our two-week sprint tasks and goals. Each day, the branch is updated with Pull Requests (PRs), as well as new code committed on top. This means developers will be working with everchanging code for two weeks. This can be unstable, so we follow best practices to help prevent any regression issues or any accidental changes made when the developer is merging their code on top of new code. Both the team and each individual developer are responsible for ensuring the Dev Environment is stable at all times. Then, to ensure a stable deployment, we use a CI/CD tool.
A CI/CD runs these steps every time we merge something into Dev, in a sense making it almost parallel to production. We try to set up this Dev Environment like a small scale version of what a production environment will look like. It doesn't have the scalability of a production site, but it has direct functionalities and features that the production site has, as well as what's on top of the sprint. This case of the Dev branch is where we catch any potential build issues because your CI/CD deployment tool catches it. We recommend writing scripts for automation.
Quality Assurance (QA)
In the sprint process, there's also Quality Assurance (QA). Some projects have that fourth staging environment, which is nice, which is called the QA environment, which allows the QA team to independently test features without any disruptions of new code that may have been added on the Dev branch. I think that really depends on the development team and the QA team and which roles they prefer. Some prefer just having batch QA all on the Dev branch. Some prefer being able to individually test that QA and have their own testing. That's also one way to stabilize it, to see if individual features cause a regression issue or a major bug.
In the schedule we're using a three-branch example or three-environment sample. Dev is where both developers and QA test to make sure there are no regressions that can cause an issue on production by doing their best mimic on their Dev Environment, given that it is a sandbox version of production.
There's always the possibility of regression issues, which means something comes back or we create a new bug with something that we created. We prevent regression by having multiple deployment phases.
Step Two: User Acceptance Testing Environment (UAT)
Unlike the Dev Staging Environment, when it comes to User Acceptance Testing Environment (UAT) and the Production Environment, they each get updated toward the end of a sprint. In a UAT, there's user-acceptance testing, which is more continuous testing, while also being open to the client to view what was done in the sprint time before they approve of it and move it onto production. Typically we have UAT testing run for at least a few days, ideally a week. This is just pure testing to make sure there are no regression issues or missed client expectations.
This is an environment that is more about user experience. Is this stable for a user? Does it make sense? This is a sandbox environment similar to the Dev Staging Environment. It's a little bigger but it’s not as big as the Production Environment. There are more users available to visit the site and test, but not the same as how many people will be in production. This is another place to add as a guardrail to make sure that everything is stable and ensure we aren't causing any new regression issues.
We keep the UAT in parody with production as much as possible. For a typical user, there shouldn't be any difference. The Dev Staging Environment is the one where, in most cases, we have a developer API, which is a work in progress API. It also doesn't have the same database as a production API because, again, the cost of holding a database is always expensive. We want to be able to stress test it enough, but not too much.
For the most part, UAT some teams, which are over to the production API, if it doesn't affect the database there. And it's purely just to retrieve and match the production site as much as we can. There's going to be some differences in terms of what the data might look like, but it should match what production sees.
At this point, we deploy to production. Until now, Production, UAT, and Dev have worked nearly identical in each of the staging environments. So if there are any build errors, we will have caught them on Dev or UAT, especially when it comes to switching environments. It should be pretty automatic with how we set up a lot of these deployments.
When it comes to production, you just kind of tag it and release it. That's really how you prevent any unknowns in production. You don't want to be fighting fires on deployment day; fix the fires before they start. The goal is always that production is as smooth as day. If there's any disruption, it will come in UAT or Dev just because there's a developer regression issue.
Why do we do this?
This process is a core best practice for developers, especially as your company gets bigger. You should never expect production to be a blind fix. You want to have a safe environment where you are pushing code, and following this process is one way we accomplish that.
Work with us
If you just read this article and thought, "Man, that's how I'd like to work," we're hiring and we'd love to hear from you.