The Release Process
This project currently follows a Continuous Delivery process, but it's gradually moving toward Continuous Deployment.
Whenever a commit is pushed to this repository's main
branch, the deployment pipeline kicks in, deploying the changeset to the stage
environment.
After the deployment is complete, accessing the __version__
endpoint will show the commit hash of the deployed version, which will eventually match to the one of the latest commit on the main
branch (a node with an older version might still serve the request before it is shut down).
Versioning
The commit hash of the deployed code is considered its version identifier. The commit hash can be retrieved locally via git rev-parse HEAD
.
Preventing deployment
Occasionally developers might want to prevent a commit from triggering the deployment pipeline. While this should be discouraged, there are some legitimate cases for doing so (e.g. docs only changes).
In order to prevent the deployment of the code from a PR when merging to main
, the title of that PR must contain the [do not deploy]
text. Note that, when generating the merge commit for a branch within the GitHub UI, the extened description must not be changed or care must be taken to ensure that [do not deploy]
is still present.
For example:
# PR title (NOT the commit message)
doc: Add documentation for the release process [do not deploy]
While the [do not deploy]
can be anywhere in the title, it is recommended to place it at its end in order to better integrate with the current PR title practices.
The deployment pipeline will analyse the message of the merge commit (which will be contain the PR title) and make a decision based on it.
Releasing to production
Developers with write access to the Merino repository can initiate a deployment to production after a Pull-Request on the Merino GitHub repository is merged to the main
branch.
While any developer with write access can trigger the deployment to production, the expectation is that individual(s) who authored and merged the Pull-Request should do so, as they are the ones most familiar with their changes and who can tell, by looking at the data, if anything looks anomalous.
In general authors should feel responsible for the changes they make and shepherd throught their deployment.
Releasing to production can be done by:
- opening the CircleCI dashboard;
- looking up the pipeline named
merino <PR NUMBER>
running in themain-workflow
; this pipeline should either be in a running status (if the required test jobs are still running) or in the "on hold" status, with theunhold-to-deploy-to-prod
being held; - once in the "on hold" status, with all the other jobs successfully completed, clicking on the "thumbs up" action on the
unhold-to-deploy-to-prod
job row will approve it and trigger the deployment, unblocking thedeploy-to-prod
job; - developers must monitor the Merino Application & Infrastructure dashboard for any anomaly, for example significant changes in HTTP response codes, increase in latency, cpu/memory usage (most things under the infrastructure heading).
What to do if production is broken?
Don't panic and follow the instructions below:
- depending on the severity of the problem, decide if this warrants kicking off an incident;
- if the root cause of the problem can be identified in a relatively small time, create a PR for the fix.
- verify the fix locally;
- verify the fix on
stage
, after it is reviewed by a Merino developer and merged; - deploy it to production.
OR
- if the root cause of the problem is harder to track down, revert the last commit and then deploy the revert commit to production.