The Release Process

This project currently follows a Continuous Deployment process.

Whenever a commit is pushed to this repository's main branch, a CircleCI workflow is triggered which performs code checks and runs automated tests. The workflow additionally builds a new Docker image of the service and pushes that Docker image to the Docker Hub registry (this requires all previous jobs to pass).

Pushing a new Docker image to the Docker Hub registry triggers a webhook that starts the Jenkins deployment pipeline (the Docker image tag determines the target environment). The deployment pipeline first deploys to the stage environment and subsequently to the production environment.

Activity diagram of CircleCI main-workflow

After the deployment is complete, accessing the __version__ endpoint will show the commit hash of the deployed version, which will eventually match to the one of the latest commit on the main branch (a node with an older version might still serve the request before it is shut down).

Release Best Practices

The expectation is that the author of the change will:

  • merge pull requests during hours when the majority of contributors are online
  • monitor the [Merino Application & Infrastructure][merino_app_info] dashboard for any anomaly

Versioning

The commit hash of the deployed code is considered its version identifier. The commit hash can be retrieved locally via git rev-parse HEAD.

Load Testing

Load testing can be performed either locally or during the deployment process. During deployment, load tests are run against the staging environment before Merino-py is promoted to production.

Load tests in continuous deployment are controlled by adding a specific label to the commit message being deployed. The format for the label is [load test: (abort|skip|warn)]. Typically, this label is added to the merge commit created when a GitHub pull request is integrated.

  • abort: Stops the deployment if the load test fails.
  • skip: Skips load testing entirely during deployment.
  • warn: Proceeds with the deployment even if the load test fails, but sends a warning notification through Slack.

If no label is included in the commit message, the default behavior is to run the load test and issue a warning if it fails.

For more detailed information about load testing procedures and conventions, please refer to the Load Test README.

Logs from load tests executed in continuous deployment are available in the /data volume of the Locust master kubernetes pod.

What to do if production breaks?

If your latest release causes problems and needs to be rolled back: don't panic and follow the instructions in the Rollback Runbook.

What to do if tests fail during deployment?

Please refer to What to do with Test Failures in CI?