What to do with test failures in CI?

  1. Investigate the cause of the test failure

  2. Fix or mitigate the failure

    • If a fix can be identified in a relatively short time, then submit a fix
    • If the failure is caused by a flaky or intermittent functional test and the risk to the end-user experience is low, then the test can be "skipped", using the pytestxfail decorator during continued investigation. Example:
      @pytest.mark.xfail(reason="Test Flake Detected (ref: DISCO-####)")
      
  3. Re-Deploy

    • A fix or mitigation will most likely require a PR merge to the main branch that will automatically trigger the deployment process. If this is not possible, a re-deployment can be initiated manually by triggering the CI pipeline in CircleCI.