Locust vs k6; Merino-py Performance Test Framework

  • Status: Accepted
  • Deciders: Nan Jiang, Raphael Pierzina & Katrina Anderson
  • Date: 2023-02-21

Context and Problem Statement

Performance testing for the Rust version of Merino was conducted with the Locust test framework and focused on the detection of HTTP request failures. During the migration of Merino from Rust to Python, performance testing was conducted with k6 and focused on the evaluation of request latency. Going forward a unified performance testing solution is preferred, should the test framework be Locust or k6?

Decision Drivers

  1. The test framework supports the current load test design, a 10-minute test run with an average load of 1500RPS (see Merino Load Test Plan)
  2. The test framework measures HTTP request failure and client-side latency metrics
  3. The test framework is compatible with the Rapid Release Model for Firefox Services initiative, meaning:
    • It can execute through command line
    • It can signal failures given check or threshold criteria
    • It can be integrated into a CD pipeline
    • It can report metrics to Grafana
  4. The members of the DISCO and ETE teams are able to contribute to and maintain load tests written with the test framework

Considered Options

  • A. Locust
  • B. k6

Decision Outcome

Chosen option:

  • A. Locust

Both k6 and Locust are able to execute the current load test design, report required metrics and fulfill the Rapid Release Model for Firefox Services initiative; However, Locust's Python tech stack ultimately makes it the better fit for the Merino-py project. In-line with the team's single repository direction (see PR), using Locust will:

  • Leverage existing testing, linting and formatting infrastructure
  • Promote dependency sharing and code re-use (models & backends)

Pros and Cons of the Options

A. Locust

Locust can be viewed as the status quo option, since it is the framework that is currently integrated into the Merino-py repository and is the basis for the CD load test integration currently underway (see DISCO-2113).

Pros

  • Locust has a mature distributed load generation feature and can easily support a 1500 RPS load
  • Locust has built-in RPS, HTTP request failure and time metrics with customizable URL break-down
  • Locust scripting is in Python
  • Locust supports direct command line usage
  • Locust is used for load testing in other Mozilla projects and is recommended by the ETE team

Cons

  • Locust is 100% community driven (no
  • commercial business), which means its contribution level can wane
  • Preliminary research indicates that reporting metrics from Locust to Grafana requires the creation of custom code, a plugin or a third party integration

B. k6

For the launch of Merino-py, performance bench-marking was conducted using a k6 load test script (see Merino Explorations). This script was reused from the Merino rewrite exploration effort and has proven successful in assessing if Merino-py performance achieves the target p95 latency threshold, effecting preventative change (See PR). k6's effectiveness and popularity amongst team members is an incentive to pause and evaluate if it is a more suitable framework going forward.

Pros

  • k6 is an open-source commercially backed framework with a high contribution rate
  • k6 is built by Grafana Labs, inferring easy integration with dashboards
  • k6 has built-in RPS, HTTP request failure and time metrics with customizable URL break-down
  • k6 supports direct command line usage
  • k6 is feature rich, including built-in functions to generate pass/fail results and create custom metrics

Cons

  • The k6 development stack is in JavaScript/TypeScript. This means:
    • Modeling and backend layer code would need to be duplicated and maintained
    • Linting, formatting and dependency infrastructure would need to be added and maintained
  • k6 has an immature distributed load generation feature, with documented limitations
    • k6 runs more efficiently than other frameworks, so it may be possible to achieve 1500 RPS without distribution