Skip to main content

Flaky Tests

Identifies tests with inconsistent pass/fail results. Helps improve test reliability and CI/CD stability by highlighting tests that may need maintenance.

What It Shows

For each flaky test case, the report surfaces:

  • Test case details (name, ID, automation status)
  • Flip count — number of status transitions within the analyzed window
  • Execution timeline — recent run statuses (name, color, success/failure indicator, timestamp)
  • Pass rate for the windowed executions

How "Flaky" Is Determined

The report analyzes each case's most recent executions within a configurable sliding window (default 10 consecutive runs, max 30). A case qualifies as flaky if it shows status instability — specifically:

  • Both successful and failing results within the window, or
  • Any non-success results (Blocked, Retest, Skipped, etc.) mixed with successes

A case is then surfaced only if its flip count meets the configured flip threshold.

Filters

At generation time:

  • Consecutive Runs — how many recent executions to analyze per case (default 10, max 30)
  • Flip Threshold — minimum flip count to surface the test (default 5; range 2 to runs−1)
  • Automation Status — All, Automated only, or Manual only
  • Date Range — restrict the windowed executions to a specific period