Flaky Tests

Identifies tests with inconsistent pass/fail results. Helps improve test reliability and CI/CD stability by highlighting tests that may need maintenance.

What It Shows

For each flaky test case, the report surfaces:

Test case details (name, ID, automation status)
Flip count — number of status transitions within the analyzed window
Execution timeline — recent run statuses (name, color, success/failure indicator, timestamp)
Pass rate for the windowed executions

How "Flaky" Is Determined

The report analyzes each case's most recent executions within a configurable sliding window (default 10 consecutive runs, max 30). A case qualifies as flaky if it shows status instability — specifically:

Both successful and failing results within the window, or
Any non-success results (Blocked, Retest, Skipped, etc.) mixed with successes

A case is then surfaced only if its flip count meets the configured flip threshold.

Filters

At generation time:

Consecutive Runs — how many recent executions to analyze per case (default 10, max 30)
Flip Threshold — minimum flip count to surface the test (default 5; range 2 to runs−1)
Automation Status — All, Automated only, or Manual only
Date Range — restrict the windowed executions to a specific period

What It Shows​

How "Flaky" Is Determined​

Filters​

What It Shows

How "Flaky" Is Determined

Filters