How Argos detects visual differences

Argos uses deterministic pixel diffing, not AI-based visual comparison. Instead of compensating for flakiness, Argos focuses on eliminating it at the source. This keeps visual tests precise, explainable, and reliable over time.

What Argos compares

Argos compares rendered screenshots and ARIA snapshots produced by your E2E and Storybook tests.

Each snapshot is compared against a baseline using a pixel-level diff algorithm executed in multiple refinement passes.

The question we answer is intentionally simple:

Did the UI visually change, or not?

No interpretation. No probability. No guesswork.

The diff algorithm

Argos relies on the open source odiff library maintained by the talented Dmitriy Kovalenko.

Our full diff implementation is also open source and can be inspected here.

High-level flow

Image normalization
Resolution, color space, and alpha channels are aligned.
Multiple diff passes
Each pass uses different thresholds to detect both strict and subtle changes.
Pixel clustering
Random noise is separated from meaningful visual changes.
Final diff output
A diff mask and score are produced for review in Argos.

Running multiple passes allows Argos to stay strict while remaining resilient to minor, explainable noise.

Why pixel diffing instead of AI

Some tools use AI or ML models to decide whether a change is acceptable.

Argos intentionally does not.

AI compensates for flakiness.
Argos removes it.

AI-based approaches often:

Hide small changes without clear explanations
Mask rendering inconsistencies
Create ambiguity between what changed and what was approved

Over time this leads to silent regressions and declining trust in the test suite.

Flakiness is a signal

A flaky visual test usually points to an underlying problem, such as:

Non deterministic animations
Time-dependent rendering
Uncontrolled fonts
Async layout shifts
Environment-specific rendering differences

Argos treats flakiness as technical debt to fix, not noise to ignore.

Flaky management and resolution

Argos provides explicit tools to identify and resolve flakiness:

Flaky indicators and reports
Ignore changes on specific screenshots
SDK-level stabilization for fonts, animations, images, and loading states
Helpers to mask specific regions or elements

Learn more in managing flaky tests.

Determinism over probability

Pixel diffing offers properties that matter in CI:

Deterministic: same input, same result
Explainable: exact pixels that changed are visible
Review-friendly: reviewers assess facts, not model guesses
Auditable: approvals have a clear meaning

AI introduces probability and hidden heuristics, which is a poor fit for regression testing.

Built for long-term health

Argos optimizes for:

Trust in failures
High signal to noise ratio
Stable baselines
Predictable reviews

The result is visual testing teams rely on every day, not something they mute after a few weeks.

Open by design

The diff engine and its surrounding logic are open source:

No black box
No hidden thresholds
Fully inspectable and debuggable

Visual testing should be infrastructure, not magic.

What Argos compares​

The diff algorithm​

High-level flow​

Why pixel diffing instead of AI​

Flakiness is a signal​

Flaky management and resolution​

Determinism over probability​

Built for long-term health​

Open by design​