Free Tool

ABX Blind Test: Can You Hear the Difference?

Sample-accurate double-blind comparison. Audition A, B, and the unknown X with the playback position carrying across every switch - the closest thing to "is this thing actually different" you can do without a measurement rig. Try a built-in synthetic test in two clicks, or drag in your own pair.

My tests

Completed tests save here automatically. Click any saved test to re-open its result and certificate.

Score - Take the test to see your number.
p-value - Probability your hits came from random guessing.
Confidence
Awaiting test Five levels: chance · suggestive · significant · strong · publishable.

Pick what to compare

setup
Source

How ABX testing works

The protocol

ABX is the gold standard for auditory discrimination. You get two known references (A and B) and an unknown X, randomly assigned to be A or B each round. Audition each as many times as you like, then commit an answer.

Sample-accurate switching matters. The position carries across A→B→X in this tool, so you swap mid-phrase. Your auditory echoic memory is roughly 3-4 seconds; if the switch resets the playhead, the gap erases the trace you were comparing to.

The statistics

The p-value is the probability of getting at least your hit count from pure random guessing on a binomial distribution. The standard "this is real" threshold is p < 0.05.

For 10 rounds: 8 ≈ p 0.055 (borderline), 9 ≈ p 0.011 (significant), 10 ≈ p 0.001 (highly significant). Longer tests need higher absolute hit counts but lower percentage: 15/20 (75 %) is p 0.021, comfortably significant.

What your hit rate actually proves

Binomial probability of guessing the listed score or better by pure chance. The bar for "I really heard a difference" is p < 0.05 - anything below that and the guessing explanation runs out.

Scorep-valueVerdict
5 / 100.623Pure guessing. No evidence of audible difference.
6 / 100.377Inconclusive - could be a slight bias, could be luck.
7 / 100.172Suggestive but not significant. Run a longer test.
8 / 100.055On the edge of significance (p ≈ 0.05).
9 / 100.011Significant - only 1.1 % chance of guessing.
10 / 100.001Highly significant. You hear it. Go publish.
13 / 160.011Equivalent confidence on a longer test.
15 / 200.021Clean significant result on 20 rounds.
17 / 200.001Highly significant. Beyond chance.

FAQ

ABX blind test FAQ.

How the test works, what counts as significant, and whether your audio files are uploaded anywhere.

  1. What is an ABX blind test?

    An ABX test is a statistical method for proving you can hear the difference between two audio sources. You hear sample A, sample B, then a randomized sample X (either A or B) and decide which one X matches. Get enough trials right and the result is statistically significant; flip a coin and you will not.

  2. How many ABX rounds do I need to be statistically significant?

    The tool runs 10 rounds by default. Eight or more correct out of 10 is significant at p < 0.05 (less than 5% chance of being a fluke). Ten out of ten clears p < 0.001. Fewer rounds are less reliable; more rounds are diminishing returns once you cross 10.

  3. Why does ABX testing matter for audiophile gear?

    Because most "I can hear it" claims fall apart under blind, level-matched conditions. ABX cuts through expectation bias by hiding which sample is which. If you can pass an ABX test on two cables, two DACs, or two file formats, the difference is real; if you cannot, it likely was not.

  4. Are my audio files uploaded anywhere?

    No. The ABX test runs entirely in your browser using Web Audio AudioBuffers. Your files never leave your device, never hit our servers, never get logged. The Certificate of Auditory Transparency you can share is just an image. It contains the result, not the audio.

  5. How do I level-match the two samples for a fair test?

    The tool auto-applies RMS level matching when you load two files, so loudness differences do not bias the test. If your two clips have very different peak levels but similar RMS, the perceived loudness will be close. For maximum rigor, pre-normalize the files in a DAW before loading.