How Many Contacts Do You Need for a Valid Split Test
Why Sample Size Matters
Every split test result contains some amount of random variation. If you flip a fair coin 10 times and get 7 heads, you cannot conclude the coin is biased. You need more flips to separate the real tendency from random noise. Split testing works the same way. A subject line that gets 25% opens in one segment and 22% in another might actually perform identically, with the 3-point gap caused entirely by random variation in who happened to open their email that day.
Sample size is what separates a real finding from a statistical accident. With enough contacts in each test group, random variation averages out and the true difference between your versions becomes visible. With too few contacts, you cannot tell whether version A actually outperformed version B or whether you just got lucky.
Practical Minimums by Test Type
Email Subject Line Tests (Open Rate)
Open rates typically range from 15% to 35%, and meaningful improvements are usually 3 to 7 percentage points. At these rates, a list of 1,000 contacts (500 per variation) is generally enough to detect a 5-point difference with reasonable confidence. A list of 2,000 contacts gives you more reliable results. Below 500 contacts, subject line tests become unreliable because the expected difference is too small relative to the random variation.
Email Click-Through Rate Tests
Click-through rates are much lower than open rates, typically 2% to 8%. Because the base rate is lower, random variation has a proportionally larger effect, and you need more contacts to detect real differences. For click-through rate tests, aim for at least 5,000 contacts (2,500 per variation). With fewer than 2,000 total contacts, click-through rate tests rarely produce statistically meaningful results.
Landing Page Conversion Tests
Landing page conversion rates vary widely depending on the offer, but a typical range is 2% to 10%. Similar to click-through rate tests, the low base rate means you need more traffic. Plan for at least 1,000 conversions total across both variations, which means you need enough traffic to generate 500 conversions per variation. If your page converts at 3%, that is roughly 17,000 visitors per variation, or 34,000 total.
SMS Click-Through Rate Tests
SMS click-through rates are typically higher than email, ranging from 10% to 30%. This higher base rate means you need fewer contacts for reliable results. A list of 1,000 to 2,000 contacts is usually sufficient for SMS tests measuring click-through rates.
What to Do When Your List Is Too Small
If your list is smaller than the recommended minimums, you have several options. First, focus on testing elements with the largest expected impact. Subject line tests require the smallest audiences, so start there. Second, accept a wider margin of error by only declaring winners when the gap between variations is very large, say 10 percentage points or more rather than 3 to 5. Third, accumulate data across multiple sends by running the same test over two or three campaigns and combining the results.
What you should not do is declare a winner with insufficient data and assume the result is reliable. A test that shows version A beating version B by 2 percentage points on a list of 300 contacts is essentially meaningless. The result could easily flip on the next send. See How to Split Test When You Only Have 500 Subscribers for more specific strategies for small lists.
The Quick Rule of Thumb
If you want a simple guideline that works for most situations: you need at least 100 successful events (opens, clicks, or conversions) per variation to have a reasonable chance of detecting a meaningful difference. If your open rate is 25% and you need 100 opens per variation, you need 400 contacts per variation, or 800 total. If your click rate is 3% and you need 100 clicks per variation, you need 3,333 contacts per variation, or about 6,700 total.
This is a rough guideline, not a precise calculation. For exact sample size calculations, you would need to specify your desired confidence level (typically 95%), your desired statistical power (typically 80%), and the minimum detectable effect size. But for practical marketing purposes, the 100-events-per-variation rule gets you in the right ballpark.
Need help designing a testing program that works with your audience size? Talk to our team about getting started.
Contact Our Team