How Long Should You Run a Split Test Before Deciding
Why Ending a Test Too Early Is the Biggest Mistake
The most common split testing error is calling a winner too soon. You send an email test at 9 AM, check the results at 11 AM, see that version A has a 28% open rate versus 22% for version B, and declare version A the winner. The problem is that at 11 AM, only your most engaged subscribers have opened the email. The rest of your list, who check email less frequently, have not weighed in yet. By the next morning, the gap might narrow to 2 points or even reverse entirely.
Early results are biased toward your most active subscribers. These people behave differently from your overall list. They open emails faster, click more often, and may have different content preferences than the casual subscribers who make up the bulk of your list. Waiting long enough for your full audience to engage gives you results that reflect your entire list, not just the most active segment.
Minimum Duration by Channel
Email Campaigns
Most email opens happen within the first 24 hours, but a meaningful portion of opens, often 10% to 20%, happen on day two and beyond. Wait at least 24 hours before even looking at results, and ideally 48 hours before making a decision. For campaigns sent on Friday, wait until Monday because weekend email behavior differs from weekday behavior.
Landing Page Tests
Landing page tests should run for at least one full business cycle, which for most businesses means one week. This ensures your results capture behavior across all days of the week, which can vary significantly. If your landing page gets fewer than 500 visitors per week, plan for two to four weeks to accumulate enough data for reliable results.
Never run a landing page test for less than a full week, even if you get enough traffic in three days. Day-of-week effects are real: a Monday visitor and a Saturday visitor may behave differently, and a test that only captures weekday behavior will mislead you about weekend performance.
SMS Campaigns
SMS is the exception to the "wait longer" rule. Most text message engagement happens within the first 30 minutes of delivery, and nearly all of it happens within 4 hours. You can typically evaluate an SMS split test on the same day you send it. Wait at least 4 hours after sending before checking results.
How to Know When You Have Enough Data
Duration alone is not sufficient. You also need enough responses. A test that runs for two weeks but only produces 50 conversions per variation is not reliable, regardless of how long it ran. The practical minimum is 100 to 200 successful events (opens, clicks, or conversions) per variation.
If your testing platform shows a confidence level, wait until it reaches 95% or higher before declaring a winner. If one version has been consistently ahead at 90% or above confidence for several days, that is also a reasonable threshold for making a decision, though technically less rigorous.
If your platform does not show confidence levels, use this simple rule: wait until the gap between versions has been stable for at least two measurement periods. If version A is ahead by 4 points after 24 hours and still ahead by 4 points after 48 hours, the gap is probably real. If version A was ahead by 6 points after 24 hours but the gap shrank to 2 points by 48 hours, wait longer because the result is still unstable.
When to End an Inconclusive Test
Not every test produces a clear winner. If you have waited the appropriate duration, accumulated enough data, and the two versions are still performing within 2 percentage points of each other, call it a tie and move on. There is no value in extending a test indefinitely hoping for a winner to emerge. A tie is a valid result that tells you your audience does not strongly prefer one version over the other for that variable.
Set a maximum test duration before you start. For email tests, if the result is unclear after 72 hours with adequate data, call it a tie. For landing page tests, if the result is unclear after four weeks, end the test. Extended testing ties up resources and prevents you from running the next test that might produce a clear, actionable result.
Want to build a disciplined testing program that produces reliable, actionable results? Talk to our team.
Contact Our Team