The process of improving conversion is called A/B testing: it's the science of testing changes to see if they improve performance.
For example, you could rewrite the top half of your landing page or you could switch from a paid trial to a free trial. These changes might increase your sign up rate.
Your job is to figure out what's worth testing.
We'll cover:
In a test, each thing you're testing is called a variant. For example, your existing site may be Variant A. The change you're comparing it to may be called Variant B.
Hence, "A/B" testing.
Testing makes or breaks growth. I've worked with many companies who couldn't get Facebook ads to run profitably then later achieved success through three months' worth of landing page A/B testing: they continuously made their visuals more enticing and their messaging more clear.
Here's the testing cycle:
Repeat these steps until you run out of variant ideas. Never have downtime; every day of the year, a test should be running—or you're letting traffic go to waste.
A/B testing isn't about striving for perfection with each variant. It's about iteration.
Here's where I source ideas from:
An A/B variant is only better if it increases your bottom line.
If you discover that a variant motivates visitors to click a button 10x more, but button clicking doesn’t actually lead to greater signups or purchases, then your variant isn’t better than the original. All it's done is distract users into clicking a button.
For each A/B test, keep your eye on the prize: What is the meaningful funnel metric you're trying to increase? Often, it's email captures, sign ups, purchases, and retention.
Of these, you'll more often A/B test earlier parts of the funnel—for two reasons:
There are two types of variants: micros and macros.
Micro variants are small, quick changes. They're unlikely to have a large impact. For example, changing a button's color (a micro variant) typically won't have more than 2% conversion impact—at best.
Macro variants, on the other hand, are significant rethinkings of your asset. Entirely rewriting a landing page can increase conversion by 50-300%. This happens often. Although, you'll usually only get a couple of boosts before facing diminishing returns.
Your goal is to focus on big, macro impacts—because every A/B test has an opportunity cost: you're usually only running one test per audience at a time.
Macro variants require considerable effort: It’s hard to repeatedly summon the focus and company-wide collaboration needed to wholly rethink your assets.
But macros are the only way to see the forest through the trees.
Since the biggest obstacle to testing macros is committing the resources, I urge you to create an A/B testing calendar and adhere to it: Create a recurring event for, say, every 2 months. On that day, spend a couple hours brainstorming a macro variant for a step in your growth funnel.
You can do so using one of five approaches:
Now here are micro ideas.
Despite micros being less important, I'm including them because if you piece together enough micros, you sometimes have yourself a macro.
When you run out of macros, this is the micro with the greatest impact: change your above-the-fold content.
Every page has an above-the-fold (ATF) section. This is what visitors see before scrolling to the rest of a page. The content placed in your ATF in part determines whether visitors continue scrolling.
Specifically, rewrite your header and subheader copy. Header text is the first hook encountered for your product. So, if you've been unknowingly showing visitors unenticing messaging here, fixing it can have an impact.
An A/B test has an opportunity cost; you only have so many visitors to test against. So prioritize thoughtfully.
Here are the factors I consider:
A reminder that the last page of this handbook has a downloadable cheatsheet that handily recaps most of what you're about to learn.
Two things to understand about proper test design:
Google Optimize handles all this A/B testing logic for you.
When setting up tests, consider who should be included in them. It doesn't have to be everyone.
For example, consider only showing an experiment to visitors arriving at your site for the first time. This ensures that everyone in the test has the same base level of familiarity with your product.
To target only new users in Google Optimize, follow Example 1 in these instructions:
For test results to be statistically valid, you need to reach a sufficiently large sample size. The math is simple:
The implication is that if you don’t have a lot of traffic, the opportunity cost is too great to run micro variants, which tend to show conversion increases in just the 1-5% range. Meanwhile, macros have the potential to produce 10-20%+ improvements, which is well above the 6.3% threshold.
Below is an example of an experiment I ran using Google Optimize:
Above, our page had 1,724 views throughout the testing period. There was a 30% (29/22) improvement in our test variant over our baseline.
This 30% number is likely inaccurate, by the way. It's just a reference for the variant's maximum potential. We don't yet have that many sessions to validate this conversion improvement with certainty. But 30% is likely good enough to validate that we improved conversion by at least 6.3% (the number from earlier).
Pay attention to the Google Optimize column labeled Probability to be Best. If a variant’s probability is 70%+ and it has sufficient sessions (e.g. 1,000 and 10,000 as I indicated above in the sample size thresholds), the results are likely statistically sound, and the winning variant should be considered for implementation.
Now you can decide if the labor and implementation externalities are worth the 6.3%+ improvement in conversion.
What if our results weren't conclusive? What if we didn't surpass a 70% certainty?
Had the experiment revealed merely a 3% increase, for example, we would have to dismiss the sample size of 1,724 as too small for the 3% to be statistically valid.
We would end the experiment if we have low confidence in it, or we'd accept the testing opportunity cost and continue until we reach 10,000 sessions. If, after 10,000 sessions, the 3% increase remains, we'd conclude it's likely valid.
But, as mentioned in the previous section, if you have little traffic to begin with, don't risk waiting on a small, 3% improvement. Instead, consider a new test.
However, if that small change is tied to a meaningful revenue objective (e.g. purchases) as opposed, to say, people providing their email addresses, then perhaps it's worth continuing.
In other words, the closer an experiment's conversion objective is to revenue, the more worthwhile it may be to confirm small conversion boosts.
Don't implement A/B variants that win negligibly. The unknown downsides of implementation often outweigh the expected value of the gain.
For example, a change may introduce unforeseen funnel consequences that won't be obvious for a few months. It'll later be difficult to identify this as the root cause.
However, sometimes negligible wins are worth re-running on a new audience.
Consider this: when running A/B tests to improve conversion, you'll get diminishing returns on conversion gains for already high-intent traffic (e.g. organic search, referrals, and word of mouth). Those visitors came looking for you on their own merit. They're already interested. The onus is on you to reassure that you sell what they're expecting, and to not scare them off.
In contrast, for, say, ad traffic, A/B testing has the potential to provide much larger returns. These are uninterested, medium-intent eyeballs at best—often people who whimsically clicked your ad. They're looking for excuses to dismiss your value props and leave immediately.
This is where A/B tests shine: they're more effective at significantly improving conversion rates for low-to-medium intent traffic—because there's a greater interest gap to cover.
Here’s the implication: If you only A/B against high-intent traffic, you may not notice a significant improvement and may mistakenly dismiss your test as a global failure. When this happens, but you're confident the variant does have potential, retry the test on paid traffic. That’s where the improvement may be large enough to notice its significance.
Three takeaways: