We are doing AB testing with the hopes of a better future. Or to learn more and eventually get to that better future. But what method should we use to check whether we’re making progress or not? 

Frequentist or Bayesian?

The difference between Frequentist and Bayesian methods in A/B testing can be understood by how these methods handle uncertainty and update knowledge.

Frequentist Method

In the Frequentist approach, we assume that the truth (e.g., whether version A or B is better) is fixed, but we don’t know what it is. Instead, we gather data and rely on repeated experiments to reveal the truth over time. The focus here is on whether the results we see could have happened just by random chance.

For example, let’s say you want to know if your new website (Version B) performs better than the old one (Version A). You run an A/B test. After collecting data, you calculate a p-value, which tells you the likelihood of getting the observed results if there were no real differences between A and B.

If the p-value is low (typically below 0.05), you reject the idea that A and B are the same and conclude that one version is likely better. But the Frequentist method doesn’t let you directly calculate the probability that B is actually better; it only tells you how unusual your result would be if there were no difference.

Bayesian Method

The Bayesian approach, on the other hand, is more flexible. It starts with what you already believe (called a “prior”) and updates that belief as you get more data. You don’t have to wait for large amounts of data to make a decision; you can continuously update your understanding.

In A/B testing, if you think Version B might be better based on some prior information, Bayesian analysis allows you to combine this prior belief with new data to calculate the probability that B is indeed better than A. This gives you a more intuitive result: “There’s a 75% chance that B is better than A,” for example.

Now, the question many CRO professionals face: which method leads to healthier, more accurate experimentation? 

We wanted to explore the topic more in-depth, so we asked seven industry experts to weigh in on the debate. They were quick to reveal their thoughts and share the real-life examples that convinced them it’s the best way forward.Keep reading and what experts such as Valentin Radu, Ruben de Boer, Bradley Rodé, and more have to say about the subject!

Frequentist vs. Bayesian: What’s the Healthiest Approach to Running a CRO Program?

Frequentist methods, with their focus on long-run frequencies and hypothesis testing, are often favored for their simplicity and well-established metrics like p-values. 

On the other hand, Bayesian methods are gaining traction for their flexibility, allowing professionals to incorporate prior knowledge and update probabilities as new data is collected.

Let’s see which side wins the debate – with real-life examples to defend our experts’ preferences!

a-ha moment CRO experts

Ryan Thomas: Bayesian is a Simpler Way to Approach A/B Test Statistics

Frequentist.

Bayesian is sold as a simpler way to approach A/B test statistics where you don’t have to worry A/Bout things like test planning, error control, peeking, multiple comparisons, and so on.

But using it doesn’t actually make any of these things go away, you’re just sweeping them under the rug. In order to do Bayesian properly (using informative priors, which is not how it’s usually done in online A/B testing), it’s actually much more complex and difficult to explain than frequentist stats, and ends up introducing subjectivity into a process that is supposed to be objective.

Ryan describes himself as a “Contrarian CRO”, keen on poking holes in all the dogma and gatekeeping in the industry. You can follow him, or check out the work of Koalatative on their website.

Merritt Aho: They’re Not Mutually Exclusive

I use both every day and I also get evaluated regularly to ensure my mental health has not been compromised in doing so.

Seriously, they’re not mutually exclusive.

As for the real-life examples:

Some years ago I started doing a bunch of data simulations on different A/B test outcomes and scenarios I’d seen in my career.

That was when I learned a lot about errors in A/B testing and learned that the promises I was hearing surrounding Bayesian stats were hyper inflated. Statistical methods all have their limitations.

There’s no one-size-fits-all approach. Merritt is in charge of Digital Analytics at Breeze Airways™. He got his start in CRO after entering a university digital analytics competition and discovering a passion for data-driven design.

Merritt has served as a chapter leader for the Digital Analytics Association Austin Chapter and occasionally shares his CRO experience and enthusiasm with others at conferences, in webinars, on blogs and on twitter.

You can get in touch with him on his LinkedIn profile – here.

Ruben de Boer: Involve Stakeholders in the Decision-Making Process

Frequentist vs. Bayesian?

It depends :) When measuring the effectiveness of your CRO program, both Frequentist and Bayesian methods can be effective. The healthiest approach is to choose the method that you and your stakeholders are comfortable with and agree upon. Also, decide together on the confidence level for Frequentist methods (e.g., 95% confidence interval) or the probability threshold for Bayesian methods (e.g., 90% probability).

By involving stakeholders in the decision-making process and agreeing on the statistical approach and confidence/probability levels, you ensure a more collaborative and supported CRO program. I applied this mindset in real-life with all my clients :) Ruben is a Lead CRO Manager and consultant with over 15 years of experience in data and optimization. He’s a bestselling instructor on Udemy with 17,000+ students and a public speaker on topics such as experimentation culture, change management, conversion rate optimization, and personal growth.

At Online Dialogue, as the Lead Conversion Manager, he is responsible for leading the Conversion Managers team, developing team skills and quality, setting the team strategy and goals, and business development. He’s also a Conversion Manager for various brands such as Vodafone, eBay, RTL, Swiss Sense, I Amsterdam, fonQ, and DPG. Follow him on LinkedIn and check out his work with OnlineDialogue here.

Matt Gershoff: The Frequentist or Bayesian Debate is a Second-Order Issue

It doesn't really matter. The Frequentist or Bayesian debate is a second-order at best issue, as long as whatever approach is made with awareness. 

The real efficacy of an experimentation program is not found in the technology or statistical methods but in providing a principled procedure for organizations to make decisions with intention and awareness. 

To be explicit about what problems they are solving, how solving them will help their customers, and to make decisions between competing alternatives at the margin. 

I have many real-life examples that explain this idea.

Conductrics uses both Frequentist and empirical Bayesian approaches depending on the use case. As applied to most of the simple cases found in basic A/B Tests, the numerical results will often be the same. 

Matt Gershoff is the CEO of Conductrics, a digital optimization platform that combines A/B testing with artificial intelligence. 

With more than 15 years of experience in scientific marketing and web analytics, Matt holds dual MS degrees in Economics and Artificial Intelligence. Learn more about him from his public LinkedIn profie.

Bradley Rodé: Both Statistical Techniques are Prone to False Positives

The healthiest way is to use the one that works best for your organization and that your organization is proficient in using correctly. 

For us, it is the well-defined procedures of Frequentist statistics that, although rigorous, everyone in our organization are able to understand and execute with each experiment setup.  Bayesian methods work great when the experimenter takes the time to study past data and include a well-researched prior probability distribution into the experiment calculations. 

In our experience; however, this is rarely done in practice because it is more technical to execute and too time consuming when experimenting at scale.  So, it is not that one method is more accurate than the other, the best results are going to come from using the one that your organization is most likely to follow the best practices of. Any real-life examples you can think of? Both statistical techniques are more prone to false positives with low sample sizes.  

With Bayesian stats, this can be mitigated by choosing a strong prior probability distribution based on past data.   With Frequentist stats, this can be mitigated by calculating a minimum sample size needed to control for error rates.   Online Bayesian A/B Test calculators, however, use weak uninformative priors that can be more impacted by random spikes in conversions.  

Many clients want to stop tests early because of the results they see from online calculators, but when we stick to Frequentist methods and wait until an agreed-upon predefined sample size is reached, we can more easily avoid these false positives and achieve better results for our clients in the long run.

Bradley Rodé is the lead data scientist at Conversion Advocates. He studied business analytics at Pepperdine University and political science at UCLA.  When not watching NBA games at two in the morning his local time, you can find him studying Bayesian inference or training Brazilian jiu-jitsu in Africa.

Florent Buisson: What's the p-value?

I'm 100% a Bayesian in principle (Jaynes's book closed the deal for me) and I believe that it's the most accurate approach, but I still use frequentist stats in my day-to-day to measure CRO because it's faster and simpler to explain to business partners.

What convinced you that this is the best one? 

My business partners always ask me "what's the p-value?", and I don't want to restart their statistical education from scratch. 

Florent Buisson is a behavioral economist with 13 years of experience in business, analytics, and behavioral science. 

He started and led for four years the behavioral science team of Allstate Insurance, before leading experimentation for the eCommerce company Cars Commerce, where he increased testing velocity by 300% over two years. 

When he is not writing a new book, he advises companies and startups on how to improve products with behavioral science and experimentation.

Bhavik Patel: It Depends on the Needs of Your Organization and the Skills in Your Team

I'm not sure if there is a "healthy" or "unhealthy" approach - frequentist or bayesian. I've never considered one over the other as better or worse.

It largely depends on the needs of your organization and the skills in your team. If you don't have statisticians in your company then you're bound by the method employed by your A/B Testing tool. Maybe you're bound by the method your analyst is familiar with.

Either way, I think the new age dogma that Bayesian > Frequentist is largely being challenged every day.

Bhavik is an experienced Product Analytics and Experimentation lead with a demonstrated history of working successfully across a number of industries. Skilled in analytics, marketing, conversion optimisation and leadership.

Valentin Radu: I am a Strong Believer in Experimenting

The debate between Frequentist and Bayesian methods isn't the main issue to make progress in experimentation, as long as the teams are aware of the approach that they’re using and the consequences of choosing one or the other. 

The real challenge of an experimentation program lies in guiding organizations to make radical changes and embrace an experimentation mindset that goes beyond website tweaking. That’s why, I am a strong believer (obsessed advocate) in experimenting towards more impactful KPIs (like CLV),  rather than just website conversion rate.

Valentin Radu is one of the principal architects of the Customer Value Optimization methodology, the founder of Omniconvert, founder of The CVO Academy, and author of The CLV Revolution book. Get in touch with Valentin here and (or) buy his book on the Customer Value Optimization Methodology here.

Frequentist/Bayesian: No “Real” Winner

As you can see, there is no universal “best” method.  As with many other things in marketing, experimentation, or development, the healthiest approach is the one that aligns best with your team’s understanding, resources, and experimentation needs. 

Frequentist methods offer well-established, straightforward procedures, making them easier to communicate and implement, especially in organizations that value objectivity and simplicity. 

Bayesian methods, while more flexible and nuanced, require a deeper understanding and careful application of priors to be effective. Ultimately, the key takeaway is to choose the method your team can execute properly and consistently

Whether you prefer the precision of Frequentist stats or the adaptability of Bayesian models, success lies in understanding the strengths and limitations of each—and ensuring your organization is equipped to follow best practices for whichever approach you adopt.

Did you know.

…that whether you rely on the Bayesian method, or the Frequentist method, Omniconvert Explore has you covered?

Our straightforward CRO tool runs on both methods, allowing you to draw accurate conclusions from any experiment, no matter your sample size.

Try Omniconvert Explore and take your A/B testing to the next level!