SoatDev IT Consulting
SoatDev IT Consulting
  • About us
  • Expertise
  • Services
  • How it works
  • Contact Us
  • News
  • May 28, 2024
  • Rss Fetcher

A B testing

Suppose you’re running an A/B test to determine whether a web page produces more sales with one graphic versus another. You plan to randomly assign image A or B to 1,000 visitors to the page, but after only randomizing 500 visitors you want to look at the data. Is this OK or not?

Of course there’s nothing morally or legally wrong with looking at interim data, but is there anything statistically wrong? That depends on what you do with your observation.

There are basically two statistical camps, Frequentist and Bayesian. (There are others, but these two account for the lion’s share.) Frequentists say no, you should not look at interim data, unless you apply something called alpha spending. Bayesians, on the other hand, say go ahead. Why shouldn’t you look at your data? I remember one Bayesian colleague mocking alpha spending as being embarrassing.

So who is right, the Frequentists or the Bayesians? Both are right, given their respective criteria.

If you want to achieve success, as defined by Frequentists, you have to play by Frequentist rules, alpha spending and all. Suppose you design a hypothesis test to have a confidence level α, and you look at the data midway through an experiment. If the results are conclusive at the midpoint, you stop early. This procedure does not have a confidence level α. You would have to require stronger evidence for early stopping, as specified by alpha spending.

The Bayesian interprets the data differently. This approach says to quantify what is believed before conducting the experiment in the form of a prior probability distribution. Then after each data point comes in, you update your prior distribution using Bayes’ theorem to produce a posterior distribution, which becomes your new prior distribution. That’s it. At every point along the way, this distribution captures all that is known about what you’re testing. Your planned sample size is irrelevant, and the fact that you’ve looked at your data is irrelevant.

Now regarding your A/B test, why are you looking at the data early? If it’s simply out of curiosity, and cannot effect your actions, then it doesn’t matter. But if you act on your observation, you change the Frequentist operating characteristics of your experiment.

Stepping back a little, why are you conducting your test in the first place? If you want to make the correct decision with probability 1 − α in an infinite sequences of experiments, then you need to take into account alpha spending, or else you will lower that probability.

But if you don’t care about a hypothetical infinite sequence of experiments, you may find the Bayesian approach more intuitive. What do you know right at this point in the experiment? It’s all encapsulated in the posterior distribution.

Suppose your experiment size is a substantial portion of your potential visitors. You want to experiment with graphics, yes, but you also want to make money along the way. You’re ultimately concerned with profit, not publishing a scientific paper on your results. Then you could use a Bayesian approach to maximize your expected profit. This leads to things like “bandits,” so called by analogy to slot machines (“one-armed bandits”).

If you want to keep things simple, and if the experiment size is negligible relative to your expected number of visitors, just design your experiment according to Frequentist principles and don’t look at your data early.

But if you have a good business reason to want to look at the data early, not simply to satisfy curiosity, then you should probably interpret the interim data from a Bayesian perspective.

I’d recommend taking either a Frequentist approach or a Bayesian approach, but not complicating things by hybrid approaches such as alpha spending or designing Bayesian experiments to have desired Frequentist operating characteristics. The middle ground is more complicated and prone to subtle mistakes. We have a lot of experience navigating the middle ground, and we can help you with that if you need to go there, but we’d recommend keeping things simple if that’s feasible.

Related posts

  • A/B testing and interaction effects
  • A Bayesian view of Amazon resellers

The post Can you look at experimental results along the way or not? first appeared on John D. Cook.

Previous Post
Next Post

Recent Posts

  • Microsoft’s Satya Nadella is choosing chatbots over podcasts
  • MIT disavows doctoral student paper on AI’s productivity benefits
  • Laser-powered fusion experiment more than doubles its power output
  • TechCrunch Week in Review: Coinbase gets hacked
  • Epic Games asks judge to force Apple to approve Fortnite

Categories

  • Industry News
  • Programming
  • RSS Fetched Articles
  • Uncategorized

Archives

  • May 2025
  • April 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023

Tap into the power of Microservices, MVC Architecture, Cloud, Containers, UML, and Scrum methodologies to bolster your project planning, execution, and application development processes.

Solutions

  • IT Consultation
  • Agile Transformation
  • Software Development
  • DevOps & CI/CD

Regions Covered

  • Montreal
  • New York
  • Paris
  • Mauritius
  • Abidjan
  • Dakar

Subscribe to Newsletter

Join our monthly newsletter subscribers to get the latest news and insights.

© Copyright 2023. All Rights Reserved by Soatdev IT Consulting Inc.