Management And Accounting Web

Kohavi, R. and S. Thomke. 2017. The surprising power of online experiments: Getting the most out of A/B and other controlled tests. Harvard Business Review (September/October): 74-82.

Summary by James R. Martin, Ph.D., CMA
Professor Emeritus, University of South Florida

Lab and Experimental Research Main  |  Decision Theory Main Page

Many leading companies are using an "experiment with everything" approach to produce large payoffs. The purpose of this article is to share the authors' lessons learned from many years of conducting experiments, and advising companies about how to design and execute them, ensure their integrity, and interpret their results. Rigorous online experiments should be used by all firms to transform decision making into a scientific evidence-driven process.

Appreciate the Value of A/B Tests

An A/B test consists of two experiences: "A" the control, and "B" the treatment, where the treatment is a modification that attempts to improve something. Users are randomly assigned to the experiences, and the results are compared and analyzed. Any company that has a few thousand daily online users can conduct A/B tests to improve their web related decisions. On line, very small changes (in loading speeds, colors, etc.) can have very large effects. For example, experiments conducted by Bing revealed that slightly darker blues and greens in titles and a slightly lighter shade of black in captions improved their users' experience. Everyone was skeptical, so they replicated the experiment on a large scale (32 million users) and the results were similar. When the color changes were rolled out to all users, they boosted revenue by more than $10 million annually.

Small web site changes create huge impact

Build a Large-Scale Capability

Only a small number of experiments will pay off, so the idea is to test everything. But scientifically testing everything requires an infrastructure. Although many third-party A/B testing tools and services are available, leading companies (e.g., Amazon, Facebook, Google, Microsoft, Bing) integrate the testing capibility into their processes. Experimentation personnel can be organized using a centralized model, a decentralized model, or a center-of-excellence model that combines a centralized function with other specialist located in business units. Small companies typically use a centralized model, or start with a third-party testing tool.1

Address the Definition of Success

The idea is to define the short term metrics that predict long-term results. Key methics might include web traffic, search engine queries, number of clicks, ad revenue, or on-line sales. Key metrics should also be tracked, e.g., which parts of a web page get user clicks.

Beware of Low-Quality Data

Experiments need checks and safeguards. Some methods include A/A tests where a treatment is tested against itself, and replicating tests to make sure they are valid. Test data can be distorted by internet bots (i.e., web robots that perform automated tasks), outliers (e.g., library massive book orders), heterogeneous treatment effects (e.g., from participants using different browsers), carryover effects (where control and treatment groups are reused), and a sample ratio mismatch between control and treatment groups.

Avoid Assumptions About Causality

The assumption made by some executives that correlation implies causality is incorrect. Radomized tests are needed to establish causality, not observational studies. Analyst should also go beyond results that indicate one thing causes another thing to find out why if possible. But not knowing the why does not keep you from benefiting from knowledge of the cause.

Combining software with scientific controlled experiments can help companies develop the experimental capability necessary to reap returns in increased user experience, cost savings, increased revenue, and competitive advantage.

_______________________________________________________

Footnote:

1 For many A/B and multivariate testing tools see MAAW's Experimental Research Tools and Links.

Related summaries:

Anderson, E. T. and D. Simester. 2011. A step-by-step guide to smart business experiments. Harvard Business Review (March): 98-105. (Summary).

Appelbaum, D., A. Kogan and M. A. Vasarhelyi. 2017. An introduction to data analysis for auditors and accountants. The CPA Journal (February): 32-37. (Summary).

Appelbaum, D., A. Kogan, M. Vasarhelyi and Z. Yan. 2017. Impact of business analytics and enterprise systems on managerial accounting. International Journal of Accounting Information Systems (25): 29-44. (Summary).

Davenport, T. H. 1998. Putting the enterprise into the enterprise system. Harvard Business Review (July-August): 121-131. (Summary).

Davenport, T. H. 2009. How to design smart business experiments. Harvard Business Review (February): 68-76. (Summary).

Davenport, T. H. and J. Glaser. 2002. Just-in-time delivery comes to knowledge management. Harvard Business Review (July): 107-111. (Summary).

Spear, S. J. 2004. Learning to lead at Toyota. Harvard Business Review (May): 78-86. (Summary).

Spear, S. and H. K. Bowen. 1999. Decoding the DNA of the Toyota production system. Harvard Business Review (September-October): 97-106. (Summary).

Thomke, S. and J. Manzi. 2014. The discipline of business experimentation. Increase your chances of success with innovation test-drives. Harvard Business Review (December): 70-79. (Summary).

Tschakert, N., J. Kokina, S. Kozlowski and M. Vasarhelyi. 2017. How business schools can integrate data analytics into the accounting curriculum. The CPA Journal (September): 10-12. (Summary).