Are A/B Tests Ethical?

A/B testing may seem harmless, but many consumers don’t like how easily companies can test them without their knowledge. Should marketers change how they test?

What ethical questions could a simple A/B test raise? What could be wrong with testing how people react to two different campaigns or website colors?

Michelle Meyer would tell you: not much. She’s an assistant professor and associate director of research ethics at Geisinger’s Center for Translational Bioethics & Health Care Policy and says that she doesn’t see the bounds of A/B testing so differently from the bounds of business ethics. “Selling? Yes. Upselling? Hmm. Advertising? Yes. False advertising? No,” she says.

But as the amount of data online grows, the line between research and business gets thinner. Companies can now A/B test large groups of consumers, and social media platforms can test even larger numbers of users. Large corporations, including Google, Amazon and Netflix, undertake many A/B tests every day, unknown and unseen by the users being tested. This is unsettling to some; many consumers have loudly—sometimes angrily—spoken out when they realized that they were part of A/B tests undertaken by companies that wield large pools of data. In March 2014, Facebook tested about 700,000 of its users without telling them. The social platform gave some users a positive newsfeed and others a negative newsfeed, publishing the results of its study in collaboration with researchers from Cornell University in the journal Proceedings of the National Academy of Sciences. The resulting paper, titled “Experimental evidence of massive-scale emotional contagion through social networks,” found that users with negative newsfeeds posted more negative words, and those with positive newsfeeds posted more positive words. Many users decried Facebook’s use of “human experimentation.” The chorus of complaints grew so loud that Adam D.I. Kramer, one of the Facebook researchers who worked on the study, apologized.

“The reason we did this research is because we care about the emotional impact of Facebook and the people that use our product,” Kramer wrote in a Facebook blog post. “I can understand why some people have concerns about it, and my co-authors and I are very sorry for the way the paper described the research and any anxiety it caused.”

Meyer found the kerfuffle to be distasteful—not for what Facebook and Cornell researchers studied, but for what she thought was a misled, overheated response by many angry media members and Facebook users. The sample size of the study was huge, but the results were minimal—viewers who saw negative newsfeeds, for example, only posted four more negative words for every 10,000 words they wrote. What was reported by many as a giant wave of emotion was more of a minor blip. “People were wrong on the internet and it was annoying,” Meyer says. She wrote a blog post on why Facebook’s testing could have feasibly been approved by the Institutional Review Board, an administrative body that regulates research performed on human subjects, if Facebook had practiced slightly better informed consent—notifying potential subjects of active tests and alerting them to potential side effects. Her post was quickly picked up by Wired and shared thousands of times. “We can certainly have a conversation about the appropriateness of Facebook-like manipulations, data mining and other 21st-century practices,” Meyer wrote in the post. “But so long as we allow private entities freely to engage in these practices, we ought not unduly restrain academics trying to determine their effects.”

A year later, Meyer wrote a column for The New York Times with Christopher Chabris, an associate professor of psychology at Union College, which editors provocatively titled: “Please, Corporations, Experiment on Us.” Meyer and Chabris wrote that the outrage against Facebook’s testing was a “moral illusion,” a false choice between releasing a product or atmosphere and experimenting with different products or atmospheres.

“Companies—and other powerful actors, including lawmakers, educators and doctors—‘experiment’ on us without our consent every time they implement a new policy, practice or product without knowing its consequences,” they wrote. “We aren’t saying that every innovation requires A/B testing. Nor are we advocating nonconsensual experiments involving significant risk. But as long as we permit those in power to make unilateral choices that affect us, we shouldn’t thwart low-risk efforts … to rigorously determine the effects of those choices. Instead, we should cast off the A/B illusion and applaud them.”

But others disagree and see A/B testing as an ethical risk, no matter if researchers run tests in a lab or corporations run tests atop shared desks. Ehud Reiter, a professor of computing science at the University of Aberdeen and chief scientist of Arria NLG, teaches the Facebook study to his students as something to avoid. It was unethical, he says. “I would certainly never accept that as an academic research project,” Reiter says. “It’s not acceptable to me to manipulate people’s emotion. That is not acceptable without informed consent.”

Informed consent is the heart of modern research ethics, Reiter says. In 2017, he struggled with an A/B test proposal for this reason. He was asked to approve a project by a researcher who was working with a real-world service provider; they wanted to test different strategies on different clients, then evaluate the results. Reiter wrote on his website that he knew these real-world tests can be helpful, but he also knew that they exist in ethical gray areas. He wasn’t quite sure how to ask participants for informed consent. If researchers were transparent, they might bias the participants and ruin the test; if the researchers weren’t transparent, he didn’t believe that the testing would be ethical.

“We decided to go with allowing [participants] to opt out and be transparent after the fact,” he says, meaning the study was explained to participants after they were tested. “Academically, I think that it’s the right thing to do, and I think that companies might also consider going down that path because, otherwise, it’s a danger to blow up in their face.”

Most companies solve this issue by avoiding it. They don’t ask for informed consent from the users whom they A/B test; at most, they inform users of potential tests in agreements users sign when they join. These agreements are filled with fine print and legalese; no one reads them, Reiter says. Users click away without even glancing at the fine print, thereby allowing companies to test their data and send it along to third-party aggregators. Reiter says that researchers need to hold themselves to a higher standard than this, but he believes that businesses should, too. Most companies decide their testing policies in terms of what will or won’t get them sued, he says, but to not account for ethics is to activate a ticking time bomb that he believes will destroy a company’s reputation.

Ethics don’t lend themselves to binary choices, Meyer says. Ethical marketers, like researchers, will always be faced with tough choices. The Common Rule—the U.S. rule of ethics that oversees biomedical and behavioral research involving human subjects—guides researchers toward more ethical choices, but situational standards may be applied differently on each test. And even so, the Common Rule doesn’t apply to corporations. Companies that can afford ethical consultants might hire them for tough decisions, but those without the budget should also consider the ramifications of their testing.

There are obvious fault lines of A/B testing that marketers will encounter: A/B testing an alcohol brand on the Facebook fans of Alcoholics Anonymous is unethical to its core, whereas A/B testing two different headlines on the same marketing email is as benign as a single drop of rain. As Meyer noted, this is Business Ethics 101.

There are many reasons why consumers might object to A/B testing, Meyer says: objections to randomization, a feeling of unfairness or inequality, an assumption that businesses already know what will work. But she doesn’t believe that it’s logically consistent to be against A/B testing just because it tests two different things. CEOs often decide to launch one product to market without testing—essentially a “B test.” Meyer says that she doesn’t believe anyone would be angry at the CEO of that company for giving consumers a B test without consent.

“So why is the moral world shook upside down if half of people get A and half of people get B?” she asks. “That’s a bit of a mystery.”

What would be a change is more transparency, similar to what Reiter opted for in his academic A/B test. Meyer, like Reiter, says that companies should assuage consumer concerns by creating a landing page that explains the ongoing A/B tests to them. This page would be a simple explanation about why the company is showing users multiple versions of the same page. Reiter says that users should also be given the right to opt out if they don’t want their data to be used as part of an A/B test, and he says that they should be informed as to how their data is being protected.

The alternative, of course, is to do nothing. This is what most companies do, save for a note when users sign up for their services. If users find out they’re being tested, they may be angry or they may not care. “It’s a dilemma,” Meyer says.

Reiter says that much of this dilemma can be solved with transparency, to inform users of the test. If users complain, then he says that the company should change its future tests.

“If you don’t do it then it’s going to blow up in your face sooner or later,” he says. “As an academic, we tell people, ‘You’ve got to [test] properly. And if you try to hide that you are doing it, you’ll eventually get found out, and it will be a lot worse.’”

Market Research

Marketing Ethics

Marketing Strategy

Hal Conick

Hal Conick is a freelance writer for the AMA’s magazines and e-newsletters. He can be reached at halconick@gmail.com or on Twitter at @HalConick.