People are drawn to Alexa, Siri and other voice assistants because they simplify their lives. Brands can appeal to this sensibility by setting a voice strategy that values function over form.
Your next great brand ambassador lives inside a glowing cylinder roughly the size of a tuna can. The Amazon Echo Dot carries the hypnotic, homorhythmic voice of a semi-cyborgian woman and has access to the internet and the internet of things, performs household operations and delivers up-to-the-minute analytics about how customers are interacting with your brand. It may rest on kitchen counters and recite recipes, or on the bedside table to play white noise for light sleepers. Power users likely own multiple devices scattered throughout the home, which chat with one another over Wi-Fi. All are equally capable of placing product orders based on user preferences or of singing “Happy Birthday.” Part market maven, part consumer confidant and totally to-the-point: Users aren’t wasting any time by picking up their phones or performing a web search, and they expect quick, time-saving answers.
Amazon’s Alexa and other voice assistants have seen recent spikes in usage rates. According to predictions made by eMarketer, 111.8 million Americans (33.8% of the total population) will use a voice assistant in 2019 at least monthly, accounting for 39.4% of internet users. As for the access devices themselves, Voicebot.ai reported that 66.4 million Americans own smart home speakers as of 2018. But also consider that Statista estimates there are 265.9 million smartphone users in the U.S., and all of those devices ship standard with some form of voice assistant. These numbers illustrate a juggernaut industry trend that marketers can no longer afford to ignore.
Voice technology offers a vast world to explore, but brands benefit most by thinking small and contributing only basic functionality. Trends in consumer behavior reveal that brands that accomplish simple tasks on voice elegantly, effectively and quickly—such as allowing customers to reorder a favorite Starbucks coffee by uttering only a few words—receive better reviews and a steady rise in brand equity.
How well do you know your voice assistants? Take our quiz!
What’s unique about Alexa’s technology is that it’s tethered to Amazon-produced speakers and not available for download in app stores or included standard in phone purchases in the way Siri has been synonymous with iOS since 2011. Customers are consciously welcoming this voice assistant into their homes and making it privy to intimate conversations—all in the name of easing simple tasks like setting timers and adjusting smart thermostats. It’s therefore worth analyzing Alexa to better understand what drives its infiltration and adoption.
Two key motivating factors to embrace Alexa’s presence in the home are simplicity and consistency. Alexa is ready to rock right out of the box. Questions asked to Alexa receive instant answers, or at least approximations. Regardless of what Alexa is doing, it maintains the same voice, cadence and tone. Therefore, brands receive the greatest benefit by interfacing with users in a way that values function over form—perform one task well and ask for minimal user input aside from a one-line command.
“Branding is voice-based, so there’s not necessarily colors or labels or content, and it needs to be short and concise,” says Keval Baxi, CEO of UX and design agency Codal. “Brand loyalty is almost deterred to convenience. … It’s the race to better customer experience.”
One of Alexa’s key features is its ability to order products directly from e-commerce sites, which means users tell the device what to add to their shopping cart. Some brands have a head start on establishing themselves in voice because their name has become the generic term for the product—consider brands such as Kleenex or Ziploc. It’s highly unlikely a consumer would ask their voice assistant to “reorder facial tissues” or “sandwich bags,” as the two big-name brands have become synonymous with those products. This places Puffs and Glad at a disadvantage right off the bat.
Short of becoming Xerox, brands can get ahead by developing what’s known as a “skill,” essentially a third-party voice app built by the brand itself to execute basic functions. Customers look to skills to personalize their device, and the branded skills they choose will find a permanent home inside their Echo Dot. A staggering number of brands have arrived at that space. As of January, Alexa boasts 80,000 skills, from brands like Spotify and Jeopardy!, and Google Assistant has 4,200 actions—its version of a skill—according to Voicebot.ai.
Dennis Maloney, senior vice president and chief digital officer of Domino’s, attributes the success of his company’s highly ranked skill to understanding that his customers want pizza, not frills. “Customers come to Domino’s because they’re hungry and trying to order food,” he says. “Functionality is way more important for a consumer.” Using Alexa with the Domino’s skill, for example, customers can place a new order from scratch, repeat past orders or ask for a delivery status update using the pizza tracker function.
Donna Hoffman, marketing professor at George Washington University, notes that brands win big with expediency as well—even weighing functionality or ease of use over cost. “You might start to use [voice assistants] for things that you previously didn’t trust before, like to reorder products,” Hoffman says. “And if it doesn’t tell you the price, that’s pretty trusting. What will happen is we cede a lot of control to these devices over time and they become more autonomous and independent.”
When reached for comment, an Amazon spokesperson echoed that sentiment and confirmed that the company hopes to transform science fiction into technological fact: “Our goal is to shape [Alexa] through the lens of the customer and the holistic customer experience. … [We] want her to understand, anticipate and react to our customers based on their preferences, interests and needs.”
Currently, those needs are pretty simple. A look at Alexa’s top skills reveals that most of them perform one function very well. Skills from NPR and Reuters deliver quick briefings on the day’s top headlines, and Find My Phone does just that. Another top skill simply plays rain sounds.
Keith Soljacich, vice president and group director of experiential technology at Digitas, says that because consumers are currently making simple requests of their voice assistants, all brands can benefit by introducing even the most pared-down skill. “One thing I know that the platforms have done is they’ve made it easier for brands to come onto the platform and make sure you can get some basic information about your brand, your products and services,” he says. “You can’t be part of the game unless you actually play.”
SEO is crucial for brand success on Alexa, and the fight for the limited real estate is especially cutthroat. To win the optimization battle, brands must first get into the mind of Alexa to understand how searches are parsed. Soljacich notes that people have gotten used to how they interact with text-based engines to receive optimal results. For example, you may type “best cheeseburger Racine Wisconsin hours,” but when asking out loud, you’d say something to the effect of, “What’s a restaurant near Racine, Wisconsin, that’s open now and serves the best cheeseburger?” The results Alexa returns are not chosen based on the keywords of the voice query but on what’s called an “intent”—a translation of the voice request into the types of truncated word strings you might type into a search bar.
Soljacich says that good SEO for voice resembles good SEO for web search: Focus on optimization of the intent and stay away from trying to win out on basic functions Alexa can perform itself—such as a company producing a skill that reads the weather forecast, when Alexa is perfectly capable of handling the task itself. He also says that the process by which Alexa determines which skill to open, like when ordering a pizza, is still somewhat of a “black box” for brands.
Dan Golden, president of digital marketing agency Be Found Online, says that in order to rank for voice, SEO optimization needs to be relentless. Being located on the first page of a Google or Bing results page on a browser is an admirable achievement, but unless your company finds itself at No. 1 or is included as the featured snippet at the top of a page, your organization will get completely lost on voice assistants that often only return the top result.
“That’s a whole new world for marketers, where you either win or lose,” Golden says. “I’ve been making brands loads of money for years by targeting position two and three because it’s more economical, but in the case of voice search, you have to be that answer.”
Even though Maloney admits Domino’s isn’t currently focused on SEO—he believes the industry best practices are still sorting themselves out—consumers are nevertheless discovering his company in pursuit of a better ordering experience beyond the top search result. “Our experience from a functionality standpoint is where we differentiate ourselves,” he says. “I’m really not interested in matching competition.”
A digital assistant with access to living spaces and the ability to spend users’ money raises obvious privacy concerns. A 2018 study by PwC found that the majority of people who don’t use voice assistants—and don’t plan to start—cited concerns for how their data was going to be mined and used. Of those surveyed, 38% cited the creep factor of having a device listening in the background, and 28% explicitly said that privacy regarding their data and security was the main reason they’d continue to abstain.
But these assistants are expected to further infiltrate every facet of our lives. In September, Amazon announced a slate of forthcoming products. In the near future, people will be able to access Alexa via earbuds, glasses, a ring, a nightlight, a smart oven, miniature cameras for the home and a collar tag for the family dog (appropriately named Amazon Sidewalk).
At the same event, Amazon addressed the elephant in the room, or rather the elephant ears in the room that monitor user activity. The company introduced two Alexa commands that more overtly demonstrate how people can better trust the company to have its best interests at heart. Now users can say, “Alexa, tell me what you heard” and “Alexa, why did you do that?” to understand the source of an Alexa command that was ordered, intentional or not. This helps get to the root of a mispronunciation. Consumers also maintain the ability to delete any recordings whenever they’d like.
Privacy concerns aren’t unique to Alexa, but Amazon made headlines in April when Bloomberg reported that thousands of its employees around the world have been listening to recordings and annotating them in an attempt to improve the voice-recognition software. In particular, they work to eliminate accidental activation and ensure that Alexa is correctly answering questions it’s asked.
In a statement to Bloomberg, an Amazon spokesperson abated fears of a forthcoming nanny state by clarifying just how hard it would be to abuse the system. “Employees do not have direct access to information that can identify the person or account as part of this workflow,” the spokeperson wrote in an email. “All information is treated with high confidentiality and we use multi-factor authentication to restrict access, service encryption and audits of our control environment to protect it.”
Hoffman counters that there are still issues that go beyond simple mining to further Alexa’s speech evolution. “These things come into your home in your most private spaces, when you’re in your intimate life with you and your family, and you start to think of them as ancillary family members,” she says. “That can be really dangerous. Now there’s research looking at what these devices are doing even when you’re not interacting with them. They’re connecting with other devices and sending information about you and your behaviors without your permission—and that’s pretty intrusive. We’re interested in what people’s thoughts on those sorts of things might be depending on how they view the device. If you see it as a family member, you might say it’s OK. But if you see it as an AI who’s out to maximize ad targeting, you might not think it’s OK.”
Jim Mourey, a researcher and marketing professor at DePaul University, says the data could also be mined and run through sentiment analysis, which is when companies scrape people’s social media feeds to piece together a rough idea of users’ general emotional state. “You can imagine the same thing happening, and it probably already is, of assessing the emotional state of users of voice-activated assistants,” he says. Brands could benefit from this data by aligning themselves with particular emotional states. The example he gives is if a user sounds distressed or overwhelmed, Alexa could suggest ordering a pizza and some ice cream.
Currently, brands are afforded access to customer data, but on a limited scale. Amazon maintains a “skill metrics” section for brands that have written a skill, which keeps tabs on skill-specific information. For example, a brand can learn how many times its skill was invoked, what sorts of phrases customers said while interacting with it and basic demographic information about users.
Maloney says Domino’s is able to use this data to better its voice strategy by homing in on its intents to make sure they are close to perfect. “We try and create as transparent an experience as possible and give customers the choice to use [our skill] or not—it is definitely not the only way to order Domino’s,” he says. “There is a level of privacy that consumers do give up, and that is really a result of just making the functionality better. [Otherwise], it’s really hard for the experience to get any better. The expectation should be that [data] is not being used for malicious things.”
The way users feel toward their voice assistant, and their fear that the technology may be too intrusive, can affect how they view the brands alongside it. “I think we generalize our overall experience of engaging with Alexa, which then could have negative ramifications for a company or brand,” Mourey says. “If the experience is negative, [there’s] kind of a halo effect, like we spread it out to whatever it is that she’s talking about.”
Brands also face an inherent bias that consumers hold against voice assistants. Mourey proved how high the expectations are for Alexa in a recent study. His research team isolated three groups and asked each to make weather predictions: real flesh-and-bone meteorologists, apps on mobile devices and voice assistants such as Siri and Alexa. The results showed that when all three groups accurately predicted the forecast, there wasn’t much change between how people felt about each. When the forecast was predicted incorrectly, people tended to offer excuses for the humans and the mobile apps, claiming mistakes are made via human error or miscalculations. However, they tended to get angry at voice assistants when they were wrong.
“We make immediate social judgments of people based on two dimensions,” Mourey adds. “One is how warm they are, and the other is on competence. No surprise, when you ask humans to assess the warmth and competence of other humans, they tend to say that humans are high on both because they otherwise would be basically saying that they themselves aren’t very warm or competent. When you look at apps, people [think they are] super competent but not particularly warm. But with Alexa and Siri and all these other digital voices, they kind of have to be the best of both worlds because they’re attempting to emulate a human.”
This phenomenon was on display last year when Google demonstrated its forthcoming Duplex technology. In front of a crowd, Google Assistant placed a phone call to a hair salon to make an appointment for a user, negotiating in real-time with the receptionist to pick out an ideal time that jibed with the user’s calendar. The technology on display was certainly impressive, but notably the speech pattern Google Assistant used was peppered with conversational tics like “ums” and “ahs” to more accurately match how a person would speak.
It’s an example of why many worry about the ethical undertones of this technology. Mourey wonders what rules will be followed as far as letting the person on the other line know if they are interfacing with a robot. Soljacich says that this kind of technology must consider different dialects, and that accessibility for folks with speech disabilities can get complicated. One Vox reporter was unimpressed with a demonstration of Google Duplex, noting that the technology kicked him to a real operator who works on behalf of Google when the AI failed.
True responsive voice intelligence is still a ways away, but there’s plenty to optimize for today. Domino’s is exploring the possible use of voice assistants in its physical locations, but Maloney maintains equal focus on how his brand can deliver hassle-free value for users right now, and encourages other brands to strike a similar balance between the present and future. “Find the one or two things that you can turn into a voice experience that create real, tangible benefit for your consumers. Get those right, and you you can start building from there,” he says. “[At the same time,] recognize that this is not a short-term or a one-time investment; you are getting into something for the long haul, which has a long way to go before it replaces our other interfaces as the interface of choice. There is a long-term investment involved in part of that process, hopefully with good benefits at the end.”
Photos by Vince Cerasani.