Voice has been one of the most successful technologies of the last decade, especially when it comes to the smart home. Thanks to innovations in natural language processing (NLP) during the 2012-2015 time period, voice became ubiquitous on our phones and in our homes. But now, with news that Amazon is downsizing its Alexa business and that Google is questioning what it can eke out of its Google Assistant, it’s time to take a hard look at how to make money in voice and what the news about Amazon and Google’s struggles mean for the smart home.
First up, voice and the smart home are related but entirely separate. Siri launched in 2011. IBM’s Watson was also playing Jeopardy back then. Thanks to hard work on speech-to-text and NLP software, we could talk to our phones and have them understand us — both to take transcription and complete programmed tasks. But speech wasn’t transformative on the phone, partly because the phone already had a pretty convenient and established user interface in touch and tapping. Many people were impressed, but talking to your phone to set an alarm or a reminder was still clunky and not in widespread use. It was a trick, not a transformational technology for most.

Voice is a UI, not a platform
But the need for a new interface became clear when we started adding connected devices to our homes. I saw it back in 2014 when Amazon launched Alexa because I already had a home full of devices by then. Others were not so sure. Even Kevin doubted my enthusiasm. Voice was intrusive and still somewhat glitchy. When Amazon launched its smart home capabilities in the spring of 2015, voice really achieved a killer app. Or so everyone thought.
But voice isn’t the smart home. And Amazon’s Alexa layoffs and losses and Google’s issues aren’t an indictment on voice. It merely shows that no one has figured out how to monetize a digital assistant. Monetizing voice is a low-margin endeavor. Voice is the user interface and the digital assistant is the platform. It’s like we’re confusing the ability to touch to navigate our phones with an app store.
And because voice is going to be an essential way people communicate with ubiquitous computers, we have to get voice right. It’s not going anywhere. But trying to make money on it outside traditional ways companies monetize user interfaces is a mistake. Logitech sells keyboards and mice. Apple and Google have operating systems that translate taps into instructions on touchscreens. Amazon, Google, and other companies can sell far-field microphones embedded with NLP software to provide voice.
Divorcing voice from the platform
But unlike touchscreens, voice has high barriers to standardization and understanding of intent that make it more difficult to divorce from a platform or OS. Alex Capecelatro, the CEO of Josh.ai, a company that builds a voice interface for custom integrators, points out that with voice, there are two layers of communication. The first is the actual words, and the second is the intent.

“When you’re dealing with apps, you have a dedicated destination you want to get to,” he said. “With voice, how do you prioritize words and voice commands? How does the system do the right thing and how does the user have control where a command goes?” In his example, asking a voice interface to turn off the light requires the interface to know what to do, but also what program to ask to do that. Today voice interfaces use integrations such as Amazon Skills to understand intent in some cases like knowing what light to turn off and in others, such as setting an alarm or asking a factual question, choosing its own source of information or action.
So let’s talk about digital assistants

I may be mistaken because of my own reliance on voice (I’m quadriparetic), but I thought that, from the beginning, voice for Amazon was a way to sell more other products, and it has been successful in that, at least, if you look at the demand for Alexa enabled plugs, bulbs, and everything else that Amazon gets a cut from when they sell it from their website. So it was never about making money on the voice assistant itself. if the average person who uses Alexa for voice control ends up buying 10 additional smart devices, that’s 10 more places that Amazon made a little money.
To me, and again, I am looking from my own edge case perspective, so I could be completely wrong on this, is that Alexa projects have failed most times they have tried to make money from the echo device itself, whether it was selling movie tickets, or different kinds of specialty echo devices, or anything other than, well, a UI that a lot of people enjoyed using.
The one exception has been the family monitoring service, which i still find annoying that it doesn’t have a free tier anymore, but has become so essential for certain families that they will pay the subscription for it. And I assume the Music subscription makes money, but I don’t know for sure.
Other than that, meh. But I know a lot of people who will pay an extra 10% to get a voice enabled thermostat or door lock or a security camera or some other device where there are non-voice enabled choices as well. I don’t know what the industry term is for those, “client products”? But in any case, selling those at slightly higher prices should be good for Amazon.
They just need to stop trying to build a pickup truck that flies, as one of my engineering professors used to say.
I think you missed a point. Echo gives me freedom to control lights without finding a light switch. I gain freedom to adjust comfort levels without leaving my recliner or walking to a central part of home to adjust a thermostat.
I like freedom.
Like so much of the “smart home” stuff, there seem to be 2 main markets for voice control. 1) the people who want things only if they are cheap ($50 Amazon echos) and 2) the people who care not at all about price ($600 Josh.ai nanos). Anyone stuck in the middle has to deal with those 2 groups.