For companies that have moved into connected products, most of their questions have been around what modules to build and where to store all the precious data they’ve collected. But speaking at a conference this week in Burlingame, Calif., Google’s Mark Spates offered a few more choices that a manufacturer must make, especially if they plan to build intelligence into their products.
Spates, who is a product lead for Google’s speaker products, started out by saying that most devices nowadays will comprise hardware, software, and artificial intelligence. Even lowly sensors might run basic models that can determine if the sound of glass breaking is from a broken window or a clumsy who dropped a plate. But if you want to add AI to products, how should you think about their development and purpose?
Spates divided the choices into three categories. The first is whether you want your product to prioritize for a centralized cloud-based existence or a local, federated existence. A focus on the cloud means you can have a cheaper, dumber device because the serious computing happens in the cloud. You also can let the device have access to huge amounts of information, thanks to the ability to pull in data from other products. And you can get smarter faster, because the focus on reporting back to the cloud means data will get there more often. As Spates said, the device “can learn from every interaction.”
The downside is that your product will need the internet to work, and data privacy can become an issue.
In a local and federated model, the computing happens on the device. This will reduce latency and keep data more private. But you’ll have to process data locally, which means the end device will likely cost more because it needs more computing power. Additionally, the data that does make it up to the cloud will likely get sent there less frequently, which means the product won’t get better as quickly. Finally, the components used to build connected devices always lag. So if you decide to go local, you will be limited by their ability to update the device.
Cloud vs. local really affects the hardware, but as the designer moves to software, they face another choice: do you want a specific or a convergent feature set? Spates said a convergent feature set offers the same set of features in a bunch of different packages. So your digital assistant might be in a phone, a car, or inside an Echo or Google Home, but it still offers the same functions, even if it may not make much sense to get recipe instructions in the car.
Or you can stay specific. As Spates put it: “Focus on the specific use case and use all the hardware for that.” Google’s dedicated Home Max speaker is an example of this. The downside is that if all your software uses all your hardware to deliver better AI-tweaked sound adapted for a room, you limit your market. A generalized device that does a lot of things is more useful to more people.
Speaking of AI, the choices that a designer must make will be based on how the company plans to evolve the user experience. Do they focus on breadth or depth? This is similar to the previous choice in that it will change how people use your product. If you focus on breadth the type of models you build for your product will be more generalized, while a focus on depth means you will concentrate on doing one thing well and optimize for that. Except you risk having the market pass you by as the experience changes, said Spates.
There are not right or wrong answers, but Spates called on companies to make these choices consciously, rather than just adding a Wi-Fi module to a product and throwing it out into the market. He ended his talk with a deep dive into how the industry should start thinking about AI in the home. One of the toughest questions is, what should be user-driven and what should be AI-driven?
User-driven interfaces are things like voice, touch, or presence. AI-driven experiences can be curated, corrective, or predictive. Spates cautioned that if a company wants to lean toward AI-generated experiences, each of these options requires a different confidence interval. So in a curated experience (think Netflix or suggested songs from Pandora), an AI model should be 70% confident it has the “right” answer. In a corrective experience, such as when a motion sensor doesn’t detect motion for a while and then turns off the lights, the model should have an 85% confidence interval, because it’s pretty annoying to have to wave your arms around when you’ve been sitting still in a room for a while and the lights turn off.
The final level, which we haven’t really achieved yet, is predictive. This is where your devices anticipate the user’s needs and then proactively try to meet them. For example, your Google Assistant might pull data from your home to learn that the activity taking place means you’re about to leave for the airport, so it calls you a Lyft. Or maybe it recognizes that Tuesday nights you order pizza and so calls Domino’s on your behalf.
In either of these cases the cost of a failed prediction has real-world consequences, such as an angry Lyft driver or an unwanted pizza. In these and similar situations, Spates said the designer needs to be pretty much certain, with a five nines confidence interval. That’s a big ask.
It also shows why we’re so far off from an intuitive smart home.