Voice is hot. It is rapidly becoming an accepted way of interacting with computers and even everyday objects as computers become embedded in those objects. Shouting for Siri to find an address or asking Alexa to turn on the lights is now commonplace in many homes in the U.S. Voice adoption is on the rise in China as well.
But with Google, Amazon — and to a lesser extent, Apple — poised to dominate this means of interaction, companies are searching for alternatives so as to avoid giving those tech firms too much power. For example, Nuance Communications, once a leader in the speech-to-text (although not the natural language processing) space, has signed a deal to provide BMW with an intelligent voice platform for its cars.
Outside of established companies like Nuance, startups are trying to provide voice services that brands can white-label as their own along with algorithms they can use for natural language processing.
In the smart home, a company called Josh.ai has taken on the role of Alexa or Google Home for professional installers. Josh.ai is a platform that lets users control their lights, AC, televisions, and more using natural voice commands or an app. While Josh.ai integrates with popular DIY products such as Hue Bulbs or Nest thermostats, it is designed for a professional CEDIA installer and works with many professionally-installed systems. Josh.ai has raised $11 million so far.
As a control system, Josh.ai is going almost head-to-head with Amazon Alexa and Google Home. Meanwhile, French company Snips is trying to take the natural language processing technology behind Alexa and Google to build white-label voice assistants for other companies. Such companies might include a car maker or even a giant enterprise that wants a bespoke (and private) voice assistant for the office.
Joseph Dureau, CTO of Snips, explains that many of its clients are either looking for control of what they see as an essential way to access potentially proprietary data or as a way to control their brand. This makes sense. Natural language processing is a fundamental technology, and while I don’t see people addressing different devices in their home by different digital assistants’ name, I do think people could be trained to address a location-specific assistant without using up too much brainpower. For example, at work, I might talk to Jonathan, while at home I address Alexa. In the car I might call on Mateo for directions to the supermarket.
At an even lower level there’s a startup called Aiqudo, which is building the actual algorithms for voice recognition as well as an interface. Instead of the traditional neural networks, Aiqudo is building up libraries based on semantic understanding. John Foster, the company’s CEO, says that the end platform can understand what people say and, based on that information, will open the appropriate app needed to fulfill the user’s request.
As a test, a user can download the Aiqudo (pronounced I-Q-Doe) app on Android handsets and then tell the phone to message their mom. The user will be able to select which app she wants Aiqudo to use for messaging, whether it is Messages or WhatsApp. Foster thinks that app makers will eventually build in the links to the Aiqudo platform because they don’t want to be beholden to Amazon or Google to send them traffic.
So there you have it. While most of us are excited to discover the power of voice through Alexa, Siri, and Google, there are plenty of people trying to build the tools that will let us use voice outside of those narrow confines.