Earlier this week, Google introduced Look to Speak, an experimental Android app that brings input access for those who can’t speak or physically type. Look to Speak uses the camera of an Android phone to track the eye gaze of a user, allowing them to type or select a button by interpreting where they’re looking. Based on the video demonstration, it appears amazing.
And while I don’t want to discount the positive accessibility aspect for the intended use, I can’t help but think of how useful this could be in the smart home. I’ve written about various user interfaces several times prior and my take is: The more the better for an improved smart home experience.
Voice is a great invisible interface, for example, letting us command the devices in homes whether we use Alexa, the Google Assistant or Siri. But there are times and situations where voice isn’t ideal. I don’t want to speak to a smart speaker, for example, when my wife has fallen asleep next to me on the couch.
Using a smartphone for such controls is always an option, provided that the phone is within reach or in a pocket. I look at using phone apps for smart home control as an immature and non-optimal experience though. Phone apps were the first smart home interface, at least until digital assistants came along. And while gestures are a potential option in the future for controlling devices, the technology isn’t quite mature enough yet.
So when I see something like eye-tracking for input, I get excited.
Implementing a Look to Speak-like experience isn’t much of a stretch for the smart displays of today. Many already have cameras for video calls and some are even advanced enough to use software that tracks you around a room.

This year, Google’s smart displays even gained the ability to recognize which family member is looking at the hardware. That provides useful context to show information relevant specifically to that person.
Indeed, at our Level Up the Smarthome event in October, Jake Sprouse Head of Technology at Synapse shared thoughts on such context and potential smart interfaces that could use it.
If the system knows where I am and knows that I’m in front of cooktop… it knows that I’m cooking. Then when I say ‘Turn the burner on high”, I don’t need to say a wake word. The system knows what I’m doing.
That type of context is what I’m envisioning if Google’s eye-gaze input technology could be adopted for current or future smart displays. My Google Next device already surfaces several contextual touch tiles when it sees me walk up to it. At night time, it shows a Lights button, while in the middle of the day, a thermostat option appears.
Why speak a command or tap the screen to utilize interact with these actionable buttons when a simple glance and perhaps a blink does the trick? This could further reduce user interaction friction without having to say a word or reach out to touch the screen.
To be clear, I don’t know that Google is looking to bring this technology to the smart home. And I sincerely appreciate the intended purpose of Look to Speak, as it provides a voice to people who may not have one otherwise. But I’d like to see this, or a similar, interactive experience come to my smart home.
Leave a Reply