This story was originally published on June 9, 2023 in my weekly IoT newsletter. You can subscribe to it here.
One of the underreported aspects of the Federal Trade Commission’s two settlements with Amazon and its Ring division was that Amazon would need to cull face data gathered by Ring cameras and children’s data gathered from Alexa devices and to stop using that data in its work products. All of which is just a fancy way of telling Amazon not to use that data for training algorithms.
Presumably Amazon has already used that data to train algorithms that recognize faces, so it’s unclear how it should remove “illegal” data. And will culling the data from the training set and then retraining the models work? Does Amazon even have that kind of careful annotation on its image data? Do most companies?
Before 2018, when the EU passed the General Data Protection Regulation (GDPR), I would have said no. Most companies were not cataloguing their data before using it to train models. And to be clear, the FTC fine results from actions that Ring took prior to January 2018 (the GDPR rules went into effect in May of 2018), when Ring gathered consumer face data without permission.
But for companies operating in today’s regulatory climate, having tools that catalogue incoming data from video cameras, sensors, and other devices is simply good business. This includes data on whether a user has consented to sharing their data for use in training algorithms. But it can be hard to find such tools, and larger companies tend to build them themselves.
That’s why when I encountered a startup called Xailient at the Parks Associates conference in Dallas last month, I was eager to learn more. Xailient combines two nerdy things that I love: on-device machine learning (TinyML) and documentation associated with data collected by the device, including consent data.
Lars Oleson, the CEO and co-founder of Xailient, explained that the company does machine vision on sensors located on the device so as to keep the actual images local. Insights such as whether or not a camera sees a person or a pet get sent along, not the actual image of the person or the pet. (See more on this concept here.)
The technology is set to be used on cameras from Abode later this year. Abode’s CEO and co-founder, Chris Carney, was also at the conference, and he told me that on-device processing was important for cutting down on bandwidth costs associated with sending and storing camera data in the cloud. But he also appreciated the ability to only use data for training from users who have already consented to the practice.
It’s the cloud savings that will likely entice most of the customers who look at Xailient’s technology. Oleson said that when it comes to existing options for image recognition done in the cloud, 80% of the cost is associated with sending data and storing it in the cloud while 20% is associated with performing image recognition on that data using services such as AWS Rekognition or custom computing models.
By running the models on the device, Oleson said he can cut the 20% of costs associated with the models running in the cloud and about half the costs associated with sending data and storing it in the cloud. This is a hugely compelling option because it means device makers can offer advanced services such as image recognition without needing to charge high subscription prices to cover the costs of delivering them.
I am always keen on performing AI locally for privacy, cost savings, and energy savings, but I’m also really interested in the Orchestrait platform that Xailient uses to track how device makers catalogue their data. The need to get consent and prove users have consented to sharing their data is only going to become more important as regulators in every country try to figure out privacy laws and their impact on AI .
By showing regulators that getting and showing user consent is a manageable problem for tech companies, Xailient can help push them to hold tech firms to higher standards.