Latent AI is building software to make tiny ML better - Stacey on IoT

It’s time to start pruning those neural networks so they can run on tiny devices.

In this story, I want to introduce two things. The first is the concept of tiny machine learning, or tiny ML. The second is a startup, one of many businesses trying to build a community around that concept. Let’s tackle tiny ML first.

I unwittingly wrote about this idea two months ago after visiting Microsoft and chatting with Byron Changuion, a principal software engineer at Microsoft Research. He was building compilers for machine learning software so it can run on microprocessors. I was fascinated because most companies are trying really hard to make low-power chips that can perform machine learning inference on smaller, maybe even battery-powered, devices.

But no one is really promoting the idea of performing inference on something like a microcontroller. AI involves two stages: training a model, which takes large training data sets and runs them through a massive GPU or another type of processor in a data center, and inference. Inference is when new data is run against the trained model to see if it fits. Inference requires less computing power than training, but it still can be processor-intensive.

After writing the Microsoft story I learned about an entire community of people from companies like Google, Qualcomm, and ARM who are trying to bring AI inference to the smallest and farthest edge. The idea is that by placing a machine learning model on a sensor (as opposed to a gateway) you can filter out a ton of useless data, protect user privacy, and improve latency.

Most of the demos in this space involve person-detection computer vision on sensors the size of a dime or voice recognition on a tiny microcontroller (MCU). But there is huge potential here, especially as it relates to the internet of things. Which is why I was excited to run into Jags Kandasamy, the co-founder and CEO of an 11-month-old startup called Latent AI, which is writing software to help run models efficiently at the farthest edge.

Kandasamy said his software can optimize existing models so they can run on MCUs and lose little of the accuracy of the original model. There are two products. One product includes software running in the cloud that helps compress an existing neural network so it can run on smaller devices. The other product is a compiler that compresses library files in what the company says is a matter of hours, by 10x. Kandasamy said he has 10 customers currently trialing the software and is in talks with silicon providers so he can optimize the software for their chips.

Kandasamy showed me how he took popular image recognition neural networks and tweaked them with the optimizer and then compiled it to run on his phone. A phone is obviously a higher-powered device than a sensor, but as he pointed out, phones make for better demos. In the demo the model did perform well, identifying people, bottles, and other paraphernalia around the conference center.

Models generated by neural networks aren’t limited to computer vision or voice recognition. For example, having a vibration sensor running a model designed to detect catastrophic failures could order a device’s shutdown. Or a hearing aid might run a model that could separate a conversation from background noise. We’ll likely see this type of software deployed in areas where companies need a rapid response, in use cases where privacy matters, and anywhere that battery life counts.

In June, Latent AI raised a $3.5 million seed round of financing led by Steve Jurvetson at Future Ventures and including SRI International Ventures, Perot Jain, and Gravity Ranch. Latent AI is a spinout of SRI International.

Share this:

Related