Last week, I found myself having a conversation that covered edge computing, digital twins, and the concept of absolute truth. It started out as a discussion with Simon Croby, the CTO of Swim.ai, about that company’s latest product, which is designed to bring Swim’s edge analytics software to the enterprise and industrial world. But it quickly broadened to a conversation about the way we think about data storage and compute when we want to act on real-time information and insights.
Basically, with IoT we’re trying to get a continuous and current view of machines, traffic, environmental conditions, or whatever else so we can use that information to take some sort of action. That action might be predicting when a machine will fail, or routing traffic more efficiently, but for many use cases, the time between gathering the data, offering an insight, and then taking action will be short.
And by short, I mean the data might need to be analyzed before a traffic light changes or a person walks more than a few feet away from a shelf in a grocery store. Figuring out how to analyze incoming data and then create a model based on it, such as of an intersection or shoppers, so that a computer can act on it is what led to our discussion of truth. Crosby’s point was that truth changes every second, so if we’re trying to build a digital twin that represents the truth of a machine or a model, it needs to constantly change. And that has a lot of implications for how we think about computing architectures for digital twins.
For example, Swim.ai is working with a U.S. telecommunications company to create a digital twin of the carrier’s network in real time and then optimize that network based on the ongoing movements of people and any applications they’re running. The carrier is tracking 150 million cellular devices, which together generate 4 petabytes of data each day. With 5G on the horizon and an increasing number elements to track between devices and base stations, the carrier expects that the amount of data it will need to analyze will reach 20 petabytes.
Prior to Swim, the carrier would take that data and move it to a 400-node Hadoop cluster to analyze it in batches. It took roughly 6 hours and required a lot of servers. After switching to Swim’s software, the carrier can track those 150 million devices and base stations and start taking actions on its network in just 100 milliseconds. It does this by processing the data locally on available edge processors and by only using the data it needs to determine any actions it should take.
So network optimization is both faster and cheaper because the carrier no longer needs a giant Hadoop cluster to batch process giant chunks of data. And the digital twin of its cellular network is much more accurate — representing something close to the “truth”— and allows the carrier to immediately take action to solve problems or deliver a better experience.
In earlier conversations, Crosby said the biggest challenge he’d had was figuring out a way to use the Swim software to make money. Charging Uber or a city 25 cents per API call to deliver almost instantaneous predictions about traffic or the status of a red light wasn’t working. The use case was interesting, but the economics weren’t there. Not enough people want to pay such a relatively high amount to predict traffic flow.
With this particular carrier, Crosby has found a customer with a high-value problem that uses a lot of data and requires real-time insights so it can take action. But while I can think of lots of use cases where a digital twin represents the on-the-ground truth of the situation within a few seconds, I’m not sure how many are worth spending a lot of money on.
For example, grocery stores might want to map out where shoppers are so they can offer a recipe suggestion when the shopper is near a particular product. A logistics company might want to understand when its trucks are near the cheapest gasoline so they can order them to stop for a top-off. And as electric cars and trucks become more common, having per-second insights into the cars’ charge levels and the location of nearby chargers will be important.
But although I’m still not sure how to build business cases for these examples, I do think the discussion I had with Crosby helps clarify two big issues around the next phase of the IoT. The first is that digital twins must be built to deliver real-time insights and change within seconds. That means we need standards, whereas efforts by companies such as Microsoft to make it easy to manually build out digital twins of our offices or stores feels like the wrong strategy.
The second insight is that we need to rethink how we deal with data — what we store, where we store it, and how to link data and computing in ways that are more flexible than what we have today. If you are working on this, let me know.