InfluxData raises $81M for IoT database and real time data - Stacey on IoT

A former editor of mine once described fundings as “the atomic unit of the tech press,” which was basically his way of saying that anytime I wanted to write about a new company, I had to use a funding round as the way into the story. With that in mind, I’m eager to let y’all know that InfluxData has raised a $81 million through a combination of debt ($30 million) and equity ($51 million) in a series E round of funding. So far, the ten-year-old company has raised $171 million.

InfluxData CEO Evan Kaplans says the funding will help InfluxData reach profitability and keep the company growing through what is likely going to be a rough few years in technology investment. Kaplan doesn’t anticipate hitting the public markets soon, and plans to achieve profitability within the next two years. InfluxData sells services and support for the open source InfluxDB time series database used by hundreds of thousands of users and 1,900 paying customers.

Tesla is an InfuxData customer. Image courtesy of Tesla.

Time series databases are crucial for the internet of things, and about half of InfluxData’s customers are in the industrial IoT. Time series databases track and store measurements such as temperature or state and the time. That’s it. It’s not complex, but it can be overwhelming. Some sensors report their state every nanosecond and some implementations can contain thousands of sensors. That’s a lot of small data coming in at a huge velocity.

Even more challenging for those designing IoT systems is that usually the newest data is the most important. Anomalies in the data are also critical. Which means that the database has to be easily searchable and able to quickly provide the most relevant data points from what can be a massive amount of information. Other time-series databases include GE’s proprietary Predix database and newer options, such as TimescaleDB and Amazon Timestream.

But the reason I’m writing about the funding is because it gives me a chance to revisit an idea I discussed five years ago with Kaplan — the most appropriate architecture stack for the IoT. When we last spoke, he pitched the idea of the TICK stack. From the story:

Together with tools called Telegraf, Chronograf, and Kapacitor, Kaplan is selling a concept called the TICK stack. It is designed to rapidly ingest and handle data while also giving users the tools to query it. As a lover of many IT stacks—from the historical LAMP (Linux, Apache, MySQL, PHP) stack for web development to the more recent SMACK (Spark, Mesos, Akka, Cassandra, and Kafka) stack for big data—I like the idea of one for the IoT.

The TICK stack didn’t pan out, but Kaplan is now pitching a new stack for real-time data analysis. (It’s worth noting that in this case, real-time does really mean real-time as opposed to within 15 minutes or other almost real-time analysis.) He now believes the appropriate stack consists of Apache Arrow, Apache Flight SQL, Apache Arrow DataFusion, and Apache Parquet. I have my doubts about an application stack for the IoT, since it’s far too broad a category, but the pitch here feels a bit too narrow.

What Kaplan is calling a stack is really just an open source, columnar in-memory ecosystem (that’s the Apache Arrow bit) that has tied in projects such as Flight SQL and DataFusion to handle storage, optimization, queries, etc. across different styles of databases, and Parquet to tie a a super fast in-memory columnar database to a traditional SQL database and move data from in-memory storage to external storage. Kaplan is pitching this “stack” because InfluxDB’s new IOx storage engine uses Apache Arrow’s format for representing data and Parquet to move data to external storage. It also uses DataFusion to add SQL support.

So basically, Kaplan is pitching the tools his company has built on as the ultimate solution for real-time analysis. But they are compelling tools and InfluxData has managed a series of updates to the InfluxDB open source project to help scale data ingestion and use. It also has worked out ways to blink its cloud-based database to edge-based databases, which is especially helpful for companies in the IoT, which might be trying to analyze data both at the edge for latency-sensitive use cases, and then keeping some of that data in the cloud for other analysis.

Today the IOx engine is only available in multi-tenant cloud instances, but Kaplan says in April, InfluxData will add the ability to to have dedicated cloud instances, and in September it will bring IOx to edge devices. This focus on the edge is an advantage Influx Data has over rival time series databases offered by AWS and Microsoft. InfluxDB also has rival time series databases with different architectures entirely, such as TimescaleDB.

As part of the funding, Influx Data added Princeville Capital and Citi Ventures as new investors, while existing investors Battery Ventures, Mayfield Fund, Sapphire Ventures, also participated. The $30 million debt facility is provided by Silicon Valley Bank.

InfluxData is a sponsor of the newsletter and podcast.

Share this:

Related