Database vendors pitch themselves as the cure for runaway AI costs

Irresistible force, meet immovable object. Tech leaders are under pressure to satisfy growing demand for AI while keeping a lid on costs.

That is becoming harder as Anthropic, OpenAI, and GitHub shift some services away from flat-rate subscriptions toward usage-based billing. Database vendors claim they can help by cutting the number of calls made to AI models and handling the new workloads generated by developer agents.

According to IDC research director Devin Pratt, "demand for the underlying capability is strong, because agentic adoption is already broad."

Around 79 percent of organizations are either investing significantly in agentic AI with a set budget or already running agentic applications in production, according to IDC.

"The appetite for the data infrastructure beneath it is real," Pratt said. "The open question is whether the specialists win or the capability gets absorbed into the platforms enterprises already run, much as it did with vector databases."

Among the specialists taking on the omnipresent cloud platforms and their embedded data services is Pinecone. It carved out a niche as a vector database vendor for users wanting to put LLMs into production. Although that term became commonplace among mainstream database companies, Pinecone always argued it maintained a technical advantage for the biggest use cases.

Building on top of its vector database technology, the vendor has recently launched Nexus – a "knowledge engine, not a retrieval system" – and embedded it in Microsoft's OneLake, a hybrid data lake and data warehouse environment.

Pinecone's idea is that by compiling a knowledge base of an organization's data structure and content, its technology can avoid burning through tokens back and forth between the data and AI agents. Nexus is designed to structure, contextualize, and compose specialized contexts – derived artifacts – in advance of agent demand.

Pinecone product veep Jeff Zhu told The Register the idea was to prevent agents from repeating the same work to understand the structure of business data and its context.

"All these coding agents, for example, are really good at doing a bunch of exploratory work if you ask them a question," Zhu said. "It's going to make a call, get the table schema, do some exploratory work, figure out what the top rows of this one table are, and ultimately it will eventually get to the right answer most of the time, but it's going to burn through a bunch of tokens, because every single time it creates a specific answer to a question, it has to understand your business context and rediscover it every single time."

Nexus provides a semantic layer of business data for a given use case or outcome. Imagine a finance analyst agent versus an HR agent: it is very possible they could use the same data, but they would want very different outcomes as a result, Zhu said.

"Our system is an engine that can support any number of use cases. Based on your particular use case, and the tasks that you want to accomplish, we use your data sources – which can be SQL databases, unstructured documents, PDFs, and so on – we build a task-specific context which is used for that individual agent's job or role."

IDC's Pratt said that the hard part of agentic deployments has shifted from the model to the data plumbing around it. Agents reason continuously and act on live data, and the traditional split between operational stores, analytical warehouses, vector indexes, and the pipelines stitching them together was built for humans, not software running in loops. A recent IDC Data Management survey found that the two biggest data roadblocks IT leaders name for scaling generative and agentic AI are security and compliance constraints and cost. Fragmentation compounds it, with nearly two-thirds of organizations running 11 or more distinct database technologies.

Pratt said Pinecone is betting that retrieval as practiced today does not scale to agents and that, rather than have an agent rediscover and re-read raw content on every call, Nexus does the reasoning once, upstream, and stores reusable, task-specific context. The idea is credible, and the positioning is smart, moving Pinecone from a vector database to a knowledge layer. Its query language, KnowQL, also includes a budget primitive, and the platform gives a single dashboard for token usage and spend. By doing that work once rather than on every inference call, the design directly addresses one of IT leaders' chief concerns: cost.

"What is interesting about Nexus is that it treats cost as a design constraint, not an afterthought, with a budget control and token accounting built into the query layer. That is exactly where IT leaders feel the pain," Pratt said.

Changes in the wider tech landscape have previously forced new ways of looking at databases and related software. The market responded to the so-called Big Data problem – the staggering volume of information generated by mobile devices and web clickstream data – with Hadoop and Apache Spark. The former has largely been superseded; the latter is still very much with us. The demands consumers and businesses place on globally scaled web systems prompted a generation of NoSQL databases like Couchbase, Cassandra, and MongoDB, as well as new distributed relational databases.

Another database vendor hoping to help tech teams manage the escalating costs of deploying AI agents is Tiger Data, the company behind the PostgreSQL time-series database TimescaleDB. It has built Ghost, a technology designed specifically for developers working with AI agents.

The company argues that agents helping build software experiment constantly and need isolation to do it safely. When an agent fails, the blast radius should be one database, not a shared environment that other agents and humans depend on. The Ghost platform offers instant PostgreSQL databases with fast forking, which agents can access through the Ghost CLI or MCP server. It comes with a terabyte of free storage.

Ajay Kulkarni, co-founder and CEO of Tiger Data, told The Register the company had also adopted a new database charging model better suited to AI agents, which typically produce spikes and falls in demand.

"Instead of being packaged at the traditional database level, we provide usage-based pricing at the compute-hour level, so no matter how many databases you have – it could be one, it could be 50 – it'll cost the same, and you'll just be metered by how many compute hours you consume," Kulkarni said.

Tiger Data offers a free tier with 100 compute-hours per month. Users can pay for additional usage in 15-minute active windows. "That really allows for experimentation and allows for this idea of multiple databases, with every agent getting its own database, because you're not paying on that dimension, you're paying on the compute-hour dimension," he said.

IDC's Pratt said Ghost solves a different but equally concrete problem. It gives each agent or task its own disposable PostgreSQL database, allowing an agent to branch a dataset in seconds and discard it afterward. Because billing is based on usage rather than the number of databases, Tiger Data offers a practical way for organizations to stay with PostgreSQL. However, many Postgres vendors are racing toward the same capability.

"Ghost is the most pragmatic of these. It hands each agent a throwaway PostgreSQL workspace to experiment in without risking production, and by staying on PostgreSQL it asks IT teams to learn almost nothing new," he said.

Pratt argued that agent-ready data infrastructure is turning into its own category, and the platform vendors are the ones to watch. On the context and vector side, Pinecone competes with companies such as Weaviate and Qdrant, as well as the pgvector ecosystem inside every Postgres distribution. On the agent workspace side, Tiger Data's Ghost sits alongside Neon, now owned by Databricks, and Supabase.

Bigger platform vendors such as Snowflake, Oracle, and Microsoft are also absorbing these capabilities into the stacks customers know well.

"The independents are defining these categories, but watch the platform vendors. Most organizations tell us they expect to do their vector work inside the database or lakehouse they already run, not in a separate specialist tool," Pratt said.

Aaron Rosenbaum, Gartner senior director analyst, said that building this context layer is critical for both token efficiency and better answers. "We've seen significant development across the industry; Snowflake introduced Horizon Context; Databricks introduced Genie Ontology, and we will see significant growth in the quality and depth of new tools introduced throughout the year. While most of the leading data platform vendors have their own solution, there will be enterprises that look for solutions independent of those platforms that Pinecone Nexus can address," he said.

He also noted that there was a string of vendors examining the different DBMS workloads created by agents compared with those created by humans.

"It's driving new innovation across the industry right now," Rosenbaum said.

As technology leaders try to negotiate better deals with AI providers to keep costs down, they can also look to their data platforms to ease the financial burden created by the often-unmanaged surge in demand from AI workloads.

"There are many options available today from all the major providers, and no clear leader has emerged. History has shown that the established DBMS vendors have been very agile at supporting new workloads, and it's likely that will continue," Rosenbaum said. ®

Originally published on The Register

Database vendors pitch themselves as the cure for runaway AI costs

Related Articles

OpenAI unveils its first custom chip, built by Broadcom

OpenAI reveals its first AI processor: Jalapeño