In the heat of the initial ChatGPT craze, I got a text from a former coworker. He wanted to run an idea by me. Always the one to enjoy brainstorming, we hopped on a call and he started off with “Remember how you used to always ask me to pull data for you? What if you could just do it yourself?” And then he proceeds to pitch me an idea that thousands (tens of thousands?) of other people were thinking at the same time: LLMs could be used for text-to-SQL to help less technical folks answer their own data questions.

I was hooked on the idea, but before diving in head first, I told Lei (now my CTO) that we had to do some validation. We contacted friends and former coworkers from various industries. There was a strong interest in real "self-service analytics." We knew it would be much more complicated than it seemed, but the opportunity felt too good to pass up. So Lei and I left The Shire and embarked on our journey to create our vision: Fabi.ai.

This post isn’t about our product itself (though, if you’re curious, you can read more about how some of the ideas below informed our recent product work here). Instead, I wanted to share the core learnings we’ve collected from working with LLMs for data analysis along our journey.

Note: This journey is woefully lacking in wizards and epic Middle-earth battles. 🧙

Why use AI for self-service analytics?

We won’t linger on the “why” too long. If you’re reading this, you likely fall into one of two groups:

  1. You’re someone who wishes you had self-service analytics available and don’t want to have to always wait on your data team
  2. You’re on the data team and have been hearing about how AI is going to help you solve your ad hoc requests problem.

Ignoring the concern for the role of data analysts and scientists, the idea of an all-knowing AI that can answer any questions about an organization’s data sounds nice. Or at least, it sounds nice for the organization and its business leaders whose creativity for new ways of asking questions knows no bound. This AI could be the solution to creating a “data-driven” organization where every leader is leaning on empirical evidence to make their strategic decisions. And all at a fraction of the cost that it would normally take. Finally! Organizations can capitalize on that “new oil” they’ve been hearing about since 2010.

But if this is such a valuable problem to solve and AI has gotten so good, why has no product actually solved it thus far?

Why AI for self-service analytics has failed thus far

Recent industry surveys paint a complex picture of AI adoption in the enterprise. 61 percent of companies are trying out AI agents. However, many worry about reliability and security. In fact, 21% of organizations don't use them at all. These hesitations are felt particularly strongly within data teams, where accuracy and trustworthiness are mission critical to our ability to do work.

Adopters of AI–especially in the enterprise–have a high bar when it comes to the expectations of the technology. In the context of data analytics and the self-serve dream, we expect our AI tooling to:

  1. Provides insights: Tables and charts are great, but those are a subset of what one might call “insights”. Insights are “Aha!” moments that come from spotting things in your data that run counter to your intuition and would not have otherwise considered. Sometimes a SQL query or a pivot can shine a light on these insights, but generally it feels much more like finding a needle in a haystack.
  2. Work reliably nearly 100% of the time: The only thing worse than no data is bad data. If the AI can’t be trusted or hallucinates answers and data, that spells bad news for everyone. This means that when the AI has data, it should use it correctly. But when it lacks data, it should avoid giving an answer (something LLMs are notoriously bad at).
  3. Be accessible to a wide range of technical skill sets: The beauty of LLMs is that you can interact with them like you would with a coworker over Slack. You can use vague language. The other person or thing can likely understand your request in the business context. Conversely, the more a system requires using exact terms in an exact form, the less accessible it is. This type of system requires training and reinforcement, which we all know, can be challenging.

Sadly, most current solutions use a traditional monolithic AI framework, which often fails to meet expectations. In the past few years, the Fabi.ai team and I worked hard on this issue. We built prototypes for the enterprise and explored many options. In the end, we realized that neither Retrieval Augment Generation (RAG) nor fine-tuning could fix this problem with the current monolithic framework.

When we tested this approach, a few things became clear to us:

After looking at these issues, we thought about how to make AI adapt better to problems. That’s when AI agents came into play and solidified this concept for us.

The future: Agent meshes

The minute we laid eyes on agentic frameworks, we knew it would change the game. We suddenly felt we could let the AI decide how to answer questions. It could work through steps and troubleshoot by itself. If the AI writes a SQL query that misses null values in the "Account type" field, it can dry-run the query, spot the error, and fix it itself. But what if we could take this a step further and let the AI operate mostly in Python and leverage LLMs? Now, the AI does more than pull data. It can use Python packages or LLMs to find outliers, trends, or unique insights, which you would usually have to look for manually.

But we still had one problem: the messy enterprise data. We believed organizations could solve this by using strong data engineering practices, like a medallion architecture and a strict semantic layer. However, we rarely found organizations that actually did this in real life. Most organizations use spreadsheets, half-baked tables, and ever-changing data models. From here, we came up with the idea of building specialized AI agents that can be built quickly to answer a specific set of questions.

As companies grow, they handle more data and have more users. The agent mesh idea helps balance quick decision-making with the control needed for governance. Specialized agents help set clear boundaries and responsibilities for each AI. They also create a scalable way for agents to communicate. Plus, they can help manage resources efficiently across teams and companies.

Specialized AI agents

The idea behind a specialized agent is that this agent can and will only answer questions on a very tightly defined dataset. For example, you can create and launch an AI agent that answers questions about marketing campaigns. Or you can build another to answer questions about marketing pipeline, so on and so forth.

We recently launched Agent Analyst, using this architecture. Early signs are very promising. When the datasets are carefully curated and at the right granularity level, these agents can answer a specific set of questions extremely reliably. The builder of these agents can share them with non-technical users and rest easy knowing that the AI won’t answer questions that are out of scope.

There’s just one flaw: Users need to know which agent to go to for which question. It's like needing to know the right marketing analyst to ask a question of vs. just asking a general question. With a general question, someone on the team can direct it to the right person. This is where the concept of an “agent mesh” comes into play.

Connecting agents together

If a single agent can reliably answer domain-specific questions, then why not let agents talk to each other? Why can't, for example, the marketing campaign agent just ask the pipeline agent directly if they can answer a question easier? We believe it should be able to. As a matter of fact, we think that in the future there will be networks of agents with a hierarchical structure. You can picture a “GTM agent” that calls the “Marketing agent.” This agent then calls both the “Pipeline agent” and the “Marketing campaign agent.”

This idea is like a more general idea floating around the AI known as the "Internet of Agents." It's a future where AI agents collaborate smoothly across various organizations. They do this while ensuring that security and trust remain strong.

This mesh approach offers a few key advantages over a monolithic AI (on a pristine semantic layer):

At the end of the day, this idea of a mesh isn’t novel. This mirrors the concept of a mixture of experts which has been shown to improve accuracy for LLMs. It’s simply taking that same idea and bringing it to AI agents.

Technical challenges of agent meshes

At Fabi.ai, we have a long way to go as we build an Analyst Agent mesh. But, we’ve already overcome some of the big technical infrastructure challenges along the way.

AI data analyst agents need a unique architecture. This design must allow them to use Python or LLMs to answer questions, stay in sync with data sources, and fit into collaborative platforms, while still staying secure and scalable. Each agent needs to operate in its own Python kernel, which needs to quickly be spun up or down to reduce costs and stay in sync with the source data.

Architectures that don’t provide individual kernels to each agent can run into one of the following risks:

The challenge of building this type of platform is as much an AI challenge as it is a DevOps challenge.

Looking ahead: Embracing specialized, governed AI agents in data

As enterprise companies manage more AI applications in their operations, they need specialized and well-governed approaches. The agent mesh framework uses specialized AI data agents as a means for scaling AI in data analytics. This approach keeps security, reliability, and performance intact.

We might have expected AI to be everywhere by now, answering most data questions. But, if we look closely, the progress in just two years since ChatGPT launched is impressive. We still have much to learn on this journey. In my mind, however, agents and agent mesh frameworks will be key to enterprise AI.