sia.hackernoon.com

Data visualization is the final step in analytics, but also one of the most neglected. As two folks deeply embedded in the business intelligence and AI space, we’ve watched dashboards turn into forests of bar charts and pie charts, and often not because they're the best, but because they're familiar.

That got us thinking:

Can an AI recommend the best chart for any dataset automatically?

We teamed up to explore this idea. One of us focused on rule-based heuristics and data profiling logic; the other (shoutout to my co-author!) led the charge on LLM-driven enhancements. Together, we built a working prototype that marries structure with semantics.

This article is our build-in-public walkthrough: architecture, code, logic, and all.

The Problem with Traditional BI Charting

Let’s be honest: most BI tools (Tableau, Power BI, Looker, etc.) offer chart variety, but not guidance. They assume the user:

Knows what chart suits their data
Understands data types
Can infer the story the chart should tell

In reality:

Users default to safe/familiar charts
Visuals often misrepresent the data
Stakeholders misread them

We needed a tool that doesn’t just draw but thinks.

What If AI Picked the Chart?

The idea hit during a dashboard review: Why use a line chart for this categorical data?

Boom!! Opportunity spotted.

What if a system could:

Read your CSV
Profile columns
Understand what you want to analyze
Recommend the most effective chart

Our vision: Upload your data → Get a tailored chart suggestion (with explanation)

This would empower non-technical users and speed up analysts.

How We Designed It

We split the engine into three layers:

Data Profiling Layer

This step scans the dataset to identify column types, cardinality, nulls, and more.

pythonCopyEditdef profile_data(df):
    profile = []
    for col in df.columns:
        dtype = df[col].dtype
        unique_vals = df[col].nunique()
        null_count = df[col].isnull().sum()
        profile.append({
            "column": col,
            "dtype": str(dtype),
            "unique_values": unique_vals,
            "nulls": null_count
        })
    return pd.DataFrame(profile)

Output:

column	dtype	unique_values	nulls
Region	Object	5	0
Sales	float64	1000	3
Created_At	datetime	12	0

Rule-Based & Heuristic Engine

We hardcoded rules based on common patterns. This covered 70–80% of cases.

pythonCopyEditdef suggest_chart(df):
    profile = profile_data(df)

    cat_cols = profile[profile['dtype'] == 'object']['column'].tolist()
    num_cols = profile[profile['dtype'].str.contains('float|int')]['column'].tolist()

    if len(cat_cols) == 1 and len(num_cols) == 1:
        return f"Bar chart recommended for {cat_cols[0]} vs {num_cols[0]}"
    elif len(num_cols) == 2:
        return f"Scatter plot recommended for {num_cols[0]} vs {num_cols[1]}"
    elif any('date' in str(dtype).lower() for dtype in df.dtypes):
        return "Line chart recommended for time series visualization"
    else:
        return "Default to table or manual selection"

These logic rules acted as guardrails: rigid, but fast and reliable.

LLM-Enhanced Semantic Layer (Led by Co-Author!)

We added an LLM layer that interprets column names and user goals using natural language.

import openai
from openai import OpenAI

client=OpenAI(api_key="API_KEY_HERE")

models = client.models.list()

def get_llm_chart_suggestion(columns):
    user_prompt = f"""
      You are given a dataset or a description of data. Your task is to recommend the single most suitable type of chart or visualization to effectively represent the data. Your recommendation should:

      - Be limited to one concise sentence.
      - Focus on clarity and effectiveness of communication, based on the data structure and use case.
      - Take into account:
        - The type of data (categorical, numerical, time series, geographical, etc.)
        - The number of variables (univariate, bivariate, multivariate)
        - The intended analytical goal (e.g., comparison, distribution, trend over time, composition, correlation, ranking, or anomaly detection)
        - The audience if mentioned (e.g., general public, business analysts, data scientists)
        - The medium if known (e.g., slide, dashboard, report, mobile screen)

      Avoid generating the chart or describing how to build it. Just recommend the name of the chart type (e.g., bar chart, line chart, pie chart, histogram, box plot, scatter plot, bubble chart, heatmap, treemap, choropleth map, etc.) that best fits the scenario. If more than one chart could be appropriate, choose the most effective and commonly accepted option.
      Data is: {columns}
      """
    response = client.chat.completions.create(
    model="gpt-4o-mini",  # or "gpt-3.5-turbo"
    messages=[
        {"role": "system", "content": "You are a data visualization expert."},
        {"role": "user", "content": user_prompt}
        ]
    )

    print(response.choices[0].message.content)

Sample result:

“Use a stacked bar chart to compare revenue by product line across regions.”

This LLM-backed logic helped in ambiguous cases where rule-based logic struggled.

Optional: Auto-Render Charts

We even added a quick render option:

pythonCopyEditdef plot_bar(df, category_col, value_col):
    grouped = df.groupby(category_col)[value_col].sum()
    grouped.plot(kind='bar')
    plt.title(f'{value_col} by {category_col}')
    plt.show()

What This Engine Does

Reads your dataset
Profiles it using rules
Suggests a chart using logic + LLM
Optionally plots a quick chart

No more guessing. No more mismatched visuals. Just context-aware charting.

Challenges We Faced

Here’s where we struggled:

Ambiguous Column Names: val1, x2, abc123 : LLMs helped, but only so much.
Overlapping Chart Options: Bar vs. Stacked Bar vs. Line? Context is everything.
Visualization Best Practices: Chart selection ≠ chart quality. Avoiding “chart junk” is another layer.

What’s Next?

Here’s how we plan to level this up:

Fine-tune LLMs with real-world datasets
Package this as a Power BI or Tableau extension
Add a feedback loop: "Was this chart helpful?"
Build a drag-and-drop UI (upload → chart preview)

Eventually, we want this to become your AI visualization assistant.

Key Takeaways

Most BI tools assume users know what chart to pick. They don’t.
Rules + heuristics handle most cases. LLMs handle the rest.
The right chart = better decisions, better stories, better outcomes.
Automation can make visualization accessible to all.

Bonus: Try It Yourself

We’re open-sourcing the prototype soon on GitHub.

Want to collaborate, test it out, or use it in your BI workflows?

Reach out. Fork. Contribute. Or just tell us what you’d improve.

We Built an AI Engine That Picks the Best Chart for Your Data