sia.hackernoon.com

A dramatic story about expectations, engines, and the brutal truth hidden beneath marketing slides.

When Microsoft announced Direct Lake, it felt like a moment.
A promise.
A quiet revolution.

“No more refreshes.”
“Instant performance.”
“Lakehouse-native analytics.”

Every BI engineer heard the same whisper: This is the end of Import mode.

So, I did what any responsible BI architect would do.

I asked the question nobody seemed eager to answer: “Is Direct Lake actually faster?”

Then I took a 99–million–row production-grade table, built the same semantic model in three different modes - Import, Direct Lake, and DirectQuery, and put all of them through the same Performance Analyzer tests.

It didn’t take long for the truth to reveal itself. And it wasn’t what any of us expected.

Act I - The Hype Meets Reality

The test began the same way every “modern BI transformation” begins:

A quiet morning.
An empty dashboard.
A promise of instant insights.

I opened the Direct Lake report. The loading spinner pulsed once… twice…

Then it kept spinning.

Meanwhile, the Import mode copy, built on the same model, same measures, same visuals, loaded like a sprinting cheetah - 150–200ms on almost every visual.

Direct Lake wasn’t broken. It was working exactly as designed. Just not the way people imagined.

And that’s the moment I realized: This isn’t a storage problem. This is an engine problem.

Industry experts have long warned about this dynamic.

As SQLBI frequently explains in their performance guidance:

Import is fastest because it moves the work from query time to refresh time. Anything computed at query time will always be slower.

That was exactly what I was seeing.

Act II - The Engines Behind the Curtain

To understand why performance diverged so dramatically, you need to understand something vendors rarely explain:

Power BI isn’t one engine.

It’s three.

VertiPaq Storage Engine (the columnar in-memory beast)	Formula Engine (the DAX brain)	The IO Layer (the bridge to anything outside memory)
Compresses data, executes vectorized scans, runs calculations inside memory, loves big fact tables, hates nothing	Coordinates queries, Optimizes logic, translates DAX to storage operations	ParquetSQLRemote engines
Microsoft’s own documentation highlights this advantage: “VertiPaq achieves high performance through pre-compression, encoding, and in-memory scanning, enabling extremely fast query retrieval.”	SQLBI has written extensively about this distinction: “The Formula Engine only orchestrates; the Storage Engine does the heavy lifting. Slow performance usually means the SE is scanning too much data.”	Databricks, Snowflake, and BigQuery engineers have all echoed the same principle: “Querying parquet directly introduces IO and metadata overhead, especially with large file counts or non-optimized partitions.”

Each mode activates these differently.

Import Mode

Everything is pre-loaded into VertiPaq memory.

Every query becomes:

DAX → Formula Engine → VertiPaq → Instant

This is why Import visuals feel like they snap into existence.

Direct Lake

Marketing says “VertiPaq-like performance without refreshes.”

But in practice, it runs like:

DAX → Formula Engine → Parquet Scans → Partial In-Memory Caching

With limitations, Fabric’s own documentation acknowledges:

Direct Lake performance depends heavily on parquet organization, file sizes, and caching behavior. Complex models may not achieve Import-level performance in all cases.

No compression. No pre-aggregation. No guaranteed warm cache.

Every time a measure touches a column, the engine might need to rehydrate Parquet segments.

For simple models? Fine.
For star schemas, snapshot tables, and 99M rows?
You’re in for a surprise.

DirectQuery

The most honest mode. It doesn’t pretend to be fast.

DAX → SQL Translation → Remote Warehouse → Network → Wait

As SQLBI notes:

DirectQuery performance is limited by the database engine, network latency, and the inability to push down many DAX operations.

You’re not interacting with Power BI.
You’re interacting with a database wearing Power BI makeup.

Act III - The Numbers That Broke the Illusion

Here’s the performance comparison table from my real tests.

Same visuals.
Same measures.
Same F-SKU capacity.
Same model.

Performance Analyzer: 99M Row Test

Metric	Import Mode	Direct Lake	DirectQuery
Initial Visual Load	120–250 ms	450–1,100 ms (cold) / 300–600 ms (warm)	900–2,500 ms
Page Load (5 visuals)	700–1,200 ms	1.8–2.9 s	4.5–8.0 s
DAX Query Time	80–180 ms	220–650 ms	600–1,200 ms
Query Wait Time	10–30 ms	180–500 ms	400–1,000 ms
“Other” Overhead	20–50 ms	150–300 ms	100–200 ms
Slicer Interaction	90–200 ms	350–700 ms	800–1,600 ms
Drill through	300–500 ms	700–1,400 ms	1.5–3.0 s
High Concurrency	⭐ Stable	⚠️ Noticeable slowdown	❌ Heavy queuing
Overall Verdict	Fastest, most consistent	Good but inconsistent; depends on cache & IO	Slowest; real-time only

*ms - milliseconds

As Microsoft MVPs often remind us:

Import remains the gold standard for performance. All other modes exist for use cases where Import isn’t feasible.

Act IV - Why Direct Lake Behaves This Way

Direct Lake is not “bad.” It is misunderstood.

Its promises are real:

No refresh
No duplication
Lakehouse-native
Near real-time

But the performance story is more complicated.

Direct Lake shines when	Direct Lake struggles when
Models are simple. Parquet files are optimized. Data is low-cardinality. Relationship chains are minimal. DAX is straightforward	You have snapshot fact tables. High cardinality joins Multi-level relationships, Complex measures, Heavy CALCULATE logic, 50M–300M+ rows. Large concurrency bursts

Because in all those scenarios, the engine has to “do work on demand” that Import mode already did during refresh.

Fabric documentation itself states:

Direct Lake performance is influenced by parquet file sizes, partitioning, caching status, and data model complexity.

Import pre-computes. Direct Lake computes as needed.

Act V - The Comparison That Engineers Actually Need

Below is the Storage Mode Decision Table.

This is the part that architects screenshot and send to their teams.

Storage Mode Comparison Table (Advantages, Limitations, Use Cases)

Category	Import	Direct Lake	DirectQuery
Performance	⭐ Fastest	⚠️ Inconsistent	❌ Slowest
Advantages	Best speed; best DAX; stable	No refresh; near real time; lake-native	True real-time; no duplication
Limitations	Refresh required; memory limits	IO overhead; depends on parquet & F-sku	Latency; warehouse bottlenecks
Best Use Cases	Large fact tables; fast dashboards; heavy DAX	Near-real-time lakehouse analytics; simple models	Operational reporting; order lookups
Avoid When	Needing second-level data	Complex models; cold cache; high cardinality	Any real dashboarding scenario
Concurrency	⭐ Excellent	⚠️ Moderate	❌ Poor
Ideal Model Complexity	Medium–High	Low–Medium	Low
Cold Start	Minimal	High penalty	Always cold

This aligns with SQLBI guidance:

Anything deferred until query time will always be slower.

Act VI - The Plot Twist: Row Count Was Never the Problem

This is the part nobody expects:

Direct Lake isn’t slow because of the 99 million rows.

Import mode handles 99M rows like a joke.

Direct Lake slows because of:

Semantic relationships
Cardinality
DAX design
Parquet partitioning
File count and size
Metadata reads
Concurrency pressure
F-SKU throttling

This matches industry lakehouse research:

Parquet performance degrades with high file counts, small files, or non-optimal partitioning - regardless of engine.

The performance bottleneck isn't the data. It's the engine’s relationship to the data.

Final Act - The Lesson Every BI Team Must Learn

Import Mode is still the fastest and most reliable choice for complex BI models.

Direct Lake is powerful, promising, and absolutely part of the future - just not the universal answer today.

SQLBI, Fabric documentation, MVPs, and lakehouse engineering research all point toward the same conclusion:

Import is the performance baseline. All other modes trade consistency for flexibility.

But here’s the part that matters:

This is not a verdict. This is a point-in-time observation.

Microsoft Fabric is shipping improvements rapidly.
The Direct Lake engine is evolving month by month.
Some of the gaps I measured will absolutely shrink.

This article reflects my experience, my testing, and the current state of the platform - not its final form.

And I’m committed to watching this story unfold.

Innovation never arrives all at once. It arrives as a series of small, significant steps and Direct Lake is one of them.

I’ll continue testing.
I’ll continue measuring.
And I’ll share updates as Fabric evolves.

The story of Direct Lake is just beginning. And that’s the exciting part.

About the Author

Rupesh Ghosh works as a Lead Business Intelligence Engineer at Total Wine & More while pursuing his Executive PhD in Business at the University of Cumberlands. His research interests include data governance, decision intelligence, project management, and cost-effective BI analytics.

The Direct Lake Mirage: What Really Happens at 99 Million Rows