If you've ever pasted proprietary code into an AI assistant at 2 a.m., you've already lived the tension confidential AI is designed to resolve. AI spread through organizations the way spreadsheets did: quietly, everywhere, and faster than governance can keep up. Teams route contracts, customer tickets, and code through model endpoints because it works, not because anyone has verified where that data actually goes.

This convenience widens the blast radius of a breach. The average global breach cost rose to $4.88M in 2024, up 10% from the previous year. In H1 2025 alone, 1,732 breaches exposed over 165 million records.

The thing is, traditional security handles two states well: data at rest and data in transit, but not data in use. At that point, the data typically becomes visible to the system, the hypervisor, and the people who operate it. Confidential computing goes after this “third state”: it helps encrypt information in memory and process it only inside a Trusted Execution Environment (TEE): a hardware-isolated enclave that even the operators can't peek into.

The verification piece is the hinge. If you can remotely prove where computation happened and confirm no one tampered with it, "trust" becomes an inspectable property of the stack.

The Gap That Policies Can't Close

Developers are shipping faster than ever, and AI is the accelerant. But rapid, AI-assisted output can be hard to audit and easy to deploy with invisible risk, especially when provenance and accountability are fuzzy. That rough draft your AI assistant generated? It doesn't fully internalize your security constraints. It can't.

Keysight's recent research on PII disclosure flags how easily sensitive information (the kind companies least want appearing in logs or third-party systems) can leak through prompts and model outputs. Security teams are now treating "what leaks in prompts" as something measurable, attackable, and increasingly regulated.

Meanwhile, credential theft remains stubbornly effective. In basic web application attacks, about 88% of breaches involve stolen credentials. If the modern breach often starts with "someone logged in," then pushing more sensitive work into more tools doesn't just add risk, it multiplies it.

In standard deployments, prompts and context must exist in plaintext somewhere to be processed. You can redact. You can filter. But somewhere in the stack, the data has to be readable, which means it's vulnerable.

Confidential computing shifts that baseline by running workloads inside isolated hardware environments that keep data and code protected during execution. This reduces who (or what) can observe the process, including privileged infrastructure layers. It doesn't eliminate every application-layer risk, but it narrows an entire class of exposure.

Qatar's Bet: Confidential AI Gets Physical

The fastest way to tell whether a category is real is to watch where money and megawatts go. Confidential AI is now entering the language of facilities and power grids.

Qatar is building a confidential AI data center that represents one of the first physical manifestations of this shift. The facility will use confidential computing chips (including Nvidia hardware) to keep data encrypted through every stage of processing. AILO AI, MBK Holding, and OLLM are partnering on the project, with an initial $183M investment and ambitions to scale over time.

OLLM’s role particularly matters here because there’s a difference between claiming support for confidential AI and actually building the infrastructure required to support it. Their pitch is less about "a better model" and more about a safer way to access and run a growing catalog of models. Think of it as an AI gateway: one API that aggregates access to multiple providers and AI models while offering verifiable privacy guarantees.

"Confidential AI" is an easy phrase to slap on a landing page. So the credibility test is simple: can you prove it? OLLM's partnership trail suggests they can.

One differentiator OLLM emphasizes is a confidential-compute–only posture paired with Zero Data Retention (ZDR): models are deployed on confidential computing chips, and the platform is designed not to retain prompts or outputs. The only data OLLM keeps is operationally necessary (and deliberately non-content), such as token consumption plus TEE attestation logs that aren’t linked to users or user data, so attestations can be shown publicly for transparency in the OLLM dashboard. That’s the philosophical shift: treat “model access” like critical infrastructure, where minimizing retained data and making execution verifiable are first-class product requirements, not policy footnotes.

OLLM’s partnership with Phala is the clearest “prove it” story today. It expands its footprint through Phala’s confidential AI infrastructure, where workloads run inside TEEs (including Intel TDX, AMD SEV, and Nvidia H100/H200-class GPU confidential computing). That matters because each inference can generate a cryptographic attestation showing it ran on genuine TEE hardware, not on a standard VM with a trust-me policy.

The NEAR AI is also building private inference infrastructure powered by TEEs. This means developers can think in terms of confidential inference as a composable primitive (Phala is one route; NEAR AI’s private inference is another).

Bottom Line: Why Developers Should Care

For developers, confidential AI can unlock workflows that were previously awkward, risky, or stuck in security review limbo. It directly expands on where AI can be used in practice. Proprietary data like internal playbooks, product designs, competitive intelligence can pass through AI systems without being captured in third-party logs. For regulated industries, that shift is even more consequential. Instead of being asked to “trust” how data is handled, banks and healthcare providers can point to cryptographic attestation as evidence, changing compliance discussions from risk avoidance to controlled deployment.

The same logic applies outside heavily regulated environments. Teams under pressure to ship quickly do not suddenly become less exposed when a prototype turns into a product. Confidential execution makes it possible to keep iteration speed while narrowing what inference can reveal. AI agents begin to manage credentials, trigger API calls, and interact with sensitive systems. So, the ability to run them without exposing their instructions or data, even to the operators running the infrastructure, becomes a baseline requirement rather than an advanced feature.

As AI becomes embedded in real workflows, the weakest point in the stack is no longer storage or transport, but execution. Data-in-use is where sensitive information is most exposed and least controlled. Confidential computing doesn’t solve every security problem, but it closes a gap that policy, contracts, and access controls can’t. For AI to move from experimentation to infrastructure, that gap has to be addressed.