What a 10 PB, 20-year archive really costs in AWS/GCP/Azure vs a tape-backed object store

Why long-term preservation is back on the board agenda

Most cloud bills started life as rounding errors.

A few terabytes here, a backup bucket there, a “just in case” archive for compliance. It looked cheap, flexible, and—critically—someone else’s problem.

Fast-forward a few years:

This is where preservation storage (data you must keep for 5–20+ years) diverges from operational storage (data you actively use to run the business). Treat them the same, and you’ll overspend on one or under-protect the other.

Cloud cold storage is, in many ways, a rental model that works brilliantly for elastic workloads:

But for regulatory archives, historical content, medical images, logs, and R&D datasets that must live for a decade or more, the economics and risk profile start to look very different.

That’s why repatriation—moving data out of the cloud and into an on-premise or hosted environment you control—is back in serious conversations. Not because “cloud was a mistake,” but because not all data belongs on a permanent subscription model.

This article walks through a concrete scenario:

10 petabytes of cold data, accessed about 2.5% per month, over 5 / 10 / 20 years, compared across major cloud cold tiers and a tape-based object store.

We’ll keep the math visible, the tone honest, and the conclusion practical.

The scenario: 10 PB under glass

Let’s pin down the workload so the numbers aren’t hand-wavy.

So you have:

We’ll compare:

Cloud cold storage classes

On-premise: a tape-backed object store

Using a cost profile based on LTO.org-style numbers for a mid-size tape infrastructure:

Initial CapEx = $238,600

Ongoing:

Refresh cycles:

And we’ll add a one-time cloud egress cost to pull the full 10 PB home.

For egress we’ll assume a large-volume blended rate of $0.05/GB for cloud → on-prem transfers. Real discount deals will vary, but public pricing for big volumes lands in that neighborhood.

10 PB = 10,000,000 GB → 10,000,000 GB × $0.05/GB ≈ $500,000 egress to repatriate.

Cloud cold storage: the actual cost of “keeping it just in case”

Let’s start with AWS, then we’ll look at GCP and Azure at a high level.

Anatomy of cloud cold storage cost

For each cloud class, your annual cost has two main components:

  1. Storage rent:
Annual storage cost=Data (GB)×Price ($/GB-month)×12

  1. Retrieval + egress tolls:
Annual retrieval cost=Data retrieved annually (GB)×(retrieval fee+egress fee)

In our case:

We’ll use typical US region prices (rounded) as of 2025 for illustration:

AWS (US region; approximate base on AWS advertised pricing (your mileage may very))

GCP and Azure have similar shapes—slightly different list prices, but the same pattern: low storage, non-trivial retrieval and egress.


Worked example: AWS S3 Glacier Deep Archive

Let’s fully show the math for the cheapest major cloud cold tier: S3 Glacier Deep Archive.

Inputs

Step 1 – Annual storage cost

Annual storage=10,000,000×0.00099×12

Step 2 – Annual retrieval + egress cost

Annual retrieval=3,000,000×(0.02+0.05)

Step 3 – Total annual cost

Annual total=118,800+210,000=$328,800

Step 4 – 5, 10, 20-year totals

So even in the cheapest AWS cold class, 10 PB with modest ongoing access is:

…and that’s ignoring:

You pay rent forever and tolls every time you read.

High-level look at other cloud cold tiers

Using similar reasoning and current public price examples (storage + retrieval + egress), you end up in this ballpark for 10 PB and the same access pattern:

Order-of-magnitude TCO for 10 PB @ 2.5%/month access

All numbers rounded to nearest $0.1M, assuming similar $0.05/GB egress and representative retrieval fees.

The broad pattern:

Tape-backed object store: your own cold cloud

Now let’s look at the tape-backed object store using the numbers you provided.

One-time CapEx

Initial CapEx=35,000+25,000+80,000+63,600+35,000=$238,600

So $238,600 to stand up the environment.

Annual OpEx (baseline)

Ignoring energy for now, your year-1 baseline OpEx:

Year 1 OpEx=60,000+2,000+54,000=$116,000

From year 2 onward, floor space increases:

Floor space in year n=54,000×(1.04)^(n−1)

Everything else (ops + maintenance) we’ll keep flat for simplicity.

Refresh cycles

These are big, but they happen only a few times over 20 years.

Putting it together: TCO over 5, 10, 20 years

Running the math with:

You get approximate totals:

Now add the one-time egress to bring the 10 PB home:

So “tape + migration” TCO:

Horizon

Tape env only

+ 10 PB egress

Total Tape+Egress TCO

5yr

$0.90M

$0.50M

$1.40M

10yr

$1.77M

$0.50M

$2.27M

20yr

$3.61M

$0.50M

$4.11M

Remember: that’s with no energy cost added. Even if you add $10–15k/year for power and cooling, the totals barely move compared to multi-million cloud bills.

Head-to-head: cloud vs tape for 10 PB

Versus AWS S3 Glacier Deep Archive

Compare “cheapest viable cloud cold” vs “your own tape cloud”:

Horizon

AWS S3 Glacier Deep Archive

Tape + Egress

Cloud – Tape (extra spend)

5yr

$1.64M

$1.4M

+0.24M

10yr

$3.29M

$2.27M

+1.02M

20yr

$6.58M

$4.11M

+2.47M

Interpretation:

And that’s against the cheapest of the mainstream cloud cold tiers.

Versus more expensive tiers (Instant, Nearline, Cool), tape wins by multiple millions.

Pros and cons: this isn’t just about dollars

Cloud cold storage – PROs

Cloud cold storage – CONs

Tape-backed object store – PROs

Tape-backed object store – CONs

When does repatriation make sense?

From a C-suite / IT management lens, you repatriate preservation data when:

  1. The dataset is large enough
    • Order of magnitude: ≥ 5–10 PB and growing.
    • Below that, the complexity may outweigh the savings unless there are strong non-financial drivers.
  2. The access pattern is low but steady
    • You do touch the data (audits, research, investigations, training sets) but not daily.
    • Enough to be penalized by retrieval/egress, but not enough to justify hot storage.
  3. The retention horizon is long
    • 10+ years is where the cloud rent really starts to look ugly compared to “owning the house.”
    • Regulatory or institutional mandates often live in this band (records, medical, scientific, cultural).
  4. Governance and sovereignty matter
    • You need to prove, with confidence, that:
      • Data hasn’t been altered.
      • Retention and deletion policies are actually enforced.
      • Data is stored in specific jurisdictions under your operational control.
  5. You can staff or source the operational competency
    • Either in-house tape and archive expertise, or a managed service provider who can run the environment with clear SLAs.

If you check most of those boxes, cloud cold storage for preservation becomes less “strategic agility” and more “expensive comfort blanket.”

A practical decision framework for the board and CIO

If you’re trying to decide what to do with a 10 PB archive sitting in Glacier/Coldline/Blob Cool, here’s a simple roadmap you can put in front of the C-suite.

Step 1 – Inventory and segment the archive

Step 2 – Build a real baseline of current cloud cost

This is where the 10 PB, 2.5% monthly access math above gives you a sanity check: if your model says your TCO will be $800k over 20 years, the model—not the math—is wrong.

Step 3 – Design an on-prem (or hosted) preservation tier

Compare the multi-year CapEx/OpEx profile against your cloud baseline.

Step 4 – Plan the migration and cutover

This is where the one-time $500k egress is a financial decision: pay now, buy back long-term control.

Step 5 – Embed governance and preservation practices

Regardless of where the data lives, for long-term preservation you need:

A tape-backed object store can become your institutional memory tier, but only if governance and process are designed along with the hardware.

Step 6 – Revisit every 5 years

The decision you make today is not irrevocable; what matters is not sleepwalking into paying millions in rent for what could have been an owned, well-run archival estate.

The bottom line for the C-suite

For operational workloads, cloud is still a powerful tool.

For 10 PB of long-term preservation data with modest ongoing access, cloud cold tiers look a lot like a lifetime subscription to your own history, with a toll booth every time you try to read it.

A tape-backed object store, properly designed and operated, turns that into:

The decision is not “cloud bad, tape good.” It’s:

Put fast, changeable, time-sensitive workloads where elasticity and rich services win (cloud). Put large, slow-changing, must-keep-for-decades collections where long-term economics and control win (tape-backed preservation tier).

If you’re holding 10 PB of “cold” data in the cloud today and expecting to keep it there for 10–20 years, you’re not just storing history—you’re renting it.

The question for the C-suite is simple:

Do you want to keep paying rent forever, or do you want to own the building where your institutional memory lives?