Data Breaches and Petabyte Leaks: Designing Backups That Isolate from Production

Telus Digital lost nearly a petabyte of data to ShinyHunters in 2026, and the entry point was a single set of GCP credentials buried in customer support tickets stolen during the earlier Salesloft Drift breach. Those credentials reached Telus’ Google BigQuery environment, where attackers ran Trufflehog to find more secrets, then exfiltrated data over several months. The ransom demand came in at $65 million. The reason the scale was so large has less to do with the initial credential theft and more to do with what those credentials could reach once inside.

That is the core problem with backup isolation done badly: the same access that lets a backup agent read production data frequently lets an attacker read all of it, including every historical copy sitting in the same cloud project or storage account.

Flat access is what turns a single credential into a petabyte problem

Backup agents need read access to production data. That is unavoidable. The problem is when the service account holding that access also has read and write permissions on the backup storage tier, and when both live in the same network boundary with no segmentation between them.

When an attacker gets hold of such a credential, the blast radius is not one day’s production state. It includes every backup snapshot within the retention window, potentially months or years of data. A 90-day retention window with daily incremental backups does not just represent 90 days of recovery capability; it represents 90 days of historical data that a single compromised account can read and exfiltrate.

Service accounts that share IAM roles or GCP project membership across production and backup tiers make this worse. If the credentials are valid in the production project, and the backup bucket sits inside the same project with no separate IAM boundary, no lateral movement is required. The attacker already has access. This is not a hypothetical attack path; it is exactly the pattern that played out at Telus.

Backup agents running on production hosts with broad read permissions are a high-value target after initial compromise precisely because they already have legitimate, non-alerting access to large volumes of data. Credential theft from a backup agent or its config files gives an attacker a pre-authorised path to bulk data, with no need to escalate privileges further.

Network segmentation cuts the path before it reaches stored data

The starting point for backup isolation at the network layer is a dedicated backup VLAN that production hosts cannot route into directly. Backup traffic should flow in one direction only: from production hosts to the backup server, initiated by the backup server or a pull-based agent, with no return path that allows the backup storage tier to be browsed from a production context.

In practice, this means:

Put backup servers and storage repositories on a separate VLAN with no default gateway routing back to the production network.
Use firewall rules that permit only the specific ports the backup agent requires (for Veeam, that is TCP 2500-3300 for data and TCP 9392 for the REST API), and deny everything else.
Do not mount backup storage as a persistent network share on production hosts. NFS or SMB mounts left permanently attached to production systems mean the backup storage is reachable from any process running on those hosts.
In cloud environments, use separate projects or subscriptions for backup storage, with IAM bindings scoped only to the backup service identity. A GCP service account scoped to a dedicated backup project cannot be reused to access production resources, and vice versa.

The goal is to make sure that a compromised production host cannot reach the backup tier directly, and that the backup tier’s credentials have no value on production systems.

Cold storage and air-gapped copies limit what an attacker can actually read

Logical network separation reduces the attack surface considerably, but it does not help if an attacker compromises the backup server itself. A second layer is offline or air-gapped cold storage, which removes the backup copy from any live network path entirely.

Tape remains the most practical air-gap for large volumes. LTO-9 gives you up to 18TB native per cartridge, and a tape that is physically ejected from a drive and shelved has no network attack surface at all. Object-lock-enabled cloud storage (S3 Object Lock in COMPLIANCE mode, or Azure Blob immutability policies) provides a logical equivalent for cloud-native environments: once a backup is written and the retention lock applied, no credential, including root or global administrator, can delete or overwrite it within the lock period.

The distinction between immutable storage and a true air gap matters. Immutable object storage still has a network path; an attacker with valid credentials can read and exfiltrate the data even if they cannot delete it. A physical air gap removes the read path entirely. For the most sensitive data, you need both: immutability so backups cannot be destroyed by a ransomware payload, and a physically separate copy that cannot be read over a network.

Retention windows define your historical exposure

Backup retention is almost always discussed in terms of recovery capability. It also defines how much historical data an attacker can access if the backup tier is compromised. A two-year retention window holds two years of data. If that data includes call-centre records, authentication logs, or any PII, the exposure is proportional to the retention period, not just the most recent backup.

Scope backup retention windows to the minimum your disaster recovery plan actually needs. If your RTO and RPO requirements are met with 30 days of daily backups plus a monthly snapshot kept for 12 months, do not keep 18 months of daily backups because storage is cheap. Every extra month of retained data is extra exposure.

Apply tiered retention with tiered access controls. Daily backups from the last 30 days might sit on fast disk with full backup-team access. Monthly snapshots older than 30 days should move to cold storage, with access requiring a separate approval step and separate credentials. Annual snapshots should move to offline media. Each tier boundary is also an access control boundary.

Separate credentials that cannot traverse tiers

The access control model that fails most often is the one where a single service account handles backup job execution, backup storage writes, and restore operations across all retention tiers. When that account is compromised, every tier is open simultaneously.

Structure credentials around the minimum permission each operation needs:

The backup agent service account needs read access to production sources and write access to the current-tier backup repository. Nothing else.
The restore service account needs read access to the appropriate backup tier. It should not have write access to production systems; restores should be staged to an isolated recovery environment and validated before being promoted.
Cold storage and offline tier access should require credentials that are not present on any live production or backup server at rest. Store them in a hardware security module or an offline secrets vault, retrieved only when a restore from that tier is authorised.

In GCP, this means separate service accounts per tier, with IAM bindings scoped to specific bucket paths, not project-level storage.admin roles. In AWS, it means per-tier IAM roles with S3 resource policies that block cross-role access. Neither platform enforces this by default; you have to configure it explicitly.

Restoration testing is how you verify isolation is working

Backup isolation that has never been tested is configuration that may or may not do what you think. Run a restore drill quarterly, and make it cover the scenario that matters most: production is fully compromised and unavailable, and you are recovering from the cold-storage or air-gapped copy using only the credentials appropriate to that tier.

If that drill requires you to re-use a production credential, or if the restore path involves mounting storage that is also accessible from a running production host, the isolation has a gap. Document the exact credential, the exact network path, and the exact storage mount used for each recovery tier, and verify after every infrastructure change that the isolation boundaries remain intact.

The backup retention window, the network segmentation, and the access control model are only as reliable as the last time you tested them under conditions that resemble an actual incident.