Broadcom claims “up to -40% TCO” with VCF 9.1. If you’ve heard this kind of number before, you know it has to be taken apart before it goes into a budget slide. That’s exactly what this article is for.
VCF 9.1 is not an architectural revolution — the rupture was 9.0. This is an efficiency release: beefed-up NVMe memory tiering, global vSAN deduplication, redesigned vSphere provisioning, scale to 5000 hosts. Each brick chips away at a cost line. Stacked together, they build the -40% number. We’ll see where it comes from, line by line, and what it changes for your design.
Series 'What's new in VCF 9.1' — 1/4
A mini-series on the new features of VMware Cloud Foundation 9.1:
- Infrastructure efficiency & TCO (this article)
- Networking & scale
- Kubernetes & self-service
- Security & resilience
Conceptual prerequisite: The new VCF 9 architecture.
Visual credits
Diagrams and screenshots © Broadcom, taken from the official VCF documentation and blog (links at the end of the article). Synthesis and analysis are my own.
Enhanced NVMe Memory Tiering: RAM that isn’t quite RAM
Memory tiering already existed in 9.0 as a tech preview. In 9.1 it becomes a production feature, and it’s the first lever behind the TCO number.
The principle. The hypervisor classifies memory pages by access heat. Hot pages — the ones constantly touched by active workloads — stay in DRAM. Cold pages — dormant buffers, rarely re-read allocations — are moved transparently to a local NVMe device on the host. The whole thing is exposed to VMs as a unified memory space: a VM sees “its” RAM without knowing that a fraction physically lives on flash.
Why it changes cost. DRAM is the most expensive hardware line on a modern host, and the least elastic. With tiering, you extend a host’s effective memory without adding a single stick: a host with 1 TB of DRAM can present 1.5 to 2 TB of addressable memory depending on the cold/hot ratio of workloads. Direct result: a higher consolidation ratio — more VMs per host, therefore fewer hosts for the same load.
Source : Broadcom — VCF Blog
| Aspect | VCF 9.0 | VCF 9.1 |
|---|---|---|
| Status | Tech preview | Production, supported |
| Memory model | Distinct DRAM + NVMe visible | Unified model, transparent to VMs |
| Page classification | Basic heuristic | Refined heat detection, dynamic repromotion |
| Typical effective memory ratio | ~1.25x | ~1.5–2x depending on workloads |
| DRS integration | Limited | DRS aware of tiering for placement |
Architect impact. Tiering is not free RAM: a cold page fault implies an NVMe round trip, so a latency on the order of tens to hundreds of microseconds instead of DRAM’s nanosecond. For 90% of enterprise workloads (web apps, middleware, mid-size databases) it’s invisible. For latency-sensitive workloads (trading, in-memory DBs like SAP HANA, RT analytics), they must be explicitly pinned to hosts without tiering or with a conservative ratio. The design decision: segment your fleet into “aggressive tiering” and “pure DRAM” pools, and route workloads by sensitivity profile.
Migration. No rupture: tiering is enabled per cluster. Plan the NVMe sizing (dedicated device, not the vSAN datastore) and validate the host compatibility matrix. Start with a conservative ratio, measure the swap-in rate in VCF Operations, then push.
vSAN: global deduplication and extended compression
vSAN ESA gains in 9.1 a global deduplication at cluster scale, no longer at the disk or disk-group level. This is the second TCO lever.
What changes. In 9.0, vSAN ESA dedup operated with a limited scope — identical blocks were only deduplicated within a restricted perimeter. In 9.1, the scope becomes the entire cluster: an identical block present on ten VMs spread across ten hosts is stored only once logically. Ratios climb mechanically as soon as there is data redundancy (OS templates, container images, shared datasets). Compression is extended in parallel, with better ratios on already-compressible data.
The missing piece. 9.1 supports at-rest encryption of deduplicated data. That’s the detail that unlocks regulated contexts: until now, some organizations had to choose between storage efficiency and at-rest encryption. The trade-off disappears.
| Aspect | VCF 9.0 | VCF 9.1 |
|---|---|---|
| Dedup scope | Disk / disk group | Entire cluster (global) |
| Typical observed ratio | 1.5–2x | 2–4x depending on data redundancy |
| Compression | Standard ESA | Extended ratios, more data types |
| Encryption + dedup | Mutually exclusive in some modes | Dedup on at-rest encrypted data supported |
| Management granularity | Per disk group | Cluster policy, driven by SPBM |
Architect impact. The capacity gain translates directly into vSAN TiBs not purchased — and vSAN licensing is billed per TiB. But global dedup has a CPU cost and a rebuild cost: reconstructing globally deduplicated data after a disk failure stresses the cluster more than a classic rebuild. Size hosts with a CPU margin, and test a disk-loss scenario on a representative cluster before promising the ratios in production.
vSphere Elastic Provisioning / Zero Touch Provisioning
ESX host provisioning is deeply redesigned. Auto Deploy, the legacy PXE-boot mechanism, is progressively replaced by a Zero Touch Provisioning (ZTP) model.
What ZTP brings. Network-based imaging, but modernized: automated discovery of bare hosts, parallel imaging of several hosts simultaneously, and application of the Single Image (vLCM) from the first boot. Where Auto Deploy relied on a fragile PXE/TFTP chain and largely sequential imaging, ZTP industrializes cluster bring-up — going from a few hosts to several dozen without linearizing deployment time.
Source : Broadcom — VCF Blog
| Aspect | Auto Deploy (legacy) | ZTP / Elastic Provisioning 9.1 |
|---|---|---|
| Boot mechanism | PXE / TFTP, stateless or stateful cache | Modern network imaging, automated discovery |
| Parallelism | Largely sequential | Parallel multi-host imaging |
| Image model | Baselines or Single Image | Native Single Image from first boot |
| Host discovery | Manual / scripted | Automated |
| Trajectory | End of life | Designated replacement |
Architect impact. This is a forward-looking feature: Auto Deploy is still there, but its trajectory is clear. If you build a new VCF 9.1 platform, don’t reinvest in a custom Auto Deploy mechanism — go straight to the ZTP model. If you operate an existing fleet with heavily scripted Auto Deploy, plan the migration as a project in its own right: hooks, profiles, and auto-deploy scripts don’t transpose as-is. The operational gain — bring-up time slashed, fewer engineer-hours per host — is a real line in the TCO calculation.
Scale to 5000 hosts and vMotion encryption offload
Two evolutions that act on both scale and operational cost.
Scale. A VCF 9.1 instance now supports up to 5000 ESX hosts. Beyond the marketing figure, the value is domain consolidation: fewer instances to operate for the same physical fleet, therefore fewer control planes, fewer consoles, less governance overhead. The operational cost of a platform doesn’t grow linearly with the host count if you reduce the number of instances.
vMotion encryption offload. vMotion traffic encryption was until now carried by the host CPU — a notable cost during mass migrations (maintenance, DRS rebalancing, host evacuation). In 9.1, this encryption is offloaded to network hardware (capable NICs). Broadcom claims ~70% CPU savings during encrypted migrations. Concretely: maintenance windows shorten, and the recovered CPU stays available for workloads during operations.
| Aspect | VCF 9.0 | VCF 9.1 |
|---|---|---|
| Max hosts per instance | Below 5000 | Up to 5000 ESX hosts |
| vMotion encryption | Software, host CPU | Hardware offload on capable NICs |
| Encrypted migration CPU cost | Full CPU price | ~70% CPU savings claimed |
| Maintenance window impact | Limited by encryption CPU | Faster migrations, less workload impact |
Architect impact. vMotion offload only works on capable NICs — it’s a BOM decision, not a software flag. On a heterogeneous fleet, the benefit is partial until all NICs are aligned. To be written into host purchasing standards if vMotion encryption is mandated by security policy.
VCF Management Services: a common runtime
VCF 9.1 unifies the execution of management services (lifecycle, operations) under a common runtime across the stack. Fewer redundant management components to patch and operate is an operational efficiency line often underestimated in TCO calculations.
The value for the architect: the management surface to maintain shrinks, dependencies between management components are rationalized, and management-layer patch windows simplify. It’s not spectacular in a demo, but over three years of operations it’s several hundred engineer-hours saved on a large platform.
-40% TCO: where does the number come from?
The number isn’t a single massive gain, it’s the sum of four contributions that compound. Here’s the honest breakdown.
Memory tiering — DRAM is the most expensive hardware line. Extending effective memory by 1.5 to 2x without buying a stick directly reduces hardware cost per VM. This is probably the largest contribution to the number.
Storage efficiency — global dedup and extended compression reduce consumed vSAN TiBs, hence both storage hardware and the vSAN license billed per TiB.
Consolidation — more VMs per host (combined memory + storage effect) means fewer physical hosts for the same load: fewer licensed VCF cores, less power, less rack, less cooling.
OpEx reduction — ZTP, 5000-host scale, unified management runtime: fewer engineer-hours to provision, operate and patch. OpEx weighs heavily in a three- or five-year TCO.
The honest reading. “Up to -40%” is a ceiling, not an average. The number assumes a workload mix favorable to tiering (lots of cold pages), highly redundant data for dedup, and an organization able to capitalize on the OpEx reduction. A fully latency-sensitive fleet with poorly redundant datasets will see a fraction of that gain. The right architect reflex: redo the calculation on your workload mix, with explicit assumptions, and present a range — not the ceiling figure alone.
Pitfalls & points of attention
Memory tiering and latency-sensitive workloads
CPU cost and rebuild of global vSAN dedup
Auto Deploy to ZTP migration
vMotion offload: hardware dependency
The -40% TCO is a conditional ceiling
Tiering NVMe = dedicated device, not the vSAN datastore
Conclusion
Memory lever
Production NVMe memory tiering is the primary TCO engine: 1.5 to 2x effective memory with no DRAM purchase, at the price of a latency to arbitrate per workload profile.
Storage lever
Global vSAN dedup + at-rest encryption of deduplicated data: fewer TiBs purchased and licensed, without having to choose between efficiency and compliance.
Operational lever
ZTP, 5000-host scale and vMotion offload reduce OpEx and shorten windows — a line underestimated in a three- or five-year TCO.
Next step. If infrastructure efficiency is the cost lever, the network is the scale lever. The next article, Networking & scale, decodes the VCF 9.1 networking features and what they change for large-scale architectures — the logical continuation once the TCO calculation is set.
Further reading.
- VCF 9.1 Release Notes — the official reference, read before any project
- VCF 9.1 announcement — the Broadcom post that sets the TCO number
- What’s new vSphere 9.1 — Elastic Provisioning and ZTP detail
- William Lam and vmexplorer — reference community deep-dives