NeoClouds: An Introduction for Optical Engineers

1. Introduction

CoreWeave reported first-quarter 2026 revenue of $2.078 billion and a contracted revenue backlog of $99.4 billion, operating roughly 1 GW of active power across 49 data centers, on its way to a stated 8 GW by 2030 (CoreWeave Q1 2026 results, SEC Form 8-K). A company that did not run a single AI cluster a decade ago now sits on a backlog larger than the annual revenue of several established telecom operators. It rents almost nothing but graphics processing units (GPUs). It is the reference example of a NeoCloud.

A NeoCloud is a cloud provider built almost entirely around renting high-end GPUs for artificial intelligence (AI), rather than the general-purpose menu a hyperscaler offers. SemiAnalysis defines it as a new class of provider focused on GPU compute rental; McKinsey calls them independent GPU-as-a-service (GPUaaS) providers. The term arrived in late 2024 and went mainstream through 2025. The four most-cited "NeoCloud Giants" are CoreWeave, Nebius, Lambda, and Crusoe.

For an optical engineer, the business label matters less than the engineering consequence. A NeoCloud is a dense, greenfield GPU fabric where copper stops at the rack and everything beyond it is optical. The back-end fabric that ties GPUs together is roughly 85% of the networking cost and 86% of the networking power in a representative GB300 NVL72 InfiniBand cluster (SemiAnalysis); transceivers alone can reach 10% of total cluster total cost of ownership (TCO). The optics are not a supporting cast. They are the product's main physical constraint. This article explains what a NeoCloud is, how its three distinct networks fit together, and where the optical layer sets the limits on cost, power, and reliability.

2. What Defines a NeoCloud

A hyperscaler sells more than two hundred services, of which GPU instances are one. A NeoCloud sells GPU-hours, on bare metal or a thin virtual machine, usually as a single-tenant cluster reserved for one training job. The commercial difference is sharp: NeoClouds price 60–85% below hyperscalers for identical silicon. As of May/June 2026, an NVIDIA H100 rents at roughly $2.0–2.6/hour on a NeoCloud against about $6.88/hour on AWS and $12.29/hour on Azure (SemiAnalysis pricing index). The Uptime Institute measured an eight-GPU DGX H100 node at $98/hour on hyperscalers versus $34/hour on NeoClouds, a saving near 66%.

That gap exists because the two architectures are built for different traffic. Hyperscaler platforms are central-processing-unit-first (CPU-first): heavily virtualized, running hypervisors over general-purpose CPUs to serve millions of small, independent workloads. NeoClouds are GPU-first from the foundation, stripping the managed-services stack to expose raw, tightly-coupled compute. The physical footprint follows: a traditional enterprise rack draws 3–8 kW, while an AI-dense GPU rack draws 30 kW to over 120 kW, with next-generation systems projected past 600 kW. NeoClouds design facilities around those densities and the liquid cooling they demand; hyperscalers more often retrofit legacy halls.

Why they emerged

Three forces created the category. GPU scarcity left labs and startups unable to secure NVIDIA allocation that hyperscalers had locked up. NVIDIA deliberately diversified its customer base, including direct investment and preferential allocation to NeoClouds. And the economics of dedicated AI infrastructure rewarded operators who optimized purely for GPU density and utilization rather than service breadth. CoreWeave runs Kubernetes-native bare metal and reports model FLOPs utilization (MFU) above 50% on Hopper, against roughly 30% for some competitors.

The financing model is the part most engineers miss. NeoClouds borrow against contracted cash flow to buy GPUs: multi-year take-or-pay contracts secure the revenue, and that revenue backs GPU-collateralized debt. CoreWeave's anchor contracts include OpenAI (around $22.4 billion through 2031) and Meta (an initial $14.2 billion, expanded by a further $21 billion through 2032). Nebius, spun out of Yandex in 2024, signed a Microsoft deal reported at $17.4–19.4 billion and took a $2 billion NVIDIA investment. The structure is efficient and fragile at once: customer concentration (Microsoft was reported at 60–71% of CoreWeave revenue at one point) and circular NVIDIA financing are real risks flagged in SEC filings, not settled facts.

Market context

Per Synergy Research Group (October 2025), NeoCloud revenue passed $5 billion in a single quarter, growing 205% year over year, and was tracking to exceed $23 billion for full-year 2025, with a forecast near $180 billion by 2030 at roughly 69% annual growth. McKinsey counts more than 100 NeoClouds worldwide, of which only 10–15 operate at meaningful scale in the United States. The tail is long; the leaders are few.

The standards framework that governs the physical layer is the same one optical engineers already work in. The back-end Ethernet path follows IEEE 802.3 (802.3df-2024 covers 200/400/800 Gigabit Ethernet at 100G per lane; 802.3dj adds 1.6 Terabit Ethernet and 200G per lane, targeting completion around mid-2026). Coherent inter-site links follow OIF Implementation Agreements (the 800ZR IA published October 2024) and ITU-T G.694.1 for the dense wavelength-division-multiplexing (DWDM) grid. InfiniBand follows the InfiniBand Trade Association specification (NDR at 400 Gb/s per port, XDR at 800 Gb/s). Ethernet back-ends increasingly follow the Ultra Ethernet Consortium specification 1.0, released June 2025. None of this is NeoCloud-specific; the NeoCloud just consumes it at unusual density. For a refresher on how router-hosted coherent optics reshaped this layer, see the MapYourTech guide on IP over DWDM (IPoDWDM).

Takeaway: A NeoCloud is a single-product cloud—GPU-hours—sold 60–85% below hyperscaler rates because it carries none of the general-purpose overhead. The cost discipline that makes it cheaper also makes the optical fabric its dominant capital and power line.

3. The Three Fabrics of an AI Cluster

A modern AI cluster is not one network. It is three, each with a different job, a different protocol, and a different physical medium. Confusing them is the most common error a transport engineer makes when first costing a GPU build.

Figure 1: The three fabrics of an AI cluster. Scale-up stays on copper inside the rack; scale-out is optical across the datacenter; the front-end Ethernet network feeds data and checkpoints; scale-across links sites over coherent DWDM. Sources: SemiAnalysis, NVIDIA GTC 2025, OIF.

Drawn as a physical topology, the back-end is a leaf-spine-core Clos that climbs from copper inside each rack to coherent DWDM between sites. Traffic changes optical medium at every tier, and the module and fiber type are set by the reach of that hop.

Figure 2: The physical optical topology of a NeoCloud back-end. GPU pods climb through rail leaves to a non-blocking spine fabric and a core/DCI edge, then out over a DWDM open line system to a remote cluster. The module and fiber change with the reach of each hop. Sources: NVIDIA platform guidance, OIF, ITU-T G.652/G.654.

The back-end fabric: where the optics live

The back-end (scale-out) fabric carries GPU-to-GPU traffic during distributed training—the synchronized all-reduce and all-gather collectives that keep model replicas consistent. It runs one network interface card (NIC) per GPU at 400G or 800G, wired as a rail-optimized fat-tree or Clos. "Rail-optimized" means each GPU connects to a dedicated leaf switch so same-rank traffic stays local and avoids extra hops; the cost is strict cabling discipline, because a single miswire breaks rail locality. This is the fabric that is roughly 85% of networking cost and power, and it is entirely optical, because at 800G PAM4 (four-level pulse-amplitude modulation) signaling copper degrades beyond about one meter. The MapYourTech overview of intra-DC versus inter-DC optics maps these reaches against module families.

The front-end fabric: keeping GPUs fed

The front-end is a conventional Ethernet network handling data loading and checkpointing, typically two to four GPUs per NIC. It is cheaper and lower-stakes than the back-end, but not optional: if the storage pipeline cannot stream datasets and checkpoint artifacts fast enough, expensive GPUs sit idle. NeoClouds attack this with software-defined storage that pairs flash for ingestion bursts with hybrid nodes for older checkpoints, reaching greater than 40% flash efficiency where one solid-state drive can serve the I/O of up to seven GPUs.

Takeaway: Cost the back-end fabric first. It is the optical network, it is about 85% of networking spend and power, and it is the only one of the three that has no copper option once the cluster spans more than a rack.

4. Scaling Up, Out, and Across

The industry organizes interconnect by reach, and each tier has a hard physical boundary where the medium changes. Getting the boundary right is the whole game in cluster design.

Figure 3: The reach ladder. Copper covers scale-up to about seven meters; pluggable optics carry scale-out across the building; coherent modules carry scale-across between sites. The medium changes at each boundary, and so does the failure mode.

Scale-up: copper by deliberate choice

Scale-up couples GPUs into one logical accelerator at the highest bandwidth and lowest latency. The NVIDIA GB200 NVL72 packs 72 Blackwell GPUs into a single rack as one NVLink domain—1.8 TB/s per GPU, around 130 TB/s aggregate bisection—over more than 5,000 copper cables on a blind-mate backplane. This is copper on purpose. NVIDIA's Ian Buck explained that a disaggregated optical NVLink design needed so many transceivers that roughly half the power went to optics, so it never reached production; copper saves about 20 kW per NVL72 rack. The boundary is physical: copper holds to roughly two meters at these rates, and stretches to five to seven meters only with active electrical cable.

Scale-out: optical the moment you leave the rack

Scale-out connects thousands of GPUs across racks and rows within one facility for east-west training traffic, on a three-tier leaf-spine-core Clos. Here a long-running architectural contest has shifted. InfiniBand held the AI back-end for years on native lossless transport and sub-microsecond latency, but RDMA over Converged Ethernet version 2 (RoCEv2) now accounts for roughly 70% of new AI infrastructure deployments (confirmed in Broadcom's Q1 2026 earnings commentary). InfiniBand keeps a latency edge—about 1–2 µs versus 5–10 µs for RoCEv2—but Ethernet wins on multi-vendor economics and operational familiarity, helped by Ultra Ethernet Consortium extensions for congestion control and packet spraying. The 650 Group projects around 91% of AI workloads on Ethernet by 2029.

Scale-across: training that outgrows a building

A single site has a power ceiling. When one training job needs more power than a campus can deliver, operators split it across buildings or metros and stitch the clusters together with 800ZR and 800ZR+ coherent pluggables over leased or dark fiber on DWDM line systems, reaching up to about 2,000 km (Marvell COLORZ class). The constraint is the collective-communication latency budget: demonstrations have held distributed training within roughly 1,000 km. This is the point where the datacom and telecom worlds meet, and where transport engineers earn their keep—the MapYourTech guide on 800G ZR/ZR+ coherent optics covers the module families used here, and the DCI technologies overview frames the line systems behind them.

Where it breaks

Coherent reach is bounded by optical signal-to-noise ratio (OSNR) and accumulated dispersion, but the scale-across ceiling is usually not the optics—it is latency. Every additional 100 km adds about 0.5 ms of one-way propagation delay in fiber, and synchronized collectives stall waiting for the slowest replica. You can buy more reach with a better forward-error-correction (FEC) mode; you cannot buy back the speed of light.

5. The Optics Budget: Cost, Power, Failure

Two numbers decide whether a NeoCloud build pencils out: how much optical power the fabric consumes, and how often it fails. Both scale with transceiver count, and transceiver count scales faster than GPU count because each GPU link crosses multiple switch tiers.

How many transceivers, and how much power

In a rail-optimized multi-tier fat-tree, each GPU link traverses leaf and spine, and every endpoint carries a transceiver. SemiAnalysis and NVIDIA use a representative figure of about six transceivers per GPU. The power budget follows directly.

Optical power scaling of the back-end fabric

P_optics = N_GPU × T_per GPU × P_xcvr

Where: N_GPU = number of GPUs; T_per GPU ≈ 6 transceivers per GPU (multi-tier fat-tree); P_xcvr ≈ 30 W per 800G transceiver. Result P_optics is total back-end transceiver power, in watts.

Practical Example — the million-GPU optics bill

Take a hypothetical million-GPU fleet, the scale NVIDIA framed at GTC 2025. With six transceivers per GPU, the fabric needs about 6,000,000 transceivers. At 30 W each, that is 6,000,000 × 30 W = 180,000,000 W, or 180 MW of transceiver power alone—before a single GPU draws current. Per GPU that is 6 × 30 W = 180 W of optics and roughly $6,000 of transceiver cost (Jensen Huang's GTC 2025 framing). This is why a 3.05 kW saving per switch from co-packaged optics translates directly into freed-up GPU power, and why linear pluggable optics (LPO) and co-packaged optics (CPO) moved from curiosity to strategy. See the MapYourTech analysis of co-packaged optics and the path to 1.6T switches and the broader treatment of energy efficiency in optical networks.

How often the fabric fails

A synchronized all-reduce stalls if any one link drops. With many thousands of independent transceivers, the aggregate reliability is far worse than any single link suggests, because failure rates add.

Expected time to first link failure

MTTF_system ≈ MTTF_link ÷ N_links

Where: MTTF_link = mean time to failure of one link; N_links = number of links in the fabric. For independent failures the system rate is the sum of the per-link rates, so the expected time to the first failure is the per-link MTTF divided by the link count.

Practical Example — 26 minutes to first failure

SemiAnalysis worked this for a 100,000-GPU cluster with one NIC-to-leaf link per GPU. Even at a generous 5-year MTTF per link—about 2,629,800 minutes—the expected time to the first failure is 2,629,800 ÷ 100,000 ≈ 26.3 minutes on a brand-new, fully working cluster. That is why NeoCloud operators engineer for failure rather than against it: memory-reconstruction recovery instead of checkpoint-restart, Common Management Interface Specification (CMIS) FEC telemetry to predict transceiver degradation before it drops the link, and pre-staged spares. At 100,000 GPUs you will see failures within the hour, every hour.

The economic chain

The cost stack is consistent across published analyses: the back-end fabric is roughly 85% of networking cost and power; transceivers are about 60% of networking cost; and transceivers can be around 10% of total cluster TCO. That last figure is what makes optics a board-level topic at a NeoCloud. Cutting transceiver power and cost is not an optimization—it directly changes how many GPUs a fixed power envelope can host, and a NeoCloud's whole business is converting a constrained power budget into billable GPU-hours.

Figure 4: On-demand H100 pricing per GPU-hour, May/June 2026. NeoCloud value is the midpoint of the $2.0–2.6 range. Source: SemiAnalysis pricing index; AWS and Azure list pricing. Data table: NeoCloud $2.30, AWS $6.88, Azure $12.29 per GPU-hour.

Takeaway: Transceiver count drives both the power bill and the failure rate, and both scale super-linearly with cluster size. Design the optical fabric for predictable degradation and fast recovery, not for a heroic mean-time-between-failures number that the scale will erase within the hour.

6. Building a NeoCloud Network

The physical build comes down to choosing the right fiber and module for each reach, securing supply early, and not repeating the networking mistakes the first wave of NeoClouds made.

Fiber and connectors by reach

Two fiber types cover the datacenter. Multimode OM4 (aqua) carries short-reach SR modules to 100 m on VCSEL sources; single-mode OS2 to ITU-T G.652.D carries DR and FR modules from 150 m upward on DFB lasers. The mismatch is unforgiving: an SR transceiver needs multimode fiber, a DR/FR module needs single-mode, and the wrong pairing fails the link. Parallel optics dominate the connectors—MPO-12 for SR4/DR4 (8 fibers used), MPO-16 now preferred for 800G SR8/DR8 (16 fibers), MPO-24 for 1.6T headroom. AI fabrics push trunk counts to 144, 288, and 432 strands; sizing trunks at 144-fiber ribbonized OS2 and above is the working recommendation. For inter-site coherent, large-effective-area ultra-low-loss G.654.E fiber (loss at or below 0.16–0.18 dB/km at 1550 nm) extends terrestrial coherent reach. The MapYourTech reference on pluggable nomenclature and naming conventions decodes the SR/DR/FR/ZR labels in full.

Figure 5: Rack power density escalation. Enterprise racks sit at 3–8 kW; GB200 NVL72 reaches about 120 kW; the projected Rubin Ultra NVL576 reaches about 600 kW. Source: NVIDIA platform specifications, SemiAnalysis. Data table: Enterprise 8 kW, AI-dense 100 kW, GB200 NVL72 120 kW, NVL576 600 kW.

Module choice and supply

For new scale-out builds in 2026, 800G LPO or conventional pluggable remains the mainstream choice on multi-vendor availability and operational familiarity; CPO delivers the lowest power per port but trades away field replaceability and locks the build tighter to one vendor. Lambda was among the first NeoClouds to deploy NVIDIA's Quantum-X InfiniBand co-packaged switch, shown publicly in June 2026. Supply is a live constraint, not a footnote: one NeoCloud hit a six-month lead time on validated optics, so qualifying multiple merchant vendors (InnoLight, Eoptolink, Coherent, Lumentum) and standardizing trunk cabling early is now standard practice. Roughly 24 million 800G-and-above transceivers shipped in 2025, with about 63 million projected for 2026.

The vendor landscape

Switch silicon comes from NVIDIA (Quantum InfiniBand, Spectrum Ethernet), Broadcom (Tomahawk, Jericho, Bailly CPO), Arista, Cisco (Silicon One), and Marvell. Coherent transport and DCI is supplied by Ciena, Nokia, Cisco/Acacia, Marvell, Ribbon, Adtran, SmartOptics, PacketLight, Huawei, and ZTE. For how router-hosted coherent has restructured this market, the MapYourTech analysis of how AI is reshaping optical transport hardware and the survey of open line systems for multi-vendor coherent wavelengths are good next stops.

Table 1: NeoCloud versus hyperscaler — engineering and commercial contrast
Vector	Hyperscaler	NeoCloud
Core architecture	CPU-first, heavily virtualized, 200+ services	GPU-first, bare-metal / thin VM, single-tenant clusters
Pricing model	Layered, with ingress/egress and API fees	Transparent per-GPU-hour
H100 on-demand	~$6.88–12.29 /hr	~$2.0–2.6 /hr
Rack power density	Retrofit to 30 kW+; standard 3–8 kW	Native 30–120 kW+, liquid-cooled
Optical network	Broad mesh, custom optics, OCS, subsea	Dense greenfield, merchant optics, speed-to-deploy
Financing	Balance-sheet capex	Take-or-pay contracts, GPU-backed debt

The networking gap

NeoClouds scaled compute brilliantly and connectivity poorly. A 2026 Omdia audit of fifty NeoCloud providers found systemic immaturity: 46% controlled only small blocks of IPv4 addresses, one in five relied on a single IP transit provider (a single point of failure for an AI workload), and more than half used no internet exchange peering at all. The product a NeoCloud sells is the conversion of power, space, and hardware into billable output, and that conversion runs entirely over the network—so weak external connectivity caps the value of even a flawless GPU fabric.

Deployment checklist

Before committing a NeoCloud cluster: demand 400G InfiniBand or a tuned RoCEv2 back-end; run an NCCL all-reduce benchmark and require greater than 90% scaling efficiency at your GPU count; size aggregate storage at 250 GB/s or more; keep scale-up on copper inside the rack; and qualify at least two merchant optics vendors against the six-month lead-time risk. Train on a NeoCloud, and if you need 200+ managed services or strict multi-region compliance, serve inference on a hyperscaler—a common split.

7. Where NeoClouds Are Heading

The next refresh cycle pushes optics deeper into the switch. NVIDIA's Quantum-X Photonics (InfiniBand) and Spectrum-X Photonics (Ethernet, second half of 2026) integrate silicon-photonics engines onto the switch ASIC. NVIDIA's figures—3.5x better power efficiency (cutting a 1.6T port from 30 W to 9 W), 10x resiliency, 63x signal integrity—are vendor claims; the strongest independent corroboration is Meta's ECOC 2025 Bailly data showing 65% power saving across 15 million device-hours with zero failures. The roadmap items—IEEE 802.3dj around mid-2026, 1600ZR, the NVL576 at roughly 600 kW per rack—are announced or planned and may slip.

For an optical engineer, the skill set to develop is the convergence itself. The boundary between telecom long-haul coherent and datacom short-reach optics has collapsed: coherent pluggables now sit in router line cards, LPO is deployed in production AI networks, and scale-across makes DWDM line-system design a datacenter problem. An engineer who can budget OSNR on a 2,000 km coherent link and reason about a rail-optimized fat-tree in the same afternoon is exactly who this industry is short of. The MapYourTech piece on future-proofing an optical engineering career in the AI era develops that path.

8. Reference Section

Table 2: Key specifications referenced in this article
Parameter	Value	Class / source
Back-end fabric share of networking cost/power	~85%	Measured analysis (SemiAnalysis)
Transceivers per GPU (multi-tier fat-tree)	~6	Representative figure (SemiAnalysis/NVIDIA)
800G transceiver power	~30 W	Industry-typical
Copper reach at 800G PAM4	~1 m	Physical limit
GB200 NVL72 per-GPU NVLink bandwidth	1.8 TB/s	NVIDIA platform spec
InfiniBand vs RoCEv2 latency	~1–2 vs ~5–10 µs	Measured (vendor/industry)
800ZR amplified reach	~80–120 km	OIF 800ZR IA
GB200 NVL72 rack power	~120 kW	NVIDIA platform spec

Glossary

NeoCloud — a pure-play GPU cloud renting AI compute on bare metal or thin VMs.
Scale-up — tightly-coupled intra-rack GPU interconnect (NVLink), copper today.
Scale-out — the optical back-end fabric across a datacenter (InfiniBand or RoCEv2).
Scale-across — coherent DWDM links between separate sites for distributed training.
Rail-optimized fat-tree — topology where each GPU connects to a dedicated leaf to keep collective traffic local.
RoCEv2 — RDMA over Converged Ethernet v2, the Ethernet-based lossless transport for AI fabrics.
LPO / CPO — linear pluggable optics / co-packaged optics, two approaches to cutting transceiver power.
MFU — model FLOPs utilization, the fraction of peak compute a training job actually uses.

References

OIF, "800ZR Implementation Agreement," Optical Internetworking Forum.
IEEE, "IEEE 802.3df — 200/400/800 Gb/s Ethernet," IEEE Standards Association.
ITU-T, "G.652 — Characteristics of a single-mode optical fibre and cable," ITU-T Study Group 15.
Ultra Ethernet Consortium, "Ultra Ethernet Specification 1.0," Ultra Ethernet Consortium.
InfiniBand Trade Association, "InfiniBand Architecture Specification," IBTA.

Sanjay Yadav, "Optical Network Communications: An Engineer's Perspective" — Bridge the Gap Between Theory and Practice in Optical Networking.

Start with DWDM Fundamentals

870+ Technical Articles

Engineering Knowledge Base

NeoClouds: An Introduction for Optical Engineers

NeoClouds: An Introduction for Optical Engineers

1. Introduction