Exhibitor Press Releases
Hyperscale Market Shifts: AI, Neoclouds, and the New Limits of Data Centers
Introduction
AI is turning a big part of the world’s cloud estate into something new: AI factories. No longer general-purpose “cloud warehouses”, this new class of site is built around racks that draw more than 100 kW, fabric links running at 400G and 800G, and data-center interconnects measured in terabits per second.
In this first part of our Hyperscale Market Shifts series, we will look at the forces driving the move from cloud warehouse to AI factory, including:
- How user demand and model design are changing infrastructure loads
- How the resulting effects show up in cluster size, rack power, and network scale
- How power, geography, and capital markets now inform where AI capacity can live
We will also look at rack power and cooling, scale-up and scale-out network design, and what this all means for fiber types, interfaces, and evolving network topologies.
Demand, Models, and Workloads
AI use has moved from experiment to habit. By late 2025, ChatGPT reached around 800 million weekly active users. OpenAI has also said that ChatGPT now handles about 2.5 billion prompts per day, which works out to well over a million requests per minute.
For consumers, AI is now built into search, writing tools, coding assistants, translation, and media apps. For enterprises, pilots and production systems are running in customer support, software development, marketing, analytics, and internal knowledge search (around 30% of AI consumer usage is work-related, with global adoption figures showing growth 4X higher in lower-income countries compared to higher income countries).
Three important shifts in model design underpin accelerated global AI adoption.
First, base models are richer. Frontier models from OpenAI, Google, Anthropic, and others are larger, better trained, and more efficiently optimized, delivering more reliable performance on reasoning, coding, and multimodal tasks. As a result, more kinds of work can be offloaded to these models, and with greater regularity.
Second, mixture-of-experts (MoE) and reasoning-optimized variants are now common. MoE models activate only part of the network for each token. This lowers the cost per token for a given level of quality. Reasoning-style models and modes stretch out the “thinking” part of an answer to improve quality on hard problems. The net effect is that useful tokens get cheaper. As costs fall, people and systems increase overall usage. We have seen this pattern before in other technologies: efficiency gains lead to higher total use, not less.
Third, agentic systems change the load. Instead of one prompt and one reply, an agent may run many calls, use tools and APIs, and keep context alive for a long session. Agents talk to databases (and other services) and may call other models. From an infrastructure point of view, this looks less like a series of short web requests and more like a distributed application that keeps the fabric busy. All this pushes towards higher, more continuous utilization of accelerators and networks, both inside data centers and between them.
Hardware and Cluster Scale
Modern accelerators (i.e., GPUs, TPUs, and other AI processors, often grouped under “XPUs”) have improved fast. For example, the NVIDIA GB300 NVL72 has a “35X inference performance boost” compared to the HGX H100. Each new generation of XPUs offers more FLOPS, more memory bandwidth, and larger on-package memory. Energy per token for training and inference has fallen thanks to better chips, lower-precision formats (such as FP8), and improved kernels. But because demand is so elastic, these gains have not led to lower total energy use. Instead, the gains have made many more applications viable, so total compute and energy consumption rise rather than fall.
A few years ago, a common building block for training was an 8-GPU server, perhaps around 10 kW under full load. Today, rack-scale systems (e.g., NVIDIA’s GB200 NVL72) change the picture. A single NVL72 rack links 36 Grace CPUs and 72 Blackwell GPUs through NVLink in a single 72-GPU domain, cooled by direct liquid cooling. NVIDIA’s DGX GB200 user guide puts the rack power at around 120 kW. That is one rack. Clusters are built by repeating that block. Four such racks equate to 300 GPUs and roughly half a megawatt of accelerator power. Sixteen racks pass a thousand GPUs and on the order of 2 MW. At the top end, frontier training systems and large inference clusters run into many thousands of GPUs and tens of megawatts, often spread over multiple rooms or buildings.
From a connectivity point of view, three things matter here.
The first is scale-up inside the rack or pod. NVLink and similar fabrics now span far more devices in a single domain than before. That puts pressure on very short-reach interconnects inside the rack. Copper still plays a role on the shortest paths, but high-speed optical links are moving ever closer to the chips as speeds climb.
The second is scale-out across racks. The cluster fabric that ties racks, rows, and rooms together now runs at 400G or 800G per port, usually on Ethernet or InfiniBand, with 1.6T in active development. Switches carry more ports at those speeds with each generation. Clusters are organized into larger pods and more pods per region. Every inter-rack hop is an optical link. The amount of fiber and the number of connectors per rack and per megawatt are rising quickly.
The third is failure domains. When a rack contains 72 accelerators, losing a rack is a big event. Designs for fiber routing, panel layout, and redundancy need to assume that much larger chunks of compute can disappear at once. That has consequences for how structured cabling is mapped to rows, rooms, and power feeds.
AI also puts new weight on connectivity between data centers. Training data must move. Checkpoints and replicas have to be moved and protected. Inference often runs in more than one region, both for latency and resilience. Data Center Interconnects (DCI) are therefore moving to 400G and 800G coherent optics, with multi-terabit per campus becoming a normal planning target for large AI builds (demanding robust dark-fiber routes, higher spectral efficiency on long-haul systems, and clean integration between campus networks and backbone transport).
Power, Siting, and Neoclouds
At this point, power is the hard limit in many hubs. The International Energy Agency projects that data center electricity consumption will more than double by 2030, reaching about 945 TWh a year. That is roughly the current annual electricity use of Japan, and just under 3% of projected global electricity demand.
Gartner estimates that AI-optimized servers represent about 21% of total data-center power in 2025 and will rise to 44% by 2030. By then, AI-optimized servers could account for almost two-thirds of incremental data-center power demand.
These are central estimates. Some scenarios go higher and suggest that AI could approach half of all data-center power use by the end of the decade. The exact figure will depend on how quickly new hardware generations roll out, how much useful work is displaced, and how fast demand grows. What matters for planners is the direction: more power, concentrated in fewer, larger sites.
Grid connections are now often the gate on AI projects. In many regions, getting a new high-capacity connection approved and built can take years. Local authorities are under pressure from residents about water use, noise, land use, and emissions. Some high-profile data-center projects in the US and Europe have been delayed, scaled back, or cancelled because of concerns over power and water.
One response is to move to where power is easier to secure. We see this in the growth of data-center capacity on energy-rich grids in the southern and western United States, in the Nordics, and in other regions with large, low-carbon generation and spare transmission capacity. Another response is to bring generation to the campus. Large AI sites are exploring or deploying behind-the-meter gas plants, renewables, and, in some plans, nuclear facilities sized specifically for AI workloads. Aeroderivative gas turbines, adapted from aircraft engines, are attractive here for reasons of relatively fast deployment and high-power density.
These moves change where AI factories sit on the map. Many are now further from the traditional carrier hotels and internet exchanges that defined the last decade of cloud growth. That raises the value of long-haul dark fiber into these new regions and of resilient, high-capacity DCI linking AI campuses back to core cloud regions and peering points.
A third trend is the rise of Neoclouds. A number of GPU-as-a-Service providers, often called Neoclouds, grew out of cryptocurrency mining or high-performance computing. CoreWeave is the best-known example, which started in Ethereum mining and has become a major GPU cloud, backed by large funding rounds and long-term capacity deals. Crusoe Energy is another Neocloud example, using stranded or flare gas to power modular data centers for AI and compute. Several Bitcoin miners are now planning to repurpose sites and grid connections entirely for AI and HPC. These companies inherit sites ready for AI clusters (i.e., the sites are already permitted for high-power, sometimes noisy operations, with substations and cooling infrastructure in place).
For connectivity teams, Neoclouds introduce three critical considerations:
- Sites may be remote from major peering hubs, creating a strong dependence on good long‑haul fiber and solid DCI design
- Inside the data center, operators often prefer dense, repeatable GPU pods (a model that aligns well with pre‑cabled racks, high‑density patching, and consistent structured cabling)
- The business model hinges on high utilization of expensive GPUs, making downtime and messy change work far less acceptable (cleanly designed failure domains and maintenance paths become a commercial issue, not just an engineering one)
Capital, Modularity, and Risk
The money going into AI infrastructure is enormous. Over this decade, estimates from banks and analysts place total investment in data centers and AI infrastructure in the trillions of dollars, with AI-optimized capacity taking a growing share of the total. Flagship programmes on the hyperscale side talk in terms of many tens or even hundreds of billions of dollars of capex over a few years.
The risk is unevenly spread. Large hyperscalers such as AWS, Microsoft, Google, and Meta have broad businesses and can absorb some overbuild or slow take-up because AI does not represent total revenue and profit. These hyperscalers can also use the same GPU capacity across many products (e.g., internal workloads, cloud services, enterprise offerings, and consumer apps).
Neoclouds and single-tenant AI campuses do not have the same cushion as the hyperscalers (if demand or pricing falls short of expectations, the impact is sharper). That is one reason why we are seeing more joint-venture and special-purpose structures, where infrastructure investors and utilities share part of the long-term risk of the data center and the power assets.
Even when capital is available, friction in the physical world can slow progress. Permitting, local consultation, environmental review, and grid-connection work can stretch timelines. There are also skills shortages in high-voltage electrical work, in advanced cooling, and in integrating large prefabricated systems on site.
A key response is modularity. Instead of building everything from scratch on site, more of the work is done in factories. Racks arrive pre-integrated with accelerators and cooling. Pods are built as standard units, perhaps 0.5–2 MW each, with known power and network envelopes. Power rooms and electrical skids are prefabricated and dropped in. This approach moves labour to controlled environments and compresses on-site work, making “time to AI capacity” a competitive metric.
The implications of modularity for connectivity include:
- Power and cooling envelopes per rack and per pod become standard patterns, which encourages standard patterns in cabling and connector choice
- Fiber counts and connector densities per block can be optimized once and repeated across many sites
- Expansion becomes more stepwise (i.e., new pods and blocks arrive and are lit in large increments, which in turn affects how DCI and campus-to-campus links are planned)
For fiber and connectivity vendors, the shift toward modular, factory‑built infrastructure favours solutions that are dense, modular, and easy to integrate into factory-assembled systems without complex on-site custom work.
Conclusion
AI is pushing data centers up against several hard limits at once. On one side, demand keeps rising. More users, more tokens, more agentic workflows. Rack-scale systems pack 72 GPUs into 120 kW racks and clusters stretch into the multi-megawatt range. Networks inside the building and between sites climb towards 800G and beyond. On the other side, power and siting are constrained. Grids cannot be upgraded overnight. Communities are pushing back against new data centers in some locations. Skilled labour is tight. Large projects take years from idea to energized racks.
The result is a map that looks different from the last cycle, including more AI capacity in power-rich regions and behind-the-meter sites, new operators (in the form of Neoclouds and repurposed crypto players), and a strong swing towards modular, factory-built racks, pods, and power systems.
For anyone working with physical infrastructure, fiber, and connectivity, three messages stand out:
Density: Expect more fibers, more connectors, and more optical links per rack, per row, and per megawatt.
Topology: Scale-up and scale-out networks are getting larger and more complex, both inside and between sites. Clean, well-understood designs are essential.
Resilience with simplicity: Failure domains are larger. Deployment schedules are tighter. The more complex the environment, the more valuable simple, robust connectivity designs become.
In the next blog, we will step down a level and look at the physical reality inside the AI data center: rack power and cooling, the practical limits of scale-up within a rack, how scale-out fabrics are built across rows and rooms, and what all this means for the specific choices needed across fiber type, interface, and topology.
Contributors:
Antonio Castano, Global Market Development Director, AFL
Ben Atherton, Technical Writer, AFL
Cloud & AI Infrastructure
DevOps Live
Cloud & Cyber Security Expo
Big Data & AI World
Data Centre World