Basics Story

Chapter #7 – Data Centers

purpose, structure, capacity growth, geographic distribution

7.0  Prologue

A data center is a facility that houses the compute, storage, and networking infrastructure that runs the modern internet. Every web search, streamed video, cloud application, and AI inference request is ultimately served by servers inside a data center somewhere on earth. Understanding their purpose, physical structure, capacity trajectory, and global distribution gives concrete form to the abstract idea of “the cloud.”

7.1  Purpose and Use Cases

Data centers are built to host workloads that require continuous availability, high bandwidth, and coordinated access to large data sets — things that cannot run reliably on a single desktop or laptop. Primary use cases include:
  • Enterprise IT: internal applications, email, ERP, and file services that an organisation runs for its own employees. Historically the dominant use; now declining relative to cloud migration.
  • Colocation (colo): businesses lease rack space and power inside a shared facility, bring their own servers, and share the building’s cooling, security, and connectivity. Equinix and Digital Realty are the largest colocation operators.
  • Cloud computing (hyperscale): AWS, Azure, and GCP each operate hundreds of massive data centers worldwide. Customers rent compute, storage, and services on demand without owning any hardware.
  • Content delivery: video streaming (Netflix, YouTube), social media, and software distribution require enormous storage capacity and servers positioned close to end users to minimise latency.
  • Artificial intelligence training and inference: training large language models and running inference at scale demands clusters of thousands of GPUs with very high-bandwidth interconnects (NVLink, InfiniBand). AI workloads are now the fastest-growing driver of new data center construction.
  • Financial services: stock exchanges, payment processors, and banks colocate servers in data centers specifically chosen for sub-millisecond network latency to trading venues.
  • Government and defence: classified and sensitive data processed in air-gapped or highly restricted facilities.

7.2  Physical Structure and Components

A data center is more than a room full of servers. It is a precisely engineered system in which power, cooling, networking, and physical security are co-designed to maximise uptime and efficiency. Power infrastructure:
  • Utility power enters through one or more feeds at medium voltage (11 kV–115 kV) and is stepped down through transformers to 480 V or 208 V distribution.
  • Uninterruptible Power Supplies (UPS) buffer brief utility interruptions using batteries or flywheels, providing ride-through until diesel generators start (typically within 10–30 seconds).
  • Large hyperscale facilities draw 100–500 MW; the largest AI-optimised campuses under construction exceed 1 GW, requiring dedicated grid infrastructure.
  • Power Usage Effectiveness (PUE) = total facility power ÷ IT equipment power. A PUE of 1.0 is perfect (all power goes to compute); legacy facilities often run at 1.5–2.0; modern hyperscale facilities target 1.1–1.2.
Cooling infrastructure:
  • Servers generate heat proportional to their power draw. Removing that heat is the second-largest operating cost after electricity.
  • Air cooling: hot-aisle/cold-aisle row layouts direct cold air from raised floors or overhead ducts through server chassis, exhausting hot air to Computer Room Air Handlers (CRAHs) or Air Conditioning Units (CRACs).
  • Liquid cooling: direct liquid cooling (cold plates on CPUs/GPUs) or full immersion cooling in dielectric fluid removes heat far more efficiently than air. Required for high-density AI GPU racks that exceed 50–100 kW per rack — far beyond what air cooling can handle.
  • Some facilities use evaporative cooling, outside air economisation, or nearby river/ocean water as a heat sink, reducing energy spent on refrigeration.
Compute and storage:
  • Servers are mounted in 42U or 48U racks (a U is 1.75 in / 44.45 mm). A typical rack holds 20–40 1U or 2U servers.
  • Storage tiers: NVMe SSD (fastest, most expensive) → SATA SSD → spinning HDD → tape archive (slowest, cheapest per GB).
  • AI GPU clusters use specialised racks with 8 GPUs per server, high-speed NVLink between GPUs in a node, and InfiniBand or high-bandwidth Ethernet between nodes.
Networking:
  • Within a rack: 10 GbE or 25 GbE links to a Top-of-Rack (ToR) switch.
  • Between racks and rows: 100 GbE or 400 GbE aggregation switches.
  • Data center to internet: multiple redundant fibre connections to internet exchange points (IXPs) and upstream carriers, often totalling hundreds of Gbps to multiple Tbps for large hyperscale facilities.
Physical security: perimeter fencing, biometric access control, mantraps, 24/7 security personnel, CCTV, and strict visitor logging. Tier III and Tier IV facilities (by the Uptime Institute classification) guarantee 99.982% and 99.995% annual uptime respectively, achieved through N+1 or 2N redundancy in every critical system.

7.3  Capacity Evolution

Data center capacity is commonly measured in installed IT power (megawatts, MW) or in server count. Both have grown dramatically:
Era Dominant driver Typical facility Scale
1960s–1980s Mainframe computing Corporate or government machine room 10–100 kW; one to a few machines
1990s Client/server; early web hosting Purpose-built server room 100 kW–1 MW
2000s Dot-com boom; search engines; early SaaS Colocation and early hyperscale 1–10 MW per facility
2010s Cloud computing; mobile; video streaming AWS, Azure, GCP mega-campuses 10–200 MW per campus
2020–2024 Pandemic-driven cloud migration; early AI Hyperscale campuses + GPU clusters 100–500 MW per campus
2025+ Large-scale AI training and inference AI-optimised gigawatt campuses 500 MW–1+ GW per campus
Global installed data center capacity reached approximately 60–70 GW of IT load by 2024, having roughly doubled every four years through the cloud era. The AI acceleration beginning in 2022–2023 is compressing that doubling interval: industry analysts project 200+ GW of global capacity by 2030 if announced construction projects proceed on schedule. Water consumption has grown alongside power draw. A 100 MW facility may consume 1–3 million gallons of water per day for evaporative cooling — a significant concern in water-scarce regions, driving adoption of dry cooling and liquid immersion alternatives.

7.4  Geographic Distribution

Data centers concentrate where several factors align: reliable and affordable power, low-latency fibre networks, a skilled technical workforce, political stability, tax incentives, and (increasingly) cool climates that reduce cooling costs. The following figures approximate shares of global installed IT capacity as of 2025; AI-driven construction is shifting the percentages rapidly.
Region Share (%) Key concentrations
United States ~35% Northern Virginia (Ashburn) is the single largest data center market on earth, hosting the densest concentration of hyperscale and colocation capacity. Other major clusters: Dallas-Fort Worth, Silicon Valley, Chicago, Phoenix, Atlanta, Seattle.
Canada and Mexico ~4% Canada: Toronto, Montreal (cheap hydro), Vancouver. Mexico: Queretaro, Mexico City, Monterrey (growing nearshore hub).
Central and South America ~3% Brazil (São Paulo) is the dominant market; smaller but growing presence in Chile, Colombia, and Argentina.
European Union and UK ~22% Amsterdam, Frankfurt, Dublin, London, and Paris form the “FLAP-D” tier-1 cluster. Nordics (Stockholm, Helsinki) attract hyperscale operators with cold climate and renewable energy.
Russia ~2% Moscow dominates; data sovereignty laws require Russian user data to be stored on Russian soil, driving domestic investment by Yandex, Sber, and state operators.
China ~15% Beijing, Shanghai, Shenzhen, and Guangzhou are the primary markets. The government’s “Eastern Data, Western Computing” initiative is shifting new capacity westward to Guizhou and Inner Mongolia where power is cheaper and cooler climates reduce cooling costs.
Southeast Asia ~6% Singapore is the regional hub, though land and power constraints have pushed expansion to Johor Bahru (Malaysia), Jakarta (Indonesia), Bangkok, and Ho Chi Minh City.
India and Australia ~7% India: Mumbai, Chennai, Hyderabad, Pune — fastest-growing major market, driven by domestic digital adoption and government data localisation push. Australia: Sydney, Melbourne — serves as the regional hub for the Pacific and complies with strict data sovereignty requirements.
Rest of world ~6% Middle East (UAE, Saudi Arabia investing heavily in AI-scale campuses), South Africa (Johannesburg), Japan, South Korea, and other markets.
Latency and data sovereignty are the two forces that decentralise capacity away from the dominant US and EU clusters. Users experience roughly 1 ms of round-trip latency per 100 km of fibre; a user in Mumbai served from Virginia experiences 150–200 ms, which is perceptible in interactive applications. Simultaneously, regulations in the EU (GDPR), India, Russia, China, and elsewhere mandate that certain categories of data remain within national borders, requiring local infrastructure even when economics would favour consolidation.

7.5  Epilogue

Data centers are the physical substrate of the digital economy. Their design has evolved from air-conditioned mainframe rooms to gigawatt campuses engineered to train the next generation of AI models. The economic and geopolitical forces shaping their location — cheap power, cold climates, data sovereignty law, and proximity to users — make data center geography as strategically significant as port locations were in the industrial era.

7.6  References

Uptime Institute Annual Data Center Survey
Data Center Map – global facility directory
Synergy Research Group – hyperscale market data
Wikipedia – Data Center
IEA Electricity 2024 – data center power demand