Basics Story Data Centers

8.0 Prologue

A data center is a facility that houses the compute, storage, and networking infrastructure that runs the modern internet. Every web search, streamed video, cloud application, and AI inference request is ultimately served by servers inside a data center somewhere on earth. Understanding their purpose, physical structure, capacity trajectory, and global distribution gives concrete form to the abstract idea of “the cloud.”

8.1 Purpose and Use Cases

Data centers are built to host workloads that require continuous availability, high bandwidth, and coordinated access to large data sets — things that cannot run reliably on a single desktop or laptop. Primary use cases include:

Enterprise IT: internal applications, email, ERP, and file services that an organisation runs for its own employees. Historically the dominant use; now declining relative to cloud migration.
Colocation (colo): businesses lease rack space and power inside a shared facility, bring their own servers, and share the building’s cooling, security, and connectivity. Equinix and Digital Realty are the largest colocation operators.
Cloud computing (hyperscale): AWS, Azure, and GCP each operate hundreds of massive data centers worldwide. Customers rent compute, storage, and services on demand without owning any hardware.
Content delivery: video streaming (Netflix, YouTube), social media, and software distribution require enormous storage capacity and servers positioned close to end users to minimise latency.
Artificial intelligence training and inference: training large language models and running inference at scale demands clusters of thousands of GPUs with very high-bandwidth interconnects (NVLink, InfiniBand). AI workloads are now the fastest-growing driver of new data center construction.
Financial services: stock exchanges, payment processors, and banks colocate servers in data centers specifically chosen for sub-millisecond network latency to trading venues.
Government and defence: classified and sensitive data processed in air-gapped or highly restricted facilities.

8.2 Physical Structure and Components

A data center is more than a room full of servers. It is a precisely engineered system in which power, cooling, networking, and physical security are co-designed to maximise uptime and efficiency.

Power infrastructure:

Utility power enters through one or more feeds at medium voltage (11 kV–115 kV) and is stepped down through transformers to 480 V or 208 V distribution.
Uninterruptible Power Supplies (UPS) buffer brief utility interruptions using batteries or flywheels, providing ride-through until diesel generators start (typically within 10–30 seconds).
Large hyperscale facilities draw 100–500 MW; the largest AI-optimised campuses under construction exceed 1 GW, requiring dedicated grid infrastructure.
Power Usage Effectiveness (PUE) = total facility power ÷ IT equipment power. A PUE of 1.0 is perfect (all power goes to compute); legacy facilities often run at 1.5–2.0; modern hyperscale facilities target 1.1–1.2.

Cooling infrastructure:

Servers generate heat proportional to their power draw. Removing that heat is the second-largest operating cost after electricity.
Air cooling: hot-aisle/cold-aisle row layouts direct cold air from raised floors or overhead ducts through server chassis, exhausting hot air to Computer Room Air Handlers (CRAHs) or Air Conditioning Units (CRACs).
Liquid cooling: direct liquid cooling (cold plates on CPUs/GPUs) or full immersion cooling in dielectric fluid removes heat far more efficiently than air. Required for high-density AI GPU racks that exceed 50–100 kW per rack — far beyond what air cooling can handle.
Some facilities use evaporative cooling, outside air economisation, or nearby river/ocean water as a heat sink, reducing energy spent on refrigeration.

Compute and storage:

Servers are mounted in 42U or 48U racks (a U is 1.75 in / 44.45 mm). A typical rack holds 20–40 1U or 2U servers.
Storage tiers: NVMe SSD (fastest, most expensive) → SATA SSD → spinning HDD → tape archive (slowest, cheapest per GB).
AI GPU clusters use specialised racks with 8 GPUs per server, high-speed NVLink between GPUs in a node, and InfiniBand or high-bandwidth Ethernet between nodes.

Networking:

Within a rack: 10 GbE or 25 GbE links to a Top-of-Rack (ToR) switch.
Between racks and rows: 100 GbE or 400 GbE aggregation switches.
Data center to internet: multiple redundant fibre connections to internet exchange points (IXPs) and upstream carriers, often totalling hundreds of Gbps to multiple Tbps for large hyperscale facilities.

Physical security: perimeter fencing, biometric access control, mantraps, 24/7 security personnel, CCTV, and strict visitor logging. Tier III and Tier IV facilities (by the Uptime Institute classification) guarantee 99.982% and 99.995% annual uptime respectively, achieved through N+1 or 2N redundancy in every critical system.

8.3 Capacity Evolution

Data center capacity is commonly measured in installed IT power (megawatts, MW) or in server count. Both have grown dramatically:

Era	Dominant driver	Typical facility	Scale
1960s–1980s	Mainframe computing	Corporate or government machine room	10–100 kW; one to a few machines
1990s	Client/server; early web hosting	Purpose-built server room	100 kW–1 MW
2000s	Dot-com boom; search engines; early SaaS	Colocation and early hyperscale	1–10 MW per facility
2010s	Cloud computing; mobile; video streaming	AWS, Azure, GCP mega-campuses	10–200 MW per campus
2020–2024	Pandemic-driven cloud migration; early AI	Hyperscale campuses + GPU clusters	100–500 MW per campus
2025+	Large-scale AI training and inference	AI-optimised gigawatt campuses	500 MW–1+ GW per campus

Global installed data center capacity reached approximately 60–70 GW of IT load by 2024, having roughly doubled every four years through the cloud era. The AI acceleration beginning in 2022–2023 is compressing that doubling interval: industry analysts project 200+ GW of global capacity by 2030 if announced construction projects proceed on schedule.

Water consumption has grown alongside power draw. A 100 MW facility may consume 1–3 million gallons of water per day for evaporative cooling — a significant concern in water-scarce regions, driving adoption of dry cooling and liquid immersion alternatives.

8.4 Geographic Distribution

Data centers concentrate where several factors align: reliable and affordable power, low-latency fibre networks, a skilled technical workforce, political stability, tax incentives, and (increasingly) cool climates that reduce cooling costs. The following figures approximate shares of global installed IT capacity as of 2025; AI-driven construction is shifting the percentages rapidly.

Region	Share (%)	Key concentrations
United States	~35%	Northern Virginia (Ashburn) is the single largest data center market on earth, hosting the densest concentration of hyperscale and colocation capacity. Other major clusters: Dallas-Fort Worth, Silicon Valley, Chicago, Phoenix, Atlanta, Seattle.
Canada and Mexico	~4%	Canada: Toronto, Montreal (cheap hydro), Vancouver. Mexico: Queretaro, Mexico City, Monterrey (growing nearshore hub).
Central and South America	~3%	Brazil (São Paulo) is the dominant market; smaller but growing presence in Chile, Colombia, and Argentina.
European Union and UK	~22%	Amsterdam, Frankfurt, Dublin, London, and Paris form the “FLAP-D” tier-1 cluster. Nordics (Stockholm, Helsinki) attract hyperscale operators with cold climate and renewable energy.
Russia	~2%	Moscow dominates; data sovereignty laws require Russian user data to be stored on Russian soil, driving domestic investment by Yandex, Sber, and state operators.
China	~15%	Beijing, Shanghai, Shenzhen, and Guangzhou are the primary markets. The government’s “Eastern Data, Western Computing” initiative is shifting new capacity westward to Guizhou and Inner Mongolia where power is cheaper and cooler climates reduce cooling costs.
Southeast Asia	~6%	Singapore is the regional hub, though land and power constraints have pushed expansion to Johor Bahru (Malaysia), Jakarta (Indonesia), Bangkok, and Ho Chi Minh City.
India and Australia	~7%	India: Mumbai, Chennai, Hyderabad, Pune — fastest-growing major market, driven by domestic digital adoption and government data localisation push. Australia: Sydney, Melbourne — serves as the regional hub for the Pacific and complies with strict data sovereignty requirements.
Rest of world	~6%	Middle East (UAE, Saudi Arabia investing heavily in AI-scale campuses), South Africa (Johannesburg), Japan, South Korea, and other markets.

Latency and data sovereignty are the two forces that decentralise capacity away from the dominant US and EU clusters. Users experience roughly 1 ms of round-trip latency per 100 km of fibre; a user in Mumbai served from Virginia experiences 150–200 ms, which is perceptible in interactive applications. Simultaneously, regulations in the EU (GDPR), India, Russia, China, and elsewhere mandate that certain categories of data remain within national borders, requiring local infrastructure even when economics would favour consolidation.

8.5 Workload Distribution

The table below shows approximate shares of global data center compute consumed by workload category as of 2024–2025. Figures are estimates drawn from IEA, EPRI, and Lawrence Berkeley National Laboratory reports; exact values vary by source and shift as AI workloads grow.

Category	% of Total	Notes
Enterprise / Business IT	~25%	ERP, CRM, SaaS, email, internal apps; the largest legacy category, declining relative share as cloud migration continues
AI / Machine Learning	~18%	Training large models and serving inference; fastest-growing workload, up from ~10% in 2022
Video Streaming	~15%	Netflix, YouTube, TikTok, Disney+; dominated by storage I/O and transcoding at scale
Social Media	~8%	Feed ranking, content storage, ad targeting, and real-time notifications for billions of users
E-commerce / Online Retail	~6%	Transactions, inventory, logistics, recommendation engines, and fraud detection
Search Engines	~5%	Continuous web crawling, index updates, and low-latency query serving at global scale
Financial Services	~5%	Algorithmic trading, fraud detection, payment processing, and regulatory reporting
Communications	~4%	Email, video conferencing, VoIP, and messaging platforms
Gaming / Interactive Entertainment	~4%	Online game servers, cloud gaming (GeForce NOW, Xbox Cloud), and esports infrastructure
Scientific Computing / HPC	~3%	Climate simulation, genomics sequencing, drug discovery, and particle physics analysis
Government & Defense	~3%	Intelligence analysis, logistics, satellite data processing, and air-gapped classified facilities
Healthcare IT	~2%	Electronic health records, medical imaging archives, genomic databases, and clinical AI
Cybersecurity Operations	~1%	SIEM platforms, threat intelligence feeds, malware sandboxing, and SOC analytics
Misc / Other	~1%	IoT device backends, edge orchestration, DNS/PKI infrastructure, and CDN origin servers

When Claude was asked to search the web for current AI use of data centers it returned an estimate of 38% in 2026, published by ABI Research.

8.6 Epilogue

Data centers are the physical substrate of the digital economy. Their design has evolved from air-conditioned mainframe rooms to gigawatt campuses engineered to train the next generation of AI models. The economic and geopolitical forces shaping their location — cheap power, cold climates, data sovereignty law, and proximity to users — make data center geography as strategically significant as port locations were in the industrial era.

8.7 References

Uptime Institute Annual Data Center Survey
Data Center Map – global facility directory
Synergy Research Group – hyperscale market data
Wikipedia – Data Center
IEA Electricity 2024 – data center power demand

Chapter #8 – Data Centers

purpose, structure, capacity growth, geographic distribution