GPU Useful Life

Answer

The expected useful life of AI accelerator hardware is 5 years (central estimate), with an optimistic bound of 6 years and a conservative bound of 4 years. This reflects the economic useful life -- the period over which the hardware generates sufficient value to justify its capital cost -- rather than the physical lifetime, which can exceed 10 years.

The industry is converging toward 5 years as the standard depreciation period, down from 6 years during the 2022-2024 period. Amazon's January 2025 reduction from 6 to 5 years for AI-related servers is the clearest signal. Physical obsolescence is not the binding constraint; economic obsolescence driven by rapid generational performance improvements (3-4x per generation every 2 years) determines useful life.

Evidence

Hyperscaler depreciation schedules

Company	Previous schedule	Current schedule	Change date	Notes
Amazon/AWS	6 years	5 years (subset)	Jan 2025	AI/ML servers specifically
Google/Alphabet	4 years	6 years	2023	No public revision since
Microsoft	4 years	6 years	2023	No public revision since
Meta	4 years	5.5 years	2023	Booked $2.9B depreciation savings

Source: gpu-depreciation-schedules. AWS/Google/Microsoft: 6-year depreciation. Industry converging toward 5-year via "value cascade" model. AI-native neoclouds use 4-5 year schedules. (Research compilation)

Source: Amazon 10-K / Deep Quarry analysis. Amazon shortened useful life for "a subset of servers and networking equipment" from 6 to 5 years effective January 1, 2025, citing "increased pace of technology development, particularly in the area of artificial intelligence and machine learning." Financial impact: $700M operating income reduction in 2025, plus $920M accelerated depreciation in Q4 2024 for early equipment retirements. (Amazon filing; Deep Quarry, 2025)

Neocloud depreciation schedules

Company	Depreciation period	Notes
CoreWeave	6 years	Aggressive for a neocloud
Lambda Labs	5 years
Nebius	4 years	Most conservative

Source: SiliconANGLE / theCUBE Research. AI-first clouds "cannot afford stagnant infrastructure; performance/watt gains in successive GPU generations directly determine competitiveness." Predicts 5 years as the emerging equilibrium. (SiliconANGLE, Nov 2025)

The "value cascade" model

GPUs follow a three-stage lifecycle that supports extended useful lives:

Years 1-2: Frontier training. Peak performance required. Hardware is used for training foundation models where latest-generation compute provides the strongest competitive advantage.
Years 3-4: Production inference. Previous-generation GPUs move to high-value real-time serving. Performance remains adequate; latency requirements are less stringent than training synchronization demands.
Years 5-6: Batch inference and analytics. Final lifecycle stage supports cost-sensitive, latency-tolerant workloads where the hardware still generates positive economic returns.

Source: SiliconANGLE / theCUBE Research. This framework is the primary justification for 5-6 year depreciation schedules. (Nov 2025)

Dylan Patel / SemiAnalysis perspective on GPU economics

Source: Dylan Patel on Dwarkesh Patel podcast, "Deep Dive on 3 Big Bottlenecks." Key points:

H100 all-in deployment cost is ~$1.40/hour across 5 years. At $2/hour market rate, yields ~35% gross margin.
Every 2 years, NVIDIA triples/quadruples performance while increasing price by 50-100%. This compresses the market value of older GPUs.
H100 market rate fell from ~$2/hour (2024) to ~$1/hour (2026) as Blackwell deployed at volume.
"If your argument is that a GPU has a useful life of five years" -- Patel uses 5 years as the standard assumption.
Michael Burry argued for 3-year or shorter depreciation, but Patel notes this is overly bearish.
Counter-argument: if compute demand outstrips chip manufacturing capacity (constrained by ASML EUV tool production at ~100/year by 2030), older GPUs retain value longer. "Maybe the depreciation cycle is even longer than five years." (Dwarkesh podcast, Mar 2026)

Physical vs. economic lifetime

Source: HN discussion on xAI/SpaceX orbital compute. Community estimates:

Physical lifetime: "at least 10 years" -- bounded by capacitor degradation, not silicon wear. GPUs are stateless, so duty cycle has minimal impact on longevity.
Silicon degradation (dopant migration) occurs but is slow. "Three years is probably too low but they do die."
At scale (10,000+ GPUs), individual failures are frequent but manageable through hot-swap. Meta reported ~1 failure every 3 hours across a 16,000 H100 cluster during Llama 3 training.
15% of Blackwell GPUs deployed require RMA (Dylan Patel), but this is infant mortality that can be screened out. (HN discussion; Dwarkesh podcast)

Source: Meta, "Revisiting Reliability in Large-Scale ML Research Clusters." 11 months of data from 24K A100 GPUs at >80% utilization. Component MTTF data validates that individual GPU physical failure rates are low, but at scale, failures become a daily occurrence. (arxiv, 2024)

ChinaTalk analysis

Source: ChinaTalk, "How Much AI Does $1 Get You in China vs America?" Uses 3-year useful life as the hardware lifetime for cost modeling. Notes: "data center GPUs often have lifespans for only that long." Footnote acknowledges this may be conservative: "Some conversations indicate that the lifespan can actually be much longer, and three years is simply when it is more cost-effective to upgrade the hardware." (ChinaTalk, Feb 2026)

Inference vs. training lifecycle differences

Training hardware: Frontier training demands latest-generation hardware for competitive advantage. Economic useful life for training-only is effectively 2-3 years before next-gen hardware provides compelling cost-per-FLOP improvements.

Inference hardware: Inference workloads are less sensitive to generational upgrades. Older hardware remains competitive for inference longer because:

Inference is often memory-bandwidth-bound, not compute-bound
Latency requirements are application-dependent and often tolerant
Optimized inference software (TensorRT, vLLM) continues improving on older hardware
The value cascade model naturally extends GPU life through inference workloads

Analysis

Why 5 years is the central estimate

Amazon's signal is the strongest data point. Amazon is the largest cloud provider and explicitly shortened its AI server depreciation from 6 to 5 years in January 2025, taking a $1.6B+ financial hit to do so. This is a revealed-preference signal that 6 years was too optimistic for AI hardware.
The neocloud range brackets 5 years. CoreWeave (6 years), Lambda (5 years), and Nebius (4 years) center on 5 years. These companies have the most direct exposure to GPU economics and no legacy fleet to subsidize optimistic assumptions.
The value cascade supports 5 years. The three-stage model (training -> inference -> batch) maps naturally to a 5-year lifecycle with diminishing returns in years 5-6.
NVIDIA's 2-year cadence creates natural breakpoints. With Hopper (2022) -> Blackwell (2024) -> Rubin (2026) -> next-gen (2028), each generation delivers 3-4x performance/watt. After two generations (4 years), older hardware is 9-16x less efficient per watt, making continued operation increasingly uneconomic except for latency-insensitive batch workloads.

Trend direction: slight shortening

The trend is toward slight shortening from the 6-year schedules adopted in 2022-2023:

Amazon's explicit reduction from 6 to 5 years
SiliconANGLE projects 5 years as the "emerging equilibrium"
The pace of GPU performance improvement (3-4x per generation every 2 years) has not slowed
AI-native companies use 4-5 years, and their practices tend to lead hyperscaler policy

However, a countervailing force exists: if chip manufacturing becomes the binding constraint (ASML EUV production limited to ~100 tools/year by 2030), older GPUs could retain economic value longer, potentially stabilizing or even extending useful life assumptions.

Implications for orbital economics

The 5-year useful life creates a hard constraint for orbital data centers:

Hardware must generate returns within 5 years. Any time spent on ground testing, launch, and orbital commissioning (estimated 3-6 months by Dylan Patel) reduces the productive window by 5-10%.
No mid-life upgrades. Terrestrial data centers can swap individual GPUs (15% RMA rate for Blackwell). Orbital systems must either over-provision for failures or accept degrading capacity.
No second-life cascade. Terrestrial GPUs can be redeployed from training to inference to batch workloads. Orbital GPUs are locked into their initial deployment configuration.
End-of-life disposal. Terrestrial hardware has residual value; orbital hardware must be deorbited, with the cost of the launch amortized over fewer productive years if hardware fails early.