GPU Useful Life

Answer

The expected useful life of AI accelerator hardware is 5 years (central estimate), with an optimistic bound of 6 years and a conservative bound of 4 years. This reflects the economic useful life -- the period over which the hardware generates sufficient value to justify its capital cost -- rather than the physical lifetime, which can exceed 10 years.

The industry is converging toward 5 years as the standard depreciation period, down from 6 years during the 2022-2024 period. Amazon's January 2025 reduction from 6 to 5 years for AI-related servers is the clearest signal. Physical obsolescence is not the binding constraint; economic obsolescence driven by rapid generational performance improvements (3-4x per generation every 2 years) determines useful life.

Evidence

Hyperscaler depreciation schedules

Company Previous schedule Current schedule Change date Notes
Amazon/AWS 6 years 5 years (subset) Jan 2025 AI/ML servers specifically
Google/Alphabet 4 years 6 years 2023 No public revision since
Microsoft 4 years 6 years 2023 No public revision since
Meta 4 years 5.5 years 2023 Booked $2.9B depreciation savings

Source: gpu-depreciation-schedules. AWS/Google/Microsoft: 6-year depreciation. Industry converging toward 5-year via "value cascade" model. AI-native neoclouds use 4-5 year schedules. (Research compilation)

Source: Amazon 10-K / Deep Quarry analysis. Amazon shortened useful life for "a subset of servers and networking equipment" from 6 to 5 years effective January 1, 2025, citing "increased pace of technology development, particularly in the area of artificial intelligence and machine learning." Financial impact: $700M operating income reduction in 2025, plus $920M accelerated depreciation in Q4 2024 for early equipment retirements. (Amazon filing; Deep Quarry, 2025)

Neocloud depreciation schedules

Company Depreciation period Notes
CoreWeave 6 years Aggressive for a neocloud
Lambda Labs 5 years
Nebius 4 years Most conservative

Source: SiliconANGLE / theCUBE Research. AI-first clouds "cannot afford stagnant infrastructure; performance/watt gains in successive GPU generations directly determine competitiveness." Predicts 5 years as the emerging equilibrium. (SiliconANGLE, Nov 2025)

The "value cascade" model

GPUs follow a three-stage lifecycle that supports extended useful lives:

  1. Years 1-2: Frontier training. Peak performance required. Hardware is used for training foundation models where latest-generation compute provides the strongest competitive advantage.

  2. Years 3-4: Production inference. Previous-generation GPUs move to high-value real-time serving. Performance remains adequate; latency requirements are less stringent than training synchronization demands.

  3. Years 5-6: Batch inference and analytics. Final lifecycle stage supports cost-sensitive, latency-tolerant workloads where the hardware still generates positive economic returns.

Source: SiliconANGLE / theCUBE Research. This framework is the primary justification for 5-6 year depreciation schedules. (Nov 2025)

Dylan Patel / SemiAnalysis perspective on GPU economics

Source: Dylan Patel on Dwarkesh Patel podcast, "Deep Dive on 3 Big Bottlenecks." Key points:

Physical vs. economic lifetime

Source: HN discussion on xAI/SpaceX orbital compute. Community estimates:

Source: Meta, "Revisiting Reliability in Large-Scale ML Research Clusters." 11 months of data from 24K A100 GPUs at >80% utilization. Component MTTF data validates that individual GPU physical failure rates are low, but at scale, failures become a daily occurrence. (arxiv, 2024)

ChinaTalk analysis

Source: ChinaTalk, "How Much AI Does $1 Get You in China vs America?" Uses 3-year useful life as the hardware lifetime for cost modeling. Notes: "data center GPUs often have lifespans for only that long." Footnote acknowledges this may be conservative: "Some conversations indicate that the lifespan can actually be much longer, and three years is simply when it is more cost-effective to upgrade the hardware." (ChinaTalk, Feb 2026)

Inference vs. training lifecycle differences

Training hardware: Frontier training demands latest-generation hardware for competitive advantage. Economic useful life for training-only is effectively 2-3 years before next-gen hardware provides compelling cost-per-FLOP improvements.

Inference hardware: Inference workloads are less sensitive to generational upgrades. Older hardware remains competitive for inference longer because:

Analysis

Why 5 years is the central estimate

  1. Amazon's signal is the strongest data point. Amazon is the largest cloud provider and explicitly shortened its AI server depreciation from 6 to 5 years in January 2025, taking a $1.6B+ financial hit to do so. This is a revealed-preference signal that 6 years was too optimistic for AI hardware.

  2. The neocloud range brackets 5 years. CoreWeave (6 years), Lambda (5 years), and Nebius (4 years) center on 5 years. These companies have the most direct exposure to GPU economics and no legacy fleet to subsidize optimistic assumptions.

  3. The value cascade supports 5 years. The three-stage model (training -> inference -> batch) maps naturally to a 5-year lifecycle with diminishing returns in years 5-6.

  4. NVIDIA's 2-year cadence creates natural breakpoints. With Hopper (2022) -> Blackwell (2024) -> Rubin (2026) -> next-gen (2028), each generation delivers 3-4x performance/watt. After two generations (4 years), older hardware is 9-16x less efficient per watt, making continued operation increasingly uneconomic except for latency-insensitive batch workloads.

Trend direction: slight shortening

The trend is toward slight shortening from the 6-year schedules adopted in 2022-2023:

However, a countervailing force exists: if chip manufacturing becomes the binding constraint (ASML EUV production limited to ~100 tools/year by 2030), older GPUs could retain economic value longer, potentially stabilizing or even extending useful life assumptions.

Implications for orbital economics

The 5-year useful life creates a hard constraint for orbital data centers:

  1. Hardware must generate returns within 5 years. Any time spent on ground testing, launch, and orbital commissioning (estimated 3-6 months by Dylan Patel) reduces the productive window by 5-10%.

  2. No mid-life upgrades. Terrestrial data centers can swap individual GPUs (15% RMA rate for Blackwell). Orbital systems must either over-provision for failures or accept degrading capacity.

  3. No second-life cascade. Terrestrial GPUs can be redeployed from training to inference to batch workloads. Orbital GPUs are locked into their initial deployment configuration.

  4. End-of-life disposal. Terrestrial hardware has residual value; orbital hardware must be deorbited, with the cost of the launch amortized over fewer productive years if hardware fails early.