GPU Useful Life
Answer
The expected useful life of AI accelerator hardware is 5 years (central estimate), with an optimistic bound of 6 years and a conservative bound of 4 years. This reflects the economic useful life -- the period over which the hardware generates sufficient value to justify its capital cost -- rather than the physical lifetime, which can exceed 10 years.
The industry is converging toward 5 years as the standard depreciation period, down from 6 years during the 2022-2024 period. Amazon's January 2025 reduction from 6 to 5 years for AI-related servers is the clearest signal. Physical obsolescence is not the binding constraint; economic obsolescence driven by rapid generational performance improvements (3-4x per generation every 2 years) determines useful life.
Evidence
Hyperscaler depreciation schedules
| Company | Previous schedule | Current schedule | Change date | Notes |
|---|---|---|---|---|
| Amazon/AWS | 6 years | 5 years (subset) | Jan 2025 | AI/ML servers specifically |
| Google/Alphabet | 4 years | 6 years | 2023 | No public revision since |
| Microsoft | 4 years | 6 years | 2023 | No public revision since |
| Meta | 4 years | 5.5 years | 2023 | Booked $2.9B depreciation savings |
Source: gpu-depreciation-schedules. AWS/Google/Microsoft: 6-year depreciation. Industry converging toward 5-year via "value cascade" model. AI-native neoclouds use 4-5 year schedules. (Research compilation)
Source: Amazon 10-K / Deep Quarry analysis. Amazon shortened useful life for "a subset of servers and networking equipment" from 6 to 5 years effective January 1, 2025, citing "increased pace of technology development, particularly in the area of artificial intelligence and machine learning." Financial impact: $700M operating income reduction in 2025, plus $920M accelerated depreciation in Q4 2024 for early equipment retirements. (Amazon filing; Deep Quarry, 2025)
Neocloud depreciation schedules
| Company | Depreciation period | Notes |
|---|---|---|
| CoreWeave | 6 years | Aggressive for a neocloud |
| Lambda Labs | 5 years | |
| Nebius | 4 years | Most conservative |
Source: SiliconANGLE / theCUBE Research. AI-first clouds "cannot afford stagnant infrastructure; performance/watt gains in successive GPU generations directly determine competitiveness." Predicts 5 years as the emerging equilibrium. (SiliconANGLE, Nov 2025)
The "value cascade" model
GPUs follow a three-stage lifecycle that supports extended useful lives:
Years 1-2: Frontier training. Peak performance required. Hardware is used for training foundation models where latest-generation compute provides the strongest competitive advantage.
Years 3-4: Production inference. Previous-generation GPUs move to high-value real-time serving. Performance remains adequate; latency requirements are less stringent than training synchronization demands.
Years 5-6: Batch inference and analytics. Final lifecycle stage supports cost-sensitive, latency-tolerant workloads where the hardware still generates positive economic returns.
Source: SiliconANGLE / theCUBE Research. This framework is the primary justification for 5-6 year depreciation schedules. (Nov 2025)
Dylan Patel / SemiAnalysis perspective on GPU economics
Source: Dylan Patel on Dwarkesh Patel podcast, "Deep Dive on 3 Big Bottlenecks." Key points:
- H100 all-in deployment cost is ~$1.40/hour across 5 years. At $2/hour market rate, yields ~35% gross margin.
- Every 2 years, NVIDIA triples/quadruples performance while increasing price by 50-100%. This compresses the market value of older GPUs.
- H100 market rate fell from ~$2/hour (2024) to ~$1/hour (2026) as Blackwell deployed at volume.
- "If your argument is that a GPU has a useful life of five years" -- Patel uses 5 years as the standard assumption.
- Michael Burry argued for 3-year or shorter depreciation, but Patel notes this is overly bearish.
- Counter-argument: if compute demand outstrips chip manufacturing capacity (constrained by ASML EUV tool production at ~100/year by 2030), older GPUs retain value longer. "Maybe the depreciation cycle is even longer than five years." (Dwarkesh podcast, Mar 2026)
Physical vs. economic lifetime
Source: HN discussion on xAI/SpaceX orbital compute. Community estimates:
- Physical lifetime: "at least 10 years" -- bounded by capacitor degradation, not silicon wear. GPUs are stateless, so duty cycle has minimal impact on longevity.
- Silicon degradation (dopant migration) occurs but is slow. "Three years is probably too low but they do die."
- At scale (10,000+ GPUs), individual failures are frequent but manageable through hot-swap. Meta reported ~1 failure every 3 hours across a 16,000 H100 cluster during Llama 3 training.
- 15% of Blackwell GPUs deployed require RMA (Dylan Patel), but this is infant mortality that can be screened out. (HN discussion; Dwarkesh podcast)
Source: Meta, "Revisiting Reliability in Large-Scale ML Research Clusters." 11 months of data from 24K A100 GPUs at >80% utilization. Component MTTF data validates that individual GPU physical failure rates are low, but at scale, failures become a daily occurrence. (arxiv, 2024)
ChinaTalk analysis
Source: ChinaTalk, "How Much AI Does $1 Get You in China vs America?" Uses 3-year useful life as the hardware lifetime for cost modeling. Notes: "data center GPUs often have lifespans for only that long." Footnote acknowledges this may be conservative: "Some conversations indicate that the lifespan can actually be much longer, and three years is simply when it is more cost-effective to upgrade the hardware." (ChinaTalk, Feb 2026)
Inference vs. training lifecycle differences
Training hardware: Frontier training demands latest-generation hardware for competitive advantage. Economic useful life for training-only is effectively 2-3 years before next-gen hardware provides compelling cost-per-FLOP improvements.
Inference hardware: Inference workloads are less sensitive to generational upgrades. Older hardware remains competitive for inference longer because:
- Inference is often memory-bandwidth-bound, not compute-bound
- Latency requirements are application-dependent and often tolerant
- Optimized inference software (TensorRT, vLLM) continues improving on older hardware
- The value cascade model naturally extends GPU life through inference workloads
Analysis
Why 5 years is the central estimate
Amazon's signal is the strongest data point. Amazon is the largest cloud provider and explicitly shortened its AI server depreciation from 6 to 5 years in January 2025, taking a $1.6B+ financial hit to do so. This is a revealed-preference signal that 6 years was too optimistic for AI hardware.
The neocloud range brackets 5 years. CoreWeave (6 years), Lambda (5 years), and Nebius (4 years) center on 5 years. These companies have the most direct exposure to GPU economics and no legacy fleet to subsidize optimistic assumptions.
The value cascade supports 5 years. The three-stage model (training -> inference -> batch) maps naturally to a 5-year lifecycle with diminishing returns in years 5-6.
NVIDIA's 2-year cadence creates natural breakpoints. With Hopper (2022) -> Blackwell (2024) -> Rubin (2026) -> next-gen (2028), each generation delivers 3-4x performance/watt. After two generations (4 years), older hardware is 9-16x less efficient per watt, making continued operation increasingly uneconomic except for latency-insensitive batch workloads.
Trend direction: slight shortening
The trend is toward slight shortening from the 6-year schedules adopted in 2022-2023:
- Amazon's explicit reduction from 6 to 5 years
- SiliconANGLE projects 5 years as the "emerging equilibrium"
- The pace of GPU performance improvement (3-4x per generation every 2 years) has not slowed
- AI-native companies use 4-5 years, and their practices tend to lead hyperscaler policy
However, a countervailing force exists: if chip manufacturing becomes the binding constraint (ASML EUV production limited to ~100 tools/year by 2030), older GPUs could retain economic value longer, potentially stabilizing or even extending useful life assumptions.
Implications for orbital economics
The 5-year useful life creates a hard constraint for orbital data centers:
Hardware must generate returns within 5 years. Any time spent on ground testing, launch, and orbital commissioning (estimated 3-6 months by Dylan Patel) reduces the productive window by 5-10%.
No mid-life upgrades. Terrestrial data centers can swap individual GPUs (15% RMA rate for Blackwell). Orbital systems must either over-provision for failures or accept degrading capacity.
No second-life cascade. Terrestrial GPUs can be redeployed from training to inference to batch workloads. Orbital GPUs are locked into their initial deployment configuration.
End-of-life disposal. Terrestrial hardware has residual value; orbital hardware must be deorbited, with the cost of the launch amortized over fewer productive years if hardware fails early.