What Burn-in Actually Does — and Doesn't
Burn-in is a stress applied to a finished assembly with one goal: to push the population through the infant-mortality region of the bath-tub curve before the unit ships to a customer. Anything you catch in burn-in is a unit you don't have to repair in the field. Anything you don't catch is a field return.
The mechanism is straightforward — most early-life failures follow an Arrhenius acceleration curve. Running a part at 85°C for 168 hours simulates roughly the first 2–6 months of typical 35°C operating life, depending on the activation energy of the dominant failure mode. Running at 125°C for the same time simulates years.
What burn-in catches
- Marginal solder joints — cold joints, head-in-pillow on BGAs, micro-voids that grow under thermal cycling.
- Latent silicon defects — gate oxide weak spots that fail in early life, particularly common in older process nodes.
- Wire-bond degradation — bonds with marginal pull strength that pass at zero-hours and fail after thermal stress.
- Capacitor weak spots — electrolytics with high ESR that fail rapidly under elevated temperature.
- Workmanship defects on connectors, sockets, mechanical mounts — anything that needs a few hundred thermal cycles to expose.
What burn-in doesn't catch
- Wear-out failures (the right-hand side of the bath-tub) — these will happen at the same calendar time regardless of burn-in.
- Random failures (the flat bottom of the bath-tub) — burn-in shifts these forward in time but doesn't reduce their rate.
- Field-specific failures (vibration, humidity, lightning) — these need their own stress tests.
"Burn-in is not a magic wand for reliability. It's a screen for infant mortality. If you don't have an infant-mortality problem, you don't need it; if you do, 168 hours is the starting point, not the ending point." — Pioneer Horizon test team
The Study — Three Programmes, Two Protocols
To answer the 168-vs-1000 question with data rather than opinion, we ran a parallel-cohort study across three customer programmes between 2022 and 2025. The study was structured because the question kept coming up in customer scoping calls and we were tired of answering with "depends".
Programme A — Industrial automation controller
- Volume: 4,200 units across two years.
- Operating environment: indoor cabinet, 25–55°C, low humidity.
- Burn-in cohorts: 2,100 units at 168h / 70°C (cohort A1), 2,100 units at 1,000h / 70°C (cohort A2).
Programme B — Medical diagnostic board
- Volume: 1,800 units across 18 months.
- Operating environment: clinical bench, 20–28°C, frequent power cycling.
- Burn-in cohorts: 900 units at 168h / 65°C (B1), 900 units at 1,000h / 65°C (B2).
Programme C — Outdoor edge gateway
- Volume: 2,800 units across two years.
- Operating environment: pole-mounted enclosure, -10 to +55°C, condensing humidity, vibration.
- Burn-in cohorts: 1,400 units at 168h / 75°C (C1), 1,400 units at 1,000h / 75°C (C2).
What we measured
- Burn-in escape rate — failures during the burn-in window itself, separated by hour.
- 0–90 day field failure rate.
- 90–365 day field failure rate.
- Total cost-per-board: burn-in cost (energy, fixture amortisation, test time, labour) plus field-failure cost (RMA logistics, repair labour, customer credit).
All burn-in was done in our own chambers at the Madurai facility with continuous functional monitoring — any unit failing during burn-in was logged with the failure hour and the failure mode, not just a pass/fail at the end.
Results — Where the Failures Actually Cluster
The clearest finding from the study is also the most counter-intuitive one: most of the burn-in failures don't happen evenly across the burn-in window. They cluster, and where they cluster tells you whether 168 hours is enough.
Programme A — industrial controller
- Cohort A1 (168h): 14 burn-in failures (0.67%); failures clustered between hour 8 and hour 48, then nothing.
- Cohort A2 (1,000h): 19 burn-in failures (0.90%); the additional 5 occurred between hour 168 and hour 600, mostly hour 200–350.
- Field-failure rate 0–90 days: A1 0.81%, A2 0.43% — a 47% reduction.
- Field-failure rate 90–365 days: A1 0.62%, A2 0.55% — statistically indistinguishable.
Programme B — medical diagnostic
- Cohort B1 (168h): 7 burn-in failures (0.78%).
- Cohort B2 (1,000h): 9 burn-in failures (1.00%); the additional 2 were both at hour 750+, both due to electrolytic capacitor ESR drift.
- Field-failure rate 0–90 days: B1 0.34%, B2 0.22% — modest improvement.
- Field-failure rate 90–365 days: B1 0.28%, B2 0.31% — no improvement.
Programme C — outdoor gateway
- Cohort C1 (168h): 31 burn-in failures (2.21%); a much higher rate driven by the outdoor BOM.
- Cohort C2 (1,000h): 49 burn-in failures (3.50%); the additional 18 mostly between hour 168 and hour 500.
- Field-failure rate 0–90 days: C1 1.93%, C2 0.79% — a 59% reduction.
- Field-failure rate 90–365 days: C1 1.41%, C2 1.18% — modest improvement.
The pattern
Where the 168-hour burn-in caught most of the infant mortality (Programme B), extending to 1,000 hours added negligible value. Where 168 hours left a long tail of failures still in front of the 0–90 day field window (Programmes A and C), extending to 1,000 hours produced a substantial drop in field returns. The deciding factor is the BOM's intrinsic infant-mortality profile, not a one-size-fits-all rule.
The Cost Crossover — Where 1,000 Hours Pays For Itself
The interesting number is not the failure-rate reduction; it's the total cost. Extended burn-in is expensive: chamber occupancy, fixturing, test cycles, energy, and the opportunity cost of holding inventory for an extra five weeks. It only makes sense when the avoided field failures cost more than the additional burn-in.
The cost breakdown (₹ per board, our facility, 2024 pricing)
- 168-hour burn-in: ₹180 per board on average (chamber + fixture + monitor + labour, amortised over a 100-board chamber).
- 1,000-hour burn-in: ₹620 per board — a delta of ₹440.
- Field-failure cost — Programme A: ₹4,200 per RMA (logistics + repair + credit).
- Field-failure cost — Programme B: ₹11,500 per RMA (medical service-level requirements).
- Field-failure cost — Programme C: ₹8,800 per RMA (outdoor service truck-roll).
The crossover math
For extended burn-in to pay for itself, the field-failure reduction in the 0–90 day window has to exceed (₹440 / field-failure cost) per board:
- Programme A: required reduction = 440 / 4,200 = 10.5%. Observed reduction = 47%. Strongly favours 1,000h. Net saving: ₹160 per board.
- Programme B: required reduction = 440 / 11,500 = 3.8%. Observed reduction = 0.12 percentage points / 0.34% = 35%. Favours 1,000h, but the absolute numbers are small. Net saving: ₹14 per board — barely worth the schedule cost.
- Programme C: required reduction = 440 / 8,800 = 5.0%. Observed reduction = 59%. Strongly favours 1,000h. Net saving: ₹540 per board.
The decision framework we use now
From this data, we now recommend 1,000-hour burn-in when at least two of these conditions hold:
- Field-failure cost per RMA exceeds ₹5,000 (truck-roll, medical, regulated industries).
- The 168-hour cohort's burn-in failure curve has not flattened — failures still occurring at hour 120+.
- The BOM includes electrolytic capacitors on a power rail that operates above 60% of rated voltage continuously.
- The deployment environment includes thermal cycling, humidity, or vibration not represented in burn-in conditions.
Alternatives and Pairing — What Sits Around the Burn-in Decision
Extended burn-in is one tool. It's not the only one, and on some programmes it's not even the right one. The full reliability-screen toolkit has four tools, and the right answer is usually a combination.
HALT (Highly Accelerated Life Test)
HALT applies temperature, vibration, and combined stress at levels beyond operating spec, with the goal of finding latent design weaknesses early. It's a design-validation tool, not a production screen — run it once on the first 20 prototypes, find the weak spots, fix them in the design, and don't repeat in production. Cost per unit: high (₹15,000+). Value: enormous if applied at DVT, marginal if applied at production.
HASS (Highly Accelerated Stress Screen)
HASS is HALT's production-screen sibling — a shorter, less destructive version applied to 100% or sampled production. It's faster than burn-in (typically 4–8 hours) and screens different failure modes — particularly vibration-sensitive workmanship faults. Pairing 168h burn-in with HASS often outperforms 1,000h burn-in alone, at lower cost. Cost per unit: ₹250–500.
Thermal cycling alone
For programmes where the dominant failure mode is solder-joint fatigue or BGA fracture, a thermal-cycling profile (e.g., -40 to +85°C, 200 cycles) catches what static burn-in won't. Cost per unit: ₹400–700.
Component-level burn-in
Some semiconductor families (FPGAs, high-density SoCs) benefit from chip-level burn-in by the manufacturer before they ever reach the board. If you're buying parts that have manufacturer-side burn-in already done, the assembly-level burn-in case weakens — you're screening the assembly process, not the silicon.
"Burn-in is a tool, not a virtue. The teams that go straight to '1,000 hours, always' end up paying for failures they could have caught faster with HASS plus a thermal cycle. The trick is matching the screen to the failure mode you actually have." — Pioneer Horizon test team
Our recommended starting matrix
- Consumer / commercial, low field-failure cost — 168h burn-in, no extension. Costly extension rarely justifies.
- Industrial, moderate cost — 168h burn-in + HASS, paired. Catches the most for the least.
- Outdoor / vibration / humidity — 1,000h burn-in + thermal cycling. The bath-tub tail is too long for 168h alone.
- Medical / aerospace / safety-critical — HALT during DVT, then 1,000h burn-in + HASS + thermal cycling on 100% of production.
If you're scoping a reliability protocol for a new programme, share the BOM and the deployment environment and we'll come back with a screen plan matched to the failure modes that matter for your case, with the cost-per-board math attached.