Introduction: Time-to-event data from clinical trials are routinely extrapolated using parametric models to estimate the cost effectiveness of novel therapies, but how this approach performs in the presence of heterogeneous populations remains unknown. Methods: We performed a simulation study of seven scenarios with varying exponential distributions modelling treatment and prognostic effects across subgroup and complement populations, with follow-up typical of clinical trials used to appraise the cost effectiveness of therapies by agencies such as the UK National Institute for Health and Care Excellence (NICE). We compared established and emerging methods of estimating population life-years (LYs) using parametric models. We also proved analytically that an exponential model fitted to censored heterogeneous survival times sampled from two distinct exponential distributions will produce a biased estimate of the hazard rate and LYs. Results: LYs are underestimated by the methods in the presence of heterogeneity, resulting in either under- or overestimation of the incremental benefit. In scenarios where the overestimation of benefit is likely, which is of interest to the healthcare provider, the method of taking the average LYs from all plausible models has the least bias. LY estimates from complete Kaplan–Meier curves have high variation, suggesting mature data may not be a reliable solution. We explore the effect of increasing trial sample size and accounting for detected treatment–subgroup interactions. Conclusions: The bias associated with heterogeneous populations suggests that NICE may need to be more cautious when appraising therapies and to consider model averaging or the separate modelling of subgroups when heterogeneity is suspected or detected.