Background Knowledge of gestational age is critical for guiding preterm neonatal care. In the last decade, metabolic gestational dating approaches emerged in response to a global health need; because in most of the developing world, accurate antenatal gestational age estimates are not feasible. These methods initially developed in North America have now been externally validated in two studies in developing countries, however, require shipment of samples at sub-zero temperature. Methods A subset of 330 pairs of heel prick dried blood spot samples were shipped on dry ice and in ambient temperature from field sites in Tanzania, Bangladesh and Pakistan to laboratory in Iowa (USA). We evaluated impact on recovery of analytes of shipment temperature, developed and evaluated models for predicting gestational age using a limited set of metabolic screening analytes after excluding 17 analytes that were impacted by shipment conditions of a total of 44 analytes. Results With the machine learning model using all the analytes, samples shipped in dry ice yielded a Root Mean Square Error (RMSE) of 1.19 weeks compared to 1.58 weeks for samples shipped in ambient temperature. Out of the 44 screening analytes, recovery of 17 analytes was significantly different between the two shipment methods and these were excluded from further machine learning model development. The final model, restricted to stable analytes provided a RMSE of 1.24 (95% confidence interval (CI) = 1.10-1.37) weeks for samples shipped on dry ice and RMSE of 1.28 (95% CI = 1.15-1.39) for samples shipped at ambient temperature. Analysis for discriminating preterm births (gestational age <37 weeks), yielded an area under curve (AUC) of 0.76 (95% CI = 0.71-0.81) for samples shipped on dry ice and AUC of 0.73 (95% CI = 0.67-0.78) for samples shipped in ambient temperature. Conclusions In this study, we demonstrate that machine learning algorithms developed using a sub-set of newborn screening analytes which are not sensitive to shipment at ambient temperature, can accurately provide estimates of gestational age comparable to those from published regression models from North America using all analytes. If validated in larger samples especially with more newborns <34 weeks, this technology could substantially facilitate implementation in LMICs.