TY - JOUR
T1 - Chest x-ray analysis with deep learning-based software as a triage test for pulmonary tuberculosis
T2 - a prospective study of diagnostic accuracy for culture-confirmed disease
AU - Khan, Faiz Ahmad
AU - Majidulla, Arman
AU - Tavaziva, Gamuchirai
AU - Nazish, Ahsana
AU - Abidi, Syed Kumail
AU - Benedetti, Andrea
AU - Menzies, Dick
AU - Johnston, James C.
AU - Khan, Aamir Javed
AU - Saeed, Saima
N1 - Funding Information:
The authors thank Coralie Geric (MScPH candidate, Department of Epidemiology, Biostatistics, & Occupational Health, McGill University, Montreal, QC, Canada) for generating the forest plots, and qure.ai and Delft for providing technical support with the local installation and usage of the software used in this study.
Publisher Copyright:
© 2020 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY-NC-ND 4.0 license
PY - 2020/11
Y1 - 2020/11
N2 - Background: Deep learning-based radiological image analysis could facilitate use of chest x-rays as triage tests for pulmonary tuberculosis in resource-limited settings. We sought to determine whether commercially available chest x-ray analysis software meet WHO recommendations for minimal sensitivity and specificity as pulmonary tuberculosis triage tests. Methods: We recruited symptomatic adults at the Indus Hospital, Karachi, Pakistan. We compared two software, qXR version 2.0 (qXRv2) and CAD4TB version 6.0 (CAD4TBv6), with a reference of mycobacterial culture of two sputa. We assessed qXRv2 using its manufacturer prespecified threshold score for chest x-ray classification as tuberculosis present versus not present. For CAD4TBv6, we used a data-derived threshold, because it does not have a prespecified one. We tested for non-inferiority to preset WHO recommendations (0·90 for sensitivity, 0·70 for specificity) using a non-inferiority limit of 0·05. We identified factors associated with accuracy by stratification and logistic regression. Findings: We included 2198 (92·7%) of 2370 enrolled participants. 2187 (99·5%) of 2198 were HIV-negative, and 272 (12·4%) had culture-confirmed pulmonary tuberculosis. For both software, accuracy was non-inferior to WHO-recommended minimum values (qXRv2 sensitivity 0·93 [95% CI 0·89–0·95], non-inferiority p=0·0002; CAD4TBv6 sensitivity 0·93 [0·90–0·96], p<0·0001; qXRv2 specificity 0·75 [0·73–0·77], p<0·0001; CAD4TBv6 specificity 0·69 [0·67–0·71], p=0·0003). Sensitivity was lower in smear-negative pulmonary tuberculosis for both software, and in women for CAD4TBv6. Specificity was lower in men and in those with previous tuberculosis, and reduced with increasing age and decreasing body mass index. Smoking and diabetes did not affect accuracy. Interpretation: In an HIV-negative population, these software met WHO-recommended minimal accuracy for pulmonary tuberculosis triage tests. Sensitivity will be lower when smear-negative pulmonary tuberculosis is more prevalent. Funding: Canadian Institutes of Health Research.
AB - Background: Deep learning-based radiological image analysis could facilitate use of chest x-rays as triage tests for pulmonary tuberculosis in resource-limited settings. We sought to determine whether commercially available chest x-ray analysis software meet WHO recommendations for minimal sensitivity and specificity as pulmonary tuberculosis triage tests. Methods: We recruited symptomatic adults at the Indus Hospital, Karachi, Pakistan. We compared two software, qXR version 2.0 (qXRv2) and CAD4TB version 6.0 (CAD4TBv6), with a reference of mycobacterial culture of two sputa. We assessed qXRv2 using its manufacturer prespecified threshold score for chest x-ray classification as tuberculosis present versus not present. For CAD4TBv6, we used a data-derived threshold, because it does not have a prespecified one. We tested for non-inferiority to preset WHO recommendations (0·90 for sensitivity, 0·70 for specificity) using a non-inferiority limit of 0·05. We identified factors associated with accuracy by stratification and logistic regression. Findings: We included 2198 (92·7%) of 2370 enrolled participants. 2187 (99·5%) of 2198 were HIV-negative, and 272 (12·4%) had culture-confirmed pulmonary tuberculosis. For both software, accuracy was non-inferior to WHO-recommended minimum values (qXRv2 sensitivity 0·93 [95% CI 0·89–0·95], non-inferiority p=0·0002; CAD4TBv6 sensitivity 0·93 [0·90–0·96], p<0·0001; qXRv2 specificity 0·75 [0·73–0·77], p<0·0001; CAD4TBv6 specificity 0·69 [0·67–0·71], p=0·0003). Sensitivity was lower in smear-negative pulmonary tuberculosis for both software, and in women for CAD4TBv6. Specificity was lower in men and in those with previous tuberculosis, and reduced with increasing age and decreasing body mass index. Smoking and diabetes did not affect accuracy. Interpretation: In an HIV-negative population, these software met WHO-recommended minimal accuracy for pulmonary tuberculosis triage tests. Sensitivity will be lower when smear-negative pulmonary tuberculosis is more prevalent. Funding: Canadian Institutes of Health Research.
UR - http://www.scopus.com/inward/record.url?scp=85092928782&partnerID=8YFLogxK
U2 - 10.1016/S2589-7500(20)30221-1
DO - 10.1016/S2589-7500(20)30221-1
M3 - Article
C2 - 33328086
AN - SCOPUS:85092928782
SN - 2589-7500
VL - 2
SP - e573-e581
JO - The Lancet Digital Health
JF - The Lancet Digital Health
IS - 11
ER -