TY - JOUR
T1 - A versatile tool for precise variant calling in mycobacterium tuberculosis genetic polymorphisms
AU - Razzak, Safina Abdul Abdul
AU - Hasan, Zahra
AU - Azim, M. Kamran
AU - Kanji, Akber
AU - Shakoor, Sadia
AU - Hasan, Rumina
PY - 2023/7/31
Y1 - 2023/7/31
N2 - Background: Whole genome sequencing (WGS) facilitates the diagnosis of multidrug-resistant MDR-TB through the interpretation of sequence variations (SV) in Mycobacterium tuberculosis (MTB) genes. Information on phenotypic and genotypic resistance associations continues to evolve, it is important to identify SV within genes of interest. We developed an MTB-VCF variant calling pipeline that can compare against the reference genome for any gene of interest. We demonstrate its utility for calling SV in genes associated with Rifampicin (RIF), Isoniazid (INH), Ethambutol (EM), and Streptomycin (SM) resistance.Methods: MTB-VCF is a Python-based command line Variant Calling pipeline designed to streamline batch processing from raw reads (FastQ) files. SV called by MTB-VCF were compared with those identified by TBProfiler, KVARQ, CASTB, Mykrobe Predictor and Phy-ResSE pipelines. The sensitivity and Specificity of MTB-VCF SV calling were calculated against the drug susceptibility testing (DST) phenotype.Results: MTB-VCF identified 868 SV present in 200 phenotypically resistant MDR-TB isolates. These were across rpsl, rrs, rpoB, inhA, katG, ahpC, gidB and embCAB genes. Of these, 684 SV were known to be associated with a resistance genotype, leading to a specificity of 97.75%. The SV called by the MTB-VCF was compared separately to resistance genotypes called by TB-Profiler, KvarQ, CASTB, Mykrobe Predictor, and PhyRes-SE pipelines, demonstrating a sensitivity of 99.5%.Conclusion: The MTB-VCF pipeline offers a rapid and accurate solution for identifying SV in target genes for interpretation later. It can be run in large batches, proving flexible computing that allows for the customization of core bioinformatic pipelines, enabling the analysis of WGS data from different technologies.
AB - Background: Whole genome sequencing (WGS) facilitates the diagnosis of multidrug-resistant MDR-TB through the interpretation of sequence variations (SV) in Mycobacterium tuberculosis (MTB) genes. Information on phenotypic and genotypic resistance associations continues to evolve, it is important to identify SV within genes of interest. We developed an MTB-VCF variant calling pipeline that can compare against the reference genome for any gene of interest. We demonstrate its utility for calling SV in genes associated with Rifampicin (RIF), Isoniazid (INH), Ethambutol (EM), and Streptomycin (SM) resistance.Methods: MTB-VCF is a Python-based command line Variant Calling pipeline designed to streamline batch processing from raw reads (FastQ) files. SV called by MTB-VCF were compared with those identified by TBProfiler, KVARQ, CASTB, Mykrobe Predictor and Phy-ResSE pipelines. The sensitivity and Specificity of MTB-VCF SV calling were calculated against the drug susceptibility testing (DST) phenotype.Results: MTB-VCF identified 868 SV present in 200 phenotypically resistant MDR-TB isolates. These were across rpsl, rrs, rpoB, inhA, katG, ahpC, gidB and embCAB genes. Of these, 684 SV were known to be associated with a resistance genotype, leading to a specificity of 97.75%. The SV called by the MTB-VCF was compared separately to resistance genotypes called by TB-Profiler, KvarQ, CASTB, Mykrobe Predictor, and PhyRes-SE pipelines, demonstrating a sensitivity of 99.5%.Conclusion: The MTB-VCF pipeline offers a rapid and accurate solution for identifying SV in target genes for interpretation later. It can be run in large batches, proving flexible computing that allows for the customization of core bioinformatic pipelines, enabling the analysis of WGS data from different technologies.
U2 - 10.1101/2023.07.24.550283
DO - 10.1101/2023.07.24.550283
M3 - Article
JO - Department of Paediatrics and Child Health
JF - Department of Paediatrics and Child Health
ER -