TY - JOUR
T1 - Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion
T2 - Retrospective Machine Learning Approach
AU - Campbell, Amy Marie
AU - Hauton, Chris
AU - van Aerle, Ronny
AU - Martinez-Urtaza, Jaime
N1 - Publisher Copyright:
© 2024, JMIR Publications Inc.. All rights reserved.
PY - 2024/11/28
Y1 - 2024/11/28
N2 - BACKGROUND: Environmentally sensitive pathogens exhibit ecological and evolutionary responses to climate change that result in the emergence and global expansion of well-adapted variants. It is imperative to understand the mechanisms that facilitate pathogen emergence and expansion, as well as the drivers behind the mechanisms, to understand and prepare for future pandemic expansions.OBJECTIVE: The unique, rapid, global expansion of a clonal complex of Vibrio parahaemolyticus (a marine bacterium causing gastroenteritis infections) named Vibrio parahaemolyticus sequence type 3 (VpST3) provides an opportunity to explore the eco-evolutionary drivers of pathogen expansion.METHODS: The global expansion of VpST3 was reconstructed using VpST3 genomes, which were then classified into metrics characterizing the stages of this expansion process, indicative of the stages of emergence and establishment. We used machine learning, specifically a random forest classifier, to test a range of ecological and evolutionary drivers for their potential in predicting VpST3 expansion dynamics.RESULTS: We identified a range of evolutionary features, including mutations in the core genome and accessory gene presence, associated with expansion dynamics. A range of random forest classifier approaches were tested to predict expansion classification metrics for each genome. The highest predictive accuracies (ranging from 0.722 to 0.967) were achieved for models using a combined eco-evolutionary approach. While population structure and the difference between introduced and established isolates could be predicted to a high accuracy, our model reported multiple false positives when predicting the success of an introduced isolate, suggesting potential limiting factors not represented in our eco-evolutionary features. Regional models produced for 2 countries reporting the most VpST3 genomes had varying success, reflecting the impacts of class imbalance.CONCLUSIONS: These novel insights into evolutionary features and ecological conditions related to the stages of VpST3 expansion showcase the potential of machine learning models using genomic data and will contribute to the future understanding of the eco-evolutionary pathways of climate-sensitive pathogens.
AB - BACKGROUND: Environmentally sensitive pathogens exhibit ecological and evolutionary responses to climate change that result in the emergence and global expansion of well-adapted variants. It is imperative to understand the mechanisms that facilitate pathogen emergence and expansion, as well as the drivers behind the mechanisms, to understand and prepare for future pandemic expansions.OBJECTIVE: The unique, rapid, global expansion of a clonal complex of Vibrio parahaemolyticus (a marine bacterium causing gastroenteritis infections) named Vibrio parahaemolyticus sequence type 3 (VpST3) provides an opportunity to explore the eco-evolutionary drivers of pathogen expansion.METHODS: The global expansion of VpST3 was reconstructed using VpST3 genomes, which were then classified into metrics characterizing the stages of this expansion process, indicative of the stages of emergence and establishment. We used machine learning, specifically a random forest classifier, to test a range of ecological and evolutionary drivers for their potential in predicting VpST3 expansion dynamics.RESULTS: We identified a range of evolutionary features, including mutations in the core genome and accessory gene presence, associated with expansion dynamics. A range of random forest classifier approaches were tested to predict expansion classification metrics for each genome. The highest predictive accuracies (ranging from 0.722 to 0.967) were achieved for models using a combined eco-evolutionary approach. While population structure and the difference between introduced and established isolates could be predicted to a high accuracy, our model reported multiple false positives when predicting the success of an introduced isolate, suggesting potential limiting factors not represented in our eco-evolutionary features. Regional models produced for 2 countries reporting the most VpST3 genomes had varying success, reflecting the impacts of class imbalance.CONCLUSIONS: These novel insights into evolutionary features and ecological conditions related to the stages of VpST3 expansion showcase the potential of machine learning models using genomic data and will contribute to the future understanding of the eco-evolutionary pathways of climate-sensitive pathogens.
KW - VpST3
KW - climate change
KW - ecology
KW - evolution
KW - genomics
KW - machine learning
KW - pathogen expansion
KW - sequence type 3
KW - sequencing
KW - vibrio parahaemolyticus
UR - http://www.scopus.com/inward/record.url?scp=85211347018&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/89bc1af8-3568-3fa9-a8b9-6b8969b5fc97/
UR - https://portalrecerca.uab.cat/en/publications/09e48d67-f6d2-4ac6-b08e-75db2b819ec7
U2 - 10.2196/62747
DO - 10.2196/62747
M3 - Article
C2 - 39607996
SN - 2563-3570
VL - 5
JO - JMIR Bioinformatics and Biotechnology
JF - JMIR Bioinformatics and Biotechnology
M1 - e62747
ER -