Source: UNIV OF MARYLAND submitted to NRP
DSFAS: TOWARDS THE BIOGEOGRAPHY OF FOODBORNE PATHOGEN GENOMIC DIVERSITY
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
1030674
Grant No.
2023-67022-40017
Cumulative Award Amt.
$299,958.00
Proposal No.
2022-11599
Multistate No.
(N/A)
Project Start Date
Aug 1, 2023
Project End Date
May 31, 2026
Grant Year
2023
Program Code
[A1541]- Food and Agriculture Cyberinformatics and Tools
Recipient Organization
UNIV OF MARYLAND
(N/A)
COLLEGE PARK,MD 20742
Performing Department
(N/A)
Non Technical Summary
Foodborne illness impacts nearly 50 million people in the United States per year, causing severe economic damages and healthcare burdens. There is an urgent need for more effective strategies to manage pathogen transmission across food and agricultural systems. This proposal aims to definethe spatial and temporal distributions of expansive genomic diversity of foodborne pathogens by integrating open-source molecular data with artificial intelligence (AI) and machine learning (ML) technologies, in order to inform practical applications that will promote food safety. Focusing on a variety of pathogens that present unique risks (i.e.,Campylobacter jejuni,Cronobacter sakazakii, andListeria monocytogenes), we will build and validate a computational workflow to understand how selective pressures across food chains drive bacterial environmentaltropisms (e.g., genes underlying persistence, antimicrobial resistance, or virulence linked to specificsource types). Leveraging publicly available genomic and phenotypic data will enable high-throughput identification of new molecular markers associated with foodborne pathogen transmission and evolution. These data will further serve as key reference material to improve strategies for pathogen molecular surveillance with metagenomics. Overall, the analytical framework that wedevelop and promote for use by the food safety research community will have applications that extend to broader foodborne pathogen groups and emerging food safety concerns.
Animal Health Component
30%
Research Effort Categories
Basic
70%
Applied
30%
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
7124010108050%
7234010108050%
Goals / Objectives
Our proposal aims to integrate AI and ML with open-source molecular data to define the spatial and temporal distributions of expansive genomic diversity of foodborne pathogens. The data-analytic workflows that we buildwill have innovative technological applications for food and agricultural industries to support food safety, nutrition, and health. Objective 1:Establish systematic pipeline to elucidate the biogeography of genomic diversity of foodborne pathogens.Objective 2:Benchmark metagenomic tools for foodborne pathogen detection in simulated microbial communities.These independent, yet interrelated objectives will test ouroverarching hypothesisthat environmental stressors along the agricultural continuum to consumers impose selective pressures that drive adaptive responses of diverse foodborne pathogens -C. jejuni,C. sakazakii, andL. monocytogenes- and the emergence of key genes and pathways (e.g., persistence, antimicrobial resistance, virulence) that hold potential for informing novel applications to promote food safety.
Project Methods
Objective 1will establish a reproducible bioinformatics pipeline to annotate and compare pangenomes of critical foodborne pathogens that represent different bacterial lifestyles, survival strategies, and foodborne illness implications (C. jejuni, C. sakazakii,andL. monocytogenes). Leveraging publicly available resources, including >100,000 open-source genome assemblies archived in 'NCBI Pathogen Detection,' will enable high-throughput identification of new molecular markers associated with pathogen transmission and clinical health risks. Genome assembly accessions will be sourced from bacterial isolate metadata, focusing on isolates with assigned source categories (food, environment, and clinical), origin location, and year. Quality control programs will filter and select forhigh quality assemblies, which will be used in downstream pangenome analysis to characterize the comprehensive genetic diversity of respective target species' groups. Additional tools will be employed for feature identification within pangenomes (e.g., antimicrobial resistance genes, virulence genes, metabolic pathways). Machine learning models will be used to uncover associations between the metadata, notably source type, and genomic features.Key genes and pathways putatively associated with environmental tropisms with food safety implications will be identified across geographic and temporal scales for an extensive set of food commodities.These data will further serve as reference material to optimize metagenomic frameworks for pathogen molecular surveillance inObjective 2. Expanding on acquired knowledge of the biogeography of genomic diversity ofC. jejuni,C. sakazakii, andL. monocytogenes, we will explore how strain-level variation impacts detection potential for microbiome profiling with metagenomics. Traditional metagenomic tools will be benchmarked for efficacy in predicting abundances of the respective pathogens in simulated microbial communities, testing effects of pathogen concentration and strain-level variation (i.e., different lineages or clades of each species) on metagenomic predictions. Overall, the AI-based pangenome and metagenome workflows that we develop will have broad applications in future research that extends to additional foodborne pathogen groups and emerging food safety concerns.

Progress 08/01/23 to 07/31/24

Outputs
Target Audience:Our target audience has beenresearchers and stakeholders in microbial genomics, food safety and public health, and machine learning applications to support food science. Across these themes, we have providedadvanced training in genomics and big data analysis to a postdoctoral fellow, a graduate student, and an undergraduate student researcher. Our findings have been shared atregional andinternational meetings, thus reaching the broader scientific community. Changes/Problems:There have not beenmajor changes from the proposed study. However, the project began in August 2023 and it took longer than anticipatedto hire the Postdoctoral associate (Gao), which delayed parts of our proposed investigation to begin around January 2024. Nevertheless, substantial progress in both objectives has been made during the first reporting period. What opportunities for training and professional development has the project provided?Blaustein (PD) participated in a workshop on 'Unlocking the Power of Artificial Intelligence in Food' with the Institutes of Food Technologists. Members of the team include a Postodoctoral associate (Mairui Gao) and PhD student (Madhusudan Timilsina) receiving advanced training in food safety and data analytics at the University of Maryland. Gao shared findings at a regional and national conferences (received a travel award), thus engaging with communication and professional networking opportunities. How have the results been disseminated to communities of interest?Results have been shared in apresentation at aNIFA-PD meeting (Blaustein, 2024), a presentation at theAmerican Society for Microbiology: Conference on Rapid Applied Microbial Next-GenerationSequencing and Bioinformatic Pipelines (Gao et al. 2024), a poster at theInternational Association for Food Protection Annual Meeting (Blaustein and Lam, 2023), and a poster at theAmerican Scoiety for Microbiology: Maryland Branch regional meeting (Gao et al. 2024). Themes related to applications for artifical intellienge and machine learning to improve food safety were further discussed in a panel at IFT-First (Blaustein, 2024) and in a collaborative review article (Ma et al. 2024). What do you plan to do during the next reporting period to accomplish the goals?The next reporting period will focuson publishing research from Objective 1 and Objective 2. Two publications are currently in the preparation stage (Cronobacter pangenome, metagenome simulation benchmarking) and another 1-2 focusing on genomic diversity of Campylobacter and Listeriaare anticipated during the next period as well. The seed project will provide foundational data that will be used to target continued funding support onthemes from this research in future grants that willbe prepared during the next reporting period.

Impacts
What was accomplished under these goals? Objective 1: The pangenomics workflow has been applied to characterize the biogeography of genomic diversity of Cronobacter sakazakii. Ouranalytical framework involved developing a novel machine learning model to improve the structure of big data from 'NCBI Pathogen Detection' that was sourced for theanalysis.Key gene clusters andfunctions encoded by Cronobacter have been found to be correlated with geographic region and source type of diverse isolates of the foodborne pathogen. The reproducible workflow is set to be applied to Listeria monocytogenes and Campylobacter sakazakii underthe continued research plan. Objective 2:Metagenomes representative of different pathogen-food microbiome systems have been simulated, i.e., Campylobacter-chiken meat microbiota, Listeria-dairy milk microbiota, and Cronobacter-powdered food microbiota. Alternative metagenome analytical workflows are being benchmarked for efficacy in appllications for molecular surveillance as a factor of pathogen concentration among the microbiota and pathogen strain type.

Publications

  • Type: Journal Articles Status: Accepted Year Published: 2024 Citation: Ma P, Jia X, Gao M, Yi Z, Tsai S, He Y, Zhen D, Blaustein RA, Wang Q, Wei C. Innovative Food Supply Chain through Spatial computing technologies: A review. Comprehensive Reviews in Food Science and Food Safety.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2023 Citation: Blaustein R, Lam K. 2023. Standardized workflow to define the biogeography of genomic diversity of foodborne pathogens. International Association for Food Protection Annual Meeting. Toronto, Ontario.
  • Type: Conference Papers and Presentations Status: Published Year Published: 2024 Citation: Gao M, Pradhan A, Blaustein R. 2024. Standardized pan-genomic workflow demonstrates genomic variation of C. sakazakii across different isolation sources. American Society for Microbiology Conference on Rapid Applied Microbial Next-Generation Sequencing and Bioinformatic Pipelines. Washington D.C.