Source: CORNELL UNIVERSITY submitted to NRP
USING WHOLE GENOME SEQUENCING DATA TO IMPROVE MYCOBACTERIUM BOVIS OUTBREAK INVESTIGATION EFFICIENCY
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
1022937
Grant No.
2020-67034-31732
Cumulative Award Amt.
$120,000.00
Proposal No.
2019-07280
Multistate No.
(N/A)
Project Start Date
Jun 15, 2020
Project End Date
Jun 14, 2022
Grant Year
2020
Program Code
[A7101]- AFRI Predoctoral Fellowships
Recipient Organization
CORNELL UNIVERSITY
(N/A)
ITHACA,NY 14853
Performing Department
Pop. Med. & Diag. Sci.
Non Technical Summary
Bovine tuberculosis (bTB) is an important agricultural disease from both an economic and a public health perspective. Mycobacterium bovis, the cause of bTB has a wide host range including cattle, white-tailed deer (Odocoileus virginianus), and humans, and causes decreased milk and meat production in cattle. Humans can become infected by contact with infected animals or, more commonly through consumption of unpasteurized milk. Risk of zoonotic bTB infection has been greatly reduced in developed countries by a combination of effective test and slaughter control programs and milk pasteurization; however, M. bovis is still responsible for 1-2% of human tuberculosis cases in the US. Due to the risk of zoonotic infection, and the production losses caused by bTB, the US instituted a bTB eradication program in 1917. This program includes slaughter surveillance, and a lengthy testing protocol for infected herds estimated to cost over $1.5 million in a 1000 cow dairy herd. Despite over 100 years of directed national control, outbreaks still occur in cattle every year.There are two explanations for the difficulty in eliminating bTB. First, bTB infection in cattle remains subclinical, and potentially undetected for months to years until lesions become large enough to impair organ function. Second, there are multiple sources of bTB infection in cattle including spillover from wildlife, animal movement, zoonotic transmission, and fomite transmission. The only known wildlife reservoir of bTB in the US is a population of white-tailed deer in Michigan, and cross species transmission remains an important source of cattle infection there. Cattle trade between Mexico and the US is common and bTB prevalence is 14.2% in some regions in Mexico. Movement of animals from Mexico to the US could result in transmission; however, animal movement and wildlife do not account for all outbreaks in the US. Remaining cases may be linked to zoonotic transmission.Because of the potential lag time between infection and detection, and because of the multiple possible sources of infection, identifying the source of infection in a herd are difficult. Although thorough sampling is routine in traceback investigations, animal movement records are often incomplete, and other potential sources of infection, including zoonotic transmission, are difficult to characterize. To maximize the information available from outbreak investigations, whole genome sequencing (WGS) of M. bovis isolates from infected herds was incorporated into the bTB eradication program by the National Veterinary Services Laboratories in 2013. We will fill a critical gap in determining the source of an outbreak by predicting time and location of infection. Our objective is to utilize M. bovis WGS data to increase efficiency in outbreak investigations by 1) estimating time since herd infection and drivers of variation in M. bovis evolutionary rate, and 2) predicting the geographical source of outbreaks by analyzing pan-genomic population structure.
Animal Health Component
25%
Research Effort Categories
Basic
75%
Applied
25%
Developmental
0%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
3114010117050%
3117310209050%
Goals / Objectives
Despite over 100 years of a national eradication program, bovine tuberculosis remains an important cattle disease in the United States and around the world. Mycobacterium bovis, the causative agent of bovine tuberculosis, has a wide host range including humans, making disease control in animal populations a public health priority. Bovine tuberculosis is difficult to control because of a long latency period before infected animals show clinical signs of disease, and because transmission patterns are often ambiguous due to limited animal movement records, cross-species transmission from infected wildlife, and potential zoonotic transmission from humans to animals.Our central objective is to utilize M. bovis whole genome sequencing data to increase efficiency in outbreak investigations.Our supporting objectives are to:Estimate time since herd infection and investigate drivers of variation in M. bovis evolutionary ratePredictthe geographical source of outbreaks by analyzing pan-genomic population structure.Through close collaboration with partners at the National Veterinary Services Laboratory and the Center for Epidemiology and Animal Health, predictive models developed in this proposal will be incorporated into ongoing and future M. bovis outbreaks to narrow down the source of an outbreak in both time and space. Our project aligns closely with the Farm Bill Priority area: Animal health and production and animal products. Research produced in this project will have a direct impact on production animal health by helping to decrease M. bovis transmission, contributing to the ultimate goal of eliminating bovine tuberculosis from the United States and the world.
Project Methods
Efforts:Estimating the amount of time since a herd was infected: Drtermining the time since herd infection depends on accuratelyestimatingtheevolutionaryrateof M. bovis, or the relationship between the bacteria's mutation rate and evolutionary time. To estimate time since infection, we will develop Bayesianphylogeneticmodels and a novel convolutional neural network based tool. First, we will BEAST2 to build time-calibrated phylogenies and estimate evolutionary rate of differnetM. bovislineages.Estimatesof evolutionaryratewillthenbe incorporated intolinear modelscapableofpredicting time since herd infectioninanoutbreak. Secondly, we will build aconvolutional neural network that will predict the time since a herd was infected using an alignment of outbreak M. bovissequences.The model will be trained on simulated outbreak sequences with associated time data. The fitted neural network will then be tested using real outbreak data with reliable historical estimates of the amount of time a herd was infected before the outbreak was detected, and the estimates will be compared to those generated using the bayesian phylogenetic approach.Estimating the geographical source of an outbreak:Two population structure algorithms will be applied to estimate geographic structure. The first, RhierBAPS, which is an implementation of hierBAPS n the programming language R, is a model-based population structure algorithm that incorporates sample geographic coordinates and a Markovian sequence clustering model to estimate population structure. hierBAPS cannot incorporate accessory genes, so will be run using only core genes, present in every sequence in the database. The second algorithm, popPUNK F incorporates both core and accessory genes. popPUNK calculates the pairwise Jaccard distance between sequence kmers to estimate sequence divergence. Clusters are then created from core and accessory distances using Gaussian mixture models. An undirected network is then created for each cluster from samples (nodes) connected by distances (edges). This network then serves as a reference network, which can be updated to incorporate information from new sequences. We will compare the accuracy of these two algorithms on labeled data and produce a dynamic database of geographic results.Evaluation:For each aim two separate methodologies will be used to generate estimates. In aim 1,the estimate of time since herd infection generated by the Bayesian phylogenetic approachwill be compared to estimates generated from the deep learning approach. Both estimates will be compared against real outbreak data, where epidemiologic investigation lead to conclusive evidence for a time associated withM. bovisintroduction into a herd. In the second aim, the accuracy of HeirBAPS and popPUNK will be compared to real outbreak data with accurate animal source data.

Progress 06/15/20 to 06/14/22

Outputs
Target Audience:The target audience of this work is the agencies and individuals involved in the US National Bovine Tuberculosis Eradication Program including collaborators at the National Veterinary Services Laboratories and other academic instititutions around the world interested in Mycobacteriumbovis and Mycobacterium tuberculosis complex evolutionary biology. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Kristina attended two virtual workshops as part of the University Washington Summer Institute for Statistical Genomics series. Workshop 1: Population Genetics Workshop 2: MCMC for Genetics Kristina also participated in the following weekly seminars: Epi seminar fall 2020: SARS-CoV-2 epidemiology and public health interventions. Epi seminar Spring 2021: Decision Making: Is our research helping, hindering, or making any difference? BTRY 6890: Population Genetics journal club Kristina also took Human Genomics (BioMG 6871) in the fall 2021 semester for general interest and to learn more about vertebrate genomics. She also attended the 2020and 2021 Conferences for Research Workers in Animal Disease. Shealso attended several virtual conferences including the Biodiversity Genomics conference and Planetary Health meeting. Lastly,Kristinasuccessfuly defended her PhD thesis in May 2022. How have the results been disseminated to communities of interest?Kristina presented a virtual oral presentation at the Conference for Research Workers in Animal Disease (CRWAD), 2020 Chicago IL, titled: Exploring mechanisms of accessory genome evolution the clonally evolving Mycobacterium tuberculosis complex. She also presented a talk on the M. bovis pangenome workat the Conference for Research Workers in Animal Disease in Chicago IL in December, 2021 and was awarded the American College of Veterinary Microbiologists Don Kahn Award for Best Overall Presentation. This work was published inMicrobial Genomics in 2022. Kristina also participated in quarterly zoom meetings and regular email communication with US National Bovine TB Eradication Program leaders including Dr. Suelee Robbe-Austerman, Dr. Kathy Orloski, Dr. Tyler Thacker and Dr. Claudia Perea. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Goal 1. Estimating time since herd infection and investigating drivers of variation in M. bovis evolutionary rate. Throught Kristina's training, she learned that uccess of goal 1 heavilyreliedupon understanding how M. bovis evolves within hosts and between hosts, and how that evolution, quantified by aquired mutations, is related to time. Because a deep understanding of within host evolution is necessary to develop predictive models, Kristina developed additional models beyond what was originally described in the project proposal and conducted careful sequence analyses to better understandM. bovisgenome dynamics within hosts. Year 1 Accomplishments: Development of a simulation framework to study within host evolution. Our first task was to study how different evolutionary forces including population bottlenecks and clonal progeny skew impact both the trajectory of mutations in a population in a parameter estimation framework that does not rely on narrow assumptions such that of equal variance offspring distributions in the Kingman's coalescent. Kristina developed a forward genetic simulation structure to study how demographic forces and infetion dynamics influence the evolution of M. bovis. This framework was further developed in year 2. Year 2 Accomplishments:Testing evolutionary hypotheses using simulation We studied the within host evolution of M. bovis using the empirical distribution mutations that arise de novo within an animal host and forward genetic simulation to test hypotheses for evolutionary mechanisms within the host. We found that the empirical distribution of mutations does not deviate from a null expectation of random mutation, suggesting drift, not selection, is the primary evolutionary force driving mutation during infection in animal hosts. We also found that a within host evolutionary scenario involving skewed offspring distributions and panmictic within host populations and a rapid mutation rate was best supported by the data. The diversity generating effect of faster mutation rate is counteracted by the diversity reducing effect of large variance in offspring distributions, where a minority of individuals produces offspring in the next generation. Our simulation framework can be applied to other aspects of M. bovis molecular epidemiology, including transmission analysis, and studying the evolution of important phenotypes like antimicrobial resistance. During this time Kristina alsodeveloped a machine learning predictionframework to estimate characteristics of an outbreak using whole genome sequencing data. We are currently in the process of testing convolutional neural networks (CNN) and other machine learning methods to predict the relationship between inputs (outbreak SNP alignments and indel presence/absence data) and time. CNNs are a class of deep learning models commonly used in image processing and are designed to recognize patterns in data without a needing to compress input data into a predefined feature vector. A CNN framework was selected for this problem because of the ability of CNNs to accurately identify evolutionary processes from labeled population genetic data, with equal or greater accuracy than other methods. Additionally, CNNs take matrices as inputs, and sequence alignments can be easily formatted as matrices without loss of genome position information.The fitted model will allow us to estimate the relationship between mutations and real time to provide a method for determining the time since a herd was infected. The model will also be capable of estimating the population bottleneck size, analogous to the transmission dose, which is currently unknown. Furthermore, this modeling framework provides an alternative to coalescent-based methods for studying outbreak dynamics in clonally evolving, intracellular pathogens. Goal 2. Predict the geographical source of outbreaks by analyzing pan-genomic population structure. Recent work published in 2021 suggested that M. bovis has an open pangenome, implying that new genes would be discovered with each new genome sequenced. Since M. bovis is thought to evolve strictly clonally, with no known mechanism for horizontal gene transfer, we were skeptical of this result. In the absence of horizonatal gene transfer, new genes should not be acquired by mechansims other than duplication and mutation, so M. bovis should in theory have a limited capacity for new gene evolution compared to other prokaryotes. Year 1 Accomplishments: Whole genome de novo assembly and pangenome characterization Before addressing our goal of utilizing gene content variation in outbreak analysis, we needed to confirm whether the M. bovis pangenome is truly open, or if sequencing, annotation, or assembly errors artificially inflated the accessory gene count and pangenome size in previous studies. First, Kristinade novo assembled a sample of 1463 globally distributed M. bovis genomes, representative of all known M. bovis lineages. She then constructed the pangenome, and developed a series of quality control analyses to ensure that each accessory gene identified was truely a variable gene, and not an annotation error. Year 2 Accomplishments: Using this large sample of M. bovis genomes, and throurough bioinformatic analyses involving additional non-standard quality control procedures, we confirmed that theM. bovispangenome is infact compact consistent with a closed pangenemoe and ongoing clonal evolution. Kristina also found that indel variation is commen among outbreak sequences, even with similar or identical SNP patterns, so we concluded that altough gene content variation is very limited inM. bovis, indel variation could be a useful source of information in outbreak analysis. This project was published in Microbial Genomics in 2022.

Publications

  • Type: Journal Articles Status: Accepted Year Published: 2022 Citation: Ceres, Kristina M., Stanhope, Michael J., Grohn, Yrjo T. "A critical evaluation of Mycobacterium bovis pangenomics, with reference to its utility in outbreak investigation." Microbial Genomics (2022).


Progress 06/15/21 to 06/14/22

Outputs
Target Audience:The target audiences reached in the June 2021-June 2022 funding period include 1) USDA National Veterinary Services Laboratories scientists through virtual meetings, 2) the general animal health community through presentation at the Conference for Research Workers in Animal Disease in Chicago, IL, December 2021. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?I took Human Genomics (BioMG 6871) in the fall 2021 semester for general interest and to learn more about vertebrate genomics. In addition to the 2021 Conference for Research Workers in Animal Disease, I also attended several virtual conferences including theBiodiversity Genomics conference andPlanetary Health meeting. How have the results been disseminated to communities of interest?I presented a talk on the M. bovis pangenome workat the Conference for Research Workers in Animal Disease in Chicago IL in December, 2021 and was awarded the American College of Veterinary MicrobiologistsDonKahnAward for Best Overall Presentation. This work is also accepted and pending open access publication at Microbial Genomics. What do you plan to do during the next reporting period to accomplish the goals?This is the final reporting period, however I plan to continue work on the convolutional neural network through the end of 2022. Specifically, I will use the training database developed during the funding period to train the convolutional neural network. The training process will include hyperparameter tuning to achieve optimal prediction accuracy. The model with then be tested on United States 2021-2022 M. bovis outbreak data provided by USDA partners at the National Veterinary Services Laboratories.

Impacts
What was accomplished under these goals? Goal 1. Estimating time since herd infection and investigating drivers of variation in M. bovis evolutionary rate. Success of goal 1 relies upon understanding how M. bovis evolves within hosts and between hosts, and how that evolution, quantified by aquired mutations, is related to time. Our first task was tostudy how different evolutionary forces including population bottlenecks and clonal progeny skew impact both the trajectory of mutations in a population in a parameter estimation framework that does not rely on narrow assumptions such that of equal variance offspring distributions in the Kingman's coalescent. Goal 1a. Develop a parameter estimation framework for studying M. bovis within host evolution using forward genetic simulation. We studied the within host evolution of M. bovis using the empirical distribution mutations that arise de novo within an animal host and forward genetic simulation to test hypotheses for evolutionary mechanisms within the host. We found that the empirical distribution of mutations does not deviate from a null expectation of random mutation, suggesting drift, not selection, is the primary evolutionary force driving mutation during infection in animal hosts. We also found that a within host evolutionary scenario involving skewed offspring distributions and panmictic within host populations and a rapid mutation rate was best supported by the data. The diversity generating effect of faster mutation rate is counteracted by the diversity reducing effect of large variance in offspring distributions, where a minority of individuals produces offspring in the next generation. Our simulation framework can be applied to other aspects of M. bovis molecular epidemiology, including transmission analysis, and studying the evolution of important phenotypes like antimicrobial resistance. Goal 1b. Using the modeling framework designed in Goal 1a, develop a prediction model to estimate characteristics of an outbreak using whole genome sequencing data We are developing a convolutional neural network (CNN) to predict the relationship between inputs (outbreak SNP alignments and indel presence/absence data) and time. The CNNs are a class of deep learning models commonly used in image processing and are designed to recognize patterns in data without a needing to compress input data into a predefined feature vector. A CNN framework was selected for this problem because of the ability of CNNs to accurately identify evolutionary processes from labeled population genetic data, with equal or greater accuracy than other methods. Additionally, CNNs take matrices as inputs, and sequence alignments can be easily formatted as matrices without loss of genome position information. The fitted model will allow us to estimate the relationship between mutations and real time to provide a method for determining the time since a herd was infected. The model will also be capable of estimating the population bottleneck size, analogous to the transmission dose, which is currently unknown. Furthermore, this modeling framework provides an alternative to coalescent-based methods for studying outbreak dynamics in clonally evolving, intracellular pathogens. Goal 1a status: Analysis is complete, manuscript draft is complete, publication pending; anticipated completion August 2022 Goal 1b status: SImulation design is complete, training database runs in progress; anticipated completion December 2022 Goal 2. Predict the geographical source of outbreaks by analyzing pan-genomic population structure. Recent work published in 2021 suggested that M. bovis has an open pangenome, implying that new genes would be discovered with each new genome sequenced. Since M. bovis is thought to evolve strictly clonally, with no known mechanism for horizontal gene transfer, we were skeptical of this result. In the absence of horizonatal gene transfer, new genes should not be acquired by mechansims other than duplication and mutation, so M. bovis should in theory have a limited capacity for new gene evolutioncompared to other prokaryotes. Before addressing our goal of utilizing gene content variation in outbreak analysis, we needed to confirm whether the M. bovis pangenome is truly open, or if sequencing, annotation, or assembly errors artificially inflated the accessory gene count and pangenome size in previous studies. First Ide novo assembleda sample of 1463globally distributedM. bovis genomes, representative of all known M. bovis lineages. I then constructed the pangenome, and developeda series of quality control analyses to ensure that each accessory gene identified was truely a variable gene, and not an annotation error. Using this large sample, and throurough bioinformatic analyses involving additional non-standard quality control procedures, we confirmed that the M. bovis pangenome is infact compact consistent with a closed pangenemoe and ongoing clonal evolution. I also found that indel variation is commen among outbreak sequences, even with similar or identical SNP patterns, so we concluded that altough gene content variation is very limited in M. bovis, indel variation could be a useful source of information in outbreak analysis. Goal 2 status: complete, manuscript accepted and awaiting publication at Microbial Genomics

Publications

  • Type: Journal Articles Status: Accepted Year Published: 2022 Citation: Ceres, Kristina M., Stanhope, Michael J., Grohn, Yrjo T. "A critical evaluation of Mycobacterium bovis pangenomics, with reference to its utility in outbreak investigation." Microbial Genomics (2022)


Progress 06/15/20 to 06/14/21

Outputs
Target Audience:The target audience was our USDA partners involved in the US National Bovine Tuberculosis Eradication program. Through regular meetings in 2020 and 2021, we refined our project plan to optimize collaboration and Kristina's training goals. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Kristina attended two virtual workshops as part of the University Washington Summer Institute for Statistical Genomics series. Workshop 1: Population Genetics Workshop 2: MCMC for Genetics Kristina also participated in the following weekly seminars Epi seminar fall 2020:SARS-CoV-2epidemiologyand public health interventions. EpiseminarSpring 2021: Decision Making: Is our research helping, hindering, or making any difference? BTRY 6890: Population Genetics journal club How have the results been disseminated to communities of interest? Virtual oral presentation at the Conference for Research Workers in Animal Disease (CRWAD), 2020 Chicago IL,Exploring mechanisms of accessory genome evolution the clonally evolving Mycobacterium tuberculosis complex Quarterly zoom meetings and regular email communication with US National Bovine TB Eradication Program leaders including Dr. Suelee Robbe-Austerman, Dr. Kathy Orloski, Dr. Tyler Thacker and Dr. Claudia Perea. What do you plan to do during the next reporting period to accomplish the goals?Goal 1a: Understand the distinct roles of selection, demography, and progeny skew inM. bovisevolution over the course of an outbreak in a cattle herd. Complete simulations submit manusctipt 3 (Evolutionary pressures governing Mycobacterium boviswithin and between host evolution) Goal 1b: Developa convolutional neural network (CNN) to predict the relationship between inputs (outbreak SNP alignments) and time. complete test data simulations develop CNN structure and tune hyperparameters submit manuscript 4 (Using convolutional neural networks to uncover Mycobacteriumbovis outbreak population dynamics) Goal 2:Predict the geographical source of outbreaks by analyzing pan-genomic population structure. Publish manuscript 1 (Mycobacterium bovis pangenome evolution) Publish manuscript 2 (Genotype to outbreak source prediction using deep learning)

Impacts
What was accomplished under these goals? Goal 1:Estimate time since herd infection and investigate drivers of variation inM. bovisevolutionary rate The focus of project 1 is tolinkM. bovisgenetic diversity generated over the course of an outbreak to time, which depends on accurate estimation of mutation rate. Most methods for combining sampling time data with sequence data rely on a set of assumptions for the underlying mutation generating process including the assumption that each individual in a population has an equal chance creating offspring. However, recent work has shown thatM. tuberculosishas skewed offspring distributions caused by clonal replication and serial population bottlenecks created by transmission. In light of this recent work, we have modified Goal 1 to incorporate 2 parts. Goal 1a is firstto understand the distinct roles of selection, demography, and progeny skew inM. bovisevolution over the course of an outbreak in a cattle herd. Our hypothesis is that the dominant evolutionary mechanism that produces skewed site frequency spectra toward rare variants is demography characterized transmission bottlenecks and subsequent compartmentalization ofM. boviswithin the host lead to highly structured populations. Results from this project will be used to build simulation models toaccurately depictM. boviswithin host evolution. Goal 1b is to developa convolutional neural network (CNN) to predict the relationship between inputs (outbreak SNP alignments) and time. The CNNs are a class of deep learning models commonly used in image processing and are designed to recognize patterns in data without a needing to compress input data into a predefined feature vector. A CNN framework was selected for this problem because of the ability of CNNs to accurately identify evolutionary processes from labeled population genetic data, with equal or greater accuracy than other methods. Additionally, CNNs take matrices as inputs, and sequence alignments can be easily formatted as matrices without loss of genome position information. The fitted model will allow us to estimate the relationship between mutations and real time to provide a method for determining the time since a herd was infected. The model will also be capable of estimating the population bottleneck size, analogous to the transmission dose, which is currently unknown. Furthermore, this modeling framework provides an alternative to coalescent-based methods for studying outbreak dynamics in clonally evolving, intracellular pathogens. Goal 1status:in progress Empirical data: Outbreak sequences have been identified, and phylogenetic trees and summary statistics for each outbreak have been created. Simulation data: Key simulation parameter values are being estimated in coordination with USDA partners. After these parameters are estimated, simulations will commence on high powered computers. Test data: simulations have been designed to generate test data. Exact parameter values depend on results from Goal 1a. CNN structure: I have conducted a literature review to determine the best starting places for CNN structure design. During the testing process, model hyperparameters will be tuned to produce optimal results. Goal 2: Predict the geographical source of outbreaks by analyzing pan-genomic population structure. The purpose of project 2 is to first determine how the M. bovispangenome has evolved over space and time, and then to develop a model that can predict geographic location from M. bovis genome sequences. This project involves determining patterns of evolution in the core genome (genes shared by all members of the species) and the accessory genome (genes not present in all members), and to determine if these partitions of the pangenome evolve clonally. M. bovis is thought to be a strictly clonal pathogen, with no known mechanisms for ongoing horizontal gene transfer. Therefore, my overarching hypothesis is that the accessory genome evolves through serial gene deletions, and that these deletions, along with purifying selection and serial bottlenecks created by transmission, causes strong population structure. This strong population structure will lead to genetic signatures that are specific to and diagnostic of geographic location. I have found evidence of events that look like gene deletion, consistent with my hypothesis, and, surprisingly, strong evidence for gene addition. Upon closer inspection of the gene addition events, I have found that many have arisen from gene duplication events suggesting that gene duplication and subsequent mutation, instead of horizontal gene transfer, may be an important mechanism for gene content diversity generation in M. bovis.I am currently building supervised machine learning models with SNP data alone, and with SNP data and gene presence/absence data together to explore whether or not gene presence/absence information increases prediction accuracy. Goal 2 status: near complete Evolutionary analysis: complete, manuscript in preparation Geographic analysis: Genotype and location data were acquired froom USDA partners, deep learning model in progress.

Publications