Source: MICROBITYPE LLC submitted to NRP
POLYMORPHIC LOCUS SEQUENCE TYPING OF FOODBORNE PATHOGENS
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
1012521
Grant No.
2017-33610-26803
Cumulative Award Amt.
$75,000.00
Proposal No.
2017-00400
Multistate No.
(N/A)
Project Start Date
Jun 15, 2017
Project End Date
Sep 14, 2018
Grant Year
2017
Program Code
[8.5]- Food Science & Nutrition
Recipient Organization
MICROBITYPE LLC
5110 CAMPUS DR STE 170
PLYMOUTH MEETING,PA 19462
Performing Department
(N/A)
Non Technical Summary
This SBIR (small business innovationresearch) Phase I projectis responsive to National Challenge Area: Food Safety, and to USDA Strategic Goal 4.3 - Protect Public Health by Ensuring Food is Safe (www.ocfo.usda.gov/usdasp/usdasp.htm). The detection and investigation of foodborne outbreaks associated with pathogens such as E. coli O157 is highly dependent on strain typing. The typing methods employed must provide sufficient strain resolution to justify the commitment of considerable resources to epidemiological fieldwork. Ideally, the strain typing method is also expedient, since time is a critical factor in tracking pathogens to their food and environmental sources. Finally, if an affordable, user-friendly, outsourced, and confidential strain typing service were available to food processors, there is considerable potential for its use in tracking down pathogens within the food chain before an outbreak occurs.
Animal Health Component
100%
Research Effort Categories
Basic
(N/A)
Applied
100%
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
71240101100100%
Goals / Objectives
Strain typing plays a central role in detection and investigation of foodborne outbreaks, particularly those mediated by Salmonella, Listeria, and Shiga toxigenic E. coli (STEC). To a much lesser extent, strain typing is used by food processors to monitor pathogens within their facilities (e.g., introduction of new strains or persistence of established strains), although increasing this practice would undoubtedly reduce outbreaks. The gold standard for strain typing has been pulsed-field gel electrophoresis (PFGE), but its multiple disadvantages have encouraged development of alternative methods, in particular whole genome sequence-based single nucleotide polymorphism analysis (WGS-SNP). Although providing high resolution, WGS-SNP requires major investments in equipment, reagents, and personnel, largely limiting its use to government-supported labs. Neither PFGE nor WGS-SNP are practical alternatives for food processors, and government labs would also benefit from a complementary sequence-based typing approach that is more rapid, less costly, and user-friendly. MicrobiType was founded in 2014 to address this need. Its technology platform ? polymorphic locus sequence typing (PLST) ? is based on standard PCR and dideoxy sequencing, and is hence simple, affordable, and robust. Its novelty is in its targeting of specific genomic loci (patent pending) that have been bioinformatically determined to be the most phylogenetically informative, as a consequence of highly polymorphic tandem repeats. This Phase I proposal will build upon recently published and commercially implemented PLST services for Salmonella and Listeria by developing and evaluating similar services for STEC, along with the relatively neglected but increasingly important foodborne pathogens Campylobacter and Vibrio parahaemolyticus.
Project Methods
The approach used to develop and evaluate PLST typing for STEC, C. jejuni/C. coli, and V. parahaemolyticus will be analogous to the approach previously used with Listeriaand Salmonella. This general algorithmis outlined below, beginning with the bioinformatic identification of PLST loci and primer design (steps 1-9), followed by laboratory testing (steps 10-16). Finally, specific details relevant to each of the three species/groups are provided.1. Tandem repeats are identified within the complete genome sequences from 3 representative strains using Tandem Repeats Database (https://tandem.bu.edu). From the 200 or so loci, approximately 20 with highest repeat number and lengths between 6 and 60 bases are downloaded along with 500 base of flanking sequence.2. Each repeat locus is used as query in BLASTN searches (https://blast.ncbi.nlm.nih.gov) of the GenBank Nucleotide and Genomes databases, and the locus evaluated with respect to: (a) presence in all or nearly all strains of that species/group, (b) number of distinct alleles as approximated by the number of different "Max score" values, and (c) overlap if any with VNTRs from published MLVA methods and consideration of their diversity index and allele number.3. For the 5 or so most promising loci based on the above, the full sequences (500 base flanks plus repeat regions) are downloaded from all strains in the Nucleotide and Genomes databases (separate BLASTN searches are used to determine if strains with the same Max score have identical or distinct sequences).4. Downloaded sequences (reoriented as needed to reverse complement) are aligned with Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo), in PHYLIP format.5. PHYLIP alignments are analyzed using dnapars (DNA parsimony; default parameters) in the PHYLIP package (version 3.69; http://evolution.genetics.washington.edu/phylip.html), which, importantly, weighs both insertions/deletions and SNPs. Dendrograms are generated using drawgram.6. Dendrograms are evaluated with respect to (a) allele number, (b) available epidemiological data (e.g., isolates from same outbreak or food/environmental source), and (c) other typing data. For the latter, this typically involves comparisons to other PLST loci, to available serotype data, to available PFGE data, to available WGS-SNP-based phylogenies, and to MLST sequence types (determined by downloading the 7 loci from each strain's genome sequence and submitting to the relevant database; e.g., http://pubmlst.org/campylobacter).7. To estimate relative strain resolution, the diversity index (Simpson's dominance) is calculated (www.alyoung.com/labs/biodiversity_calculator.html).8. For the 2 most promising PLST loci selected based on the above, the BLASTN search is repeated against the GenBank WGS database, followed again by sequence downloads, clustal alignment, and dnapars/dendrogram analysis. For most foodborne pathogens,the WGS database includes hundreds or thousands of strains, and is thus an invaluable resource for this project. On the other hand, there are two common issues with this database: (a) longer repeat regions are often incomplete since these genomes are more fragmented and Illumina-type sequencing technologies struggle with repeat regions; and (b) WGS genomes often lack annotation; i.e., source, year and location of strain isolation. Thus, WGS-derived sequences must be carefully selected to avoid bias. Unfortunately, GenBank's short read archive (SRA) database is not useful for analyzing tandem repeat regions.9. For primer design, clustal alignments are used to identify conserved sequences within the upstream and downstream flanks, ideally separated by 800 to 900 bases (the limit for clearly readable dideoxynucleotide sequencing). Conserved seqeunces typically fall within protein-coding regions, so the the reading frame is taken into consideration (i.e., avoiding the wobble position of codons at the primer 3' terminus. Candidate primers (3 upstream and 3 downstream, for comparison of amplification efficiency, for PCR versus sequencing, and for nested PCR if required) are further screened by BLASTN searches of the WGS database. Final adjustments are made to yield Tm ≈ 60oC before ordering (Integrated DNA Technologies).10. DNA templates for PLST are prepared by simple and safe heat lysis in the client's lab (for this project, the "clients" are USDA-ARS consultants/collaborators). Specifically, isolated colonies are suspended in 200 ul Tris/EDTA (10/1 mM); turbidity should approximate McFarland standard 1. Tubes (screw-capped) are incubated in a 100oC heat block (or boiling water bath) for 15 min, with a cover to ensure bacterial killing throughout the tube. Tubes are transported to MicrobiType overnight at ambient temperature and without biohazard packaging (which substantially reduces shipping costs).11. At MicrobiType, tubes are centrifuged to pellet cell debris. PCR is conducted with Taq polymerase as recommended by the manufacturer (New England BioLabs), with minor modifications. Template (0.5 to 2 ul of lysate) and cycle number (28 to 32) are adjusted according to the turbidity of the bacterial lysate. Template-free controls are included to rule out contamination. (In initial studies with representative templates, all 6 combinations of the 3 upstream and 3 downstream primers are tested; subsequently, the optimum primer pair is used.) (Time ≈ 2.5 h)12. Aliquots from the PCR tubes are analyzed by conventional agarose gel electrophoresis and SYBR Safe staining (Invitrogen) to assess yield and quality of product. (Time ≈ 2.5 h).13. For sequencing, PCR products (1 to 3 ?l) are treated with ExoSAP-IT as recommended by the manufacturer (Affymetrix), sequencing primer is added to 2 ?M, and samples sent by courier to Genewiz (South Plainfield, NJ). Sequences are typically available online the following morning. (Time ≈ 18 h)14. DNA sequences are edited as needed based on visual inspection of the chromatograms, and trimmed to common termini. Sequences are analyzed by BLASTN searches of the GenBank Nucleotide, Genomes, and WGS databases to identify any identical matches. Database annotations and dendrogram results (step 8 above) for the matched strain are recorded (Time ≈ 0.5 h per sequence).15. Sequences lacking an identical GenBank database match are added to the downloaded sequences (step 8) and analyzed by clustal and dnapars to generate a dendrogram as described above to show relationship to database strains. (Time ≈ 1 h)16. All non-confidential sequences are deposited in GenBank with relevant annotation.

Progress 06/15/17 to 09/14/18

Outputs
Target Audience:The target audience for this project are food processors, and more specifically the personnel within food processing facilities responsible for detecting and tracking down sources of foodborne pathogens, or alternatively, the contract microbiology labs many food processors work with to comply with federal regulations and ensure the safety of their products.? Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest?Dissemination to communities of interest has been through: (1) the website www.microbitype.com; (2) poster and oral presentations at the International Association for Food Protection (IAFP) annual meeting in Salt Lake City, July 2018; (3) submission for publication in peer-reviewed scientific journals such as Applied and Environmental Microbiology and Journal of Food Protection. What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Initial MicrobiType R&D, in collaboration with USDA-ARS researchers, led to the development, evaluation, and commercial implementation of polymorphic locus sequence typing (PLST) services for the foodborne pathogens Listeria monocytogenes, Yersinia enterocolitica, and Salmonella enterica. With support from this USDA-NIFA SBIR Phase I award, PLST services have now been extended to: (1) Shiga-toxigenic E. coli (www.microbitype.com/escherichia-coli; IAFP 2018 poster; manuscipt in preparation) (2) Vibrio parahaemolyticus (www.microbitype.com/vibrio-parahaemolyticus; IAFP 2018 poster; manuscript submitted for publication) (3) Campylobacter (www.microbitype.com/campylobacter). Furthermore, preliminary studies completed during this Phase I were integral to the development of two new projects that form the basis for a follow-on USDA-NIFA Phase I application, entitled "Isolate- and Culture-independent Typing of Foodborne Pathogens" (submitted October 24, 2018). Specifically, these studies demonstrated the feasibility of: (1) Coupling commercially available environmental sampling tests for Listeria and Salmonella with PLST (EST-PLST). (2) Culture-independent PLST (ciPLST) directly from poulty rinses of difficult-to-culture Campylobacter.

Publications


    Progress 06/15/17 to 06/14/18

    Outputs
    Target Audience:Food safety researchers and professionals Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest?The three Phase I Objectives involved extension of polymorphic locus sequence typing (PLST) to shiga-toxigenic Escherichia coli (STEC), Vibrio parahaemolyticus, and Campylobacter jejuni/coli. For the former two, work has been completed, a poster has been presented (at ASM-Microbe 2017, New Orleans; see attached) or abstract submitted (to IAFP 2018, Salt Lake City; see attached), and manuscripts are in final stages of preparation. For the latter, encouraging progress has been made in an unanticipated direction. For all three, the work has been commercially implemented as typing services on the MicrobiType website (www.microbitype.com) at the standard costs of $50-70 per sample. Key components of this progress are summarized below. What do you plan to do during the next reporting period to accomplish the goals?Posters will be presented, and papers submitted and published.

    Impacts
    What was accomplished under these goals? STEC: The initial bioinformatic screen identified 2 promising PLST loci for STEC typing. The EcMT1 locus proved to be well suited for pan-E. coli typing, while EcMT2 provided exceptional resolution of O157 strains, the clinically most important serogroup due to its association with hemolytic uremic syndrome. In dendrograms, EcMT1 typing can be used to confirm or presumptively identify O157 strains since they form a distinct cluster. Subsequently, EcMT2 typing can be used for O157 epidemiological analysis, illustrated by resolution of the three 2006 outbreaks. With respect to the infamous spinach outbreak, it is apparent that some strains diverged from the major cluster; this same complex pattern was first described by Eppinger et al. (2011, Proc Natl Acad Sci USA 108:20142-20147) in their WGS-SNP analysis; indeed the strain resolution of EcMT2 and WGS-SNP appear comparable. V. parahaemolyticus: Bioinformatic analysis at MicrobiType, and laboratory studies in collaboration with USDA-ARS researcher Gary Richards, identified VpMT1 on chromosome 1 as the most promising PLST candidate for typing this shellfish-associated pathogen. Consistent with this, VpMT1 encompasses a tandem repeat identified as the most polymorphic component of two published MLVA typing protocols, yielding diversity indexes based on length alone of 0.88 to 0.90 (Kimura et al., 2008; Harth-Chu et al., 2009). In comparison, VpMT1 sequence analysis yielded a diversity index of 0.99, with 334 strains resolved into 164 alleles. This increased resolution, reflecting polymorphism in the form of both SNPS and insertions/deletions, is even more notable because the analysis was limited to North American isolates. Most importantly, VpMT1 analysis identified multiple clusters of strains known or likely to be epidemiologically related. V. parahaemolyticus contamination of oysters peaks in the warmer waters of summer months, often exceeding 103 CFU/g. This suggested to us that V. parahaemolyticus could be typed directly from oysters without enrichment; i.e., culture-independent PLST (ciPLST). Attempts to do this from the oyster hemolymph were unsuccessful, most likely due to nucleases. However, the "liquor" in which the oyster is immersed in its shell (and generally consumed with the raw oyster) was an excellent substitute, yielding high quality sequence chromatograms following DNA purification and nested PCR. Of 7 oysters analyzed from the U.S. east coast harvested in early September, 3 yielded VpMT1 PCR products with sequences related but non-identical to those from clinical or environmental isolates in GenBank genome databases. Using oysters harvested in mid-October, VpMT1 positives decreased to 2 in 24. C. jejuni/coli: The most promising PLST candidates identified bioinformatically in Phase I proved, not too surprisingly, to be the two loci already used for C. jejuni/coli sequence-based typing: porA and flaA (Cody et al., 2009, Microbiology 155:4145-4154; Clark et al., 2012, J Clin Microbiol 50:798-809). By analysis of current genome databases, however, these typing loci were improved by designing new primers that target more highly conserved regions, and extend the loci to incorporate additional polymorphism. As an example, the CjcMTporA dendrogram illustrates the distinct clustering of strains from the Walkertown waterborne outbreak, an accidental laboratory infection (human and mouse isolates derived from a lab strain), and isolates associated with Guillian-Barré syndrome. The "unanticipated direction" noted above involves preliminary results demonstrating culture-independent PLST of C. jejuni/coli directly from two chicken part rinses. CjcMTporA typing of these enrichment-free samples confidently clustered these contaminants with either C. jejuni or C. coli.

    Publications

    • Type: Conference Papers and Presentations Status: Accepted Year Published: 2018 Citation: Sequence-Based Typing for Tracking Foodborne Shiga-Toxigenic Escherichia coli
    • Type: Conference Papers and Presentations Status: Accepted Year Published: 2018 Citation: CbMT Sequence Typing for Identification and Tracking of Foodborne Clostridium Botulinum Outbreaks
    • Type: Conference Papers and Presentations Status: Accepted Year Published: 2018 Citation: Development and Evaluation of Sequence-Based Typing Services for Epidemiological Tracking of Vibrio parahaemolyticus


    Progress 06/15/17 to 02/14/18

    Outputs
    Target Audience:Food safety researchers and professionals Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest?The three Phase I Objectives involved extension of polymorphic locus sequence typing (PLST) to shiga-toxigenic Escherichia coli (STEC), Vibrio parahaemolyticus, and Campylobacter jejuni/coli. For the former two, work has been completed, a poster has been presented (at ASM-Microbe 2017, New Orleans; see attached) or abstract submitted (to IAFP 2018, Salt Lake City; see attached), and manuscripts are in final stages of preparation. For the latter, encouraging progress has been made in an unanticipated direction. For all three, the work has been commercially implemented as typing services on the MicrobiType website (www.microbitype.com) at the standard costs of $50-70 per sample. Key components of this progress are summarized below. What do you plan to do during the next reporting period to accomplish the goals?Posters will be presented, and papers submitted and published.

    Impacts
    What was accomplished under these goals? STEC: The initial bioinformatic screen identified 2 promising PLST loci for STEC typing. The EcMT1 locus proved to be well suited for pan-E. coli typing, while EcMT2 provided exceptional resolution of O157 strains, the clinically most important serogroup due to its association with hemolytic uremic syndrome. In dendrograms, EcMT1 typing can be used to confirm or presumptively identify O157 strains since they form a distinct cluster. Subsequently, EcMT2 typing can be used for O157 epidemiological analysis, illustrated by resolution of the three 2006 outbreaks. With respect to the infamous spinach outbreak, it is apparent that some strains diverged from the major cluster; this same complex pattern was first described by Eppinger et al. (2011, Proc Natl Acad Sci USA 108:20142-20147) in their WGS-SNP analysis; indeed the strain resolution of EcMT2 and WGS-SNP appear comparable. V. parahaemolyticus: Bioinformatic analysis at MicrobiType, and laboratory studies in collaboration with USDA-ARS researcher Gary Richards, identified VpMT1 on chromosome 1 as the most promising PLST candidate for typing this shellfish-associated pathogen. Consistent with this, VpMT1 encompasses a tandem repeat identified as the most polymorphic component of two published MLVA typing protocols, yielding diversity indexes based on length alone of 0.88 to 0.90 (Kimura et al., 2008; Harth-Chu et al., 2009). In comparison, VpMT1 sequence analysis yielded a diversity index of 0.99, with 334 strains resolved into 164 alleles. This increased resolution, reflecting polymorphism in the form of both SNPS and insertions/deletions, is even more notable because the analysis was limited to North American isolates. Most importantly, VpMT1 analysis identified multiple clusters of strains known or likely to be epidemiologically related. V. parahaemolyticus contamination of oysters peaks in the warmer waters of summer months, often exceeding 103 CFU/g. This suggested to us that V. parahaemolyticus could be typed directly from oysters without enrichment; i.e., culture-independent PLST (ciPLST). Attempts to do this from the oyster hemolymph were unsuccessful, most likely due to nucleases. However, the "liquor" in which the oyster is immersed in its shell (and generally consumed with the raw oyster) was an excellent substitute, yielding high quality sequence chromatograms following DNA purification and nested PCR. Of 7 oysters analyzed from the U.S. east coast harvested in early September, 3 yielded VpMT1 PCR products with sequences related but non-identical to those from clinical or environmental isolates in GenBank genome databases. Using oysters harvested in mid-October, VpMT1 positives decreased to 2 in 24. C. jejuni/coli: The most promising PLST candidates identified bioinformatically in Phase I proved, not too surprisingly, to be the two loci already used for C. jejuni/coli sequence-based typing: porA and flaA (Cody et al., 2009, Microbiology 155:4145-4154; Clark et al., 2012, J Clin Microbiol 50:798-809). By analysis of current genome databases, however, these typing loci were improved by designing new primers that target more highly conserved regions, and extend the loci to incorporate additional polymorphism. As an example, the CjcMTporA dendrogram illustrates the distinct clustering of strains from the Walkertown waterborne outbreak, an accidental laboratory infection (human and mouse isolates derived from a lab strain), and isolates associated with Guillian-Barré syndrome. The "unanticipated direction" noted above involves preliminary results demonstrating culture-independent PLST of C. jejuni/coli directly from two chicken part rinses. CjcMTporA typing of these enrichment-free samples confidently clustered these contaminants with either C. jejuni or C. coli.

    Publications

    • Type: Conference Papers and Presentations Status: Accepted Year Published: 2018 Citation: Sequence-Based Typing for Tracking Foodborne Shiga-Toxigenic Escherichia coli
    • Type: Conference Papers and Presentations Status: Accepted Year Published: 2018 Citation: CbMT Sequence Typing for Identification and Tracking of Foodborne Clostridium Botulinum Outbreaks
    • Type: Conference Papers and Presentations Status: Accepted Year Published: 2018 Citation: Development and Evaluation of Sequence-Based Typing Services for Epidemiological Tracking of Vibrio parahaemolyticus