Performing Department
Veterinary Pathobiology
Non Technical Summary
Mycoplasmas are important bovine pathogens which significantly impact the US cattle industry. Annual economic losses due to Mycoplasma mastitis and respiratory disease have been estimated to be greater than $100 million. This proposal seeks funding to study genomic variation among mastitis and respiratory isolates. to fill a current knowledge gap which should facilitate the development of better strategies for disease control and prevention.
Animal Health Component
100%
Research Effort Categories
Basic
80%
Applied
10%
Developmental
10%
Goals / Objectives
Acquire, bank and begin genotypic analysis of approximately 100 Mycoplasma bovis isolates.Determine five genome sequences (four Mycoplasma bovis isolates and one Mycoplasma californicum strain).Use PCR to determine distribution of strain variable genomic loci in M. bovis isolates.Use next generation sequencing to generate additional draft genome sequences of M. bovis .
Project Methods
1) Use PCR to determine distribution of strain variable genomic loci in M. bovis isolates Initially 50 mastitis-associated M. bovis isolates and 50 respiratory tract isolates will be studied, together with strain PG45 as a control. Low passage broth cultures of each strain will be grown and total genomic DNA prepared from harvested cells using commercially available kits (Qiagen). Template DNA will then be used in PCR using Phusion High Fidelity DNA polymerase (Fermentas) to test whether specific gene targets are present or absent. For each strain, a positive control reaction will be performed using the M. bovis uvrC gene (55). The targets selected for this study are listed in Table 1 and include 6 genes that are only found in the PG45 genome and 8 genes that are >99% identical between strains HB0801 and Hubei-1, but that are absent from PG45. For single genes that are strain variable in presence or absence between PG45 and the Chinese isolates, an amplicon of the appropriate size will be considered as a positive result. In cases where the gene is not amplified, a further round of PCR will be performed using lower stringency conditions in case there is a sequence mismatch between primers and the target sequence in a given isolate. In parallel, PCR between the conserved flanking genes will be performed. This will either confirm the presence of the target gene, disclose the presence of an "empty site" that lacks the target, or reveal the presence of an unexpected gene or IS unit. In the latter case, amplicon sequencing will be used to identify gene "x".For larger regions of genomic difference, PCR will be used to query the presence of four target genes among the 100 isolates. PCR will also be performed between the target genes and the conserved flanking genes. Strains that have neither the gene arrangement of PG45 or HB0801 will be analyzed further by inverse PCR. This approach allows "chromosome walking" to be performed starting at a known sequence and permitting sequencing of adjacent "unknown" sequence. This strategy has been used by the PI to sequence three different mobile genetic elements in Mycoplasmas that were greater than 20 kb and that had no significant similarity to known sequences. Once a new configuration has been established (if present), then specific PCR primers will be designed for the new genes and used to assess distribution in other M. bovis isolates. Initially, highest priority will be given to the three closely linked regions that represent ~25 kb of strain variable DNA. It is not known whether strains will have the PG45 pattern or that exhibited by both HB0801 and Hubei, or whether this will segregate with mastitis or respiratory tract isolates. Depending on the distribution, this study will be extended to include additional isolates, as necessary. Certain surface proteins were found to be unexpectedly divergent between PG45 and the two Chinese isolates. To determine whether the US isolates have one or other gene sequence, PCR and sequencing will be performed for each of these genes. In this instance it is expected that the gene will be present in all isolates since they are present in M. agalactiae also and thus appear to be conserved (in presence) within these closely related taxa. Sequencing of each amplicon will be performed in 96 well plate format at the DNA Core facility and the resulting sequences compared using Blast and ClustalW to delineate sequence diversity within the group. For any gene, it is possible that (i) the gene is absent; (ii) the gene matches closely the PG45 or Chinese isolate sequence, or (iii) contains a distinct genomic signature that might be prevalent within a group of US isolates. Whatever scenario is found, these data will provide useful new information regarding strain distribution of these highly divergent protein sequences. 2) Use next generation sequencing to generate additional draft genome sequences of M. bovisFour M. bovis field isolates will be selected for next generation sequence analysis, based upon the results obtained in Aim 1. It is planned that two mastitis and two respiratory tract isolates will be sequenced, that reflect the most common genotype(s) in the US as assessed by strain variable gene sets characterized in Aim 1. For example, if most US isolates contain the HB0801 signatures, field isolates harboring these will be sequenced to determine whether there is genome-wide similarity between the isolates (both to HB0801 and to other field isolates). By Illumina sequencing, approximately 15 gigabases of 100-bp reads can be generated per channel, representing over 150 million reads per sample. Based on an estimated genome size of 1 Mb, a single Illumina channel can be used to sequence 5 genomes at 200X coverage, which is sufficient to generate a draft genome sequence for each isolate. One limitation of this sequencing methodology is that the read length is too short to sequence across sequence repeats such as IS units (~1.5 kb) and the two rRNA operons. Nevertheless, this approach will enable the draft genome sequences to be assembled into contiguous stretches (contigs) between such repetitive sequences. Assembly of the Illumina reads will be perfomed using Velvet (56) or NextGENe (57), both of which have been used for Mycoplasma genomes in the PI's research. This will give a complete representation (with high depth of coverage) of the gene coding potential of an isolate genome which can then be annotated and compared with the three previously published genome sequences. The contigs for each genome will be submitted to the National Center for Biotechnology Information to be passed through the PGAAP auto-annotation pipeline (58). This free service results in genomes that have all ORFs and stable RNAs identified and annotated. The resulting annotated contigs of each genome will then be compared by BLAST to determine which of the reference genomes is the most similar, both in terms of nucleotide similarity for most housekeeping genes and for regions of variation. For ease of comparison between isolates, the contigs will be ordered based on the genome organization of the most similar reference strain. It is not within the scope of this proposal to close each genome and it is fully appreciated that such ordering is purely a hypothetical order to permit more facile strain comparison and does not take into account structural re-arrangements that may be present (14). As has been performed for the comparisons of the three M. bovis genomes, regions of variation including individual genes, large genetic loci, or variable genes that exhibit considerable sequence diversity between isolates, will be noted. Any regions of the newly sequenced isolates that appear to encode novel genes (lacking similarity to the gene sets of the three M. bovis genomes) will be analyzed further. Such loci could encode hitherto unrecognized mobile genetic elements, strain variable restriction systems, additional members of multi-gene families or genes that lack similarity to known sequences. Depending on the number and nature of any new genes found in the genomes, PCR will be carried out to assess distribution among the 100 isolates that were initially characterized.