Source: COLORADO STATE UNIVERSITY submitted to
COMPLETE GENOME SEQUENCING OF CLAVIBACTER MICHIGANENSIS SUBSP. SEPEDONICUS
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0189953
Grant No.
2001-52100-11428
Project No.
COL0-2001-04720
Proposal No.
2001-04720
Multistate No.
(N/A)
Program Code
(N/A)
Project Start Date
Sep 15, 2001
Project End Date
May 31, 2005
Grant Year
2001
Project Director
Ishimaru, C. A.
Recipient Organization
COLORADO STATE UNIVERSITY
(N/A)
FORT COLLINS,CO 80523
Performing Department
BIOAGRICULTURAL SCIENCES & PEST MANAGEMENT
Non Technical Summary
Bacterial ring rot disease of potato is a significant economic problem in US agriculture. The purpose of this project is to study the molecular basis of pathogenicity in the pathogen, Clavibacter michiganensis subsp. sepedonicus. The objective of this project is to sequence and analyze the complete genome of the type strain of C. m. subsp. sepedonicus to reveal DNA sequences that are involved in pathogenicity.
Animal Health Component
(N/A)
Research Effort Categories
Basic
50%
Applied
50%
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2121310116025%
2124010104050%
2124010116025%
Goals / Objectives
To sequence, analyze and annotate the genome of one of the major U.S. Clavibacter plant pathogens, Clavibacter michiganensis subsp. sepedonicus and compare this genome to related bacteria in the Actinomycetales.
Project Methods
Purified total genomic DNA (approximately 100 micrograms) from Cms, type strain ATCC 33113 (= NCPPB 2137 = PDDCC 2535 = LMG 2889), will be produced at Colorado State University and, after ensuring that it is free of live bacteria, will be shipped to the Sanger Centre. The DNA will be sheared to give fragments of sizes ranging from 1.5-4 kb and libraries constructed by blunt end cloning into pUC plasmid and m13 phage vectors. In addition, we will construct larger insert libraries (10-20 kb) in lambda and BAC vectors for scaffolding purposes. The ATCC 33113 strain is chosen for sequencing because it is virulent and included in most phylogenetic analyses. The genomic sequence will be generated by a whole-genome shotgun strategy, generating 8-10 fold genome coverage from several libraries in pUC and m13. We will supplement the pUC sequences with single end-sequences from m13 clones, which we have found to give much more representative libraries of high G+C DNA. Sequencing reads will be produced using Big-dye terminators on ABI3700 capillary machines. We will also produce end-sequences from the large insert BAC and lambda libraries (around 3-fold clone coverage of each), thus producing a scaffold to assemble the sequence contigs produced by the assembly. The initial assemblies of shotgun data will produce a number of contiguous sequences (contigs) separated by sequence gaps (bridged by read-pairs) and physical gaps (not bridged). Gap-closure will be achieved initially by finding clones that bridge contigs using forward and reverse read pairs from pUC, lambda and BAC libraries. These clones can then be "walked" or shotgunned to provide sequence to fill the gap. The remaining gaps will be bridged by combinatorial PCR methods, again walking or shotgunning the PCR product to fill the gap. When the sequence is complete it will be analyzed and annotated using our in-house tools. We will perform a manual annotation, as we believe that to extract the maximum information from the sequence, the analysis must be considered not just at a basic level, but also in the context of the biology of the organism under study. The analysis will include the use of positional base preference methods, codon bias and specially trained Hidden Markov Models to identify protein-coding regions. Ribosomal RNAs, tRNAs and other stable small RNAs will be identified by homology searches and specific search programs such as tRNA-scan. Fasta and Blast analysis will be used to identify homologous sequences at the DNA and protein level and Pfam/InterPRO will be used for protein motif searches. Specific searches will be carried out for repetitive elements. The results of all searches will be displayed in the Artemis program and annotated to produce a feature table for submission to the public databases. Special attention will be given to comparisons with other Actinomycetales, facilitated by the Artemis Comparison Tool that has been developed at the Sanger Centre to give an exhaustive and fully interactive display of the comparison between two genomes.

Progress 09/15/01 to 05/31/05

Outputs
Clavibacter michiganensis subspecies are major phytopathogens impacting the health of several economically important crops. Most species in the group grow slowly in vitro and are difficult to dissect genetically. The objective of this research is to obtain the complete genome sequence of C. michiganensis subsp. sepedonicus (Cms). Cms causes bacterial ring rot of potato, which is considered to be the most important disease of seed potatoes worldwide. The ATCC 33113 strain was chosen for sequencing because it is virulent and accessible to the scientific community. The project was conducted through a collaboration of scientists at Colorado State University, the Sanger Centre, U.K., and Ohio State University, with expertise in Clavibacter biology and high-throughput DNA sequencing and annotation of high-GC gram-positive bacteria. A whole-genome shotgun strategy, generating 8-10 fold genome coverage from several libraries in pUC and m13 was used. The sequencing project website is www.sanger.ac.uk/projects/C_michiganensis. Several libraries were constructed for the project, including small (2-2.8 kb) insert pUC19 libraries and several large insert libraries with 6, 9-10, and 38-42 kb inserts. At present there are 55,801 reads totaling 29.324 Mb and giving a theoretical coverage of 99.98% of the genome. The assembly phase is completed. The finished genome is 3.40 Mb and consists of a circular chromosome (3.25 Mb), the circular plasmid pCS1 (50.35 kb) and the linear plasmid pCSL1 (94.6 kb). The genome is 0.8 Mb (30%) larger than originally predicted. The predicted gene set was obtained, and manual and automated annotation of the genome sequence has been completed. On publication of the analysis and annotation, a fully annotated sequence will be released to the EMBL/GenBank/DDBJ public databases. Bioinformatic studies revealed the presence in Cms of eleven homologues of pat-1, a gene associated with pathogenicity in the related tomato pathogen, C. michiganensis subsp. michiganensis (Cmm). Chromosomal pat-1 homologues in Cms are not clustered within a single region as in Cmm. Instead, pat-1 homologues are distributed around the chromosome of Cms, though there is some clustering in two regions. Whole genome comparisons between the potato and tomato pathogen were performed. Our analyses so far have shown the presence of extremely high sequence similarities between common genes in Cmm and Cms, often translating into greater than 85.2% identity at the nucleotide level. A somewhat surprising find from these studies is the extensive number of rearrangements in Cms as compared to Cmm. From the available annotation, we have learned that Cms contains about 102 copies of insertion elements (IS) which are transposases belonging to at least 3 different families. In contrast, Cmm does not contain these insertion elements. This is consistent with previous studies showing the insertion element IS1121 in Cms is not present in Cmm. We are in the process of examining the expansion of IS elements in Cms relative to the chromosomal rearrangements, and whether this expansion is associated with loss of gene function.

Impacts
There is a zero tolerance for bacterial ring rot in the seed potato industry. Ensuring that seed is free of the pathogen (Cms) currently depends on positive reactions in a limited number of antigen or DNA-based assays, none of which target DNA sequences important in pathogen virulence. Having the complete genome sequence of Cms can lead to identification of genes required for disease development and provide new targets for pathogen detection.

Publications

  • Laurila. J., M. C. Metzler, C. A. Ishimaru, and V.-M. Rokka. 2003. Infection of plant material derived from Solanum acaule with Clavibacter michiganensis ssp. sepedonicus: temperature as a determining factor in immunity of S. acaule to bacterial ring rot. Plant Pathol. 52:496-504.


Progress 01/01/04 to 12/31/04

Outputs
Clavibacter michiganensis subspecies are major phytopathogens impacting the health of several economically important crops. Most species in the group grow slowly in vitro and are difficult to dissect genetically. The objective of this research is to obtain the complete genome sequence of C. michiganensis subsp. sepedonicus (Cms). Cms causes bacterial ring rot of potato, which is considered to be the most important disease of seed potatoes worldwide. The ATCC 33113 strain was chosen for sequencing because it is virulent and accessible to the scientific community. The project is being conducted through a collaboration of scientists at Colorado State University, the Sanger Centre, U.K., and Ohio State University, having expertise in Clavibacter biology and high-throughput DNA sequencing and annotation of high-GC gram-positive bacteria. A whole-genome shotgun strategy, generating 8-10 fold genome coverage from several libraries in pUC and m13 was used. The shotgun phase is complete, and a database of reads is available for searching at Sanger's Blast Server, or for downloading from their FTP site. The sequencing project website is www.sanger.ac.uk/projects/C_michiganensis. Several libraries were constructed for the project, including small (2-2.8 kb) insert pUC19 libraries and several large insert libraries with 6, 9-10, and 38-42 kb inserts. At present there are 55,801 reads totaling 29.324 Mb and giving a theoretical coverage of 99.98% of the genome. The assembly phase is completed. The finished genome is 3.40 Mb and consists of a circular chromosome (3.25 Mb), the circular plasmid pCS1 (50.35 kb) and the linear plasmid pCSL1 (94.6 kb). The genome is 0.8 Mb (30%) larger than originally predicted. The predicted gene set has been obtained. Manual and automated annotation of the genome sequence is underway. About 50% of the predicted CDS have been annotated. On completion and publication of the analysis and annotation, a fully annotated sequence will be released to the EMBL/GenBank/DDBJ public databases. Bioinformatic studies revealed the presence in Cms of eleven homologues of pat-1, a gene associated with pathogenicity in the related tomato pathogen, C. michiganensis subsp. michiganensis. Chromosomal pat-1 homologues in Cms are not clustered within a single region as in Cmm. Instead, pat-1 homologues are distributed around the chromosome of Cms, though there is some clustering in two regions.

Impacts
There is a zero tolerance world-wide for bacterial ring rot in the seed potato industry. Ensuring that seed is free of the pathogen (Cms) currently depends on positive reactions in a limited number of antigen or DNA-based assays, none of which target DNA sequences important in pathogen virulence. Having the complete genome sequence of Cms can lead to identification of genes required for virulence and provide new targets for pathogen detection.

Publications

  • Ishimaru, C. A., Knudson,D. L., Brown,S. E., Francis,D.M., and Parkhill, J. 2004. Genome sequencing of Clavibacter michiganensis subsp. sepedonicus. Phytopathology (94:S123).


Progress 01/01/03 to 12/31/03

Outputs
Clavibacter michiganensis subspecies are major phytopathogens, impacting the health of several economically important monocots and dicots. Most species in the group grow slowly in vitro and are difficult to dissect genetically. The objective of this research is to obtain the complete genome sequence of C. michiganensis subsp. sepedonicus (Cms). Cms causes bacterial ring rot of potato, which is considered to be the most important disease of seed potatoes world wide. The ATCC 33113 strain was chosen for sequencing because it is virulent and accessible to the scientific community. The project is being conducted through a collaboration of scientists at Colorado State University, the Sanger Centre, U.K., and The Ohio State University, having expertise in Clavibacter biology and high-throughput DNA sequencing and annotation of high-GC gram-positive bacteria. The complete genomic sequence is being generated at the Sanger Centre by a whole-genome shotgun strategy, generating 8-10 fold genome coverage from several libraries in pUC and m13. End-sequences from large insert BAC and lambda libraries have been generated, providing a scaffold to assemble the sequence contigs produced by the assembly of shotgun data. Gap closure is being achieved through a combination of primer walking, PCR, and shotgun strategies. The completed sequence will be analyzed and annotated using in-house tools available at the Sanger Institute. Library preparation and validation is completed. Shotgun phase is completed. A database of reads is available for searching on the Sanger Blast Server, or for download from the Sanger FTP site. At present there are 55,801 reads, totaling 29.324 Mb giving a theoretical coverage of 99.89% of the genome. Test assemblies of the present coverage yields 46 contigs > 2kb with a total size of 3.546 Mb. Closure is anticipated 28 months from start date. Annotation of the sequence is anticipated 28-36 months from start date. Current assemblies give a genome size 0.8 Mb (30%) larger than originally predicted. The overall %GC content is 72.2%. The sequence of the linear plasmid in Cms has been finished and annotated. Preliminary automated annotation of the genome sequence has begun. On completion and publication of the analysis and annotation, a fully annotated sequence will be released to the EMBL/GenBank/DDBJ public databases. An enhanced set of web pages allowing full access to the sequence and annotation will be available at the Sanger Centre website.

Impacts
There is a zero tolerance world-wide for bacterial ring rot in the seed potato industry. Ensuring that seed is free of the pathogen (Cms) currently depends on positive reactions in a limited number of antigen or DNA-based assays, none of which target DNA sequences important in pathogen virulence. Having the complete genome sequence of Cms can lead to identification of genes required for virulence and provide new targets for pathogen detection.

Publications

  • Laurila. J., M. C. Metzler, C. A. Ishimaru, and V.-M. Rokka. 2003. Infection of plant material derived from Solanum acaule with Clavibacter michiganensis ssp. sepedonicus: temperature as a determining factor in immunity of S. acaule to bacterial ring rot. Plant Pathol. 52:496-504.
  • Ishimaru, C. A., Parkhill, J., Francis, D. M., Knudson, D. L. 2003. Complete genome sequence of Clavibacter michiganensis subsp. sepedonicus 3rd ASM and TIGR Conference on Microbial Genomes. USDA-CSREES Microbial Sequencing Awardee Workshop. New Orleans, Louisiana. January 2003.


Progress 01/01/02 to 12/31/02

Outputs
Clavibacter michiganensis subspecies are major phytopathogens, impacting the health of several economically important monocots and dicots. Most species in the group grow slowly in vitro and are difficult to dissect genetically. The objective of this research is to obtain the complete genome sequence of C. michiganensis subsp. sepedonicus (Cms). Cms causes bacterial ring rot of potato, which is considered to be the most important disease of seed potatoes world wide. The ATCC 33113 strain was chosen for sequencing because it is virulent and accessible to the scientific community. The project is being conducted through a collaboration of scientists at Colorado State University, the Sanger Centre, U.K., and the Ohio State University, having expertise in Clavibacter biology and high-throughput DNA sequencing and annotation of high-GC gram-positive bacteria. The complete genomic sequence of Clavibacter michiganensis subsp. sepedonicus is being generated by a whole-genome shotgun strategy, generating 8-10 fold genome coverage from several libraries in pUC and m13. Sequencing reads have been produced using Big-dye terminators on ABI3700 capillary machines. End-sequences from large insert BAC and lambda libraries will also be generated, providing a scaffold to assemble the sequence contigs produced by the assembly of shotgun data. Gap closure will be achieved through a combination of primer walking, PCR, and shotgun strategies. The completed sequence will be analyzed and annotated using in house tools available at the Sanger Centre. Library preparation and validation is completed. The shotgun phase is in progress, and a database of reads is available for searching on the Sanger Blast Server, or for download from the Sanger FTP site. At present there are 45,927 reads, totaling 23.299 Mb giving 6.8x coverage, a theoretical coverage of 99.89% of the genome. The overall %GC content is 72.2%, a finding that is typical of this group of organisms. BLASTX using the current Cms sequences against the complete Mycobacterium tuberculosis proteome of known and predicted proteins yielded a 35% hit rate.

Impacts
There is a zero tolerance world-wide for bacterial ring rot in the seed potato industry. Ensuring that seed is free of the pathogen (Cms) currently depends on positive reactions in a limited number of antigen or DNA-based assays, none of which target DNA sequences important in pathogen virulence. Having the complete genome sequence of Cms can lead to identification of genes required for virulence and provide new targets for pathogen detection.

Publications

  • Ishimaru, C. A., Brown, S. E., and Knudson, D. L. 2002. Linear plasmid in Clavibacter michiganensis subsp. sepedonicus. The International Conference on the Status of Plant, Animal, and Microbe Genome Research, Plant, Animal, and Microbe Genomes X. San Diego, CA. P5.
  • Ishimaru, C. A., Parkhill, J., Francis, D. M., Brown, S. E., and Knudson, D. L. 2002. Complete genome sequence of Clavibacter michiganensis subsp. sepedonicus. The International Conference on the Status of Plant, Animal, and Microbe Genome Research, Plant, Animal, and Microbe Genomes X. San Diego, CA. W211.
  • Brown, S. E., Knudson, D. L., and Ishimaru, C. A. 2002. Linear plasmid in the genome of Clavibacter michiganensis subspecies sepedonicus. 2002. J. Bacteriol. 184:2841-2844.


Progress 01/01/01 to 12/31/01

Outputs
Plant pathogenic members of the high-GC gram-positive bacteria are vastly unexplored. Despite their agricultural importance, the molecular biology of these bacteria has lagged behind that of the gram-negative bacteria. Most species in the group grow slowly in vitro and are recalcitrant to classical genetic manipulation methods. Clavibacter michiganensis subspecies are major phytopathogens, impacting the health of several economically important monocots and dicots. The objective of this research is to obtain the complete genome sequence of C. michiganensis subsp. sepedonicus (Cms). Cms causes bacterial ring rot of potato, which is considered to be the most important disease of seed potatoes. The ATCC 33113 strain was chosen for sequencing because it is virulent and accessible to the scientific community. The project is being conducted by scientists at Colorado State University, the Sanger Centre, U.K., and Ohio State University, having expertise in Clavibacter biology and high-throughput DNA sequencing and annotation of high-GC gram-positives. In this first stage of the project, a culture of ATCC33113 was obtained from ATCC. Purified total genomic DNA from ATCC 33113 was produced at Colorado State University and, after ensuring that it is free of live bacteria, was shipped to the Sanger Centre. The next phase of the project will entail DNA library construction, whole genome shotgun sequencing, and gap closing. When the sequence is complete it will be analyzed and annotated using tools available at the Sanger Centre. Once the sequence of Cms is completed and annotated it will allow the first comprehensive genomic comparisons between animal and plant pathogens in the Actinomycetales.

Impacts
Having the genome sequence of Cms will enable identification of those genetic sequences that are required for pathogenicity. It will also help identify sequences that enable the pathogen to persist for long periods in a latent state. The expectation is that identifying regulatory sequences and genes that are conserved among Cms and other related plant and animal pathogens will provide new avenues for disease control.

Publications

  • No publications reported this period