Source: UNIVERSITY OF KENTUCKY submitted to
A GENOME SEQUENCE FOR THE MODEL HEMIBIOTROPH COLLETOTRICHUM GRAMINICOLA
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0209151
Grant No.
2007-35600-17829
Project No.
KY0VAILLANCOURT
Proposal No.
2006-04989
Multistate No.
(N/A)
Program Code
23.2
Project Start Date
Dec 15, 2006
Project End Date
Dec 14, 2010
Grant Year
2007
Project Director
Vaillancourt, L.
Recipient Organization
UNIVERSITY OF KENTUCKY
500 S LIMESTONE 109 KINKEAD HALL
LEXINGTON,KY 40526-0001
Performing Department
PLANT PATHOLOGY
Non Technical Summary
Fungal diseases cause enormous loses to agriculture worldwide each year, and are major barriers to food sustainability in the developing world. Elucidating the molecular mechanisms that regulate interactions between plants and fungal pathogens will be critical for future disease management. Complete genome sequences of pathogens and their hosts have become a key part of this research effort. The goal of this project is to release to the public a complete genome sequence for the model hemibiotrophic plant-pathogenic fungus Colletotrichum graminicola.
Animal Health Component
(N/A)
Research Effort Categories
Basic
100%
Applied
(N/A)
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
21240201080100%
Knowledge Area
212 - Pathogens and Nematodes Affecting Plants;

Subject Of Investigation
4020 - Fungi;

Field Of Science
1080 - Genetics;
Goals / Objectives
The objective of the proposed research is to produce and release a complete genome sequence (8X coverage) of the hemibiotrophic plant-pathogenic fungus Colletotrichum graminicola, together with a high quality assembly and annotation. Other objectives are to develop tools for comparative bioinformatics, focused on understanding the functional and evolutionary relationships among different fungal lifestyles, and to integrate all of the research with an established outreach program with a proven track record in increasing the exposure of rural Kentucky and Appalachian high school students to modern biological sciences.
Project Methods
The genome sequence, assembly, and annotation will be conducted primarily by the Broad Institute in Massachusetts. The Broad has a proven record of accomplishment relevant to the accurate and timely release of fungal genome data, having already published fifteen fungal genomes. The Broad will use their assembly program Arachne and their whole genome annotation program Calhoun to finish the genome. An EST project, consisting of 15,000 sequences, will be used to train Calhoun for annotation. An optical map will be used to validate the genome assembly. The project will be coordinated by the PI and another Colletotrichum researcher, who together have more than 30 years of experience, and more than 100 publications, related to this genus. A database will be developed and housed at Texas A&M university which will facilitate genomic comparisons of various fungal lifestyles including symbiosis, heterobiotrophy, and necrotrophy.

Progress 12/15/06 to 12/14/10

Outputs
OUTPUTS: We sequenced the genome of Colletotrichum graminicola strain M1.001 via a combination of Sanger and 454 protocols and achieved greater than 8X high quality genome coverage. The assembly covers 51.6 Mb, arranged in 653 scaffolds of 1151 contigs. The assembly has been validated with an optical map which demonstrated the presence of 13 chromosomes, including 3 "mini-chromosomes" of less than 2 Mb in size. The genome was annotated with the aid of paired-end sequence from 10,000 cDNA clones, and it is predicted to contain approximately 12,000 genes. All the information is currently available on the Broad database, together with reads from a second strain of C. graminicola that was isolated from corn in Brazil. These sequence data were donated by Dupont Nemours Co. In addition, genome data for a second species of Colletotrichum, C. higginsianum, will appear on the same Broad website which will become therefore a Colletotrichum comparative genome site. C. higginsianum is a pathogen of the model plant Arabidopsis and was sequenced independently by the Max Planck Institute in Cologne, Germany. The PI of the project agreed to have the Broad site host his data, and the Broad institute agreed to do so. This will be a tremendous resource for the Colletotrichum community since C. graminicola can be considered the model for graminaceous Colletotrichum pathogens, while C. higginsianum can be considered the model for dicotyledonous Colletotrichum pathogens. Having both species on the same website and accessible using the same Broad tools and format will facilitate comparative analyses. Analysis of the genomes will continue, with a plan to develop a joint publication sometime in the next year. PARTICIPANTS: Graduate Students: Maria Torres, Ester Buiate Postdoctoral Scholars: Stefan Amyotte Technical Support Staff: Audrey Law Collaborators: Richard O'Connell, Max Planck Plant Breeding Institute, Cologne, Germany; Christopher Schardl, University of Kentucky; Mark Farman, University of Kentucky; Li Jun Ma, Broad Institute; Michael Thon, University of Salamanca, Spain; Martin Dickman, Texas A&M University TARGET AUDIENCES: Target Audience: Scientific Community, Agricultural Biotechnology Industry PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
Fungi cause the vast majority of plant diseases and result in enormous losses to agriculture worldwide every year. There is a continual need to develop safer, more effective ways to combat plant disease. Elucidating molecular mechanisms that regulate interactions between plants and fungi will be critical for future disease management. Complete genome sequences of pathogens and of their hosts have become a key part of this research effort. C. graminicola is a hemibiotrophic plant pathogen, which has aspects of both biotrophy and necrotrophy. Comparative studies of the genome of the model hemibiotroph C. graminicola with the genomes of these other classes may be very useful for elucidating functional and evolutionary relationships among these different pathogenicity types. Our goal for this project was to produce a high quality draft genome sequence of C. graminicola strain M1.001 and release that to the community on the Broad website. This represents the first publicly available genome sequence for a member of this genus.

Publications

  • No publications reported this period


Progress 12/15/08 to 12/14/09

Outputs
OUTPUTS: Fungi cause the vast majority of plant diseases and result in enormous losses to agriculture worldwide every year. There is a continual need to develop safer, more effective ways to combat plant disease. Elucidating molecular mechanisms that regulate interactions between plants and fungi will be critical for future disease management. Complete genome sequences of pathogens and of their hosts have become a key part of this research effort. C. graminicola is a hemibiotrophic plant pathogen, which has aspects of both biotrophy and necrotrophy. Comparative studies of the genome of the model hemibiotroph C. graminicola with the genomes of these other classes may be very useful for elucidating functional and evolutionary relationships among these different pathogenicity types. Our goal for this project is to produce a high quality draft genome sequence of C. graminicola strain M1.001 and release that to the community on the Broad website. This would represent the first genome sequence for a member of this genus. PARTICIPANTS: P.I. Lisa Vaillancourt Collaborators: Lijun Ma, Martin Dickman, Mike Thon TARGET AUDIENCES: Research community, industry scientists PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
We have completed sequencing via a combination of Sanger and 454 protocols and have achieved approximately 8X genome coverage. The assembly covers 50.8 Mb, arranged in 653 scaffolds of 1151 contigs. The assembly has been validated with an optical map which demonstrated the presence of 13 chromosomes, including 3 "B-chromosomes" of less than 2 Mb in size. The assembly is currently available on the Broad database, together with reads from a second strain of C. graminicola from corn isolated from Brazil. The latter sequence data were donated to our project by Dupont Nemours Co. The genome annotation is now underway, using a collection of ESTs generated by Sanger reads from both ends of 5000 clones from a normalized cDNA library generated from mycelium growing in complete medium, and 5000 clones from a non-normalized library growing in nitrogen starved conditions.

Publications

  • No publications reported this period


Progress 12/15/07 to 12/14/08

Outputs
OUTPUTS: Project goals that have now been completed include the following. An optical map was produced at the University of Wisconsin. Results indicated an unexpectedly large genome size of 57.44 Mb, consisting of 13 contigs (chromosomes) ranging in size from 0.5 Mb to 7.4 Mb. A genomic DNA prep was produced, and three genomic libraries (2 plasmid and 1 fosmid) were produced. Sanger sequencing to 7.5X coverage (based on the modified genome size, above) has been completed, and all data have been deposited in the NCBI trace archives. A preliminary assembly was produced, consisting of 7.1X assembly Q20 coverage, with total contig length of about 48 Mb, 1855 contigs and 297 supercontigs. Since it appeared that we had some cloning bias in our libraries, the data were supplemented by nearly 11X coverage of 454 sequencing. A new assembly incorporating these new data is now completed, consisting of a total contig length of about 50 Mb, 8.74X Q20 coverage, 9.6X assembly base coverage, 761 contigs with N50 length 256,950, and 516 supercontigs with N50 length 545921. A comparative bioinformatics tool (Fungal Protein Cluster Database (FPC-DB) has been developed and a beta version is currently being tested. The outreach portion of the project has been successfully completed, in which a bioinformatics unit was developed and delivered to Appalachian high school students. Outcomes analysis focused on high school students is underway, but other completed analyses suggest that the unit results in significant improvements in the understanding of bioinformatics concepts and applications by Appalachian high school teachers. PARTICIPANTS: Training was provided to one graduate student at Texas AM who graduated with MS in 2009. Training was also provided to a graduate student participating in the outreach component of this project, and to 12 Appalachian high school teachers who attended a summer training session for the bioinformatics module. TARGET AUDIENCES: The PI and Co-PIs provided genome updates to the community at one meeting in 2008 and two in 2009. Updates are also being provided on the Colletotrichum.org community website. PROJECT MODIFICATIONS: A one-year extension was requested because of the unforseen complexity of the genome, which was both larger and more difficult to assemble than expected. 454 sequencing was added to the project to address clone bias and improve the genome assembly.

Impacts
Fungi cause the vast majority of plant diseases and result in enormous losses to agriculture worldwide every year. There is a continual need to develop safer, more effective ways to combat plant disease. Elucidating molecular mechanisms that regulate interactions between plants and fungi will be critical for future disease management. Complete genome sequences of pathogens and of their hosts have become a key part of this research effort.

Publications

  • No publications reported this period


Progress 12/15/06 to 12/14/07

Outputs
OUTPUTS: Fungi cause the vast majority of plant diseases and result in enormous losses to agriculture worldwide every year. There is a continual need to develop safer, more effective ways to combat plant disease. Elucidating molecular mechanisms that regulate interactions between plants and fungi will be critical for future disease management. Complete genome sequences of pathogens and of their hosts have become a key part of this research effort. C. graminicola is a hemibiotrophic plant pathogen, which has aspects of both biotrophy and necrotrophy. Comparitive studies of the genome of the model hemibiotroph C. graminicola with the genomes of these other classes may be very useful for elucidating functional and evolutionary relationships among these different pathogenicity types. The goal will be a High Quality Draft assembly using the whole genome shotgun approach. The degree of coverage will be 8X. The assembly will be done using the ARACHNE program and the annotation with the Calhoun program at Broad. The assembly will be independently validated by using an optical map, and the annotation will be aided by an EST project consisting of paired reads of 15,000 cDNA clones. Data will be released to the public on the NCBI website and on the website of the Broad Institute. PARTICIPANTS: Lisa Vaillancourt (PI) Martin Dickman, Michael Thon, Jeffrey Osborn, Lijun Ma (Co-PIs) Jaehee Jung (graduate student) TARGET AUDIENCES: Scientific Community

Impacts
To date we have completed the optical map of C. graminicola, which demonstrated that the genome was considerably larger than we had anticipated (approximately 57 Mb). We have completed sequencing to 8X coverage of this larger genome size, and will soon begin the process of assembly. Most of the sequence has already been released to the public on the NCBI trace repository.

Publications

  • No publications reported this period