Source: MASSACHUSETTS INST OF TECH submitted to NRP
GENOMICS OF THE PATHOGENIC FUNGUS FUSARIUM GRAMINEARUM (GIBBERELLA ZEAE)
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0199492
Grant No.
2002-35600-14701
Cumulative Award Amt.
(N/A)
Proposal No.
2004-00648
Multistate No.
(N/A)
Project Start Date
Nov 1, 2003
Project End Date
Sep 14, 2006
Grant Year
2004
Program Code
[23.2]- (N/A)
Recipient Organization
MASSACHUSETTS INST OF TECH
(N/A)
CAMBRIDGE,MA 02139
Performing Department
(N/A)
Non Technical Summary
The Fusarium fungi are arguably the most important group of fungal plant pathogens, causing a variety of blights, root rots or wilts on nearly every species of economically important plants. Among these, Fusarium graminearum, which causes head blight of wheat and barley, has resulted in the largest economic loss to U.S. agriculture in the last decade. In addition to reducing crop yields, the fungus and the toxins that it produces are hazards to food safety, because the infested grain is harmful to both humans and livestock. The PURPOSE of this project is to determine and make public the DNA sequence that encodes all the genetic instructions, the genome, of this fungus. Providing this DNA sequence, along with a preliminary analysis of it, will allow scientists everywhere to use this information to study how the fungus attacks plants and develop effective methods of treatment. In addition, we will train students in the new areas of bioinformatics and genomic analysis using the Fusarium sequence. The project represents a partnership between the Fusarium research community and the Whitehead Institute/MIT Center for Genome Research (WI-CGR), with the sequencing and automated annotation to be performed at the WI-CGR. Information generated in this project will help the F. graminearum community to better understand this important pathogen and benefit scientists studying other fungi for comparative studies on fungal pathogenesis and development.
Animal Health Component
20%
Research Effort Categories
Basic
70%
Applied
20%
Developmental
10%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2041540104010%
2041540106010%
2041541110210%
2044020104010%
2044020106010%
2044020110210%
2121540106010%
2124020104010%
7124020104010%
7124020106010%
Goals / Objectives
Fungi in the genus Fusarium are arguably the most important group of fungal plant pathogens, causing a variety of blights, root rots or wilts on nearly every species of economically important plants. Fusarium graminearum, the causal agent of head blight of wheat and barley, has resulted in the largest economic loss to U.S. agriculture in the last decade. The fungus and the mycotoxins that it produces are hazards to food safety, as ingestion of infested grain is harmful to both humans and livestock. Our long-term goal is to use genomic approaches to determine molecular mechanisms regulating infection processes and secondary metabolism in this pathogen. The specific objectives for this proposal are to: (1) produce a whole genome shotgun (WGS) assembly of one strain of F. graminearum with >8X coverage in phred 20 bases from paired-reads of plasmids, BACs, Fosmids and cosmids, (2) integrate the assembly with existing EST, genetic and physical mapping resources, (3) perform automated annotation and other genomic analyses on the assembly, (4) train students in bioinformatics and genomic analysis using the Fusarium sequence, and (5) make these data publicly available through web-based tools and interfaces. The project represents a partnership between the Fusarium research community and the Whitehead Institute/MIT Center for Genome Research (WI-CGR), with the sequencing and automated annotation to be performed at the WI-CGR. Information generated in this project will help the F. graminearum community to better understand this important pathogen and benefit scientists studying other fungi for comparative studies on fungal pathogenesis and development.
Project Methods
The strain chosen for sequencing by the International Gibberella zeae Genomics Consortium (IGGR) is designated PH-1 (NRRL 31084) and is a member of lineage 7 of Fusarium graminearum (Gibberella zeae). Lineage 7 is the predominant population of the wheat and barley scab fungus found in North America and Europe and is distributed worldwide (O'Donnell et al., 2000). For this project we will generate two new F. graminearum whole-genome shotgun libraries using our standard protocols. The majority of the shotgun sequencing will be accomplished using plasmids containing 4 kb inserts prepared using random fragmentation via the HydroShear (Thorstenson et al., 1998) and size fractionation. In addition, we will generate a new Fosmid library containing 40 kb inserts (Kim et al., 1992) from sheared DNA. The reduced copy number of the Fosmid vector and the use of random shearing rather than restriction digestion improves genomic representation of large insert libraries. Whole genome shotgun sequencing is the method of choice for small genomes such as that of F. graminearum because it is rapid, efficient, and effective. The basic requirements for a successful whole genome shotgun approach are: (i) representative shotgun libraries with narrow distributions of insert sizes (described above), (ii) sufficient depth of sequence coverage provided by paired (forward-reverse) sequence reads, (iii) additional links provided by paired-ends from clones with larger inserts, (iv) software and computing capacity to assemble the sequence, and (v) an independent means to verify the structure and quality of the assembly. The WI-CGR has extensive experience with each of these topics. The WI-CGR has built extensive tools to monitor sequence quality and accuracy. Sequence quality metrics are reported and reviewed daily by the Production Team and Director of Operations and are presented weekly at meetings of both the Production Teams and Sequencing Executive Committee. The WI-CGR has built extensive tools to monitor sequence quality and accuracy. These metrics are reported and reviewed daily by the Production Team and Director of Operations and are presented weekly at meetings of both the Production Teams and Sequencing Executive Committee. The sequence will be assembled using a software system, ARACHNE, we developed for the assembly of whole genome shotgun sequence reads (Batzoglou et al. 2002). ARACHNE was designed for large data sets of whole genome reads and makes use of the paired-read information. ARACHNE has several novel features, including an efficient and sensitive procedure for finding read overlaps, a procedure for scoring overlaps which achieves high accuracy by correcting errors first, read merger based on coincidence of sequence inserts, and repeat contig detection by forward-reverse link inconsistency. Automated annotation of the assembly will be performed using CALHOUN, a comprehensive system for supporting genome annotation and analysisand will include gene prediction, EST alignment, blastx and blastn searches against nr other fungal genomes, and identification of tRNAs and repeats.

Progress 11/01/03 to 09/14/06

Outputs
During this project we accomplished the specific objectives set out in the Fusarium graminearum project proposal. We generated >10X sequence coverage of F. graminearum strain PH-1 from plasmid, Fosmid, and BAC clones, and submitted all sequence traces to NCBI. We produced an assembly with 8X average sequence depth, which we released on the Broad website and submitted to NCBI (accession AACM00000000) in 2003. The assembly contains 36.1 Mb of sequence in 511 contigs, which are linked by paired-end reads into 43 scaffolds, with total size 36.5 Mb. The assembly displays extraordinary continuity; 99% of the sequence is found in scaffolds at least 735 kb in size. The sequence is of high quality with 99.4% meeting the Q40 standard for finished sequence. The end-sequenced Fosmids were deposited with the Fungal Genetics Stock Center. We identified >10,000 SNPs between the assembly and 0.5X sequence from a second strain, and released these SNPs on our website in 2006. We integrated the assembly with two genetic maps, one of which derived additional markers from the assembly, and released alignments of the maps to the assembly on our website in 2004. Using one of these maps, we genetically anchored 99.5% of the assembly sequence into one of the four F. graminearum chromosomes. We sequenced >24,000 ESTs from two cDNA libraries generated under either carbon-starved or complete media conditions. These sequences were integrated into our website and released in 2005. We predicted 11,640 protein coding genes using an automated annotation process, and validated the accuracy of these gene calls using the EST sequences. All these data were made publicly available on the Broad website in 2003 and by submitting all sequences to NCBI. The Broad website supports a wide range of resources for public use of the assembly and gene set including search, visualization, Blast, and download of sequence sets. Since its release, the F. graminearum home page has been accessed more than 56,000 times. Our analysis of the genome sequence revealed that F. graminearum has a highly unusual eukaryotic genome consisting of ~99.9% single copy sequence, with no active transposable elements and few high identity paralogs. We experimentally demonstrated the fungus mutates repeat sequences by a process similar to repeat induced point mutation (RIP). We also found that the genetic diversity is concentrated in specific regions of the genome, which are enriched in plant associated genes and genes unique to F.graminearum. These findings have implications for understanding how potential diversity at the population level may impact interactions of the pathogen with its hosts. We also found that the genome contains a large number of secreted proteins with putative roles in pathogenicity, including those for predicted cutinases, pectate lyases, necrosis-inducing peptides, and pectate lyases. We built a multigene phylogeny for F. graminearum and related sequenced fungi. In this analysis work we have collaborated with an international group of Fusarium researchers and MIPS. As part of this project, students were trained in bioinformatic and genetic analysis in Dr. Kistler's laboratory.

Impacts
By sequencing and annotating the genome of F. graminearum, we have transformed the ability of the research community to study this fungus. Already, studies of global gene expression have been conducted using the gene predictions from this project. Resources produced by the analysis include sets of genes which will be given high priority for functional study; these include species-specific genes, predicted secreted proteins, and genes found in high-diversity regions of the genome. A complete gene catalogue of F. graminearum also enables genome-wide functional studies, including those aimed at understanding pathogenesis.

Publications

  • No publications reported this period