Recipient Organization
MASSACHUSETTS INST OF TECH
(N/A)
CAMBRIDGE,MA 02139
Performing Department
(N/A)
Non Technical Summary
The Fusarium fungi are arguably the most important group of fungal plant pathogens, causing a variety of blights, root rots or wilts on nearly every species of economically important plants. Among these, Fusarium graminearum, which causes head blight of wheat and barley, has resulted in the largest economic loss to U.S. agriculture in the last decade. In addition to reducing crop yields, the fungus and the toxins that it produces are hazards to food safety, because the infested grain is harmful to both humans and livestock. The PURPOSE of this project is to determine and make public the DNA sequence that encodes all the genetic instructions, the genome, of this fungus. Providing this DNA sequence, along with a preliminary analysis of it, will allow scientists everywhere to use this information to study how the fungus attacks plants and develop effective methods of treatment. In addition, we will train students in the new areas of bioinformatics and genomic analysis
using the Fusarium sequence. The project represents a partnership between the Fusarium research community and the Whitehead Institute/MIT Center for Genome Research (WI-CGR), with the sequencing and automated annotation to be performed at the WI-CGR. Information generated in this project will help the F. graminearum community to better understand this important pathogen and benefit scientists studying other fungi for comparative studies on fungal pathogenesis and development.
Animal Health Component
20%
Research Effort Categories
Basic
70%
Applied
20%
Developmental
10%
Goals / Objectives
Fungi in the genus Fusarium are arguably the most important group of fungal plant pathogens, causing a variety of blights, root rots or wilts on nearly every species of economically important plants. Fusarium graminearum, the causal agent of head blight of wheat and barley, has resulted in the largest economic loss to U.S. agriculture in the last decade. The fungus and the mycotoxins that it produces are hazards to food safety, as ingestion of infested grain is harmful to both humans and livestock. Our long-term goal is to use genomic approaches to determine molecular mechanisms regulating infection processes and secondary metabolism in this pathogen. The specific objectives for this proposal are to: (1) produce a whole genome shotgun (WGS) assembly of one strain of F. graminearum with >8X coverage in phred 20 bases from paired-reads of plasmids, BACs, Fosmids and cosmids, (2) integrate the assembly with existing EST, genetic and physical mapping resources, (3) perform
automated annotation and other genomic analyses on the assembly, (4) train students in bioinformatics and genomic analysis using the Fusarium sequence, and (5) make these data publicly available through web-based tools and interfaces. The project represents a partnership between the Fusarium research community and the Whitehead Institute/MIT Center for Genome Research (WI-CGR), with the sequencing and automated annotation to be performed at the WI-CGR. Information generated in this project will help the F. graminearum community to better understand this important pathogen and benefit scientists studying other fungi for comparative studies on fungal pathogenesis and development.
Project Methods
The strain chosen for sequencing by the International Gibberella zeae Genomics Consortium (IGGR) is designated PH-1 (NRRL 31084) and is a member of lineage 7 of Fusarium graminearum (Gibberella zeae). Lineage 7 is the predominant population of the wheat and barley scab fungus found in North America and Europe and is distributed worldwide (O'Donnell et al., 2000). For this project we will generate two new F. graminearum whole-genome shotgun libraries using our standard protocols. The majority of the shotgun sequencing will be accomplished using plasmids containing 4 kb inserts prepared using random fragmentation via the HydroShear (Thorstenson et al., 1998) and size fractionation. In addition, we will generate a new Fosmid library containing 40 kb inserts (Kim et al., 1992) from sheared DNA. The reduced copy number of the Fosmid vector and the use of random shearing rather than restriction digestion improves genomic representation of large insert libraries. Whole genome
shotgun sequencing is the method of choice for small genomes such as that of F. graminearum because it is rapid, efficient, and effective. The basic requirements for a successful whole genome shotgun approach are: (i) representative shotgun libraries with narrow distributions of insert sizes (described above), (ii) sufficient depth of sequence coverage provided by paired (forward-reverse) sequence reads, (iii) additional links provided by paired-ends from clones with larger inserts, (iv) software and computing capacity to assemble the sequence, and (v) an independent means to verify the structure and quality of the assembly. The WI-CGR has extensive experience with each of these topics. The WI-CGR has built extensive tools to monitor sequence quality and accuracy. Sequence quality metrics are reported and reviewed daily by the Production Team and Director of Operations and are presented weekly at meetings of both the Production Teams and Sequencing Executive Committee. The WI-CGR has
built extensive tools to monitor sequence quality and accuracy. These metrics are reported and reviewed daily by the Production Team and Director of Operations and are presented weekly at meetings of both the Production Teams and Sequencing Executive Committee. The sequence will be assembled using a software system, ARACHNE, we developed for the assembly of whole genome shotgun sequence reads (Batzoglou et al. 2002). ARACHNE was designed for large data sets of whole genome reads and makes use of the paired-read information. ARACHNE has several novel features, including an efficient and sensitive procedure for finding read overlaps, a procedure for scoring overlaps which achieves high accuracy by correcting errors first, read merger based on coincidence of sequence inserts, and repeat contig detection by forward-reverse link inconsistency. Automated annotation of the assembly will be performed using CALHOUN, a comprehensive system for supporting genome annotation and analysisand will
include gene prediction, EST alignment, blastx and blastn searches against nr other fungal genomes, and identification of tRNAs and repeats.