Source: J. CRAIG VENTER INSTITUTE submitted to NRP
GENOME SEQUENCE OF THE PARASITIC CILIATE, ICHTHYOPHTHIRIUS MULTIFILIIS
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0211573
Grant No.
2007-35600-18539
Cumulative Award Amt.
(N/A)
Proposal No.
2007-04743
Multistate No.
(N/A)
Project Start Date
Sep 15, 2007
Project End Date
Sep 14, 2010
Grant Year
2007
Program Code
[51.0]- Microbial Genomics
Recipient Organization
J. CRAIG VENTER INSTITUTE
9704 MEDICAL CENTER DRIVE
ROCKVILLE,MD 20850
Performing Department
(N/A)
Non Technical Summary
As the cause of "white-spot" disease in freshwater fish, Ichthyophthirius multifiliis, or Ich, affects a wide range of freshwater fish species and is a major pest within the aquaculture industry in this country and abroad. To develop more effective preventive and treatment options to combat Ich infection, the sequence of the entire genome of this organism will be determined and the genes that encode all its proteins identified. Because Ich is related to other organisms used as models in basic research, comparing their genomes will offer much insight into biological processes of wide interest.
Animal Health Component
(N/A)
Research Effort Categories
Basic
100%
Applied
(N/A)
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
31340501040100%
Knowledge Area
313 - Internal Parasites in Animals;

Subject Of Investigation
4050 - Protozoa;

Field Of Science
1040 - Molecular biology;
Goals / Objectives
The goals of this project are to obtain a high quality draft genome sequence of the parasitic ciliate Ichthyophthirius multifiliis, perform auto-closure to close as many of the gaps in the draft genome sequence as possible, and perform auto-annotation to identify protein-coding and non-protein-coding genes in the genome. The results will be made freely available to interested researchers through submission to public databases.
Project Methods
Macronuclear DNA will be purified from a cloned culture of parasites, grown on fish. Plasmid shotgun genomic DNA libraries with a variety of insert sizes will be constructed and end-sequenced by standard Sanger sequencing methods. In addition, the pyrosequencing method developed by 454 Life Sciences will be used to generate high sequence coverage. The two types of data will be integrated to generate a high quality draft assembly. Two rounds of an AutoClosure pipeline will be conducted to identify regions of low or no coverage or low quality sequence and correct them by additional targeted sequencing. Based on experience with the related ciliate, Tetrahymena thermophila, automated gene annotation will be performed with ab initio gene finders, trained with Expressed Sequence Tag (EST) data from the previous Ichthyophthirius multifiliis EST project.

Progress 09/15/07 to 09/14/10

Outputs
OUTPUTS: The objectives of this project were to obtain, analyze and disseminate information about the macronuclear (MAC; somatic) genome of the fish parasite Ichthyophthirius multifiliis, or Ich. Over the course of the project, we have sequenced and assembled the Ich genome as well as that of a previously unidentified intracytoplasmic bacterium and identified and modeled the protein coding genes and those for principal non-coding RNAs. Major protein-coding gene families have been analyzed in detail, including kinases, membrane transporters, proteases and immobilization antigens. The annotated genome sequence assemblies are in the final steps of submission to the NCBI genome database. They have already been released on a ftp site freely accessible from the J. Craig Venter Institute. Within a few months, the genomic data will also be loaded onto the Tetrahymena Genome Database website, allowing the full use of standard genome database features, such as a genome browser, Blast server and editable web pages for each predicted gene. The predicted protein coding genes were also used to map orthology relationships between Ich and other species in the OrthoMCL database and predict Ich's metabolic pathways to identify potential targets for therapeutic intervention against Ich infection. Preliminary results were presented by the Principal Investigator at the FASEB Ciliate Molecular Biology conference in July 2009. A full report will be presented at the same conference in 2011. The PI also presented seminars on this project at three locations in France: the Ecole Normale Superieure in Paris, the Centre Genetique Moleculaire in Gif-sur-Yvette and at a conference on Paramecium genomics in Roscoff. Participants: The PI and co-PIs have coordinated the project, presented the results at seminars and prepared publications. A number of informatics engineers at the JCVI have been involved in genome assembly and annotation. Two summer interns (one undergraduate and one Masters level) have assisted in the project and gone on to graduate study. For analysis of the genome, we have collaborated with several groups having expertise in particular gene families or types of genomic analysis. PARTICIPANTS: The PI and co-PIs have coordinated the project, presented the results at seminars and prepared publications. A number of informatics engineers at the JCVI have been involved in genome assembly and annotation. Two summer interns (one undergraduate and one Masters level) have assisted in the project and gone on to graduate study. For analysis of the genome, we have collaborated with several groups having expertise in particular gene families or types of genomic analysis. TARGET AUDIENCES: The principal target for dissemination in the form of seminars has so far been the ciliate research community. Publications and genomic resources have and will be accessible to any interested researcher. PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
Prior to this project, the only genomic knowledge of Ich came from limited EST studies. This project has provided a set of over 8000 complete gene models on the basis of which we can draw conclusions about the metabolic capabilities of Ich, its range of antigenic variation and its regulatory functions. The data will be invaluable in finding targets for drugs that may specifically inhibit Ich growth without accompanying side effects on its vertebrate host. Several potential drug targets are identified in the publication of our results. The genome sequence also has provided a complete catalog of coding sequences for the major target of immune reactivity, the immobilization antigens, as well as many other predicted membrane proteins that may be used to design more effective vaccines. Because Ich is closely related to the eukaryotic model organisms Tetrahymena and Paramecium, the results also have been used to understand the transition from a free-living bacterivorous species to a parasitic one. The data being freely available on the NCBI and Tetrahymena Genome Database web sites will allow any interested investigator access to these extensive genomic resources. The impact of this project will therefore be a major and long-lasting facilitation of research on all aspects of Ich biology.

Publications

  • 1. Sun HY, Noe J, Barber J, Coyne RS, Cassidy-Hanley D, Clark TG, Findly RC, Dickerson HW. Endosymbiotic bacteria in the parasitic ciliate Ichthyophthirius multifiliis. Appl Environ Microbiol. 2009 Dec;75(23):7445-52.
  • 2. Coyne RS et al. 2011 Comparative genomics of the pathogenic ciliate Ichthyophthirius multifiliis, its free-living relatives and a host species: insights into adoption of a parasitic lifestyle and prospects for disease control. Submitted to Genome Biology, in review.
  • 3. Michael Lynch, Thomas Doak, Casey L. McGrath, Samuel Miller, B. Franz Lang, Henner Brinkmann, Daniel Brami, Jessica B. Hostetler, Vinita S. Joardar, Diana Radune, Prasanna Kolli, and Robert S. Coyne. 2011. The Genome of a Rickettsiales Bacterium Inhabiting the Parasitic Ciliate Ichthyophthirius multifiliis. Manuscript in preparation.


Progress 09/15/08 to 09/14/09

Outputs
OUTPUTS: Whole genome shotgun sequencing, assembly and partial closure was performed on the ciliate Ichthyophthirius multifiliis and its associated endosymbiotic bacteria. The ciliate genome is approximately 50 Mb in size. An optical map of the ciliate genome was also constructed, revealing 69 complete chromosomes and 4 incomplete chromosomes. The amplified ribosomal DNA gene, which encodes the 24S and 16S rRNAs and is located on a separate chromosome, was assembled separately. One complete circular endosymbiotic genome was constructed. This genome shows the symbiont to be a member of the Rickettsiales. Both the ciliate and symbiont genomes are available for download according to the instructions given here: http://www.jcvi.org/cms/research/projects/ich/overview/. In addition to genome assembly, deep sequencing of a normalized cDNA library was performed using the Illumina platform to assist with structural gene annotation. Automated and manual gene annotation of both genomes is in progress. PARTICIPANTS: Principal Investigator Robert Coyne coordinated efforts on the project and is organizing plans for analysis of the final genome annotation. Co-PIs Ted Clark and Donna Cassidy-Hanley provided materials for genome and cDNA analysis and provided their expert knowledge of the organism in setting an agenda for the focus of our analysis. At the JCVI, the principal scientists and bioinformaticians involved in genome assembly and closure have been Jessica Hostetler, Justin Johnson, and Daniel Brami. Linda Hannick has been responsible for Ich genome annotation. Vinita Joardar has been responsible for endosymbiont annotation. Collaborators have included Harry Dickerson and members of his lab at the University of Georgia and Michael Lynch and members of his lab at Indiana University, who have been primarily interested in analysis of the endosymbionts discovered in Ich. One publication has resulted from these collaborations and a second is in preparation. As for training, a summer intern from Carnegie Mellon University, Irtisha Singh, worked on Ich annotation projects in 2009. She is now applying for admission to PhD programs in computational biology. TARGET AUDIENCES: Progress on the project was reported by the PI at the biannual FASEB Ciliate Molecular Biology Conference in July of 2009. The genome assemblies and preliminary annotations have been made available freely on the JCVI website, as announced by a message to the ciliate community listserve. The first publication on Ich endosymbionts was selected as the cover article for that issue of Applied and Environmental Microbiology and was accompanied by a press release that was picked up by several specialized media outlets. PROJECT MODIFICATIONS: As reported in the previous annual report, the discovery of endosymbionts in the Ich cytoplasm led to changes in our plans for genome assembly, but these did not seriously affect the quality and completeness of the Ich genome assembly and provided a nearly complete symbiont assembly. Addition of the optical map provided a much more complete framework for the draft genome assembly and guide for future closure efforts. Addition of deep cDNA sequencing provided much greater coverage of the transcriptome than previously available, leading to more accurate gene modeling and the discovery of hundreds of otherwise undetectable genes.

Impacts
From the assembled genome sequence and optical map, we were able to confirm the approximate macronuclear genome size of Ich to be 50 Mb and also learned that it contains most likely 72 macronuclear chromosomes. Gene finding and cDNA sequencing revealed the presence of approximately 8300 protein-coding genes. Multiple copies of genes encoding the immobilization antigens, the main target of protective immunity against this parasite, were identified, located in tandem arrays. This discovery will facilitate the identification of parasite serotypes and the production of vaccines. The genome shows a considerable reduction in gene number from the most closely related non-parasitic relatives, Tetrahymena thermophila and Paramecium tetraurelia. When genome annotation is completed within the next few weeks, we will analyze the gene content to reveal potential targets for treatment of parasite infections. We also discovered a previously unknown bacterial endosymbiont within the Ich cytoplasm with a genome of 1.3 Mb and approximately 1300 protein-coding genes. There is evidence for one or two additional symbionts within Ich, but genome assembly was not successful. It is possible that knowledge of the symbionts may lead to novel methods for controlling Ich infection of fish.

Publications

  • Sun HY, Noe J, Barber J, Coyne RS, Cassidy-Hanley D, Clark TG, Findly RC, Dickerson HW. 2009. Endosymbiotic bacteria in the parasitic ciliate Ichthyophthirius multifiliis. Appl Environ Microbiol. 2009 Dec;75(23):7445-52.


Progress 09/15/07 to 09/14/08

Outputs
OUTPUTS: The first goal of this project is to perform shotgun sequencing of the macronuclear genome of the parasitic ciliate Ichthyophthirius multifiliis, or Ich. To minimize contamination by DNA sequence from the germline micronuclear genome, genomic DNA was prepared from the parasite at the trophont stage of its life cycle, during which the macronuclear DNA is greatly amplified relative to micronuclear DNA. Because trophonts are attached to the parasitic host fish, they must be manually isolated and washed, which limits the yield of DNA and results in minor contamination from fish DNA. The first sequence results, however, revealed another, unexpected source of contamination. A substantial portion of sequence reads exhibited much higher than expected GC content (Ich is very AT-rich, about 85%) and similarity to sequences of bacterial origin. For a variety of reasons, we suspected the source of this bacterial DNA was one or more intracellular endosymbiont. This has been confirmed by fluorescence in situ hybridization. Possibly three different symbionts have been detected. Attempts to isolate macronuclei free of bacterial contamination were unsuccessful. To avoid further delay and because the sequence of the symbiont(s) would be of interest, we decided to proceed with shotgun sequencing of the mixed sample and separate the reads computationally. This has presented a challenge, but significant progress has been made toward assembly. We have pursued a hybrid Sanger/454 shotgun sequencing strategy. We constructed plasmid libraries for Sanger end-sequencing with inserts in the 2-4 and 4-6 kb size ranges. We also had available a larger insert library, made by Lucigen Corp., with inserts in the 5-9 kb range. The larger insert libraries were much more heavily contaminated with GC-rich bacterial DNA, presumably due to cloning bias against the AT-rich Ich DNA. Therefore, the great majority of Sanger sequencing was performed on the 2-4 kb library. Over 297,000 successful reads were obtained. We have also completed five full runs of 454 fragment library sequencing on the FLX instrument. Because the assembly would benefit from still higher sequence coverage, we are now preparing to perform a run using the new 454 Titanium platform, which because of improvements to read length and number, may produce as much new sequence information as all 5 FLX runs. We have also begun the process of mapping existing EST sequences to the Ich genome assembly and training gene finding algorithms. About 33,000 ESTs in public databases are identified as being of Ich origin. On closer examination, some of these sequences appear to be of bacterial origin. However, many of the ESTs align well to the Ich assembly and we are beginning to be able to model gene structures. As expected, the most closely related genes in sequence databases to the Ich predictions are from Tetrahymena thermophila, a related ciliate with a fully sequenced and annotated genome. PARTICIPANTS: The principal investigator, Robert Coyne, has been responsible for coordination of the project. Co-PIs Ted Clark and Donna Cassidy-Hanley at Cornell University have provided DNA samples for sequencing and consulted with the PI on strategy and interpretation of data. An undergraduate research intern, Cory Smith, worked on the project during a 12 week session in the sumer of 2008. He focused on symbiont assembly and closing of assembly gaps. TARGET AUDIENCES: The primary target audience for the data generated by this project is researchers studying Ich and searching for more effective means of controlling Ich infection. Ultimately, this research will lead to improved productivity for aquaculturists with fewer losses due to parasite outbreaks. The project also benefits students studying in the Clark laboratory at Cornell University or in the summer internship program at the J. Craig Venter Institute. PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
The assembly of the Ich genome will provide a valuable resource to researchers studying the molecular biology of this organism. Unfortunately, due to the assembly complications described in the Outputs section, we have not yet felt the current assembly is ready for public release. Using the data currently available, we have been working to assemble the sequence into contigs and scaffolds and to separate those belonging to the Ich genome from those of bacterial and fish origin. Several strategies have been tried and others are under evaluation. Our current best assembly of the Ich portion of the reads contains about 4000 scaffolds, with a scaffold N50 size of 26 kb. The total span of the scaffolds is 49.6 Mb, which is very close to our preliminary estimate of genome size of 50 Mb, based on EST and hybridization evidence. Coverage of the bacterial genomes is higher, and a number of large scaffolds have been assembled, but as yet no complete chromosomes. Training of a summer intern in genome closure methods occured during a 12 week session in 2008.

Publications

  • No publications reported this period