Source: UNIV OF MARYLAND submitted to NRP
AN EST RESOURCE FOR TILAPIA
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0209762
Grant No.
2007-35205-17948
Cumulative Award Amt.
(N/A)
Proposal No.
2006-04830
Multistate No.
(N/A)
Project Start Date
Jul 1, 2007
Project End Date
Jun 30, 2010
Grant Year
2007
Program Code
[43.0]- (N/A)
Recipient Organization
UNIV OF MARYLAND
(N/A)
COLLEGE PARK,MD 20742
Performing Department
(N/A)
Non Technical Summary
Tilapia are among the most important species in aquaculture today, but selective breeding of this species has barely begun. Only limited information on the sequence of genes is available for this species. This project will develop an extensive catalog of tilapia gene sequences, which will support the identification and characterization of genes involved in growth, disease resistance, stress tolerance and reproduction of this species.
Animal Health Component
(N/A)
Research Effort Categories
Basic
100%
Applied
(N/A)
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
30437141080100%
Knowledge Area
304 - Animal Genome;

Subject Of Investigation
3714 - Tilapia;

Field Of Science
1080 - Genetics;
Goals / Objectives
This project will fill a remaining gap in the genomic resources for tilapia by developing a comprehensive set of Expressed Sequence Tags (ESTs). We will construct and validate normalized cDNA libraries for several tissues. The libraries will be sequenced to generate ESTs for each tissue. All sequences will be immediately deposited in GenBank and clusters of overlapping sequences identified by bioinformatic analyses.
Project Methods
We will construct normalized cDNA libraries from several tissues using a novel duplex-specific nuclease from the Kamchatka crab. We will develop ~13 new normalized full-length cDNA libraries and will validate each library by measuring the insert size in 96 random clones from each library. We will further validate each library by sequencing a few hundred clones. We will then send the libraries to an outside contractor for high-throughput sequencing. The sequencing of each library will be tracked using a statistical approach that allows us to maximize the overall rate of gene discovery for the project. NCBI and the Dana-Farber Cancer Institute will begin building clusters once we sequence 50,000 clones. The HCGS will produce additional bioinformatics products, including 1) mapping of the ESTs on to the genome assemblies of pufferfish and medaka to speed the construction of comparative maps; 2) identification of microsatellites for genetic mapping; 3) coordination of the design of a universal cichlid microarray.

Progress 07/01/07 to 06/30/10

Outputs
Target Audience: This project filled a major gap in the genomic resources for tilapia by generating more than 116,000 ESTs for the Nile tilapia, a globally important aquaculture species. The ESTs were immediately useful to the research community studying the genetics and physiology of tilapia and related cichlid fishes. The sequences allowed the design of PCR primers for analyses of gene expression, genetic variation and genetic mapping. Many of the sequences were used in the radiation hybrid mapping of the tilapia genome by Guyon et al. 2012. The ESTs also played an important role in annotation of the first assembly of the tilapia genome by the Broad Institute (Browand et al., in review). The paper describing these ESTs (Lee et al. 2010) has already been cited more than 25 times. We expect these Sanger ESTs will continue to play an important role in training gene prediction algorithms used to annotate future assemblies of the tilapia genome. Changes/Problems: We encountered no difficulties in completing the major goals of the research as originally proposed. We exceeded the original target number of ESTs (goal 100,000, published 116,889), and our normalization procedure to remove redundant clones from the libraries was even more successful than we had hoped. We did have a secondary goal of designing a tilapia microarray. This effort was not pursued because new technologies (Illumina RNA-seq) made this microarray obsolete. What opportunities for training and professional development has the project provided? The project trained 2 postdocs and three technicians in preparation of cDNA libraries and bioinformatic analysis of the EST sequences. How have the results been disseminated to communities of interest? Our results were reported in an article published in BMC Genomics in 2010. All of the sequences have been deposited in the NCBI GenBank database. The sequences can be searched via BLAST on our www site BouillaBase.org The sequences can be viewed in the GBrowse genome viewer on our www site BouillaBase.org What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? Large collections of expressed sequence tags (ESTs) are a fundamental resource for analysis of gene expression and annotation of genome sequences. We generated 116,899 ESTs from 17 normalized and two non-normalized cDNA libraries representing 16 tissues from tilapia, a cichlid fish widely used in aquaculture and biological research. The ESTs were assembled into 20,190 contigs and 36,028 singletons for a total of 56,218 unique sequences and a total assembled length of 35,168,415 bp. Over the whole project, a unique sequence was discovered for every 2.079 sequence reads. 17,722 (31.5%) of these unique sequences had significant BLAST hits (e-value < 10(-10)) to the UniProt database. Normalization of the cDNA pools with double-stranded nuclease allowed us to efficiently sequence a large collection of ESTs. These sequences are an important resource for studies of gene expression, comparative mapping and annotation of the forthcoming tilapia genome sequence.

Publications

  • Type: Journal Articles Status: Published Year Published: 2010 Citation: Lee BY, Howe AE, Conte MA, D'Cotta H, Pepey E, Baroiller JF, di Palma F, Carleton KL, Kocher TD. An EST resource for tilapia based on 17 normalized libraries and assembly of 116,899 sequence tags. BMC Genomics. 2010 Apr 30;11:278. doi: 10.1186/1471-2164-11-278.