Progress 12/15/05 to 12/14/09
Outputs OUTPUTS: Twenty rice varieties were chosen to understand their genome makeup by identifying and comparing the differences among their DNA sequences. The selected varieties span the deep genetic diversity of cultivated rice, including landraces, modern varieties, and improved donors from the temperate and tropical japonicas, aromatic, aus, deepwater, and indica types of rice. The lines included Nipponbare as a reference since a high-quality complete genome sequence was available from the International Rice Genome Sequencing Project. These lines were purified via single seed descent and multiplied prior to preparing high-quality genomic DNA. SNP discovery through hybridization-based re-sequencing, a technology previously proven effective in human, mouse, and Arabidopsis genomes, was done at Perlegen using very high-density arrays of oligonucleotides (short single-stranded probes of DNA attached to silicon wafers). These arrays were designed as part of this project to allow sequence variation of more than 100 Mbp to be interrogated for both DNA strands of the genome sequence. The 100-Mbp fraction of the rice genome was chosen from the gene-rich, nonrepetitive regions of Nipponbare. With the exception of those areas absent from the IRGSP pesudomolecules, the areas interrogated on the arrays evenly covered the genome. Using two analytical methods ("model-based" algorithmic and "machine learning" tools) more than 420,000 SNPs were predicted and about 160,000 of these were in common, and hence of highest quality. Analyses of the OryzaSNP data have offered new insights about how rice diversity is distributed as well as glimpses of rice breeding history. Specific findings: 1. The OryzaSNP project (http://www.oryzasnp.org) discovered more than 160,000 non-redundant, genome-wide SNPs across a 100 Mb fraction of the Nipponbare rice genome for 20 diverse varieties corresponding to the fraction with little or no repetitiveness. 2. Annotation relative to the Rice Annotation Project release 2 and TIGR release 5 gene models is available via the OryzaSNP consortium website (http://www.oryzasnp.org). 3. The extent of linkage disequilibrium (LD, a measure indicating the extent of recombinations occurring in rice since its domestication and dispersal, and defining blocks that are more or less prone to recombination). LD was found to be highest in japonica (about 500 kb in length) and lowest in indica (about 200 kb) in this sample of germplasm. 4. Introgressions from modern breeding or historical events were observed where blocks of SNPs have been incorporated from one type of rice into another even though today they are geographically isolated. 5. Regions of introgressions shared between three or more genomes are not randomly distributed and tend to cluster together. These clusters comprise about 9% of the total genome and are highly correlated to genes/quantitative trait loci for domestication-related functions such as yield and grain quality. 6. Pairwise comparisons among all combinations of the 20 genomes indicated a large block on chromosome 5, where most indica and all japonica share the same patterns. PARTICIPANTS: From CSU: Rebecca M. Davidson,Ph.D. student; Myron Bruce, Ph.D. student; Amadou Seck, Post Doctoral Fellow; Hiromichi Ishihara, Post Doctoral Fellow; From IRRI: Hei Leung, Plant Pathologist and Kenneth McNally, Plant Pathologist; From MSU: C. Robin Buell, Professor and Kevin Childs, post doctoral fellow; In addition: 17 other national and international participants and co-authors not funded by the project. TARGET AUDIENCES: Plant Breeders, rice geneticists, and evolutionary geneticists PROJECT MODIFICATIONS: Not relevant to this project.
Impacts The comprehensive SNP data provides a foundation for deep exploration of rice diversity and gene-trait relationships, and their use for future rice improvement. It is the largest collection of SNP data for rice to date. Many projects worldwide are already using the data in their experiments. This extensive collection of SNP data is the first step for developing high-density genotyping platforms to investigate rice diversity, evolution, population genetics, and gene-phenotype relationships. Having SNP data is also advantageous because the bi-allelic nature of SNPs (one base is substituted for another of the same class more than 97% of the time) means that more convenient sets of marker systems can be implemented for marker-assisted breeding programs. Additionally, the 20 OryzaSNP lines are also being used to generate a series of recombinant inbred line populations for detailed functional genomics studies. These populations will be useful for validating gene-phenotype relationships predicted by association genetics studies. This publicly available, multi-varietal SNP data is a powerful resource to investigate the frequency and distribution of molecular variation across the rice genome, assess evolutionary forces shaping the rice genome, and identify candidate genes controlling important traits. In the long term, the information will be used to improve rice and other crop plants, and to enhance plant genomic research.
Publications
- Davidson, R., Manosalva, P. M., Vera Cruz, C. M., Leung, H., Leach, J. E. 2009. Functional analyses of germin-like proteins and oxalate oxidases; Contributors to basal disease resistance in rice. Poster presented at the Plant and Animal Genomes XVII. San Diego, CA. Jan 9-15, 2009. [abstract P687; http://www.intl-pag.org/17/abstracts/P07b_PAGXVII_687.html]
- Mojica, C., Ma. E. Naredo, K. Zhao, K. Wright, M. Thomson, B. Courtois, J.E. Leach, S. McCouch, H. Leung, K. McNally. 2009. Rice genetic diversity assessment of the RiceSNP set using the GoldenGate genotyping assay and VeraCode technology. Poster presentation P8-5 at the 6th International Rice Genetics Symposium, Manila, Philippines, November 16-19. http://ricegenetics.com/index.php.
- McNally, K., Childs, K., Bohnert, R., Davidson, R.M., Zhao, K., Ulat, V.J., Zeller, G., Clark, R.M., Hoen, D., Bureau, T., Stokowski, R., Ballinger, D., Frazer, K., Cox, D., Padhukasahasram, B., Bustamante, C., Weigel, D., Mackill, D., Bruskiewich, R., Ratsch, G., Buell, C. R., Leung, H. and Leach, J. E. 2009. Genome-wide SNP variation reveals relationships among land-races and modern varieties of rice. Proc Nat Acad Sci USA 106:12273-12278.
- Davidson, R., Reeves, P., Manosalva, P., Leach, J.E. 2009. Germins: a diverse protein family important for crop improvement. Plant Science 177:499-510.
- Carrillo, M.G., Goodwin, P.H., Leach, J. E., Leung, H., Vera Cruz, C. M. 2009. Phylogenomic relationships of rice oxalate oxidases to the cupin superfamily and their association with disease resistance QTL. RICE 2:67-79.
- Leung, H. 2009. Stressed genomics-bringing relief to rice fields. Curr. Opin. Plant Biol. 2008 11:201-208.
- Childs, K., John Hamilton, Haining Lin, C. Robin Buell. 2009. Improvements to the MSU Rice Genome Annotation. RAP Workshop at the International Rice Genetics Symposium, Manila, Philippines, November 16-10. http://ricegenetics.com/index.phpoption=com_content&view=article&id= 137.
- Collard ,B.C.Y., Vera Cruz, C.M., McNally, K.L., Virk, P.S., Mackill, D.J. (2008) Rice molecular breeding laboratories in the genomics era: current status and future considerations. Special issue: Genomics of Major Crops and Model Plant Species. Int J Plant Genomics, 25 pp. http://www.hindawi.com doi=10.1155/2008/524847.
|
Progress 12/15/07 to 12/14/08
Outputs OUTPUTS: Rice, the primary source of dietary calories for half of humanity, is the first crop plant for which a high-quality reference genome sequence from a single variety was produced. We used resequencing microarrays to interrogate 100 Mb of the unique fraction of the reference genome for 20 diverse varieties and landraces that capture the impressive genotypic and phenotypic diversity of domesticated rice. Here we report the distribution of 160,000 non-redundant single nucleotide polymorphisms (SNPs) that lead to linkage disequilibrium estimates of 200 to 500 kb for the indica and japonica subgroups, respectively. Introgression patterns of shared SNPs revealed the breeding history and relationships among the 20 varieties; some introgressed regions are associated with agronomic traits that mark major milestones in rice improvement. This comprehensive SNP data provides a foundation for deep exploration of rice diversity and gene-trait relationships, and their use for future rice improvement. Specific Accomplishments: 1. The OryzaSNP project (http://www.oryzasnp.org) has discovered genome-wide SNPs across a 100 Mb fraction of the Nipponbare rice genome for 20 diverse varieties corresponding to the fraction with little or no repetitiveness. 2. The diverse varieties included representatives from all variety groups with Nipponbare included as control (McNally et al., 2006) 3. SNPs were identified by array-based re-sequencing technology using very high-density oligomer arrays at Perlegen. 4. A combination of model-based and machine-learning algorithms predicted 160,000 non-redundant SNPs at non-repetitive sites. 5. Annotation relative to the Rice Annotation Project release 2 and TIGR release 5 gene models is available via the OryzaSNP consortium website (http://www.oryzasnp.org). 6. Analyses of shared patterns of SNPs between variety types has identified regions indicative of historical introgression or imposed selection that has occurred in breeding programs. 7. The linkage disequilibrium (LD) ranged from 500 to 200 kb in the japonica and indica subgroups, respectively. This LD level enables whole genome association scans of germplasm collections for identification of genomic regions useful for rice improvement. 8. Phenotyping for disease resistance and agronomic traits is underway 9. Genetic stock development has been initiated. PARTICIPANTS: Rebecca M. Davidson,Ph.D. student in the Bioagricultural Sciences and Pest Management Department at CSU and 21 national and international co-authors. TARGET AUDIENCES: Plant Breeders, rice geneticists, and evolutionary geneticists. PROJECT MODIFICATIONS: We requested a no cost extension to allow completion of several areas, e.g., (I) Continue phenotypic evaluation and assembly of comprehensive phenotype data on the 20 rice cultivars and landraces used for resequencing. Traits in five broad categories, i.e., (1) reproductive and yield-related, (2) morphological, (3) grain quality, (4) abiotic stresses, and (5) biotic stresses, are being evaluated at various sites for the SNPset. The phenotypic data will be integrated into the International Rice Information System database (http://www.iris.irri.org). (II) Determination of the health value variation in the OryzaSNP set. Rat feeding trials and metabolomics analysis with the OryzaSNPset are underway to identify traits associated with nutritional and health benefits. (III)Continue development of genetic stocks based on the resequenced rice lines to allow exploitation of biologically relevant genetic variation. Production of Recombinant inbred lines (RILs) and F1 hybrids is underway from selected pairs of the OryzaSNPset.
Impacts Valuable resources for the molecular genetic and plant breeding community have been developed, including data on genome sequences of 20 comprehensively characterized rice lines, rice genetic stocks and an expansive SNP database. This publicly available, multi-varietal SNP data is a powerful resource to investigate the frequency and distribution of molecular variation across the rice genome, assess evolutionary forces shaping the rice genome, and identify candidate genes controlling important traits. In the long term, the information will be used to improve rice and other crop plants, and to enhance plant genomic research.
Publications
- Bush, D., J.E. Leach. 2007. Translational Genomics for Bioenergy Production: Theres Room For More Than One Model. Plant Cell 19: 2971-2973
- Leach, J.E., Rebecca Davidson, Bin Liu, Patricia Manosalva, Ramil Mauleon, Gay Carrillo, Myron Bruce, Janice Stephens, Maria Genaleen Diaz, Rebecca Nelson, Casiana Vera Cruz, and Hei Leung. 2007. Understanding Broad-Spectrum, Durable Resistance in Rice. Pages 191-209 in Rice Genetics V, D S Brar, D Mackill & B Hardy, eds. World Scientific Publ. Co.
- Leach, J.E., Rebecca Davidson, Bin Liu, Patricia Manosalva, Ramil Mauleon, Gay Carrillo, Myron Bruce, Janice Stephens, Maria Genaleen Diaz, Rebecca Nelson, Casiana Vera Cruz, and Hei Leung. 2007. Understanding Broad-Spectrum, Durable Resistance in Rice. Pages 191-209 in Rice Genetics V, D S Brar, D Mackill & B Hardy, eds. World Scientific Publ. Co.
- Leach, J.E., A. Seck, H. Ishihara, C. Zhou, Z. Shen, and B. Zhao. 2008 Achieving disease resistance in rice: more than one way to stop a pathogen. Symposium speaker, 6th International Symposium on Rice Functional Genomics, Jeju, Korea. November 10-12, 2008
- Jahn, C., J. Stephens, B. Mason, D. R. Bush, H. Leung, J. McKay, J.E. Leach. 2008. Using the OryzaSNP set to characterize genes important to biofuel production. Poster presented at the 6th International Symposium on Rice Functional Genomics, Jeju, Korea. November 10-12, 2008
- Bohnert R, Zeller G, Clark RM, Childs KL, Ulat V, Stokowski R, Ballinger D, Frazer K, Cox D, Bruskiewich R, Buell CR, Leach J, Leung H, McNally KL, Weigel D, Ratsch G. 2008. Revealing sequence variation patterns in rice with machine learning methods. BMC Bioinformatics 9:(Suppl 10):O8 (30 October 2008)
- Seck, A., H. Ishihara, B. Zhao, J. E. Leach, S. H. Hulbert. 2007. Dissection of Rxo1-mediated defense signaling in cereals. Phytopathology 97:S106
- Davidson, R.M., Manosalva, P., Vera Cruz, C., Leung, H., Leach, J.E. 2008. Sequence polymorphisms confer differential allele regulation of germin-like protein gene family members associated with rice blast QTL. Phytopathology 98:S44
|
Progress 12/15/06 to 12/14/07
Outputs OUTPUTS: The OryzaSNP project (http://www.oryzasnp.org) goal is to provide rice researchers with access to the maximal information on genetic variation present within and between diverse rice cultivars and landraces and the genetic resources to exploit that information. Our first step was genome-wide SNP discovery across the unique fraction of the rice genome for 20 diverse varieties. The 100 Mb of the rice genome (IRGSP release 4) corresponding to the fraction with little or no repetitive sequences was chosen for SNP discovery using a repetitive masking pipeline. The diverse varieties included representatives from indica, tropical and temperate japonicas, aus, deepwater, and aromatic types of rice, with Nipponbare included as a control. SNPs were identified by the array-based re-sequencing technology using very high-density oligomer arrays at Perlegen. Perlegen's model-based algorithms identified 259,721 non-redundant SNP sites in Nipponbare where one or more of the other 19
varieties differed. To improve predictions, a combination of the Perlegen model based (MB) SNP calling approach and a machine learning (ML) method previously used for analysis in the Arabidopsis SNP discovery project were applied. The ML method was superior by recovering 20.9% of all known SNPs at a false discovery rate (FDR) of 8.3%, compared to 13.9% and 7.8%, respectively, for the MB approach. Combined, the methods predicted 158,000 non-redundant SNPs relative to Nipponbare, with a recall rate of 10.7% and FDR of 2.9%. Annotation relative to the Rice Annotation Project release 2 and TIGR release 5 gene models has been accomplished, and the OryzaSNP annotation database (release 1) is open to the public (http://irfgc.irri.org). A workshop was conducted at The 5th International Symposium of Rice Functional Genomics (Tsukuba, Japan) to train researchers and plant breeders in how to use the OryzaSNP data. Comprehensive phenotyping of the 20 lines in the OryzaSNP set and development of
genetic resources (recombinant inbred lines and F1 hybrids) is underway.
PARTICIPANTS: Training: Kevin Childs, The Institute for Genomic Research/Michigan State University: Kevin is the postdoctoral fellow who developed the OryzaSNP browser and has collaborated in the data quality assessment. Rebecca Davidson, Ph.D. student, CSU: Although not funded on this project, Rebecca has helped in trouble shooting the browser, teaching how to do queries in the workshop, and working with a breeder in China for application of the data. Regina Bohnert, MS student, Universitat Tubingen. Although not funded on this project, Regina performed the machine learning assessment of the data under the direction of Gunar Ratsch. Workshop: OryzaSNP Discovery Workshop held at The 5th International Symposium of Rice Functional Genomics, Oct 16, Tskuba, Japan. Coordinated by J. Leach with presentations by K. McNally (IRRI), K. Childs (MSU), R. Davidson (CSU), Mashiro Yano (NIAS), H. Leung (IRRI). The 3 h workshop purpose was to introduce the rice research community to the OryzaSNP
data set.
TARGET AUDIENCES: Rice geneticists and breeders, Cereal crop geneticists and breeders.
Impacts The OryzaSNP data set is valuable information for understanding diversity of rice genotypes, rice genome evolution, domestication, etc, as well as for breeding applications. Specific resources developed include data on genome sequences of 20 comprehensively characterized rice lines, valuable rice genetic stocks and an expansive SNP database. In the long term, the information generated and resources developed will be used to improve rice as well as other important cereal crop plants, and to enhance plant genomic research.
Publications
- McNally, K., K. Childs, V. Ulat, R. Clark, R. Bohnert, G. Zeller, G. Ratsch, D. Weigel, D. Hoen, T. Bureau, R. Stokowski, D. Ballinger, K. Frazer, D. Cox, R. Bruskiewich, D. Mackill, C.R. Buell, R. Davidson, J. Leach, and H. Leung. 2007. OryzaSNP Genome-wide SNP discovery in Diverse Rice. The 5th International Symposium of Rice Functional Genomics, Oct 15-17, Tskuba, Japan. (Abstract).
- Leung, H., McNally, K.L., Mackill, D. 2007. Rice. Pages 335-351. In Genetic Variation: A Laboratory Manual. Edited by Weiner MP, Gabriel SB, Stephens JC. Cold Spring Harbor Laboratory Press.
- McNally, K.L., R. Bruskiewich, D. Mackill, C. R. Buell, J. E. Leach, H. Leung. 2006. Sequencing Multiple and Diverse Rice Varieties: Connecting Whole-Genome Variation with Phenotypes. Plant Physiol. 141:26-31.
|
Progress 12/15/05 to 12/15/06
Outputs Our goal is to identify and provide rice researchers access to the maximum information on genetic variation that exists within and between diverse rice cultivars and landraces. This project provides partial support for genome-wide SNP discovery by re-sequencing 100 Mbp of the genomes of 20 diverse rice varieties; other funding is provided through international collaborators. Perlegen Sciences will generate the SNP data through DNA-DNA hybridization on high density oligonucleotide arrays. Release 4 of the high-quality BAC-by-BAC japonica sequence of Nipponbare was masked for repetitive DNA. After masking, 95 Mb of Nipponbare was found to be unique by the strict criterion that each segment have no significant BLAST hit other than to itself. The same procedure was used to mask the sequence of the indica variety 93-11. Both sequences were aligned to one another, and 3 Mb of 93-11 sequence were chosen from areas with unambiguous global alignment. After masking, 2.5 Mb of
unanchored Nipponbare BACs, 0.4 Mb of BACs from FR13A (aus) and Kasalath (indica) distinct from Nipponbare and 93-11, and 0.1 Mb of mitochondrial sequence from Nipponbare were chosen giving a total of 100 Mb unique sequence. CsCl2 purified genomic DNA from the 20 lines of high quality were tested for long range (LR)-PCR amplification; all templates at several target loci yielded product. A pilot experiment involving arrays where 379 kb of unique sequence in a region of 684 kb on the long arm of Chr3 was performed. All 20 varieties were hybridized to the arrays with a total of 2,132 SNPs detected (on average 1 SNP/200 bp). The 20 rice lines have undergone one round of single seed descent, and are currently being increased to amounts sufficient for phenotyping. F1 intercrosses for a half di-allele mating scheme are being performed in preparation for the production of RILs and other populations. We have initiated a collaboration with Detlef Weigel at Max Planck Institute for
Developmental Biology, Tubingen, Germany who has recently analyzed SNP data from 20 genotypes of Arabidopsis thaliana. Those data were also derived from Perlegen resequencing, and Dr. Weigel has agreed to share his group's SNP analysis tools, including a machine learning method to identify polymorphic regions that are not recognized by Perlegen's standard SNP calling software. Because of discussions related to data analysis, we will do a limited amount of dideoxy sequencing of regions that are being resequenced by Perlegen. When the preliminary results are delivered by Perlegen, we will identify regions of the genome that have been successfully resequenced and order primers so that these same regions may be PCR amplified and sequenced at TIGR. These data will serve three purposes, they will: provide true measures of SNP rates in the 20 genotypes; enable accessment of the type I and II error rates of the resequencing, and to train the machine learning algorithm.
Impacts In achieving our objectives, valuable resources will be developed, including data on genome sequences of 20 comprehensively characterized rice lines, valuable rice genetic stocks and an expansive SNP database. This publicly available, multi-varietal SNP data will be a powerful resource to investigate the frequency and distribution of molecular variation across the rice genome, assess evolutionary forces shaping the rice genome, and identify candidate genes controlling important traits. In the long term, the information will be used to improve rice and other crop plants, and to enhance plant genomic research.
Publications
- No publications reported this period
|
|