Source: UNIV OF MINNESOTA submitted to NRP
EVALUATING GENOMIC SELECTION FOR APPLIED PLANT BREEDING
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0219079
Grant No.
2009-65300-05661
Cumulative Award Amt.
(N/A)
Proposal No.
2009-01886
Multistate No.
(N/A)
Project Start Date
Sep 1, 2009
Project End Date
Aug 31, 2013
Grant Year
2009
Program Code
[91610]- Plant Genome, Genetics and Breeding
Recipient Organization
UNIV OF MINNESOTA
(N/A)
ST PAUL,MN 55108
Performing Department
Agronomy & Plant Genetics
Non Technical Summary
Plant breeding is a process designed to generate new plant cultivars with desirable combinations of genes. This is accomplished by selecting parents with desirable characteristics, selectively mating parents, generating families of breeding lines from each mating or cross, and evaluating those progeny for important traits. The most desirable progeny are further evaluated to identify new cultivars or selected as parents to begin the cycle again. For most traits, plant breeders use phenotypic selection (directly measuring the trait(s) of interest) to identify superior individuals. Recently, marker assisted selection (MAS) has been used to increase the efficiency of breeding programs. MAS involves mapping the location of an important gene or genes and using nearby genetic markers (DNA sequence) to track the gene in progeny of breeding populations. This technology can often increase the efficiency and decrease the cost of conducting selection in a plant breeding program. Traditional MAS strategies have proven useful for simply inherited traits, but have generally failed to improve complex traits controlled by many genes. Genomic selection differs from this traditional approach because it uses many genetic markers distributed across the entire genome to predict the performance of a given breeding line. The most valuable asset of genomic selection is that it can dramatically shorten the breeding cycle ? the time from making a cross between two parents to when a progeny of that cross is used as a parent in a new cross. For many programs a breeding cycle will be three to seven years. Using genomic selection this can be shortened to as little as one year. Genomic selection has been investigated through computer simulation, but has not been empirically tested in a plant breeding program. The goal of this project is to measure the effectiveness of genomic selection in a spring barley and winter wheat breeding program.
Animal Health Component
80%
Research Effort Categories
Basic
10%
Applied
80%
Developmental
10%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2011540108050%
2011550108050%
Goals / Objectives
Genomic selection is an approach to improve quantitative traits that couples the power and relevance of large plant breeding populations with the resolution and accuracy of highly multiplexed marker technology. Our overall goal is to empirically evaluate the potential of genomic selection in a winter wheat and a spring barley breeding program. Our specific objectives are: 1) Compare the effectiveness of genomic selection and phenotypic selection to identify superior progeny in breeding populations; 2) assess the ability of genomic selection to predict a parent's breeding value; 3) determine whether a trained genomic selection model maintains accuracy over cycles of breeding; 4) use simulations to assess scenarios for the introduction and implementation of genomic selection in a breeding program to optimize short- and long-term success over cycles of genomic selection. Important outputs from this work include: genomic selection models for yield and other quantitative traits for a spring barley and winter wheat program; estimates of accuracy of genomic selection models to predict progeny performance and breeding value; assessment of the longevity of a genomic model several breeding cycles beyond the training data set; and a set of guidelines for updating genomic selection models and allocating resources toward phenotyping and genotyping.
Project Methods
The training and testing of genomic selection models will be accomplished using large marker and trait data sets. The participating barley and wheat breeding programs have identified sets of breeding lines that are suitable for the proposed objectives. To test the ability of genomic selection to predict breeding line performance each program will generate trait and marker data for a set of over 600 breeding lines. Phenotypic data will be a combination of historical data and new data. When unbalanced data are used best linear unbiased prediction (BLUPs) will be calculated from the data. Genotyping of barley will consist of 1,536 single nucleotide polymorphic markers arranged in an oligo pooled assay using the Illumina GoldenGate Assay. Wheat breeding lines will be genotyped using Diversity Array Technology (DArT) performed by Triticarte Pty Ltd (http://www.triticarte.com.au/). The wheat DArT array is expected to generate data for nearly 800 markers. Marker effects will be fit using least squares, random-regression BLUP (RR-BLUP) and Bayesian methods (Bayes-A and Bayes-B). From the set of 600 breeding lines a random set of 100 breeding lines will be identified and evaluated in replicated experiments in years 2 and 3 of the project. We will assess effectiveness by evaluating the correlation between genomic selection model prediction and performance in years 2 and 3. Data from years 2 and 3 will be used to compare phenotypic selection and genomic selection. To test the ability of genomic selection to predict parent breeding value, each breeding program will identify a set of 100 parents that each has at least six progeny in yield trials. In addition to the 100 parents, additional breeding line cohorts will be added to the parent set. This set of lines will be genotyped and phenotyped and genomic selection models will be used to predict the breeding value of each line. The progeny phenotype data will be used to calculate the true breeding value of each line. We will assess accuracy by calculating the correlation between genomic estimated breeding value and true breeding value. Once a genomic selection model is generated it is likely that it will need to be re-trained over time to retain accuracy for current breeding lines. We will use a set of 189 barley breeding lines developed from 1999 to 2003 to develop a genomic selection model. We will use this model to predict the genetic values of sets of 96 breeding lines developed and evaluated in each of the six following years (2004-2009). We will use regression analysis to determine if there is a relationship between years since the model was developed and the correlation between predicted and observed performance. Lastly, we will use computer simulation to test various procedures for implementing genomic selection within these two breeding programs. We will simulate and compare various scenarios that differ in breeding cycle and the number of breeding lines to phenotype to update genomic selection models.

Progress 09/01/09 to 08/31/13

Outputs
Target Audience: The audiences for this work are plant breeders, professors teaching plant breeding and students of plant breeding. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? PhD candidate at the University of Minnesota, Ahamd Sallam, conducted all of his dissertation research as a part of this project on objectives 1, 2, and 3. He is scheduled to graduate in June of 2104. Four post-docs at USDA-ARS, Ithaca, NY received partial funding from this project and contributed analyses to Objectives 1, 2, and 4: Aaron Lorenz, Hsiao-Pei Yang, Yi Jia, Jeff Endelman. One graduate student at Cornell University contributed to analyses for Objective 1: Nicolas Heslot PhD candidate at The Ohio State University, Amber Hoffstetter, has received partial funding from this project. She is projected to graduate in the spring of 2015. A post-doctoral researcher at The Ohio State University, Antonio Cabrera, participated in objectives 1 and 2. How have the results been disseminated to communities of interest? Some of the findings of this research have been disseminated to the research community through peer reviewed publications listed in this report. Also results and lessons learned from this project have been communicated to researchers in the USDA Triticeae Coordinated Project. There are several manuscripts that are in preparation or review.Aspects of this work have been presented at many conferences both national and international that include various stakeholders including growers and industry representatives from malt companies, brewers, millers, and bakers (examples listed below). Smith, K.P., V. Vikram, A. Sallam, A. Lorenz; J. Jannink, J. Endleman, R. Horsley, S. Chao, and B. Steffenson. 2013. Using Genomic Selection in Barley to Improve Disease Resistance. 2013 Borlaug Global Rust Initiative Technical Workshop, August 19-22, 2013, Taj Palace Hotel, New Delhi, India. Smith,K. P., V. Vikram, A. Lorenz, J. Jannink, S. Chao and R. Horsley. 2012. Evaluating Genomic Selection for DON in a Collaborative Barley Breeding Effort. 2012 National Fusarium Head Blight Forum, December 4-6, 2012, Wyndham Orlando Resort, Orlando, Florida. Smith, K.P. Implementing Genomic Selection in Barley. 2012. 6th International Crop Science Congress, Buento Goncalves, Brazil, 8/6 – 8/10/2012. Smith, K. P. 2011. Keeping barley competitive through research: Genomics and breeding. Research Panel. Barley Improvement Conference. Old Town San Diego, CA, Jan 12-13, 2011. Clay H. Sneller, et al. "Genomic Selection in Wheat." Presented at University of Science, Agriculture and Veterinary Medicine. Cluj-Napoca, Romania. ( 2013 ) Clay H. Sneller, et al.r. "Genomic Selection as a Tool to Meet Future Food Demands." Presented at Sokoine University of Agriculture. Morogoro, Tanzania ( 2013 ) Clay H. Sneller, et al.. "Assessing and Implementing Genomic Selection is US Soft Winter Wheat." Presented at 1st International Genomic Selection Workshop. Clermont-Ferrand, France. ( 2013 ) What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? The rapid decrease in the cost of genotyping plants has enabled breeders to easily collect large amounts of genetic marker information on breeding lines and implement genomic selection. Genomic selection is using large numbers of genetic markers dispersed throughout the genome to predict the characteristics or phenotype of an individual. This has particular importance in plant breeding as one can collect this genetic prediction data much earlier in the breeding scheme then when empirical phenotypic data is collected. Thus, genomic selection has the potential to dramatically shorten the breeding cycle time and accelerate breeding progress. A key component of genomic selection strategies is the accuracy of the predictions made using marker data. The overall gain in progress in breeding will be a function of the relative accuracy and the length of the breeding cycle for genomic selection compared to phenotypic selection. We assessed the accuracy of these genomic predictions in barley for several traits including yield and disease resistance. Prediction accuracy evaluated over a number of scenarios ranged from 0.40 to 0.62 in barley for yield and plant height, respectively. We determined that prediction accuracy was maintained over a five year period in a dynamic breeding program for most traits. We obtained these predictions with training population sizes of around 200 individuals and around 800 SNP markers. These resource parameters are well within reach of most barley breeding programs. The accuracy for wheat yield (0.44 to 0.56), resistance to FHB (0.37), and two quality traits (0.51, 0.63) The relative efficiency of GS compared to phenotypic selection on a per year basis ranged from 2.66 to 4.73 and indicate that GS could be quite effective for winter wheat. Using phenotypes obtained from different environments than those used to phenotype our training population, we found that our GS models were moderately predictive of the phenotypes of relatives of the training population, but did not predict the phenotype of other lines that were not highly related to the training population. The results from our work suggest that genomic selection in small grains is quite feasible and should for many traits provide predictions with sufficient accuracy to improve genetic gains per unit time. Since this work began several public wheat and barley breeding programs have begun using genomic selection. Objective 1 Compare the effectiveness of genomic selection and phenotypic selection to identify superior progeny in breeding populations. In spring barley, we compared selection based on the phenotypic performance of breeding lines for FHB resistance, accumulation of the mycotoxin deoxynivalenol (DON), and grain yield.A set of 168 breeding lines were used to train a prediction model. Phenotypic selection of the selection candidates was based on performance in two disease evaluation trials and three yield trials conducted previously. The top ten lines were selected from each of the five sets of 96 selection candidates based on the two selection methods (observed performance and predicted performance). All selected lines wereevaluatedinfive locations for disease resistance and in seven locations for yield. In one of the two comparisons of phenotypic selection to genomic selection for yield, the mean of the phenotypic selection was slightly greater (p=0.04) and in the other case the two selection methods were not different. These results indicate that genomic selection performed similarly to the phenotypic selection. In winter wheat we phenotyped a TP consisting of 470 soft red winter wheat lines from the OSU breeding program and a second set of 94 lines. The traits evaluated on both sets were: yield, FHB resistance, softness equivalent and flour yield.We assessed the accuracy of GS using cross validation and the TP data. The accuracy of GS ranged from 0.37 to 0.63 and relative efficiency (per breeding cycle) ranged from 0.48 to 0.68. On a per year basis the relative efficiency of GS to phenotypic selection ranged from 2.66 to 4.73.We demonstrated that GS has the potential to be quite effective and efficient in winter wheat. Objective 2 Assess the ability of genomic selection to predict a parent's breeding value. In barley, we conducted several experiments that assessed the accuracy of genomic predictions that were based on a training population of 168 breeding lines and five sets of 96 breeding lines that contained progeny from parents included in the training population. Across these five sets of materials we found these correlations to be on average highest for plant height (0.62) and lowest for yield (0.40). With a breeding cycle times of one and four years for genomic selection and phenotypic selection, respectively, we would expect genomic selection to exceed phenotypic selection in terms of gain per unit time. In winter wheat, we assessed if the GS model developed from the TP can predict the phenotypes and true breeding value (TBVi = [phenotype of the ith parent’s progeny-mean of all progeny]/2) of the parents of the TP and the phenotypes of a set of lesser related lines. The GS model built from the TP had mixed value in predicting the phenotype or TBV the 21 available parents of the TP. The genomic estimated breeding value (GEBV) of the parents was significantly correlated to the parental phenotypes and TBVs for two yield traits but the correlations for FHB and quality traits were quite low. Objective 3 Determine whether a trained genomic selection model maintains accuracy over cycles of breeding. We used the training population of 168 lines to predict five sets of 96 breeding lines that were progeny of parents in the training population. The five sets of breeding lines were derived over a five year period so that we could assess prediction accuracy over time. For plant height and Fusarium head blight disease severity we saw no change in prediction accuracy over time. For yield, prediction accuracy was variable across years but did not consistently increase or decrease over time. For the mycotoxin we saw a decrease in prediction accuracy from 0.60 to 0.25. These results indicate that the need to retrain prediction models over time to maintain accuracy will be dependent on the trait. Objective 4 Use simulations to assess scenarios for the introduction and implementation of genomic selection in a breeding program to optimize short- and long-term success over cycles of genomic selection. Research on this objective has led to three publications and one manuscript in preparation: A publication using an analytical framework (Heffner et al. 2010) showed that the decreased cycle time enabled by genomic prediction can increase gain per unit time and cost, even when prediction accuracies are lower than have been observed in barley and wheat. Successful long-term genomic selection requires using markers not just for prediction but for population management to maintain diversity and the potential for future gains. In simulations, we found that adding weight to loci at which the predicted favorable allele was rare increased long-term gains (Jannink 2010). Another analytical paper on optimal allocation for prediction within biparental populations (Endelman et al. 2014) showed that for the sake of gain from selection, it is often best to phenotype all individuals that have been genotyped, rather than to rely on predictions for the selection of a subset of individuals. For very long term selection in which gains rely on the input of mutation, genomic selection may underperform phenotypic selection because the effects of new mutants present in the selection candidates are not observed in the training population. At the transition between phenotypic and genomic selection, however, a marked acceleration of gain per unit time may be observed because of more rapid selection cycles under genomic selection (Jannink, in preparation).

Publications

  • Type: Journal Articles Status: Published Year Published: 2012 Citation: Jia, Y., and J.-L. Jannink. 2012. Multiple Trait Genomic Selection Methods Increase Genetic Value Prediction Accuracy. Genetics 192:1513-1522
  • Type: Journal Articles Status: Published Year Published: 2012 Citation: Lorenz, A.J., Smith, K.P., and Jannink, J.L. 2012. Potential and optimization of genomic selection for Fusarium head blight resistance in six-row barley. Crop Sci. 52:1609-1621.


Progress 09/01/10 to 08/31/11

Outputs
OUTPUTS: There are four objectives: Objective 1) Compare the ability of genomic selection (GS) and phenotypic selection (PS) to predict future performance of lines and select the best lines. We have completed the second year of phenotyping a set of 470 wheat lines (progeny set) for yield (6 environments) and Fusarium head blight (FHB) index (2 environments) and one year of phenotyping for quality traits at two locations. We obtained a set of 1,820 DArT markers for 449 of the lines. Objective 2) Assess the ability of GS to predict a parent's breeding value. In wheat, we have completed the first year of phenotyping the parents of the 470 lines used in objective 1. We will submit these for SNP genotyping during the winter of 2011-12. The second year of field trials for these lines was planted in the Fall of 2011. In barley, we have completed the second year of data collection on the parent set of 168 lines for a total of 5 yield trials and 4 FHB trials. Analysis of grain samples for the 2010 trials are complete and 2011 samples are in progress. Objective 3) Determine whether a trained GS model maintains accuracy over breeding cycles. All the data has been assembled and this analysis is in progress. Preliminary results indicate that prediction accuracies from cross-validation studies are generally higher than prediction accuracies calculated from progenies of parents in the training population. Objective 4) Use simulations to assess scenarios for the introduction and implementation of GS in a breeding program to optimize short- and long-term success over cycles of GS. From long-term selection experiments in model systems, we know that mutation can play an important role in long-term response. However, GS will not adequately capture contributions from new mutations. We are currently setting up simulation experiments to assess what impact this issue might have on medium and long-term gain, and whether approaches can be developed to improve GS capture of mutational events. Three graduate students and two post-doctoral scientists participated in research during this period. Three presentations were given to the scientific community and end-users. PARTICIPANTS: Nothing significant to report during this reporting period. TARGET AUDIENCES: Nothing significant to report during this reporting period. PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
Based on results from simulation studies looking at long term gain, we implemented selection criteria in a separately funded breeding project that weights allelic effects based on frequency with the idea favorable alleles, currently at low frequency, will contribute to progress more in later cycles. Our preliminary observation is the GS prediction accuracies based on cross-validation of a training population are generally higher than what we observe when the training population is used to predict progenies. This suggests that when using cross-validation to assess the feasibility of GS, prediction accuracies should be viewed as optimistic and interpreted appropriately.

Publications

  • Lorenz, A. J., Chao, S., Asoro, F.G, Heffner, E.L., Hayashi, T., Iwata, H., Smith, K.P., Sorrells, M.E. and Jannink, J.L. 2011. Genomic Selection in Plant Breeding: Knowledge and Prospects. Advances in Agronomy 110:77-123.
  • Heslot N., Yang H.-P., Sorrells M.E., Jannink J.-L. 2011. Genomic Selection in Plant Breeding: A Comparison of Models, In Press. DOI: 10.2135/cropsci2011.06.0297.
  • Heffner E.L., Jannink J.-L., Iwata H., Souza E., Sorrells M.E. 2011. Genomic Selection Accuracy for Grain Quality Traits in Biparental Wheat Populations. Crop Science 51:2597-2606.


Progress 09/01/09 to 08/31/10

Outputs
OUTPUTS: There are four objectives: Objective 1) Compare the ability of genomic selection (GS) and phenotypic selection (PS) to predict future performance of lines and select the best lines. In wheat, we have completed the first year of phenotyping a set of 470 wheat lines (progeny set) for yield (3 locations) and Fusarium head blight index (1 location). Field trials to assess these traits for a second year were planted in the fall of 2010. We obtained a set of 2,293 DArT markers for 449 lines. In barley, we have compiled historical phenotypic data from the progeny set (1204 lines) resulting from crosses involving 100 parents from the breeding program. Objective 2) Assess the ability of GS to predict a parent's breeding value. In wheat, we have assembled yield data from all past trials involving the parents and the lines tested with the parents (parent set). A seed increase is in progress to be used is in future field trails. In barley, the parent set which includes the 100 parents mentioned above and an additional 100 lines that are contemporaries of the parents, were evaluated in yield trials (2 locations) and Fusarium head blight trials (2 locations) in 2010. Objective 3) Determine whether a trained GS model maintains accuracy over breeding cycles. We completed data collection from 7 sets of breeding lines spanning a seven year period. Objective 4) Use simulations to assess scenarios for the introduction and implementation of GS in a breeding program to optimize short- and long-term success over cycles of GS. We have developed platforms to address those questions that have resulted in two publications. One study used an analytical framework to ask at what level of genomic prediction accuracy GS might outperform phenotypic selection, given the accelerated breeding cycle under GS. The other study used a stochastic simulation framework to ask what the impact of GS might be on long-term gain. PARTICIPANTS: Jean-Luc Jannink and Clay Sneller TARGET AUDIENCES: Barley growers, breeders and geneticists PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
We are completing the data collection for objectives 1-3 and have not yet analyzed data to determine primary outcomes for these objectives or assessed their impacts. In the simulation work of objective four we have several findings. One important finding is that even relatively low GS accuracies can outperform phenotypic selection in terms of gain per unit time given the fact that GS can substantially shorten the breeding cycle. The other important finding is that breeders should use genotypic information not only to predict breeding values, but also to maintain genetic diversity. In this way, GS can outperform phenotypic selection on both a short- and long-term basis. Based on the results of this simulation work, we have developed a method to select parents that takes into account both the magnitude of marker effects and their frequency in the breeding population in a project where we are implementing genomic selection.

Publications

  • Jannink J.-L., Lorenz A.J., Iwata H. (2010) Genomic selection in plant breeding: from theory to practice. Briefings in Functional Genomics 9:166-177. DOI: 10.1093/bfgp/elq001.
  • Heffner E.L., Lorenz A.J., Jannink J.-L., Sorrells M.E. (2010) Plant Breeding with Genomic Selection: Gain per Unit Time and Cost. Crop Science 50:1681-1690. DOI: 10.2135/cropsci2009.11.0662.
  • Jannink J.-L. (2010) Dynamics of long-term genomic selection. Genetics Selection Evolution 42:35.