Progress 04/15/17 to 04/14/21
Outputs Target Audience:There are four primary target audiences for this project that include 1) wheat breeders and scientists, 2) wheat growers, 3) wheat grain buyers and milling companies and 4) wheat consumers. We have presented project objectives and findings to the scientific community through presentations at international scientific conferences including Plant and Animal Genome and the International Gluten Workshop. For dissemination to wheat growers, we are closely connected to the Kansas Wheat Commission (KWC)and Kanas Association of Wheat Growers.Throughout the year, formal and informal presentations about the scope and progress of the project have been presented to these groups through KWC meetings, field days and other meetings and venues.We continue to stay tuned to the needs and priorities of the growers in a demanding market to develop varieties with superior quality profiles. To communicate project activities and goals to wheat grain buyers and milling companies we have updated this group through the Wheat Quality Council meeting and through meetings with industry held in conjunction with KWC meetings and Wheat Genetics Resource Center Industry Advisory Board meetings.These meetings bring together many of the primary flour millers in the U.S. and provide input and feedback on project developments. For general public of wheat consumers, we maintain a social media presence through Facebook and Twitter to provide updates on the projects to illustrate the focus of scientists and breeders on developing higher quality, more nutritious wheat varieties. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?Training and professional development in this project has included three postdoctoral scholars, one PhD student and one visiting scientist who have received training and experience in wheat breeding and genetics, genomics and advanced breeding methodologies. How have the results been disseminated to communities of interest?The two primary communities of interest for this project are 1) scientific community including wheat breeders and 2) the larger wheat producers community including Wheat Commission and Growers Association.Throughout the year presentations and project updates have been made to both of these groups through multiple venues. Multiple presentations were made to Kansas Wheat Commission as well as private companies including, GrainCraft and Indigo Ag that included results developed in this project. What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
Profiling of advanced breeding lines: Quality samples were taken from 990 KSU PYN lines that were grow during the 2018-19 growing season at Colby, KS under optimal irrigation conditions.For all of the PYN and selected lines from the first generation yield testing (n~1200), profiling was completed for sodium dodecyl sulfate (SDS) sedimentation analysis as a high-throughput phenotype for glutenin quality and bread making.From the first generation yield nursery samples, over 1200 lines were increased for evaluation. This set of PYN, along with additional set of PYN lines from 2018, were evaluated for thousand kernel weight(all PYN 2018 and 2019 (n=1881); completed January 2021) The full set of samples for PYN testing were completed with experimental scale milling.Quality parameter traits were recorded including milling yield, ash content and flower protein on all 2019 PYN's grown at the Colby irrigated site (completed January 2021).These milled samples are in freezer storage in preparation for baking with the USDA Grain Quality and Structure Research Unit. The whole grains and milled flour were assessed for mineral content using GC-MS at the Danforth Center (Dr. Ivan Baxter).The ion profiling of all 2018 and 2019 PYN samples grown at the Colby irrigated site was completed as described in Pauli et al. (2018)https://doi.org/10.1534/g3.117.300479 Metabolomic profiling:Using apipelinedevelopedduring the first phases of the project,we performedmetabolic profilingof929flour samplescollected in2017 and 2018 field seasons across six locations in Kansas.Metabolite extraction was performed using chloroform /water mix followed by GC/MS. After initial data inspection, chromatographic peak detection, retention time correction based on an algorithm considering non-linear shifts, chromatogram alignment, peak regrouping, filling in missing peak data, significant differences in analyte intensities were recorded.The consistency of metabolite detection was validated by incorporating technical replications, which showed R2ranging from 0.83 to 0.99.Metabolites showing same Mass-to-charge Ratio and retention time within 5 seconds across different batches were considered as the same peak. In total, we have identified ~700 metabolite peaks. Usage of metabolites and SNPs as predictors of wheat quality traits:The analysis of genetic relationship between the wheat quality traits and metabolite profiles was performed fora subset of 238 lines.Within this subset, a total of 141 metabolites were consistently identified in all batches (Nyine et al., 2021).The population was genotyped using sequenced-based approaches resulting in a set of 13K SNPs.Data on quality traits included percent grain protein content, protein quality score (PQS) and amylose content in 2018 and 2019.The grain protein content (GPC) was measured using a Perten grain analyser for population.The percent solvent retention capacity weight value (SRC) for each sample was measured following the hybrid SDS-SRC method.The PQS was then calculated by dividing the SRC value by the GPC.Amylose content was determined from whole mill flour samples.The GPC varied from 12.6 - 17.40% with the mean of13.97% in 2018 and 11.85 - 15.85% with the mean of 13.37% in 2019. The PQS varied from 12.68 - 24.75% (mean PQS = 20.33%) in 2018 and 14.80 - 24.11% (mean PQS = 19.81%).The amylose content varied between 11.78 and 22.32% with an average of 16.49% in the analyzed lines. Two models including Ridge Regression Best Unbiased Prediction (rrBLUP) and Bayesian B (Bayes B) were used to determine the prediction accuracy (PA) for GPC, PQS and amylose content.The predictors included metabolites (MTB), SNPs and a combination of SNPs with MTB (SNP+MTB).The two models (rrBLUP and Bayes B) performed equally well in predicting the traits although the rrBLUP model showed an increase in prediction accuracy with most predictors and traits.The rrBLUP provided the highest prediction accuracy for PQS(0.80)and amylose content(0.58)using SNPs and SNPs+MTB as predictors.However, addition of MTB did not show significant impact on prediction accuracy in rrBLUP.On contrary, fortheGPCtraitthe usage of MTB predictor alone or in combination with SNPs showedslightlyimprovedprediction accuracycompared to SNP-only predictionswith both rrBLUP(0.70 versus 0.68)and Bayes B(0.73 vs 0.70)models, suggesting that metabolites are a good proxy for genetic markers in the prediction of grain protein content. Develop, test and implement novel prediction models: DNA was extracted from all breeding lines prepared for sequencing using a standard GBS protocol (Poland et al, 2012). After that, all the sequencing data was storage and analyzed in the the High-Performance Computing (HPC) cluster of the Kansas State University.Single nucleotide polymorphisms (SNPs) were discovered using an automated pipeline in Tassel 5 (Bradbury et al. 2007) written in java language.Initially this process parsed out by nucleotide barcodes assigned for each breeding line, then the sequences were trimmed to 64-base pair in length where SNPs were identified.The sequence reads containing discovered SNPs were aligned against the Chinese Spring reference genome v.1.0 (IWGSC, 2018) using Bowtie 2to locate their physical positions.Only those SNP markers with minor allele frequency greater than 5% (MAF≥ 0.05), heterozygosity lower than 20% (Het≤ 0.20), and less than 50% missing values (NA<0.50) across genotypes were retained. SNP markers that yielded multi-allelic calls were discarded while those present on unanchored (UN) sequences were kept apart in a separated group named 'UN'.A total of 22,172 SNPs were remained in the genotypic dataset after filtering. Genome-wide Association Study In this objective, our focus was to identify genetic variants underlying baking quality traits through genome-wide association mapping (GWAS) and develop improved genomic selection (GS) models for the quality traits.A total of 462 advanced breeding lines from 2015-2017 were genotyped using genotyping-by-sequencing (GBS) and profiled for baking quality.The best linear unbiased prediction (BLUP) of each genotype for each trait was calculated and used in the GWAS and GS analyses. Genome-wide association mapping (GWAS) analyzes were conducted in R using an enhanced compression of the mixed linear modeland the Genome Association and Prediction Integrated Tool (GAPIT) (Lipka et al, 2012; Tanget al., 2016).Principal components analysis was used to verify whether genetic structure was present within this panel of breeding lines, which was not found.The inclusion of principal components and kinship matrix as covariates were also tested to improve the significance of SNP-trait associations.The significance of SNP-traits associations was determined by the Bonferroni test at 5% of probably of error. Due to the stringency of the prior test and additional threshold of significance was considered using false discovery rate of 10%(FDR10%). GWAS results from GAPIT were reloaded into R using the package 'qqman' to generate Manhattan plots color-coded by wheat sub-genomes. Significant GWAS signals were detected for mixograph mixing time and bake mixing time; most of which were within or in tight linkage to HMWglu-1D, LMWglu-1A, LMWglu-1B, Gli-1A, Gli-1B and Gli-6A and could be used for marker-assisted breeding.Candidate genes for the four new signals are phosphate-dependent decarboxylase and lipid transfer protein genes, which are believed to affect nitrogen metabolism and dough development, respectively. ?
Publications
- Type:
Journal Articles
Status:
Accepted
Year Published:
2021
Citation:
Shichen Zhang-Biehn, Allan K. Fritz, Guorong Zhang, Byron Evers, Rebecca Regan and Jesse Poland (2021) "Accelerating Wheat Breeding for End-use Quality through Association Mapping and Multivariate Genomic Prediction." The Plant Genome (accepted)
- Type:
Journal Articles
Status:
Other
Year Published:
2021
Citation:
Nyine M, Guo Y, Clinesmith M, Fritz A, Akhunov E. (2021) Genetic analysis of metabolic and grain quality trait variation in winter wheat.
|
Progress 04/15/19 to 04/14/20
Outputs Target Audience:There are four primary target audiences for this project that include 1) wheat breeders and scientists, 2) wheat growers, 3) wheat grain buyers and milling companies and 4) wheat consumers. We have presented project objectives and findings to the scientific community through presentations at international scientific conferences including Plant and Animal Genome and the International Gluten Workshop. For dissemination to wheat growers, we are closely connected to the Kansas Wheat Commission (KWC)and Kanas Association of Wheat Growers.Throughout the year, formal and informal presentations about the scope and progress of the project have been presented to these groups through KWC meetings, field days and other meetings and venues.We continue to stay tuned to the needs and priorities of the growers in a demanding market to develop varieties with superior quality profiles. To communicate project activities and goals to wheat grain buyers and milling companies we have updated this group through the Wheat Quality Council meeting and through meetings with industry held in conjunction with KWC meetings and Wheat Genetics Resource Center Industry Advisory Board meetings.These meetings bring together many of the primary flour millers in the U.S. and provide input and feedback on project developments. For general public of wheat consumers, we maintain a social media presence through Facebook and Twitter to provide updates on the projects to illustrate the focus of scientists and breeders on developing higher quality, more nutritious wheat varieties. Changes/Problems:COVID delays:Due to university shutdown and stay-at-home orders during March and April 2020, we were not able to progress with sample analysis in the quality lab as expected.Samples were stored in controlled environment and processing has again commenced during July 2020.We expect to complete the planned data collection by Fall. What opportunities for training and professional development has the project provided?Through the project there are currently three postdoctoral scholars, one PhD student and one visiting scientist who are active and being trained and mentored through the project. Workshop and Conference attendance and presentations: Emily Delorean: Attendance to the National Association of Plant Breeders Annual Meeting in Pine Mountain, GA. Attendance to the Plant and Animal Genome Conference Annual Meeting in San Diego, CA. How have the results been disseminated to communities of interest?The two primary communities of interest for this project are 1) scientific community including wheat breeders and 2) the larger wheat producers community including Wheat Commission and Growers Association.Throughout the year presentations and project updates have been made to both of these groups through multiple venues. Multiple presentations were made to Kansas Wheat Commission as well as private companies including, GrainCraft and Indigo Ag that included results developed in this project. What do you plan to do during the next reporting period to accomplish the goals? Additional ~1200 PYN lines from unselected set are being milled, and flours will be provided to the project team for further quality tests. An additional set of 600 samples of wheat flour will be processed to obtain metabolite profiles of advanced breeding lines. Postdoctoral researcher will analyze genotyping and omics datasets to identify genetic loci controlling trait variation. The comparison of metabolite profiles against the databases of known metabolites is currently being performed to infer the identity of detected analytes. Development of high-throughput genotyping for glutenins and gliadins. The k-mer pipeline for allele prediction will be enriched with sequences from Kansas wheat varieties and novel tauschii alleles for Glu-D1. The k-mer pipeline will be extended to the prediction of LMW glutenins and gliadins.
Impacts What was accomplished under these goals?
Profiling of advanced breeding lines:Quality samples were taken from 990 KSU PYN lines that were grow during the 2018-19 growing season at Colby, KS under optimal irrigation conditions.Samples for the preliminary yield nursery (PYN) were combine harvested and processed.Eighty-seven lines were selected in the breeding program for advancement to 3rdyear yield testing (Advanced Yield Nursery, AYN).An additional set of 200 lines not selected for advancement were included to capture the full range of quality profiles in the current breeding materials. For all of the PYN and selected lines from the first generation yield testing (n~1200), profiling was completed for sodium dodecyl sulfate (SDS) sedimentation analysis as a high-throughput phenotype for glutenin quality and bread making.From the first generation yield nursery samples, over 1200 lines were evaluated. From this data we have developed genomic prediction models as well as completed genome wide association study. In 2019, a set of 406 breeding lines from the Preliminary Yield trial Nursery (PYN18) yield-tested in Colby KS under irrigation were used for the SDS-SRC sedimentation test. Phenotypic data from these genotypes were later used in association mapping studies and for testing genomic prediction models with resample-based cross validation. This dataset has also been used as a training set (TS) to predict the end-use quality of breeding materials in the IPSR18, PYN18, IPSR19, and IPSR19 nurseries.In progress is SDS sedimentation testing for the remaining 2019 PYN samples (delayed by COVID19 related shutdowns).These SDS tests will be completed on the remaining unselected PYN samples that are also being milled and profiled in the quality lab. Wheat Quality Lab Testing: In 2019, we continued to evaluate PYN (Prelim Yield Trials), AYN (Advanced Yield Nursery), and KIN (Kansas Interstate Nursery) lines from the KSU breeding program(s) in the quality lab. Approximately 470 samples were tested for quality.Samples were prepared, labeled, cleaned, evaluated for moisture and protein, tempered, and milled. Milling yield and ash content were determined. Flour moisture and protein were measured.Mixograph was conducted for all the samples.Farinograph and bread test baking were performed for all the samples except for Hays PYN1. Metabolomic profiling: A previously developed pipeline was used to process metabolite data for 831 flour samples that includes 333 samples from 2017 field season, and 498 samples from 2018 field season.Metabolite extraction was performed using chloroform /water mix followed by GC/MS. The data processing was based on routines implemented in the XCMS Bioconductor package.After chromatographic peak detection, retention time correction was performed for batches processed at different time. Further, cross sample comparison was used to group chromatographic peaks based on their density and presence in all processed batches of samples. In total, we have identified ~728 metabolite peaks, with 575 and 728 metabolite peaks identified in batches of samples from 2017 and 2018 field season, respectively.Currently, we use AMDIS software to screen the NIST database for matches with the know metabolites. The analysis of genetic relationship between the wheat quality traits and metabolite profiles is underway. In addition, we set up a high-throughput protocol for measuring amylose content that influences the milling, rheology and baking properties of wheat flour. Using replicated subset of samples, we assessed the technical variability of measurements, which conducted using Synergy Hybrid 1 plate reader (ratio A550/A620) and the results analyzed by the Gen5 software. The absorbance ratios of the standards was used to correct for interference of amylopectin in amyloses concentration determination. There was a significant variation in amylose concentration in the sample (P = 0.02), with no significant effect of technical replications (P = 0.99), at 95% confidence level. The concentration of amylose varied from ~10% to ~23%. High-throughput genotyping of glutenin alleles: The different HMW glutenin alleles are classically determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) profiling of the proteins(Payne, Corfield, & Blackman, 1979; Payne et al., 1981).To examine the molecular structure of the different known alleles we compared SDS-PAGE haplotypes with sequence variants in the HMW glutenin genes from the newly assembled wheat pan genome from the 10+ Wheat Genome Project.We found that molecular haplotypes directly corresponded with each of the known HWM alleles, in all but two cases. In the first, we were able to differentiate two sequence haplotypes for the 7+8 allele at Glu-B1. In the second, Glu-D1 2.2+12 could not be differentiated from 2+12, supporting the hypothesis that 2.2+12 recently originated from 2+12(Payne, Holt, & Lawrence, 1983). The overall conserved sequence haplotypes within HMW glutenin alleles of globally representative wheat genomes indicates that the functional alleles at the important glutenin genes are each of single origin and shared among global breeding programs.The completed assemblies and detailed sequence characterization of the alleles also enables development of effective high-throughput markers for molecular breeding as well as differentiating and predicting alleles using high-throughput skim sequencing. We then used the extensive sequence information from the 10+ Wheat Genome Project coupled with whole genome sequencing of 95 CIMMYT wheat varieties to capture sequence haplotypes as k-mers. We developeda user-friendly bioinformatics pipeline that searches skim sequencing data of breeding lines for these diagnostic k-mers and determines the allelic state at the HMW glutenin loci.This approach has the potential to offer a low cost, high-throughput sequence-based alternative to gel methods for gluten genotyping in breeding programs. Genome-wide Association Study In this objective, our focus was to identify genetic variants underlying baking quality traits through genome-wide association mapping (GWAS) and develop improved genomic selection (GS) models for the quality traits.A total of 462 advanced breeding lines from 2015-2017 were genotyped using genotyping-by-sequencing (GBS) and profiled for baking quality.The best linear unbiased prediction (BLUP) of each genotype for each trait was calculated and used in the GWAS and GS analyses. Genome-wide association mapping (GWAS) analyzes were conducted in R using an enhanced compression of the mixed linear modeland the Genome Association and Prediction Integrated Tool (GAPIT).Significant GWAS signals were detected for mixograph mixing time and bake mixing time; most of which were within or in tight linkage to HMWglu-1D, LMWglu-1A, LMWglu-1B, Gli-1A, Gli-1B and Gli-6A and could be used for marker-assisted breeding.Candidate genes for the four new signals are phosphate-dependent decarboxylase and lipid transfer protein genes, which are believed to affect nitrogen metabolism and dough development, respectively. Genomic Prediction Analysis: Early generation selection with GS can both shorten the breeding cycle time and significantly increase the number of lines that could be selected for quality traits.For various end-use quality traits, univariate GS models had 0.25 to 0.55 prediction abilities in cross-validation and from 0 to 0.41 in forward-prediction.By including secondary traits as additional predictor variables (univariate GS with covariates) or correlated response variables (multivariate GS), the prediction abilities were increased to 0.30 to 0.83 in cross-validation, 0 to 0.96 in forward-prediction.The improved genomic prediction models have great potential to further accelerate wheat breeding for end-use quality.
Publications
- Type:
Conference Papers and Presentations
Status:
Published
Year Published:
2020
Citation:
Delorean, E., Amos, C., Guzman, C., Ibba, M., Pozniak, C., 10+ Wheat Genomes Project, Poland, J. Sequence Based Genotyping for Glutenin Genes of Wheat. International Plant & Animal Genome Conference; January 11 15, 2020; San Diego, CA.
- Type:
Conference Papers and Presentations
Status:
Published
Year Published:
2019
Citation:
Delorean, E., Amos, C., Guzman, C., Pozniak, C., 10+ Wheat Genomes Project, Poland, J. High-throughput Sequence Based Genotyping for Wheat Quality Genes. National Association of Plant Breeder Annual Meeting; August 25 29, 2019; Pine Mountain, GA.
- Type:
Other
Status:
Published
Year Published:
2019
Citation:
Delorean, E., Amos, C., Guzman, C., Ibba, I., Singh, R., Pozniak, C., 10+ Wheat Genomes Project, Poland, J. Genomic Prediction for Quality Traits and Sequence Based Genotyping for Quality Genes. USAID Feed the Future Innovation Lab for Applied Wheat Genomics Annual Meeting, July 29-30, 2019. CIMMYT, El Batan, Mexico.
|
Progress 04/15/18 to 04/14/19
Outputs Target Audience:
Nothing Reported
Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?*What was accomplished under these goals (continued): Genome-wide Association Study Genome-wide association mapping (GWAS) analyzes were conducted in R using an enhanced compression of the mixed linear model (Liet al., 2014) and the Genome Association and Prediction Integrated Tool (GAPIT) (Lipka et al, 2012; Tanget al., 2016). No imputation of missing alleles was performed prior to the association mapping analysis. Principal components analysis was used to verify whether genetic structure was present within this panel of breeding lines, which was not found. The inclusion of principal components and kinship matrix as covariates were also tested to improve the significance of SNP-trait associations. The significance of SNP-traits associations was determined by the Bonferroni test at 5% of probably of error. Due to the stringency of the prior test and additional threshold of significance was considered using false discovery rate of 10%(FDR10%). GWAS results from GAPIT were reloaded into R using the package 'qqman' to generate Manhattan plots color-coded by wheat sub-genomes. The strongest associations were identified on chromosomes 1AS, 1BS, and 1DS which likely correspond to the glutenins and gliadins loci, and the 1B-1R translocations which are known to be associated with the expression of important end-use quality parameters in wheat (Figure 2). A total of 46 SNPs significantly associated with protein quality score based on the Bonferroni correction test at 5% of probability of error. Other significant association for protein quality score were also identified on 2AL, 2BS, 2DS, 4AS, 4AL, 5BL, and 7BS which could indicate potential novel loci associated with end-use quality or just spurious associations. Additional analysis will be needed to confirm these associations. Considering a threshold of significance of FDR10% a total of 371 SNPs across 18 wheat chromosomes. By comparing the five SNPs on 1BS that had the highest significance with the markerstsm0120, scm0009, andSecalin-Pawswhich are linked to 1RS confirmed that the significant associations on this genomic region corresponds to the 1B-1R translocations. Although this result may be extrapolated to predict the present of this wild segment in other genotypes, we are still not capable of differencing 1A-1R from 1B-1R translocations. Genomic Prediction Analysis We started by converting the genotypic data from nucleotide to numeric format using the function 'A.mat'from the "rrBLUP"package (Endelman, 2011) where the strings 1, -1, 0, and NA represented respectively major allele, minor allele, heterozygosity, and missing alleles. Imputation of missing allele was performing using the 'mean' method within the prior function. After that, the "GSwGBS" package (Gaynor, 2015) was used to run four genomic prediction models (GS), simultaneously: ridge regression best linear unbiased predictor (RRBLUP; Eldelman, 2011), partial least squares regression (PLSR; Mevik & Wehrens, 2007), elastic net (ELNET, Friedman et al., 2010), and random forest (RF; Liaw & Wiener, 2002). Additionally, an average prediction (AVE; Gaynor, 2015) of these four methods was calculated using standardized values to avoid overly weighting the average towards any single prediction. Resampling-based cross-validation were performed by randomly selecting 80% of dataset (406 breeding lines) to predict the performance of the remnant individuals, 20%. This process was set up in a loop to be repeated 100 times for each GS model. The broad-sense heritability was calculated using the model for additive genetic variance from the "rrBLUP" package (Endelman, 2011). The predictive ability and its 95% confidence interval were calculated as the Pearson's correlation (r) between observed quality scores and its genomic estimated breeding values (GEBV) obtained from each GS model. Forward cross validation was performed by randomly selecting 138 individuals that selected based on grain yield from the IPSR18 to test their quality performance via the hybrid SDS-SRC sedimentation method. After that, the SDS parameters were correlated (Pearson's correlations) with GEBVs of the same traits, where those initial 406 breeding lines from PYN18 were used the training set No significant difference was identified between GS models for the traits flour protein, weight value of the precipitate, protein quality score. Although for grain yield, the model PLSR as significate less accurate than the other models. The average of the predictive abilities ranged from 0.52 to 0.70 which can be considered moderately high, agreeing with other recent literature reports (Battenfiled et al, 2016). Dividing these values of predictive abilities by the square root of heritability we obtain the prediction accuracies. As the heritability for these traits ranged from 0.7 to 0.8, the actual prediction accuracies would range from 0.75 to 0.80 in our study which indicates a promising potential of GS models to assist decision-making in breeding programs for end-use quality once the conventional test are laborious, expensive and, yet, require a relatively large sample for destructive analysis, i.e. milling. For the forward cross-validation analysis we randomly picked 138 selected lines from the IPSR18 that were selected to move to the PYN19 based on their grain yield potential. All these samples initially had their end-use quality predicted by using the 406 lines from the PYN18 as a training set. After that, these materials were submitted to quality test using the SDS-SRC sedimentation method (Seabourn et al, 2012). Then, we estimated Pearson's correlations were between predicted (GEBVs) and observed values for these 138 individuals and verified correlations ranging from 0.32 to 0.43 for SDS weight protein quality score, respectively (Figure 4). Although these values are much lower than the ones observed via resample-based cross-validation, it is, yet, significant providing more evidence for the potential application of GS in the assessment of end-use quality of commercial wheat breeding programs. It also confirms that resample-based cross-validation strategies often overestimate the prediction accuracy of GS models relative to forward cross-validation analysis that try to mimic the workflow of an actual breeding program, also corroborating with other literature reports. See Figure 3:https://drive.google.com/file/d/1PGvM3lM1uMmQcfj3EHDMpxxG8kzN6vDf/view?usp=sharing See Figure 4:https://drive.google.com/file/d/1YzlHHkoe95s7kEMmQJ1fTirXCF9s4MLh/view?usp=sharing See Figure 5:https://drive.google.com/file/d/1MDaEhV1MjYjTuWBYKDa1bP2e1AapoCed/view?usp=sharing *Opportunities for training and professional development: Through the project there are currently three postdoctoral scholars, one PhD student and one visiting scientist who are active and being trained and mentored through the project. To provide opportunities for learning and development, Emily Delorean and Shichen Zhang-Biehn attended the Plant and Animal Genome conference in San Diego, CA and Cristiano Lemes de Silva attended the New Frontiers in Genetic Evaluation Conference in Lincoln, NE and The 2050 Challenge: The Role of Agriculture in Addressing the Global Needs in Manhattan, KS. How have the results been disseminated to communities of interest?*What opportunities for training and professional development has the project provided (continued): Workshop and Conference attendance and presentations: Cristiano Lemes de Silva: 2018Nebraska Plant Breeding Symposium (03/13/18, Lincoln NE) New Frontiers in Genetic Evaluation Conference (07/25/18 - 07/27/18, JohnstonIA) The 2050 Challenge: The Role of Agriculture in Addressing the Global Needs (09/06/18, Manhattan KS) Emily Delorean: Delorean, E., Amos, C., Guzman, C., Pozniak, C., 10+ Wheat Genomes Project & Poland, J. (2018).DNA Forensics to Identify the Bad Players of Wheat Quality.Poster Presentation. Plant & Animal Genome Conference XXVII. San Diego, CA. January 2019. Shichen Zhang-Biehn: Zhang-Biehn, S. & Poland, J. (2019). Accelerating Wheat Breeding for End-use Quality through Association Mapping and Multivariate Genomic Prediction. Poster Presentation. Plant & Animal Genome Conference XXVII. San Diego, CA. January 2019 *How have the results been disseminated to communites of interest? The two primary communities of interest for this project are 1) scientific community including wheat breeders and 2) the larger wheat producers community including Wheat Commission and Growers Association. Throughout the year presentations and project updates have been made to both of these groups through multiple venues. Preliminary results of the k-mer based pipeline were presented as a poster at the 2019 Plant and Animal Genome Conference. Multiple presentations were made to Kansas Wheat Commission as well as private companies including, GrainCraft and Indigo Ag that included results developed in this project, specifically the SDS information. What do you plan to do during the next reporting period to accomplish the goals? There are currently 800 preliminary yield trial lines being grown under optimal irrigation conditions at the Colby, KS breeding station for inclusion in the 2019 samples. Conduct SDS sedimentation for the remaining 2018 field material and new 2019 nurseries. For 2018, a total of 1,270 samples that are relevant to the project will be ran including selected lines from IPSR, PYN, AYN, CKE, and KIN nurseries. The similar set will be analyzed for 2019 nurseries. An additional set of 600 samples of wheat flour will be processed to obtain metabolite profiles of advanced breeding lines. Postdoctoral researcher will analyze genotyping and omics datasets to identify genetic loci controlling trait variation. The comparison of metabolite profiles against the databases of known metabolites is currently being performed to infer the identity of detected analytes. Development of high-throughput genotyping for glutinins and gliadins. The k-mer pipeline for allele prediction will be enriched with sequences from Kansas wheat varieties. The k-mer pipeline will be extended to the prediction of LMW glutenins and gliadins.
Impacts What was accomplished under these goals?
Profiling of advanced breeding lines: Quality samples from the KSU were grown in 2017-18 nurseries at Colby, KS under optimal irrigation conditions. Samples for the preliminary yield nursery (PYN) were combine harvested and processed 80 advanced lines that were selected in the breeding program for advancement to 3rd year yield testing (Advanced Yield Nursery, AYN). Unselected lines from and an additional 200 lines that were not selected for advancement were included to capture the full range of quality profiles in the current breeding materials. For all of the PYN and selected lines from the first generation yield testing (n~1200), profiling was completed for sodium dodecyl sulfate (SDS) sedimentation analysis as a high-throughput phenotype for glutenin quality and bread making. From the first generation yield nursery samples, over 1200 lines were evaluated. From this data we have developed genomic prediction models as well as completed genome wide association study. The hybrid SDS-SRC sedimentation method (Seabourn et al, 2012) as a rapid assessment of end-use quality of advanced lines of the K-State Wheat Breeding Program. This method has shown constantly high correlation values (≥0.60) with loaf volume of bread, grain protein content, and gluten strength. It uses only 1 gram of gram of flour per sample as detailed in Figure 1 and allows to test up to 144 samples per day. Two quality parameters are extracted from this analysis: weight pellet formed in the both tubes and the protein quality scores which takes in consideration the grain protein concentration of the flour samples. Protein was measured in the flour samples using a NIR equipment. In 2018, a set of 406 breeding lines from the Preliminary Yield trial Nursery (PYN18) yield-tested in Colby KS under irrigation were used for the SDS-SRC sedimentation test. Phenotypic data from these genotypes were later used in association mapping studies and for testing genomic prediction models with resample-based cross validation. This dataset has also been used as a training set (TS) to predict the end-use quality of breeding materials in the IPSR18, PYN18, IPSR19, and IPSR19 nurseries. See Figure 1:https://drive.google.com/open?id=1R-pCPI2kyzZyxMGAWOkNahwRFTV525Hf Wheat Quality Lab Testing: We evaluated PYN (Prelim Yield Trials), AYN (Advanced Yield Nursery), CKE (Central Kansas Experimental) and KIN (Kansas Interstate Nursery) lines samples from the KSU breeding program(s) in the quality lab. Approximately 500 samples were completed with full milling and baking test, including from the preliminary yield nursery (PYN, ~100 lines each from two locations), advanced yield nursery (AYN, ~25 lines each from 4-5 locations), and KIN and CKE (~10 lines each from 4-5 locations). Samples were prepared, labeled, cleaned, evaluated for moisture and protein, and milled. Flour moisture and protein were measured. Mixograph was conducted for all AYN, CKE, KIN and Hays PYN samples. Farinograph and bread test baking were performed for AYN, CKE, and KIN samples. See Table 1:https://drive.google.com/open?id=1Nc7iPUdoDXZCxm6xSRNnRt6EtR3QMR96 Additional ~340 PYN lines from unselected set are in process for milling and mixing tests. 50 of the 340 samples are being milled and tested for flour protein/moisture, mixograph, farinograph, and test baking, and the rests are also being milled and will be provided to the NIFA "Omics" team. (completion by June of 2019) Metabolomic profiling: During this year, a pipeline for data processing of GC/MS has been optimized. Generally, this processing pipeline now includes initial data inspection, chromatographic peak detection, retention time correction, chromatogram alignment, peak regrouping, filling in missing peak data, and reporting significant differences in analyte intensities. Three sets of flour samples from wheat lines grown in 2017 and 2018 have been profiled for metabolite composition. Metabolite extraction was performed using chloroform /water mix followed by GC/MS. The first set of 48 flour samples (16 wheat lines from 3 locations) was analyzed using two technical replications per sample. Coefficient of determination for technical replicates ranged from 0.83 to 0.99, with the majority of samples (45 out of 48 samples) showing R2 values above 0.9, suggesting that variation in metabolite profiles due to technical factors is rather limited. Thirty-four analytes showing significant variation have been identified among different samples (p < 0.05). For example, a metabolite with the mass-to-charge ratio of 119.1 and retention time 2016 sec was found in significantly higher quantities in wheat line KS14H180-4-6 grown at location SU than in other lines (Fig. 1). See Figure 2:https://drive.google.com/open?id=1fbX9MBv_UpFg_7qDLCf-qNm1DX4Wxh-s The second and third sets of flour samples were prepared from 170 wheat lines from 2017 field season (285 samples) and 37 wheat lines from 2018 season (98 samples). Each year wheat lines were grown at six locations: EW, MP, RN, RP, SU, THI. Principal component analysis of metabolite profiles obtained for all three sets of samples showed that wheat lines tend to cluster according to sampling location rather than genotype suggesting that in this set environmental effects on metabolite profiles might be stronger than genetic effects. High-throughput genotyping of glutenin alleles: To facilitate molecular breeding for improved quality, we are focused on developing a high-throughput sequence-based allele prediction program to replace SDS-PAGE markers for glutinins (and gliadins) for wheat breeders. A k-mer based pipeline was developed for identifying HMW glutenin alleles from skim sequencing data. Develop, test and implement novel prediction models: DNA was extracted from all breeding lines prepared for sequencing using a standard GBS protocol (Poland et al, 2012). After that, all the sequencing data was storage and analyzed in the the High-Performance Computing (HPC) cluster of the Kansas State University. Single nucleotide polymorphisms (SNPs) were discovered using an automated pipeline in Tassel 5 (Bradbury et al. 2007) written in java language. Initially this process parsed out by nucleotide barcodes assigned for each breeding line, then the sequences were trimmed to 64-base pair in length where SNPs were identified. The sequence reads containing discovered SNPs were aligned against the Chinese Spring reference genome v.1.0 (IWGSC, 2018) using Bowtie 2 (Langmead & Salzberg, 2012) to locate their physical positions. Only those SNP markers with minor allele frequency greater than 5% (MAF≥ 0.05), heterozygosity lower than 20% (Het≤ 0.20), and less than 50% missing values (NA<0.50) across genotypes were retained. SNP markers that yielded multi-allelic calls were discarded while those present on unanchored (UN) sequences were kept apart in a separated group named 'UN'. A total of 22,172 SNPs were remained in the genotypic dataset after filtering. SNP markers were distributed in a homogenous fashion throughout the wheat genome with lower density in the D genome, as expected and already reported by other studies. Principal components analysis did not reveal any genetic structure among the 406 genotypes from the PYN18.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2018
Citation:
Kariyawasam, G. K., Hussain, W., Easterly, A., Guttieri, M., Belamkar, V., Poland, J., Venegas, J., Baenziger, S., Marais, F., Rasmussen, J. B., & Liu, Z. (2018). Identification of quantitative trait loci conferring resistance to tan spot in a biparental population derived from two Nebraska hard red winter wheat cultivars. Molecular Breeding, 38(11), 140. https://doi.org/10.1007/s11032-018-0901-3
- Type:
Journal Articles
Status:
Published
Year Published:
2018
Citation:
Shao, M., Bai, G., Rife, T. W., Poland, J., Lin, M., Liu, S., Chen, H., Kumssa, T., Fritz, A., Trick, H., Li, Y., & Zhang, G. (2018). QTL mapping of pre-harvest sprouting resistance in a white wheat cultivar Danby. Theoretical and Applied Genetics, 115. https://doi.org/10.1007/s00122-018-3107-5
- Type:
Journal Articles
Status:
Published
Year Published:
2018
Citation:
Krause, M. R., P�rez, L. G., Crossa, J., P�rez-Rodr�guez, P., Montesinos-L�pez, O. A., Singh, R., & Mondal, S. (2018). Use of Hyperspectral Reflectance-Derived Relationship Matrices for Genomic Prediction of Grain Yield in Wheat. BioRxiv, 389825. https://doi.org/10.1101/389825
- Type:
Journal Articles
Status:
Published
Year Published:
2018
Citation:
International Wheat Genome Sequencing Consortium (IWGSC), T. I. W. G. S. C., IWGSC RefSeq principal investigators:, I. R. principal, Appels, R., Eversole, K., Feuillet, C., Keller, B., & Uauy, C. (2018). Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science (New York, N.Y.), 361(6403), eaar7191. https://doi.org/10.1126/science.aar7191
- Type:
Conference Papers and Presentations
Status:
Other
Year Published:
2019
Citation:
Zhang-Biehn, S. & Poland, J. (2019). Accelerating Wheat Breeding for End-use Quality through Association Mapping and Multivariate Genomic Prediction. Poster Presentation. Plant & Animal Genome Conference XXVII. San Diego, CA. January 2019
- Type:
Conference Papers and Presentations
Status:
Other
Year Published:
2019
Citation:
Delorean, E., Amos, C., Guzman, C., Pozniak, C., 10+ Wheat Genomes Project & Poland, J. (2018). DNA Forensics to Identify the Bad Players of Wheat Quality. Poster Presentation. Plant & Animal Genome Conference XXVII. San Diego, CA. January 2019.
- Type:
Journal Articles
Status:
Published
Year Published:
2018
Citation:
Battenfield, S. D., J. L. Sheridan, L. D. C. E. Silva, K. J. Miclaus, S. Dreisigacker, R. D. Wolfinger, R. J. Pe�a, R. P. Singh, E. W. Jackson, A. K. Fritz, C. Guzm�n and J. A. Poland (2018) Breeding-assisted genomics: Applying meta-GWAS for milling and baking quality in CIMMYT wheat breeding program. PLOS ONE 13(11): e0204757. DOI: 10.1371/journal.pone.0204757
|
Progress 04/15/17 to 04/14/18
Outputs Target Audience:There are four primary target audiences for this project that include 1) wheat breeders and scientists, 2) wheat growers, 3) wheat grain buyers and milling companies and 4) wheat consumers. We have presented project objectives and findings to the scientific community through presentations at international scientific conferences including Plant and Animal Genome and the International Gluten Workshop. For dissemination to wheat growers, we are closely connected to the Kansas Wheat Commission (KWC)and Kanas Association of Wheat Growers. Throughout the year, formal and informal presentations about the scope and progress of the project have been presented to these groups through KWC meetings, field days and other meetings and venues. We continue to stay tuned to the needs and priorities of the growers in a demanding market to develop varieties with superior quality profiles. To communicate project activities and goals to wheat grain buyers and milling companies, we have updated this group through the Wheat Quality Council meeting and through meetings with industry held in conjunction with KWC meetings and Wheat Genetics Resource Center Industry Advisory Board meetings. These meetings bring together many of the primary flour millers in the U.S. and provide input and feedback on project developments. For general public of wheat consumers, we maintain a social media presence through Facebook and Twitter to provide updates on the projects to illustrate the focus of scientists and breeders on developing higher quality, more nutritious wheat varieties. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?Postdoctoral: During the first year of this project we recruited a new postdoctoral scholar, Shichen Zhang, to focus on model development and quality predictions. Dr. Zhang has received training in new statistical methods and use of multivariate modeling. Graduate: Graduate student in Genetics, Emily Delorean, is being trained in connection with the quality programs at KSU and CIMMYT through this project. Graduate student training has focused on mentorship in professional development and technical skills in directly using complex genome assemblies through high performance computing, local and online blast search methodology, and sequence curation. Ms. Delorean completed focused genomics and statistical training at the Summer Institute for Statistical Genetics. University of Washington. Seattle, WA. July 2017. Graduate student presentations and meetings included Plant and Animal Genome (San Diego, CA), the Plant Breeding symposium at University of Nebraska, Lincoln and the International Gluten Workshop (Mexico City, Mexico). Full list included with publications/presentations. How have the results been disseminated to communities of interest?In additional to formal presentations to the scientific community, we have extensively shared and presented the project and associated findings with Kansas Wheat Commission and wheat growers along with individuals and companies of the quality industry (GrainCraft, Ardent, ADM) (Fritz and Poland) What do you plan to do during the next reporting period to accomplish the goals?1. Profile advanced breeding lines in K-State wheat program for milling and baking quality. We will continue with assessment of preliminary yield trial lines (PYN). To increase the size of the training set, we will target complete baking assessment of the full set of ~350 PYN lines. Samples are growing under optimal irrigated treatment at Colby, KS research station and harvest samples for 2018 analysis will be completed June/July. We will continue with implementation of high-throughput SRC-SDS evaluations of first year yield trials including first year unreplicated lines (~1500 entries) to have: 1) larger assessment of more lines within the breeding pipeline for initial quality and 2) generate larger training dataset for genomic predictions. In response to discussion with the quality industry, farinograph has been identified as a primary quality assessment on wheat grain during marketing to mills and port terminals. To improve quality profile assessment, we will implement farinograph testing across advance lines and selected set of PYN lines. We will target testing of 100 lines with farinograph in addition to the full bake tests and determine 1) genetic heritability and genomic prediction accuracy for farinograph and 2) other correlated quality traits that have higher throughput. 2. Generate proteomic, metabolomics and ionomic profiles of parental and advanced breeding lines and genomic profiles of all breeding lines. We will continue to genotype the full set of breeding lines coming through the KSU breeding program for genomewide genomic profiles. Partnering with the larger wheat community, we are currently exploring new methods for genotyping (to replace enzyme based genotyping-by-sequencing) with expectation to implement improved genotyping in late 2018 for profiling all 2018-2019 breeding lines. Concurrently, we will implement new 'practical haplotype graph' (PHG) approaches for genomewide whole-genome / high-density marker imputation from the low coverage genotype. We will focus on completing the development of a targeted sequencing assay for glutinens and gliadins. Using the extracted sequence to develop exom capture chip for all gluten and gliadin genes, we will identify unique kmers for glutenin and gliadin alleles in panel of diverse wheat lines for design of an optimized capture array. Following design, we will first sequence a set of known standards for glutinin and gliaden alleles. We identify diagnostic k-mers that tag each of the known alleles while evaluating possible differentiated loci between the known alleles (from protein profile with PAGE separation). 3. Integrative genomics of milling and baking quality traits. We will take a total of 600 samples of wheat flour to be processed to obtain metabolite and mineral profiles of advanced breeding lines. A newly recruited postdoctoral researcher will analyze genotyping and omics datasets to identify genetic loci controlling trait variation. Crosses will be initiated to validate identified marker trait associations. 4. Develop, test and implement novel prediction models using genomic profiles combined with 'omics data as predictor variables and/or correlated predictive phenotypes. We will continue testing of the different type of prediction models using secondary traits confirming the best performing models on the new generation of breeding lines (2017-18) with cross validation prediction accuracy.
Impacts What was accomplished under these goals?
Profiling of advanced breeding lines: In August 2017, 353 lines from the Manhattan and Hays breeding programs were initiated for quality assessment. As of March 15, 2018 all 353 lines have been evaluated for milling and baking properties including: wheat moisture, wheat protein, straight grade flour milling yield, flour moisture, flour protein, mixograph properties and bread baking quality and 203 of the lines have been evaluated for farinograph properties. In Nov. 2018 an additional 98 lines specifically for this project were also started to sample a larger range of the breeding lines. These are currently being milled. Small samples of flour from each of the samples have been allocated for ionomics and metabolite profiling. From KSU-Manhattan program, about 200 lines were submitted from the 2017 growing season to the quality lab for mill and bake analysis. Samples were submitted from Sumner County due to damage to the Colby, KS Irrigated nursery from 20 inches of snowfall at heading. SRC-SDS measurements were taken on the samples, as well. Yield trials were planted for the 2018 growing season with the intent of using irrigated plots at Colby, KS for 2018 quality samples. 400 first-year yield trial samples from Manhattan, KS were also evaluated for SRC-SDS in the Fritz lab. Genomic profiling using GBS was performed for 4,100 first-year yield trial entries that are in the field in 2018. Targeted amplicon sequencing of glutenin and gliadin alleles: To make a high-throughput genotyping assay targeted for sequencing the alleles of gluten proteins, we have assembled all known sequences for glutenin and gliadins to develop a sequence capture assay. We will target creation of a capture assay / amplicon assay that can be used for profiling the whole Triticeae. Sequences for all known glutenin (Glu-A1, Glu-B1, GluD1, Glu-A3, Glu-B2, Glu-B3, Glu-D3) and gliadin (Gli-A1, Gli-A2, Gli-A3, Gli-A5, Gli-A6, Gli-B1, Gli-B2, Gli-B3, Gli-B4, Gli-B5, Gli-D1, Gli-D2, Gli-D3, Gli-D4, Gli-D5) genes were pulled from the online database, NCBI. BLAST was used to pull sequences for all known glutenin and gliadin genes from completed bread wheat genome assemblies (Jagger, Stanley, Arina, Chinese Spring, Julius, Landmark, W7984), durum wheat assemblies (Strongfield, Cappelli), barley genome assembly (Morex), and wheat relative genome assemblies (Aegilops sharonensis, Aegilops speltoides, Aegilops tauschii, Triticum diccoides, Triticum Urartu, Triticum monococcum, Thrinopyrum intermedium). Concurrently the physical locations were identified for all glutenin and gliadin genes in bread wheat genome assemblies, durum wheat assemblies, barley genome assembly, and wheat relative genome assemblies. Metabolite profiling: Procedures for the extraction and analysis of metabolites from wheat flour have been established and validated by analyzing a small set of 21 wheat lines. The two extraction methods using chlorophorm/water mix and methanol were used. The GC-MS analysis of flour extract showed profiles with substantially overlapping metabolite peaks; a total of 28,478 peaks were identified (about 1,356 per sample) clustered into 1355 groups. Metabolite profiles included 45-55 resolvable analytes corresponding to various aminoacids (up to 19), sugars, and organic acids. The data analysis pipeline based on Python and R packages have been successfully tested for the ability to conduct metabolite data processing in high-throughput fashion. Currently, we are performing validation of a developed pipeline for consistency of identified metabolite peaks by analyzing additional 48 samples from advanced breeding lines using multiple technical replications. Prediction model testing: We are focused on testing and then implementing novel prediction models using genomic profiles combined with secondary traits as predictor variables and/or correlated predictive phenotypes. We are currently testing the following specific objectives 1) developed and tested models to predict end-use quality traits with only GBS data, 2) developed and tested different novel prediction models using GBS data combined with intermediate traits as predictor variables to predict end-use quality traits, 3) developed and tested different novel prediction models using intermediate traits to predict end-use quality traits, 4) developed and tested different novel multi-trait genomic prediction models using GBS data to predict end-use quality traits and intermediate traits to increase the accuracy of genomic prediction of end-use quality traits.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2017
Citation:
Moore, J. K., Manmathan, H. K., Anderson, V. A., Poland, J. A., Morris, C. F., & Haley, S. D. (2017). Improving Genomic Prediction for Pre-Harvest Sprouting Tolerance in Wheat by Weighting Large-Effect Quantitative Trait Loci. Crop Science, 57(0), 0. https://doi.org/10.2135/cropsci2016.06.0453
- Type:
Journal Articles
Status:
Published
Year Published:
2017
Citation:
Sun, J., Rutkoski, J. E., Poland, J. A., Crossa, J., Jannink, J.-L., & Sorrells, M. E. (2017). Multitrait, Random Regression, or Simple Repeatability Model in High-Throughput Phenotyping Data Improve Genomic Prediction for Wheat Grain Yield. The Plant Genome, 10(2), 0. https://doi.org/doi:10.3835/plantgenome2016.11.0111
|