Progress 05/01/20 to 05/09/24
Outputs Target Audience:Academy, USDA, breed associations, farmers Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?Training of graduate students, postdocs, and visitors. Training of industry members. How have the results been disseminated to communities of interest?Conference presentations, journal publications, industry meetings What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
1. Analyzes involved more than 47 million lactation records registered between 2000 and 2021 in purebred Holstein and Jersey and their crosses. A total of 27 million animals were included in the analysis, of which 1.4 million were genotyped. Milk, fat, and protein yields were analyzed in a 3-trait repeatability model using BLUP or ssGBLUP. The 2 models were validated using prediction bias and accuracy computed for genotyped cows with no records in the truncated dataset and at least one lactation in the complete dataset. Bias and accuracy were better in the genomic model than in the pedigree-based one, with accuracies for crossbred cows generally higher than those of purebreds. Accurate evaluation of crossbreds requires a crossbred reference population. Genomic accuracies of crossbred cows may be artificially inflated by the validation methodology because genomic predictions for crossbreds also include means of specific crosses (e.g., F1, F2, various reciprocals). 3. Data sets were simulated with different real and effective population sizes, and causative SNPs were detected by p-values. The detection was much more successful in populations with a high effective population size. In a population with a low effective population size, each causative SNP generated a wide SNP response in Manhattan plots, making it difficult to discern adjacent causative SNP. Small effective population size in farm animals makes the detectionof causative SNP difficult but enables high-accuracy prediction that is based on the estimationof large chromosome segments.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2024
Citation:
Cesarani, Alberto; Lourenco, Daniela; Bermann, Matias; Nicolazzi, Ezequiel; VanRaden, Paul; Misztal, Ignacy. 2024. Single-step genomic predictions for crossbred Holstein and Jersey cattle in the US. J. Dairy Sci. Comm. 5:124-128. doi:10.3168/jdsc.2023-0399
- Type:
Journal Articles
Status:
Published
Year Published:
2023
Citation:
Jang, S., S. Tsuruta, N. G. Leite, I. Misztal, and D. Lourenco. 2023. Dimensionality of genomic information and its impact on genome-wide associations and variant selection for genomic prediction: a simulation study. Genet. Sel. Evol. 55-49. doi:10.1186/s12711-023-00823-0
- Type:
Journal Articles
Status:
Published
Year Published:
2023
Citation:
McWhorter, T., M. Sargolzaei, C. Sattler, M. Utt, S. Tsuruta, I. Misztal, and D. Lourenco. 2023. Single-step genomic predictions for heat tolerance of production yields in U.S. Holsteins and Jerseys. Journal of Dairy Science. https://doi.org/10.3168/jds.2022-23144
|
Progress 05/01/22 to 04/30/23
Outputs Target Audience:Academy, USDA, breed associations, farmers Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?Conferences, USDA committee meetings, breed associationvisits What do you plan to do during the next reporting period to accomplish the goals?We developed and implemented formulas for computations of p-values in ssGBLUP to select sequence variants for genomic prediction that is computationally feasible in very large genotyped populations. Tests included up to 600k genotyped Angus cattle. Tentative results indicate that using ssGBLUP allows for discovery of twice as many statistically significant regions as using a classical method. We plan to confirm the results with other data sets and write a refereed-journal paper.
Impacts What was accomplished under these goals?
We developed single-step GBLUP multibreed genomic predictions for multiple dairy breed: Ayrshire (AY), Brown Swiss (BS), Guernsey (GU), Holstein (HO), and Jersey (JE). A 3-trait model with milk (MY), fat (FY), and protein (PY) yields was applied using about 45 million phenotypes recorded from January 2000 to June 2020. The whole data set included about 29.5 million animals, of which almost 4 million were genotyped. All the effects in the model were breed specific, and breed was also considered as fixed unknown parent groups. Evaluations were done for (1) each single breed separately (single); (2) HO and JE together (HO_JE); (3) AY, BS, and GU together (AY_BS_GU); (4) all the 5 breeds together (5_BREEDS). The inversion of the relationship matrix was by the APY algorithm that minimizes computations by applying recursion on a small number of animals called "core". Initially, 15k core animals were used in APY for AY_BS_GU and 5_BREEDS, but larger core sets with more animals from the least represented breeds were also tested. The HO_JE evaluation had a fixed set of 30k core animals, with an equal representation of the 2 breeds, whereas HO and JE single-breed analysis involved 15k core animals. Validation for cows was based on correlations between adjusted phenotypes and (G)EBV, whereas for bulls on the regression of daughter yield deviations on (G)EBV. Because breed was correctly considered in the model, BLUP results for single and multibreed analyses were the same. Under ssGBLUP, predictability and reliability for AY, BS, and GU were on average 7% and 2% lower in 5_BREEDS compared with single-breed evaluations, respectively. However, validation parameters for these 3 breeds became better than in the single-breed evaluations when 45k animals were included in the core set for 5_BREEDS. Evaluations for Holsteins were more stable across scenarios because of the greatest number of genotyped animals and amount of data. Combining AY, BS, and GU into one evaluation resulted in predictions similar to the ones from single breed, especially when using about 30k core animals in APY. The results showed that single-step large-scale multibreed evaluations are computationally feasible, but fine tuning is needed to avoid a reduction in reliability when numerically dominant breeds are combined. Having evaluations for AY, BS, and GU separated from HO and JE may reduce inflation of GEBV for the first 3 breeds. The results indirectly suggest that scaling of genomic relationship matrix for specific breed groups is not critical, but a recursion needs to include a sufficient number of animals of each breed as core animals. Another indirect result is that the genomic predictions for crossbreds are not based on QTLs but on estimating independent chromosome segments specific for each breed or type of crossbred. Therefore an accurate prediction for a certain breed type requires a reference population of that breed type.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2022
Citation:
Bermann, M., I. Aguilar, D. Lourenco, I.Misztal, and A. Legarra. 2023. Reliabilities of estimated breeding values in models with metafounders. Genet. Sel. Evol. 55:6. https://doi.org/10.1186/s12711-023-00778-2
- Type:
Journal Articles
Status:
Published
Year Published:
2022
Citation:
Cesarani, A., D. Lourenco, S. Tsuruta, A. Legarra, E. L. Nicolazzi, P. M. VanRaden,, and I.
Misztal. 2022. Multibreed genomic evaluation for production traits of dairy cattle in the US using single-step GBLUP. J. Dairy Sci. 105:5141-5152. doi.org/10.3168/jds.2021-21505
- Type:
Journal Articles
Status:
Published
Year Published:
2022
Citation:
Misztal, I., Y. Stein, and D.A.L. Lourenco. 2022. Genomic evaluation with multibreed and crossbred data. J. Dairy Sci. Comm. 3:156-159. doi.org/10.3168/jdsc.2021-0177
|
Progress 05/01/21 to 04/30/22
Outputs Target Audience: Academia, industry Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?The results have been disseminated at 3 scientific meetings and to approximately10industry groups What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
3. Establish a robust approximation of individual theoretical accuracy for very large genotyped populations using APY. Reliability can be calculated as as a function ofprediction error variances (PEV).We developed an efficient algorithm for calculating PEVfor genomic best linear unbiased prediction (GBLUP) models using the Algorithm for Proven and Young (APY).The PEV with APY was calculated by block sparse inversion, efficiently exploiting the sparse structure of the inverse of the genomic relationship matrix with APY. Single-step GBLUP reliabilities were approximated by combining reliabilities with and without genomic information in terms of effective record contributions. Multi-trait reliabilities relied on single-trait results adjusted using the genetic and residual covariance matrices among traits. Tests involved two datasets provided by the American Angus Association. A small dataset (Data1) was used for comparing the approximated reliabilities with the reliabilities obtained by the inversion of the left-hand side of the mixed model equations. A large dataset (Data2) was used for evaluating the computational performance of the algorithm. Analyses with both datasets used single-trait and three-trait models. The number of animals in the pedigree ranged from 167,951 in Data1 to 10,213,401 in Data2, with 50,000 and 20,000 genotyped animals for single-trait and multiple-trait analysis, respectively, in Data1 and 335,325 in Data2. Correlations between estimated and exact reliabilities obtained by inversion ranged from 0.97 to 0.99, whereas the intercept and slope of the regression of the exact on the approximated reliabilities ranged from 0.00 to 0.04 and from 0.93 to 1.05, respectively. For the three-trait model with the largest dataset (Data2), the elapsed time for the reliability estimation was 11 min. The computational complexity of the proposed algorithm increased linearly with the number of genotyped animals and with the number of traits in the model. This algorithm can efficiently approximate the theoretical reliability of genomic estimated breeding values in ssGBLUP with APY for large numbers of genotyped animals at a low cost.?
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2022
Citation:
Matias Bermann, Daniela Lourenco, Ignacy Misztal, Efficient approximation of reliabilities for single-step genomic best linear unbiased predictor models with the Algorithm for Proven and Young, Journal of Animal Science, Volume 100, Issue 1, January 2022, skab353, https://doi.org/10.1093/jas/skab353
|
Progress 05/01/20 to 04/30/21
Outputs Target Audience:Academia, industry Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?Coneferences, industry meetings, personal communications What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
We compared 3 different formulation of unnown parent groups in single-step GBLUP. The groups were by QP transformation, using themetafounder concept, and encapsulated.The last two fomulations resulted in unbiasedgenomic estimated breeding values (GEBV). While the metafounder concept is applicable to multibreed populations, the encapsulated optiondoes not require paarmeter estimation. In another study, we looked at theefficiency of two unknown parent group formulation and data trunctation in application to thegenomic evaluation of US Holstein population.The complete data included 80 million records for milk, fat, and protein yields from 31 million cows recorded since 1980. Phenotype-pedigree truncation scenarios included truncation of phenotypes for cows recorded before 1990 and 2000 combined with truncation of pedigree information after 2 or 3 ancestral generations. A total of 861,525 genotyped bulls withprogenyand cows with phenotypic records were used in the analyses. Reliability and bias (inflation/deflation) of GEBV were obtained for 2,710 bulls based on deregressed proofs, and on 381,779 cows born after 2014 based on predictivity (adjusted cow phenotypes). GEBV were unbiased with QP-modified unknown parent groups. Eliminating phenotypes recorded before year 2000 did not reduce accuracy of GEBV for the youngest animals.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2021
Citation:
Automatic scaling in single-step genomic BLUP
M Bermann, D Lourenco, I Misztal
Journal of Dairy Science 104 (2)
- Type:
Journal Articles
Status:
Published
Year Published:
2021
Citation:
Comparison of models for missing pedigree in single-step genomic prediction
Y Masuda, S Tsuruta, M Bermann, HL Bradford, I Misztal
Journal of Animal Science 99 (2)
- Type:
Journal Articles
Status:
Published
Year Published:
2021
Citation:
Genomic predictions for yield traits in US Holsteins with unknown parent groups
A Cesarani, Y Masuda, S Tsuruta, EL Nicolazzi, PM VanRaden, D Lourenco, I Misztal
Journal of Dairy Science 104 (5), 5843-5853
|
|