Source: UNIV OF WISCONSIN submitted to NRP
INTEGRATING ENVIROMICS, GENOMICS, AND MACHINE LEARNING FOR PRECISION BREEDING OF RESILIENT BEEF CATTLE
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
ACTIVE
Funding Source
Reporting Frequency
Annual
Accession No.
1030269
Grant No.
2023-68014-39816
Cumulative Award Amt.
$1,000,000.00
Proposal No.
2022-10749
Multistate No.
(N/A)
Project Start Date
May 15, 2023
Project End Date
May 14, 2027
Grant Year
2023
Program Code
[A1261]- Inter-Disciplinary Engagement in Animal Systems
Recipient Organization
UNIV OF WISCONSIN
21 N PARK ST STE 6401
MADISON,WI 53715-1218
Performing Department
(N/A)
Non Technical Summary
Animal breeding is one of the main pillars of livestock production. Statistics, computer science, and genomics have transformed the industry's productivity. Another round of breakthroughs is expected to come from harnessing the power of big data and machine learning analytics to address the complex interaction between animal genetics and the environment. Developing methods that enable precision selection decisions for animals in diverse production environments is expected to result in large productivity gains and a reduction in welfare issues, all while mitigating the environmental impact from suboptimal allocation of resources to ill-adapted animals. Due to climate uncertainty, an understanding of these genetic and environmental interactions will be critical for decision-making. We will generate a wide-ranging enviromics data lake to develop and apply novel methods for breeding more adapted and resilient beef cattle for varying environments. Using GIS technology and integrating various sources of environmental information from publicly available databases and satellite imaging, detailed descriptions of soil, climate, forage, and weather conditions will be created for thousands of U.S farms representing millions of cattle with phenotypic and genotypic data. Farms will be comprehensively described in terms of their facilities and management practices through surveys. Machine learning and artificial intelligence techniques will then be used to predict future animal performance for precision livestock management. The models proposed will also be biologically validated by measuring direct indicators of animal resilience. Following this integrated research-extension project, we will work with producers and industry groups to implement environment-aware approaches into routine genetic evaluations.
Animal Health Component
40%
Research Effort Categories
Basic
50%
Applied
40%
Developmental
10%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
3033310108150%
3033310209030%
3033310208010%
3033310102010%
Goals / Objectives
Animal breeding and genetic improvement are fundamental underpinnings of the large productivity gains that have been achieved by the livestock production sector around the world. For example, in Angus cattle between 1990 and 2020, the genetic gain for the selection index that combines terminal traits increased by 80%, and the one for weaning traits increased by 400%. However, current selection approaches have some limitations, including the assumption that the ranking of the genetic value of animals is constant across environments, which in practice means that their progeny are expected to perform equally well irrespective of their environmental condition. This is, of course, not the case since the genetic merit of an animal varies based on its environmental conditions. More importantly, because of the interaction between genetics and environment, an animal with a high genetic merit for a favorable environment can perform below average in harsher environments, and vice-versa.This genotype by environment interaction (GEI) affects livestock, poultry, and aquaculture production, making it extremely important in the context of the highly diversified geographic and climate landscape of the United States. Unfortunately, there is currently no way to optimally allocate animals to maximizes their productivity across the broad range of different environments. We envision that the next breakthrough in livestock productivity will come from novel selection tools and methods that account for GEI. This will enable large production gains since animals can be correctly selected for the environment they will inhabit and perform in, will promote better animal welfare, and will help mitigate environmental impact from the production of underperforming animals that are ill-suited to particular environments. More forward-looking, the impact of heat stress on livestock is expected to continue worsening as global temperatures and humidity levels increase. Under heat stress, animals have reduced growth, milk production, and reproductive performance due to counteracting effects such as reduced feed intake. In the US, the annual economic losses caused by heat stress in the livestock industry are already up to $2.36 billion, $370 million in the beef cattle industry alone. Modern beef cattle have progressed considerably since the 1990s due to selection for growth performance but have difficulty adapting to high temperature and humidity, which results in reduced productive and reproductive performance. Understanding the relationship between genetics and environment, and selection for more resilient livestock will be crucial to secure animal protein for the growing world population in a climate-changing globe.In animal breeding, GEI is generally modeled either using multi-trait models to assess the genetic correlations between a finite set of discrete environments, or using reaction norm models (RNM) with a continuous description of the environment gradient. However, both methods are suboptimal, as quite often the environmental diversity in which livestock is raised cannot be split or classified into only a few discrete sets of environmental conditions, nor can they be summarized satisfactorily with a single linear environment gradient variable. Another shortcoming of the traditional mixed models used in animal breeding is that environmental factors, which contribute a significant component of the phenotypic variation are simply lumped together into contemporary groups (CG), defined as animals born and raised in similar conditions and generally expressed as the combination of farm, year, and management groups. Although the use of CG may satisfactorily correct for differences between environmental conditions, they do not allow for the detection of the specific environmental variables that significantly contribute to the phenotypic expression, nor the estimation of their effects. Because of that, such approaches do not provide information for accurate prediction of future cattle performance, nor for improving management decision making.An alternative to existing methods is to integrate breeding programs and genetic evaluations with comprehensive environmental data. This is now possible by leveraging geoprocessing technologies, such as geographic information systems (GIS) for Precision Breeding. The collection and processing of spatiotemporal data on weather, water, soil and yield variables are rapidly increasing due to the societal need for food security and technological advances. This thorough characterization of farm and environmental conditions has been termed enviromics and can provide valuable information for breeding and management decisions in livestock operations. Several types of environmental variables can be used in enviromics analyses, such as temporal climatic information and vegetation indices obtained from remote sensing-based canopies. Categorical indices can also be used, such as the climatic index of Köppen or soil classes. In this project, we postulate that enviromics data can be used in advanced modeling of GEI to breed better adapted and more resilient beef cattle, as well as better selection of livestock genetics for specific combinations of environmental conditions.Using GIS technology and integrating various sources of environmental information from publicly available databases and satellite imaging, we will generate a detailed description of the soil, climate, forage and weather conditions of US beef cattle farms. In addition, using survey data, farms will be comprehensively described in terms of their facilities and management practices. This enviromics data lake will then be used to expand traditional mixed model techniques. First, enviromics variables will be used to investigate the specific environment/management factors that significantly contribute to variation across contemporary groups (CG). Second, temporal-spatial enviromics data will be used to more efficiently model GEI. Lastly, the success of genomic selection for improved climatic performance depends on the availability of phenotypes that are heritable, can be measured on many animals, and that represent the behavioral and physiological mechanisms of heat stress response in beef cattle raised under extensive production systems. As such, defining these optimal phenotypes is a priority for the beef cattle industry to fully utilize genomic selection to breed more resilient cattle. Therefore, it is of utmost value to investigate the usefulness of phenotypes routinely measured in breeding and commercial beef cattle herds and define novel indicator traits that might better capture the genetic variability for heat tolerance and overall resilience in US beef cattle. These traits and statistical models need to be biologically validated through comprehensive (in-depth) phenotyping of animals with divergent genetic merit for the traits identified.Three main research-extension integrated Objectives are proposed in this project, which will leverage enviromics and genomics techniques to breed more resilient cattle: 1) Generation of data lake and data processing pipelines for comprehensive environmental characterization of US beef cattle production systems; 2) Comprehensive evaluation of genotype-by-environment interactions and future performance through an enviromics approach; and 3) Definition of novel indicators of animal resilience based on enviromics-derived breeding values and biological validation of the predictions through in-depth phenotyping of genetically divergent animals.
Project Methods
The first objective will be the generation of an integrated enviromics data lake, including information on cattle performance, climatic condition, and farm management descriptors. Satellite images and vegetal indexes, together with detailed information regarding facilities and management of farms will be added to the current Aberdeen Angus Association (AAA) database to leverage big data in cattle breeding and production. First, based on the GPS locations and sizes of farms, spatial polygons of farms will be created to download soil, climate, and weather data from various public sources. In addition, based on the spatial polygons of the farms, long-term satellite imageries and indices, and satellite-derived products will be pre-processed and downloaded from various public sources. Lastly, an online survey will be distributed through the AAA to all 9K+ enrolled beef cattle producers for a detailed description of farms' facilities and management practices. All these data will be subjected to machine learning algorithms to select relevant variables across time and space associated with cattle performance and resilience, including growth and fertility traits. As part of the extension component of this first objective, a number of events will be held to receive beef producers' inputs and webinar/extension events on the intersection of climate change and beef cattle production, including mitigation alternatives.Objective 2 of the project will involve comprehensive evaluation of genotype-by-environment interactions (GEI) and future cattle performance through an enviromics approach. This component of the project will entail extensive data analysis, algorithm development, and simulations to model GEI based on enviromics data from Objective 1. It will involve leveraging the enviromics data to cluster farms into ecoregions so that traditional multi-trait models can be used to analyze the data. Phenotypes measured in each ecoregion will be considered as a different trait, and the genetic correlation between them will be used to assess GEI and to develop strategies for optimal selection within each ecoregion. Additional modelling techniques will include the use of enviromics data on their natural continuous scales using a multi-dimensional reaction norms approach. In addition, a spatial analysis methodology will be used to estimate breeding values specific for each farm or geographical location. Objective 2 will also involve the development of a predictive model of future cattle performance, and the investigation of genetic trends in the Angus cattle population. Importantly, it will involve various extension activities for an effective translational component of the research.The third objective will define novel indicators of animal resilience based on enviromics-derived breeding values and biologically validate the predictions through in-depth phenotyping of genetically divergent animals. Furthermore, routinely measured phenotypes including production, reproduction, longevity, health, and welfare traits will be assessed from the AAA database. These datasets generated will be used for validation of the genomic methods and traits proposed in Objectives 1 and 2, and will enable the identification of alternative traits to be measured cost-effectively at a large-scale in breeding programs aiming to improve animal resilience in US beef cattle.

Progress 05/15/23 to 05/14/24

Outputs
Target Audience:During the reporting period, several presentations were delivered at both scientific and technical meetings. The main objective of these presentations was to outline the project goals and encourage producers to contribute data through an upcoming online survey. As a result, the audience during this period consisted primarily of scientists, cattle breeders, and cattle producers. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?Throughout this first year, a graduate student and a post-doctoral fellow have been actively involved in the project, enhancing their skills in database generation, data management, data cleansing, and descriptive analyses. In addition, through various scientific and technical presentations delivered in meetings, the project has raised awareness among animal breeding scientists and beef cattle producers about the implications of genotype by environment interaction. This awareness is crucial for selecting more resilient and adapted cattle suited to different environments across the US. How have the results been disseminated to communities of interest?The project is currently in its initial stages of data generation and organization. Formal data analysis will commence after data cleansing and extensive descriptive analyses are conducted to ensure data accuracy and consistency. Consequently, there are no results to report for the first year of the project. Nevertheless, as outlined under 'Accomplishments,' several scientific and technical presentations were delivered to communicate the project goals to the scientific community and cattle breeders. These presentations aimed to raise awareness about the implications of genotype by environment interaction and the selection of more resilient and adapted cattle suitable for diverse environments across the US. What do you plan to do during the next reporting period to accomplish the goals?During the upcoming year of the project, we will continue holding monthly virtual meetings among Co-PIs and personnel from the Angus Association. Additionally, we plan to organize two additional meetings with the project's Advisory Board to discuss project developments, next steps, and to gather their input and suggestions. There is also ongoing discussion about convening the entire group for an in-person meeting, possibly at the facilities of the Angus Association. In terms of data generation, in the second year of the project, we aim to finalize the online survey and distribute it to Angus producers. Additionally, we will gather weather and soil information based on farm locations and collect satellite imaging necessary for developing the pasture quality assessment model. Genotype by environment analyses will be conducted, primarily using traditional multi-trait mixed models and reaction norms. Machine learning and artificial intelligence approaches will be implemented later, potentially towards the end of year two or during year three.

Impacts
What was accomplished under these goals? This project is an integrated research-extension initiative involving five universities and the Angus Association. It has three interdependent goals: database generation, data mining, and extension activities. The first year focused heavily on organizational efforts, including monthly virtual meetings among Co-PIs, their lab members, and personnel from the Angus Association. Additionally, we held a virtual meeting with the project's Advisory Board, which included two beef cattle producers. In terms of achievements, we have recruited a graduate student and a post-doctoral fellow to work on the project. Two more graduate students are expected to join in Years 2 through 4, along with another postdoctoral fellow in Year 4. For Objective 1 (database generation), we have compiled phenotypic, pedigree, and genomic data available from the Angus Association. We have also started developing an online survey, which will be distributed to over 9,000 beef cattle producers associated with the American Angus Association. In this initial phase, project members discussed and decided on relevant survey questions. Selected producers were consulted to gather input on additional questions and refine existing ones for clarity and informativeness. The next step involves finalizing a draft survey for testing with a few producers to gather feedback before the final version is completed. Furthermore, we have begun collecting farm location and property boundary information to support satellite imaging. Throughout the first year, several scientific and technical presentations were delivered, including: Rosa, G. J. M. Integration of Environomics and Genomics. 46th American Dairy Science Association (ADSA) Discover Conference, Itasca, Illinois. May 6-9, 2024. Rosa, G. J. M. Combining Big Data Analytics and Omics Techniques to Improve Beef Cattle Selection and Production. XXXI Plant & Animal Genomes (PAG) Conference, San Diego, CA. January 12-17, 2024. Rosa, G. J. M. The Use of AI and Statistics in Optimising the Use of Livestock Big Data. Conference on Artificial Intelligence and Data Analytics for Improved Veterinary Care. University of Surrey, Guildford, UK. December 11, 2023. Rosa, G. J. M., Lourenco, D., Rowan, T. N., Brito, L. F., Gondro, C., Huang, J. and Valle de Souza, S. Integrating Enviromics, Genomics, and Machine Learning for Precision Breeding of Resilient Livestock. 2023 ASAS-CSAS-WSASAS Annual Meeting. Albuquerque, NM. July 16-20, 2023.

Publications