Source: UNIVERSITY OF NEBRASKA submitted to
FROM GENE TO GLOBAL HYDROCLIMATIC CONTROLS ON HYBRID PERFORMANCE PREDICTABILITY
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
NEW
Funding Source
Reporting Frequency
Annual
Accession No.
1015252
Grant No.
2018-67013-27594
Project No.
NEB-21-176
Proposal No.
2017-07752
Multistate No.
(N/A)
Program Code
A1141
Project Start Date
Feb 15, 2018
Project End Date
Feb 14, 2021
Grant Year
2018
Project Director
Munoz-Arriola, F.
Recipient Organization
UNIVERSITY OF NEBRASKA
(N/A)
LINCOLN,NE 68583
Performing Department
Biological Systems Engineering
Non Technical Summary
A current challenge for the global community is to secure food provision for the decades to come. The goal of this proposal is to develop a conceptual model to predict hybrid performance in response to hydroclimatic changes. Historically, genetic progresses in maize production have responded to breeding activities. To pursue future successful and sustainable crop production we propose to develop a Genomics-by-Environment model. To implement such hybrid statistical modeling approach, possible sources of predictability of corn hybrid performance thorough changing climate forcings will be investigated and implemented in a conceptual framework as follows: (1) Develop a data management test bed to collect, standardize and integrate data; (2) Characterize spatiotemporal hydroclimatic controls and the associated uncertainties across scales; (3) Develop a conceptual Genetic-, Multi-trait-, and Hydroclimatic-sensitive Model; (4) Perform hydroclimatic-driven hybrid performance forecasts based on (a) the spatial regionalization of phenotypic and environmental data and (b) the temporal influence of EHCEs on phenotypic expressions under standardized indices and absolute values of environmental variables; and (5) Develop a conceptual framework for operational rapid-response hybrid performance forecasts. Ultimately, "simulated" successful hybrids in response to droughts may be obtained by integrating the geospatial expansion of genes at field-scale and the syntheses of global-scale hydroclimatic processes.
Animal Health Component
0%
Research Effort Categories
Basic
20%
Applied
40%
Developmental
40%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
2027310108120%
1320430108150%
4020420108130%
Goals / Objectives
GoalDevelop a conceptual model to predict hybrid performance in response to hydroclimatic forcingsGeneral ObjectiveDevelop and implement a multi-trait, multi-site expression to integrate hydroclimatic controls on the predictability of hybrids in response to EHCEsParticular ObjectivesDevelop a data management test bed to collect, standardize and integrate data.Characterize spatiotemporal hydroclimatic controls and the associated uncertainties across scales.Develop a conceptual Genetic-, Multi-trait-, and Hydroclimate-sensitive Model.Perform hydroclimatic-driven hybrid performance forecasts based on (a) the spatial regionalization of phenotypic and environmental data and (b) the temporal influence of EHCEs on phenotypic expressions under standardized indices and absolute values of environmental variables.Develop a conceptual framework for operational rapid-response hybrid performance forecasts.
Project Methods
I) Develop a data management test-bed to collect, standardize, and integrate dataPhenotypic, genomic and environmental data will be retrieved from G2F consortia. Complementary environmental information will be obtained from complex satellite and field network information systems. Since the G2F project was launched in 2014 at the beginning of this research, information is available for three years. In average, close to 28 locations were tested each year, and most of these were observed during the three years. Nearly 1,000 hybrids were tested, most of them during the three years. To allow connectivity between locations, researchers observed an average of 250 hybrids at each location in 2014, whereas in subsequent years, they observed close to 500 hybrids at each site. A portable weather station able to record information on at least eight weather conditions was used in each trial. As this research is an ongoing experiment, data generated in coming years will be integrated in the database as well. Gridded and station data from observations as well as remote sensing will be collected and standardized.Observed DataThe initial dataset includes precipitation, maximum temperature, minimum temperature, and wind speed at 1/16th degree resolution. The dataset is derived from approximately 20,000 cumulative National Climatic Data Center (NCDC) Cooperative Observer (COOP) stations across the United States, as well as stations across Canada and Mexico. Data will be merged with G2F precipitation, minimum, and maximum temperature. The synergraphic mapping system (SYMAP) algorithm (Shepard, 1984) will be used for gridding the temperature and precipitation at resolutions up to 1km to test suitability. The wind dataset was taken from the National Centers for Environmental Prediction (NCEP) National Center for Atmospheric Research (NCAR) reanalysis (Kalnay et al., 1996) and gridded using linear interpolation. Further methodology details can be found in Livneh et al. (2015) and Maurer et al. (2002). Daymet and MODIS data will also be used to calibrate/validate the approach above, as well as testing purposes.Data IntegrationThree years of G2F environmental (G2F-E), Genotypic (G2F -G) and Phenotypic (G2F -Y) data will be available by the time this project could start. The proposed series of experiments will allow the group to use a total of six years of data for the spring-summer season from 2014-2019. The ultimate goal in this section is to have a series of vectors and matrices with the same dimensions to serve as inputs for Equation 4. Dimensions of vectors and matrices as well as number of trails will be determined by the G2F data availability. The use of linear algebra in our approach will allow us to adapt the interannual changes that have already occurred. Environmental variables will be disaggregated using a fixed subdaily distribution; for example, co-op station data is available daily and G2F-E is available subdaily. Additional phenotypic data such as greening and dormancy periods based on MODIS-LAI (Tang et al., 2012) will be tested and used as complements to current G2F-Y. G2F-G will be used as it is provided by the source standardized to the same format. Standards for visualization will be developed for all G2F-data. This will facilitate data delivery, analytics and synthesis.II) Characterize spatiotemporal hydroclimatic controls and the associated uncertainties across scales.These analytics represented by metrics of environmental extremes will be used as G2F-E as well as G2F-Y in some cases (e.g., greening and dormancy obtained from MODIS-LAI; Figures 2b and 2c). An approach to identifying and assessing extremes is to measure the degree of deviation from normal; however, anomalies in absolute terms reflect different severities in different parts of the domain. These differences may be due to climate, soils, vegetation, or other factors specific to that region. An alternative approach is percentiles, which allow direct comparisons of extreme wet and dry events across the domain, helping identify extremes (Andreadis & Lettenmaier, 2006; Andreadis et al., 2005; Sheffield et al., 2009). The meaning of percentile can be attained by stating that the pth percentile of a distribution correlates to the value where approximately p percent of the values in the distribution are equal or less than that value. Monthly and daily percentiles will be calculated for each grid cell and G2F stations based on the climatology of the 64-year period (1950-2013) and the 14-year period (2000-2013) using MODIS-LAI availability. Total soil moisture was represented as percentiles (using the Gamma distribution) relative to all simulated values for a given grid cell and month. With this method, seasonal variations are removed, extremes can be more easily identified, and the intensity of events is better represented throughout the domain. Additionally, percentiles allow convenience due to their ordinal range from zero to one. The Gamma distribution has been identified as a good fit for the soil moisture data. Additional distribution functions will be tested (e.g., Weibull).III) Develop a conceptual Genetic-, Multi-trait-, and Hydroclimatic-sensitive Model.Genomic selection methodology potentially (upon data availability) could be applied to four different problems that breeders face on fields: (i) predict observed lines in observed environments [some lines tested in some environments but not in others]; (ii) predict unobserved lines in observed environments [new developed lines that have not been observed in any trial]; (iii) predict observed lines in unobserved environments [new locations]; and (iv) predict unobserved lines in unobserved locations [new lines that have never been tested would be tested in new environments].IV) Perform hydroclimatic-driven hybrid performance forecastsSpatial PatternsData in obtained in the section above will be used to address objective IV. Regionalization of the areas of interest will be based on gridded and G2F-Environmental data. This analysis will be implemented for each of the G2F-E (see section II) to characterize the sensitivity of hybrid performances to hydroclimate forcings.Temporal PatternsOnce the areas under water deficit in SM and precipitation have been selected (relative, as a function of the SPI, and absolute, as a function of their magnitude), hybrid performance simulations will be run for those lines under each scenario of water deficit. Selected hybrids under such conditions will be tested in other domains based on the regionalization of the previous section but also on the very site. The main purpose is to characterize hybrids' performance under dry and drought conditions in other sites of G2F. Hypothetical scenarios will be integrated with the genetics of dry and drought conditions throughout the G2F domain.V) Develop a conceptual framework for operational rapid-response hybrid performance forecast.Data-retrieval, experiments, and forecasting conceptual frameworks will be integrated in an architecture Sources of data and user needs vary across the G2D-domain. The development of an interoperability/information system will use an interoperation interface to cope with diversity and management of data to pursue a wider and more open accessibility of information. The system will collect data from the G2Fnetworks, the ACIS-High Plains Climate Center-CoOP climatological and meteorological stations, and remote and proximal sensing products to produce: 1) Standardized and multi-source G,E, and Y variables; 2) user-friendly access to historical G, E, and Y data; and ultimately 3) meaningful representations of integrated climate-, water- and G-E-Y-related information for users. The prototype of a system will be tested against un-integrated experiments. Information display in the appropriate forms and formats will be a key part of the system's legacy and future operational tests and activities.

Progress 02/15/18 to 02/14/19

Outputs
Target Audience:The team reached theweather, climate, and water community at the American Meteorological Society annual meeting in Phoenix, AZ. The purpose was tocommunicate about breeding programs and oportunities inthe iintersection of environmental-genotype area. Also, we provided persoectives onclimate,data, and machine learning techniques to the "OMICS" community at the Genomes to Fields and PHENOME 2019 annual conference in Tucson, AZ. The purpose was to communicate our activities on environmental data collection and improvement. Recruitment activities were focused on training "Digital Breeders". We hired two PhD students, an statistitian and Biological Systems engineer. Our commitment to supportwomen and minority groups in STEM education was reflected in this process. We also recruited an undergraduate research experience, whom will join later this year University of California Davis to pursue her Masters. We held interdisciplinary group meetings, where statistitians and Biological and Agricultural Engineers created a collaborative community. Two oral presentations and two posters emerged from these collaborative efforts. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?PI- Munoz-Arriola taught one formal courses (Soil and Water Resources Engineering). Undergraduate and graduate students, evidencing the need of non-stationary approaches to address water, food, and energy security through the presentation of sixteen final projects. Also, support for undergraduate research experience is evidenced in the projects developed in Munoz-Arriola research lab with 2 undergraduates working on independent research projects (all women in science and engineering); all undergraduates in the lab received funding from our project and UNL's Institute of Agriculture and Natural Resources in response to competitive grants solicitations. The PI continued his involvement with the National Science Foundation Research Training program on Resilient Complex Agricultural Landscapes. Four PhD students (two part-time and to fully funded in year one) have been involved in the project so far. Alessandro Amaranto was partially funded for one semester to produce a data-driven model to improve environmental databases. Manas Khan, was also partially funded to create the analytics and geospatial aggregations of EHCEs. Parisa Sarzaeim and Denise Bradford have been fully funded and work on the database improvement and statistical tool development in te project. Co-PI-Diego Jarquin developed and taught multiple sections of a one-week workshop on Statistical Methods for Omics Assisted Breeding at the University of Tokyo in Japan in November of 2018. There, he discussed prediction models involving the genotype x environmental interaction, and computational techniques integrating data for these models. Diego Jarquin also delivered two 90 minute lectures at the University of Nebraska - Lincoln on the resources of the Holland Computing Center at University of Nebraska - Lincoln which contribute to address the computational needs of the Objectives of this project. He also had the opportunity to become a visiting scientist for three months in the Laboratory of Biometry and Bioinformatics in the Department of Agricultural and Environmental Biology at the University of Tokyo in Japan, and work with scientist on developing prediction models for multi-omics data. Diego Jarquin trained two PhD students in the Department of Statistics to understand the reaction-norm model used for predictions involving the genotype x environmental interaction, and to be able to utilize a pipeline developed for simple predictions. How have the results been disseminated to communities of interest?PI- Munoz-Arriola: member of the American Meteorological Society Water Resources Committee. Consolidated the international research collaborative programs on Hydroinformatics and Integrated Hydrology with the Delft Institute for Water Education (The Netherlands); and software development with the Universidad Autonoma de Baja California (Mexico). The G2F commutity invited Parisa Sarzaeim (PhD student) and Munoz-Arriola to give a talk on "Environmental Data Generation, Collection, and Storage for Cross-Scale Phenotype Predictability in the G2F Initiative" and at AMS Munoz-Arriola talked about "Weather/climate data collection for large-scale phenotype predictability in the Midwest". Five additional posters were presented in regional and international venues by students involved in the project. Co-PI. Diego Jarquin have lead and coauthored at least 12 manuscripts where similar models have been used successfully (four of them in 2018). Also, some of these results have been presented in at least 6 international invited conferences. Diego Jarquin had numerous meetings and presentations as a visiting scientist in the Laboratory of Biometry and Bioinformatics in the Department of Agricultural and Environmental Biology at the University of Tokyo in Japan where he discussed models similar to what is proposed here. What do you plan to do during the next reporting period to accomplish the goals?PI- Munoz-Arriola expects to consolidate the software developed in the past cycles and publish findings in scientific journals, addressing the complexity of extreme precipitation and drought in agricultural and urban areas. PI-Munoz-Arriola expects to continue strengthen communication and collaboration with the G2F initiative, as well as private companies, exploring new synergies with such sectors. Two PhDs will continue working as part of the work proposed for year 2. UNL's Agriculture Research Department, the Robert Daugherty Water for Food Global Institute, and the National Science Foundation Research Training graduate program on Resilient Complex Landscapes have evidenced interest in the present project. We expect this could be translated into funding opportunities for graduate students. Co-PI. Continue the model development stage for evolving the current model and allow it to perform predictions under stress conditions. A student will be continuously trained to utilize the developed models.

Impacts
What was accomplished under these goals? MA= Major Activities; DA=Data Collected; SM=Summary Statistics; KO=Key Outcomes Objective 1. MA: A pipeline was designed to retrieve phenotypic, genomic and environmental data from the Genomes to Fields initiative for years 2014, 2015 and 2016. The team also developed pipelines to collect environmental data from public sources including remote sensing, grided and station-based networks. DC: The team developed the analytics to correct inconsistencies in the G2F database (data gaps) using data driven models. The artificial neural network used integrated G2F and environmental data for precipitation, minimum and maximum temperature, wind speed, and relative humidity. Some of the sources of data are listed below: -High Plains Regional Climate Center (station data) -National Solar Radiation Database -National Weather Service -NASA's Prediction of Worldwide Energy Resources (combination of Model and RS data) -Moderate Resolution Imaging Spectroradiometer -Global Precipitation Measurement -The Tropical Rainfall Measurement Mission SM: This objective has resulted in 0 peer-reviewed publications, 0 extension publications, 6 conference presentations, 0 extension presentation, 1 undergraduate project, 0 MS projects, and 2 PhD projects. The results of the research have led to 0 additional grant applications. KO: To date the greatest impact has been a change in knowledge with the conference presentations. The team has introduced the complexity and opportunities to advance breeding programs to the Weather, Cliamte and Water communities of the American Meteorological Society (AMS, 2019 annual meeting, Phoenix, AZ). The "Omics" community (at the PHENOME 2019 annual meeting, Tucson, AZ) on the other hand, received possitevely our efforts on environmental database development and predictive analytics using machine learning techniques and decision support tools. We anticipate a change of action in the coming years as we interact with more producers and water managers across the US. Objective 2. MA: The team developed analytics to identify extreme hydrometeorological and climate events (EHCEs). Tests have been made to spatiotemporaly characterize extreme precipitation, drought, and heat waves and the associated return periods. DC: Algorithms for the metrics of EHCEs have been developed. These algorithms incorporated the World Meteorological Organization libraries for EHCEs. Currently, analytics and data sources are being connected to create spatiotemporal aggregates of environmental data, which will be coupled with phenotype assessments. SM: In 2018, this objective resulted in 0 peer-reviewed journal articles, 0 extension publications, 1 conference presentations, 0 extension presentations, 0 undergraduate projects, 0 MS project, and 1 PhD project (partially funded). This objective has resulted in funded proposals including 0 internal grants and 0 external grants (two pending for funding from NSF coupled natural-human systems and innovations on the nexus food-energy-water). KO: To date, the greatest impact has been a change in knowledge with the presentations in the American Society of Agriculture and Biological Engineers (ASABE annual meeting, Detroit, MI). Objective 3. MA: A baseline model to perform predictions based on covariance structures have been developed for analyzing current observed environmental data. DC: No new data collected. However, we have identified areas where the present project and the G2F initiative can improve collection, storage, and distribution of data to users. SM: Preliminary results have been analyzed. At the AMS 2019 annual conference, improvements of 16% (2016) to 33% (2016) on environmental data. This represent an addition of 8 to 12 experiments with data gaps, respectively. KO:Automated pipeline for running the analysis and predictions is developed. Objective 4. MA:Work in progress since data obtained in Obective 2 was obtained during the first year of the project. DC: No results are expected in year 1. SM: No results are expected in year 1. KO: No results are expected in year 1. Objective 5. MA: This objective is an integration of Objectives 1-4. Thus, it is in working progress. Some of the prediction models are already developed and thir implemented through computational pipelines and arechitectures of software are in progress. DC: Some of the phenotypic, genotypic and environmental data are already collected, but more data will be collected. SM: KO: No results are expected in year 1.

Publications