Characterizing the USDA Peanut Core Collection through Genotype and Phenotype information

CHARACTERIZING THE USDA PEANUT CORE COLLECTION THROUGH GENOTYPE AND PHENOTYPE INFORMATION

Sponsoring Institution

National Institute of Food and Agriculture

Project Status

COMPLETE

Funding Source

AFRI COMPETITIVE GRANT

Reporting Frequency

Annual

Accession No.

1015113

Grant No.

2018-67013-28138

Cumulative Award Amt.

$140,140.00

Proposal No.

2017-07777

Multistate No.

(N/A)

Project Start Date

Apr 15, 2018

Project End Date

Apr 14, 2020

Grant Year

2018

Program Code

[A1141]- Plant Health and Production and Plant Products: Plant Breeding for Agricultural Production

Recipient Organization
IOWA STATE UNIVERSITY
2229 Lincoln Way
AMES,IA 50011

Performing Department
Computer Science

Non Technical Summary
Peanut is a major crop in the US, both for domestic consumption and for export. It is also an important food crop internationally, and is an important subsistence food in some areas. Peanuts are highly nutritious, with similar characteristics to tree nuts, and providing many of the same health benefits.Although breeding tools to enable rapid development of improved varieties of crops such as corn and soybean have long been available, development of these tools has lagged for peanut. As a consequence, peanut has not seen the dramatic increase in yield that corn and soybean have seen. Current peanut varieties are vulnerable to many diseases, and are vulnerable to drought and other environmental stresses. Further, new diseases emerging in countries outside the U.S. present risks to U.S. peanut production if and when they arrive. Breeding cultivars that are resistant to the new diseases before they devastate U.S. peanut production have the potential to safeguard the U.S. and global crop, and to save significant expenses and environmental damage through reduced chemical applications.This project aims to generate some basic tools that are necessary to advance peanut breeding to levels similar to what is possible for corn and soybeans, using conventional breeding methods and assisted by the use of molecular markers which can enable the efficient selection of promising seed for important traits. This enables breeders to focus their efforts on fewer, high value test plants and to speed up the breeding process.To identify molecular markers that can be used in breeding, a large collection of peanut varieties must have their DNA sampled (called genotyping), and each variety must be scored for traits of interest, for example, for degree of drought tolerance, yield measures, or by susceptibility or resistance to diseases. These two types of data (genotype and trait) are analyzed together to identify molecular markers that are associated with a desired trait: increased drought tolerance, increased yield, increased disease resistance. Breeders can then test seed for these markers using a fast and inexpensive method to determine if those markers are or are not present in the seed, and then planting out only those seeds with the markers associated with the desired trait.In the USDA germplasm facility in Griffin, GA, a set of about 800 varieties of peanut have been selected to form the USDA core peanut collection. This project will produce genotype data for the core collection, and will identify molecular markers associated with a set of 18 important peanut traits that have been identified in the collection. This information will be provided to all peanut breeders and researchers for further work identifying molecular markers that are associated with important traits.

Animal Health Component

(N/A)

Research Effort Categories

Basic

(N/A)

Applied

(N/A)

Developmental

100%

Classification

Knowledge Area (KA)	Subject of Investigation (SOI)	Field of Science (FOS)	Percent
202	1830	1081	100%

Knowledge Area
202 - Plant Genetic Resources;

Subject Of Investigation
1830 - Peanut;

Field Of Science
1081 - Breeding;

Keywords

genotype

phenotype

Goals / Objectives
Goal: provide public genotype data for each accession in the USDA core peanut collection to be used both by this project for initial analysis, and additional analyses in subsequent research. This dataset will be consistent as all accessions will be genotyped with the same Affymetrix SNP array. An additional set of 14 lines that are important for commercial breeding will also be genotypedSamples of each accession in the core collection, and of the 14 commercial lines will be sent to Iowa State University from the Greg MacDonald lab at the University of Florida.DNA will be extracted from each sample and applied to Affymetrix SNP arrays, then the arrays will be sent to Affymetrix for processing.The raw data from Affymetrix will be processed and analyzed for clear SNP calls.The resulting genotype data for each accession will be provided to the public upon completion of the project and publication of the marker-trait analysis describe below.Goal: test a subset of accessions for heterogeneity. To be useful in breeding programs, accessions must be inbred, displaying minimal heterogeneity. A set of the USDA core peanut collection is suspected of being heterogeneous.175 potential heterogeneous accessions have been identified.3 plants (replicates) will be sampled for each of these accessionsGoal: test uniqueness of the USDA core peanut collection. Other crop germplasm collections have been found to contain duplicated accessions and this may be true for the core peanut collection as well.All members of the USDA peanut mini-core collection have been genotyped with an older SNP array. The genotype data with the new SNP array will help determine if some accessions in the mini-core are impure.A portion of the USDA core peanut collection will be sample in triplicate to test for purity.Any duplicated accessions in the USDA core peanut collection will be detected.Goal: collect and consolidate all available phenotype data for the US core peanut collection.Collect phenotype data from older phenotyping studies, including results from the recently study of the core collection which was still in final analysis at the time of this writing.Merge this data with existing phenotype data available in GRIN. This will involve correlating trait names, measure methods, and values across the different sets so that they can be used in large scale Genome Wide Association Studies (GWAS), such as the one in this projectData will be available to the public as it is processed and published, or otherwise released to the public.Goal: provide a set of marker-trait associations for 18 important agronomic traits.Identify and remove accessions which are duplicates or impure.Determine population structure and pedigrees.Conduct GWAS analysis with tools and methods tested on an earlier project with the USDA peanut mini-core collection.

Project Methods
Build infrastructure at PeanutBase to hold and provide data produced by this project. This task will be ongoing through the length of the project.Collect and merge existing phenotype data (from GRIN, UGA, UF). Data will be made public via PeanutBase as it is published or otherwise released as public data.Select accessions to be tested for heterogeneity. The entire USDA peanut mini-core, random 10% from USDA peanut core collection, which is not in the mini-core (68 accessions), making a total of 175 accessions for re-genotyping. We will do 3 replicates of each, a total of 525 samples.Collect samples for genotyping and send to Iowa State University DNA facility to extract DNA. Seeds will be obtained from the University of Florida. DNA from an earlier extraction of the core (sans mini-core) is still available in Dr. MacDonald's lab and likely to be in good shape. We will use this, but will be prepared to extract new DNA if it has degraded. Combined with the accessions to be tested for heterogeneity, the total number of samples will be 1332. To cover the possibility of having to re-do some samples, we will allow for up to 1400 samples.Send DNA to Affymetrix for processing.Process raw data. Reliable methods for doing this have been worked out and tested in Peggy Osias-Akins' lab and will be used by this project.Population study and GWAS analyses. Analysis of the population and pedigree structure will be needed to improve the GWAS analysis. Highly similar, thus possibly duplicate accessions will be removed from the set, as will accessions that are highly heterogeneous (not pure). GWAS methods will include TASSEL, GAPIT, and GWASpoly, and any other tools or methods that prove to be useful.Collect all existing phenotype data, principally from Dr. MacDonald's lab at the University of Florida, and merge with all existing phenotype data for the USDA core peanut collection.Ensure all public phenotype data is loaded into GRIN.Assign DOIs or ARKs to full datasets and deposit in a permanent repository.All public data will be freely provided at PeanutBase as is published or otherwise declared to be public data.Planned paper(s): describing phenotypic variation and population structure of the core collection and results of the GWAS studies.

Progress 04/15/18 to 04/14/20

Outputs
Target Audience:Peanut researchers and breeders. Changes/Problems:We were unable to combine data from the older Arachis_axion SNP array (used earlier to genotype the U.S. mini-core collection) with data from the Arachis_axion2 array used in this study. This would have enabled additional testing for mixed/heterogeneous accessions within the US peanut mini-core collection. What opportunities for training and professional development has the project provided?The data produced by this project has been used in research by two Ph.D. students, Roshan Kulkarni and Paul Otyama, both students at Iowa State University, with separate funding (not funded by this grant). How have the results been disseminated to communities of interest? Results were presented in a poster session at the 2019 National Association of Plant Breeders meeting in Pine Mountain, GA. The paper, 'Genotypic characterization of the U.S. peanut core collection', has been submitted and is under review. A preprint is available at bioRxiv (https://doi.org/10.1101/2020.04.17.047019). What do you plan to do during the next reporting period to accomplish the goals? Nothing Reported

Impacts
What was accomplished under these goals? The intent of this project was to collect genetic data for each accession (variety) in the US peanut core collection (Holbrook et al., 1993), which contains 812 accessions. Of these, 787 accessions were available for this project. The data produced by this project, combined with trait observations for each accession, can be used to look for genetic markers that are associated with agronomically important traits, enabling peanut breeders to more quickly develop new varieties of peanut. Another objective of the project was to assess the genetic diversity of the US peanut core collection, that is, to address the question: does it include all (or most) of the genetic variation across all varieties of peanut. The project also sought to assess the genetic "purity" or homogeneity within accessions, for example, to determine whether any accessions contained mixtures of genetically distinct seeds, or represented recent crosses with other accessions. Knowledge about genetic homogeneity is important for determining breeding strategies. To check for this, 247 accessions were selected to test for purity. The final objective was to carry out preliminary genotype - trait analyses to look for genetic markers for 18 key agronomic traits. This data, and the preliminary analyses, will benefit both peanut breeding work and basic research on peanut, with the goal of producing new varieties of peanuts with improved nutrition, agronomic characteristics, and increased resilience to environmental and disease and pest challenges. Goal I: generate public genotype data for each accession in the USDA core peanut collection. Twenty seeds from each accession were acquired from Dr. Greg MacDonald (University of Florida) for DNA sampling. The remaining seeds for each accession were transferred to Dr. Shyam Tallury at the USDA Plant Genetic Resources Conservation Unit in Griffin, GA. The accessions for DNA sampling were divided between Dr. Peggy Ozias-Akins' lab at the University of Georgia in Tifton, GA and Dr. Kelly Chamberlain's lab, USDA, Stillwater, OK where they were sampled, then grown to maturity and seed collected. Leaf samples were sent to Iowa State University from the Graham and Chamberlain labs and DNA was extracted by Dr. Michelle Graham's lab, and shipped to ThermoFisher/Affymetrix for processing on the Arachis_axiom2 SNP array. The ThermoFisher/Affymetrix analysis on the raw data were used in the core collection genetic assessment. The resulting genotype data is available at PeanutBase (https://peanutbase.org) as a bulk download: https:/peanutbase.org/data/public/Arachis_hypogaea/minicore.trt.JWYM/, and can be browsed using Gigwa: https://peanutbase.org/gigwa. The data are also available at the National Agricultural Library's Ag Data Commons: https://doi.org/10.15482/USDA.ADC/1518508 Diversity analysis using this data showed that the collection is genetically very diverse and likely spans most or all of the genetic diversity found in peanut worldwide. Analysis of genetic diversity and geographical location of the accessions determines that most genetic diversity in peanut had arisen in South America, before global distribution during colonial trade. Goal II: test a subset of accessions for heterogeneity A set of ~400 replicates from 247 accessions were grown to seedling stage at Iowa State University by the project team, sampled, and genotyped. Of these, 44% were found to be genetically impure accessions, meaning that the accessions in the core collection cannot be assumed, in general, to be homogeneous within each accession. This has important implications for breeding work. Goal III: test uniqueness of the USDA core peanut collection The project determined that 120 (15%) accessions in the core collection were nearly-identical, although it is possible that minor genetic differences still result in distinct phenotypes, so these may still be justified as distinct accessions. Goal IV: collect and consolidate all available phenotype data for the US core peanut collection. Phenotype data was collected from all available sources - the Germplasm Resource Information Network (GRIN), publications (Anderson et al., 1993, Anderson et al., 1996, Holbrook et al., 1983, Simpson et al., 1992), and from a phenotype study carried out by Dr. Greg MacDonald at the University of Florida in collaboration with Dr. Noelle Anglin, then curator of the USDA peanut collection in 2013-2015. The GRIN data varied in collection method, so only the latter two phenotype data sets were considered in analyses. While studies looking for genetic predictors of traits are still ongoing, data from the 2013-2015 has been made publicly available, but only for the US Mini Core (Holbrook and Dong, 2005). (https:/peanutbase.org/data/public/Arachis_hypogaea/minicore.trt.JWYM/) The entire data set will be publicly available when the phenotype paper has been submitted. This paper is in preparation. Goal V: provide a set of marker-trait associations for 18 important agronomic traits. This goal has not yet been met, but the paper describing the work and results is in preparation. References: W. F. Anderson, C. C. Holbrook, T. B. Brenneman. Resistance to Cercosporulium Personatum Within Peanut Cermplasm. Peanut Science (1993) 20:53-57. W. F. Anderson, C. C. Holbrook, A. K. Culbreath. Screening the Peanut Core Collection for Resistance to Tomato Spotted Wilt Virus. Peanut Science (1996) 23:57-61 C. Corley Holbrook and William F. Anderson. Evaluation of a Core Collection to Identify Resistance to Late Leafspot in Peanut. Crop Sci. 35:1700-1702(1995). Holbrook CC, Anderson WF, Pittman RN (1993) Selection of a core collection from the U.S. germplasm collection of peanut. Crop Sci 33:859-861 Holbrook CC, Dong W (2005) Development and evaluation of a mini core collection for the U.S. peanut germplasm collection. Crop Sci 45:1540-1544 C.C. Holbrook, DA Knauft, DW Dickson. A Technique for Screening Peanut for Resistance to Meloidogyne arenaria. Plant Disease Vol. 67 No. 9(1983) Simpson, C.E., Higgins, D.L., Thomas, G.D., Howard, E.R. Catalog of passport data and minimum descriptors of Arachis hypogaea L. germplasm collected in South America 1977-1986 College Station, TX. : Texas Agricultural Experiment Station, Texas A & M University System, [1992]

Publications

Type: Journal Articles Status: Submitted Year Published: 2020 Citation: Genotypic characterization of the U.S. peanut core collection. Paul I. Otyama, Roshan Kulkarni, Kelly Chamberlin, Peggy Ozias-Akins, Ye Chu, Lori M. Lincoln, Gregory E. MacDonald, Noelle L. Anglin, Sudhansu Dash, David J. Bertioli, David Fern�ndez-Baca, Michelle A. Graham, Steven B. Cannon, Ethalinda K.S. Cannon.

Progress 04/15/18 to 04/14/19

Outputs
Target Audience:Multiple meetings were held with the peanut breeding community to insure the data would provide the expected benefits. Changes/Problems:Discussions with representatives from the peanut breeding community raised the concern that the data would be of limited use unless paired with single-seed descent of the sampled plants. As this could not be managed within the scope of the funded project, two researchers - Dr Peggy Ozias-Akins at University of Georgia, and Dr. Kelly Chamberlain, USDA, Stillwater, OK - each agreed to grow half of the collection to maturity and collect seed. Neither researcher received funds from this grant. This project was originally planned to be a single year project, but numerous and significant delays have pushed it into a second year: Initially, there was a 5 month delay in receiving the seeds: we hoped to plant in April, 2018, but didn't receive the seed until late September, 2018. Given the lateness of the season and cold Iowa fall/winter weather, and in spite of multiple approaches to increasing the warmth and light quality, the seedlings intended for biological replicate sampling were slow to germinate and showed very poor germination (<50%). Replanting of the failed accessions was delayed by the government shutdown. In the second attempt the failed accessions were grown in a grown chamber, which improved the germination rate. What opportunities for training and professional development has the project provided?The project has thus far provided opportunities for learning greenhouse and growth chamber skills, and for developing skills in use of seed imaging equipment and automated image analysis. How have the results been disseminated to communities of interest? Nothing Reported What do you plan to do during the next reporting period to accomplish the goals?The DNA samples will haven been sent to Affymetix/ThermoFisher by the time this report is received. When the data has been returned, we will complete the remainder of the goals in roughly this order: filter raw data for clear SNP signals investigate population structure of the core collection identify heterogenous accessions with in the core combine genotype and phenotype data to look for marker-trait associations prepare publication provide all data through PeanutBase

Impacts
What was accomplished under these goals? Goal 1: provide public genotype data for each accession in the USDA core peanut collection Up to 25 seeds for each accession for the USDA core peanut collection was received from Greg MacDonald, along with an agreement from Dr. MacDonald to send the remaining seed stocks for the collection to Dr. Shyam Tallury at the Plant Genetic Resources Conservation Unit in Griffin, GA. the accessions were divided between two unfunded, collaborator labs which grew the plants to maturity and collected progeny seed. Leaf tissue was sampled at 2-4 weeks after germination. Samples have been collected and DNA extracted. Goal 2: test a subset of accessions for heterogeneity 400 biological replicates were selected to test for heterogeneous accessions. Leaf samples were taken from these plants at 2-4 wks after germination. DNA has been extracted from the samples. Goal 3: test uniqueness of the USDA core peanut collection This goal partially overlaps with Goal 2. No data is yet availalble to carry out the planned analysis of heterogeneous accessions within the US peanut mini core. Goal 4: collect and consolidate all available phenotype data for the US core peanut collection All available phenotype data for the US peanut core collection has been merged with GRIN data and consolidated into PeanutBase. Additional phenotype data was collected through image analysis of the seeds before they were planted. Correlating trait terms used by GRIN and the multiple phenotype studies is in progress. Goal 5: provide a set of marker-trait associations for 18 important agronomic traits No data is yet availalble so this goal has not been addressed.

Publications