Recipient Organization
UNIVERSITY OF GEORGIA
200 D.W. BROOKS DR
ATHENS,GA 30602-5016
Performing Department
CAES-Crop & Soil Sciences
Non Technical Summary
This research addresses utilization of naturally-occurring variation to enrich genetic diversity and accelerate breeding progress in one of the most economically important and genetically vulnerable US crops, cotton. Naturally occurring day-neutral (breeder-friendly) cottons capture much of the exotic diversity available. Prior funding revealed many of these exotic cottons to confer improvements to elite germplasm. Here, a selection index applied to prior phenotyping prioritized 50 lines for backcrossing to one of two elite cultivars then phenotyping in 3 environments to prioritize 10 crosses for in-depth evaluation. Genetic analysis ('QTL mapping') to assess exotic alleles for contributions to the elite gene pool will be based on 3 environments for 100 progeny per cross (noting that many alleles may segregate in more than 1 cross). Elite alleles that consistently confer improvements against a background of multiple exotic alleles will be identified based on six environments with 500 individuals each, evaluated against backgrounds of 5 (3 environments) or 25 (3 environments) exotics. Comparing elite cultivars DES 56 and Acala Maxxa (to be used in crosses) may discern differences in how these key representatives of core US germplasm perform in exotic crosses. The hypothesis that the elite gene pool itself experiences linkage drag which can be mitigated by this exotic germplasm will be tested. Partnership between a plant breeder and a genome biologist will integrate training in field-based breeding with effective utilization of genomic data and tools toward mitigating genetic vulnerability, in a manner applicable to many other crops.
Animal Health Component
50%
Research Effort Categories
Basic
50%
Applied
50%
Developmental
(N/A)
Goals / Objectives
The long-term goal of the project is to enrich genetic diversity and accelerate breeding progress in the elite gene pool of one of the most economically important and genetically vulnerable major US crops, cotton, by increasing utilization of naturally-occurring variation from exotic genotypes. Meeting this goal will build upon results of a prior funding period that surveyed a broad range of exotic germplasm for potential contributions to the elite US gene pool both by multi-environment phenotyping and genotyping to determine relatedness/population structure and potential marker trait-associations by GWAS, aided by the prudent use of genomic tools to provide for breeding of new superior genotypes that provide low-cost intrinsic genetic solutions to needs of producers, processors, and/or consumers.Supporting objectives and brief expected outcomes include:Objective 1.Infusing diverse exotic germplasm into selected elite US cottons.In our prior funding period (above), 215 exotic cotton accessions identified by one of the proposers as day-neutral were evaluated in three environments, also utilizing association genetics to identify potential marker-trait associations.Here, we have chosen 50 lines for forward genetic analysis, based on a selection index comprised of economically important fiber quality traits including upper half mean fiber length, fiber strength, fiber uniformity index, and micronaire. The lines will be subdivided into two groups that have similar attributes (e.g. based on the selection index, the highest ranking line in one group, the second line in the other group, the third line in the first group, etc.). The resulting two groups of 25 lines will each be crossed and backcrossed once to an elite cultivar chosen to represent different dimensions of the US cotton gene pool, one group to DES 56 and the other to Acala Maxxa.Phenotyping of about 20 BC1 progeny from each of these 50 crosses in three environments will provide foundational data from which to prioritize specific crosses for more detailed analysis.The 1000 individuals will also be subjected to genotyping by sequencing (GBS), performing QTL analysis to identify elite alleles that consistently show favorable effects against a background of multiple exotic alleles, indeed comparing DES 56 and Acala Maxxa to discern important differences in how they perform in crosses with broad samples of exotic germplasm; and exploring whether GWAS provides for high resolution mapping or whether a predominance of rare alleles present within the naturally occurring day-neutral exotic cotton panel require biparental populations to detect QTL and beneficial, rare genetic variants.Based on preliminary data suggesting a predominance of rare alleles that may only be present in as little as a single progeny array of 20 individuals, we will use primarily phenotypic information to select among the 25 crosses per background to identify the single best one for each of 5 traits including fiber length, strength, fineness, elongation, and yield components - thereby advancing 10 crosses (5 per background) to mapping of larger progeny arrays in Objective 2.Objective 2.Based on field assessments of per se trait performance (Objective 1), from each of the two elite backgrounds we will choose 5 crosses for selfing and more detailed evaluation of performance of exotic chromosomal segments in BC1F2 progeny of crosses with leading elite cottons.Here, from the 20 BC1F1 lines from each of 50 crosses in Objective 1, for the 10 selected crosses we will grow 5 plants per line (100 per cross), providing sufficiently large progeny arrays to undertake QTL mapping in each individual cross.In year 2, single BC1F3 plants will be grown at a single location and subjected to GBS, and in year 3, small BC1F3:4 families will be grown in each of two locations. Thus, for discerning the effects of rare alleles from the 10 exotic lines we will have a total of 3 environments of data for 100 progeny per cross; while for assessing elite alleles that consistently show favorable effects against a background of multiple exotic alleles, we will have six environments for 500 individuals per environment, evaluated against backgrounds of 5 (3 environments) or 25 (3 environments) exotic genotypes.
Project Methods
Objective 1.Infusing diverse exotic germplasm into selected elite US cottons.Here, we will phenotype arrays of about 20 BC1 progeny from each of these 50 crosses in three environments, providing foundational data from which to prioritize specific crosses for more detailed analysis. The 1000 individuals will also be subjected to genotyping by sequencing (GBS), using a protocol that has evolved from our past work together as well as additional work in the PI's lab(Kim et al. 2016).The 1000 BC1 families will be evaluated in three environments.BC1 plants will be grown and selfed in a winter nursery overseen by Cotton Incorporated, in a randomized block design.BC1F2 plots of 1 meter rows containing about 10 individuals will be grown in SC and GA using a single replicate, augmented randomized complete block design with four repeating checks(Federer and Raghavarao 1975). The four repeating checks (Acala Maxxa, DES 56, and two commercial cultivars) will be randomly assigned to plots within 50 blocks using a randomized complete block design. Each of the 50 blocks will consist of 20 BC1 families and the four repeating checks such that the total number of plots evaluated is 1200. Following defoliation of plots at 80% open bolls, bolls will be hand harvested from the entire plant of each plot. Following ginning on a common 10-saw laboratory gin, fuzzy seed and ginned fiber will be weighed to determine lint percent. The collected fiber will be used to determine High Volume Instrument fiber properties, including strength, length, micronaire, elongation, short fiber content, and fiber uniformity.Our cotton GBS protocol(Kim et al. 2016)is modified from Multiplexed Shotgun Genotyping [MSG(Andolfatto et al. 2011)] combined with the TASSEL-GBS(Glaubitz et al. 2014)(v.2) analysis pipeline. Although the PI's Illumina MiSeq yields ~ 25 million reads of 150 base pair (bp) single-end fragments per run, a NextSeq in the UGA core facility is more economical and is budgeted. To maximize data quality, we currently only use the first 64 nucleotides (nt) of sequence but may extend this using subsequent versions of TASSEL-GBS. In a 96-multiplexed library, genotypes are tagged with barcodes in the first 6 bases of raw reads, making multiple libraries for populations with > 96 individuals. Reads that start with 'N' and with no barcode in the first 6 bases are removed. Processed reads are input to the TASSEL-GBS(Glaubitz et al. 2014)reference based genotype caller, to guide stacking of short reads from the same genomic location, aligning to a tetraploid genome(Zhang et al. 2015)using BWA(Li and Durbin 2009).To adequately map individual bi-parental populations, we expect to need at least 1000 and preferably 5000 SNPs per population. The 26 cotton chromosomes often have 4000-6000 cM of recombination in total. Previous studies of intraspecificG. hirsutumproducing ~1200 to ~4500 markers(Islam et al. 2014)would yield many more markers on current sequencers. If our standard protocol does not provide enough SNPs, we will include additional restriction enzymes or use GBS in combination with 'Cot' analysis of intergenic DNA, which is more polymorphic than genic DNA(Rong et al. 2013).QTL analysis. In objective 1, we will conduct joint analysis of all 25 populations in each background. Since different exotic parents will contribute different SNP alleles and stepwise regression cannot use individuals with missing marker data, flanking markers will be used to impute genotype at a set of reference nucleotides common across populations. Positions between two markers with the same score will simply be imputed to that score. Positions between two markers with different scores will be imputed to the weighted average of the two, based on genetic map distances.The refined SNP data obtained from GBS will be used to make linkage maps with the UNIX version of MAPMAKER 3.0, modified to handle large data sets(McMullen et al. 2009b)and locally modified for parallel processing. SNP scores for all DES56-derived lines will be combined into one file, and all Acala Maxxa derived lines into another, scoring elite alleles as 'A', exotic as 'B' and heterozygotes as missing data (required by Mapmaker for RIL). Final marker orders will be re-tested using R/qtl(Arends et al. 2010)and validated against the reference genome sequence used for SNP calls (above).Markers will be used as covariates in stepwise regression(Buckler et al. 2009)to identify QTLs in each population.Objective 2.Performance of exotic chromosomal segments in BC1F2 progeny of crosses with leading elite cottons.From the 20 BC1 families from each cross in Objective 1, for 10 selected crosses we will grow 5 plants per line (100 per cross), providing sufficient progeny to undertake QTL mapping in each individual cross.In year 2, single BC1F3 plants will be grown and subjected to GBS, and in year 3, BC1F3:4 families will be grown in two locations.In year 2, 100 BC1F3 plants from each of the 10 selected BC1 lineages will be evaluated in a single environment (SC) using a single replicate, augmented incomplete block design by set. Each set will be arranged in a 10 x 10 alpha lattice augmented by including the two parental lines within each incomplete block such that the total number of plots evaluated is 1200. Following defoliation of plots at 80% open bolls, bolls will be hand harvested from the entire plant of each plot. Following ginning on a common 10-saw laboratory gin, fuzzy seed and ginned fiber will be weighed to determine lint percent. The collected fiber will be used to determine High Volume Instrument fiber properties, including strength, length, micronaire, elongation, short fiber content, and fiber uniformity.In year 3, the 10 BC1F3:4 populations (in total ~1000 lines) will be phenotyped in two environments (Athens, GA byAP; Florence, SC byTC). Field trials will use two-replicate, alpha-lattice incomplete block designs arranged in sets according to each of the 10 populations following the general procedures outlined by Hung et al. (2012). For each location, the field position of 10 sets will be randomly assigned. Each set, which includes 100 BC1F3:4 progeny, 2 parental lines, and 2 commercial checks, will be randomized across locations as a 13 x 8 incomplete block alpha lattice design. Experimental units will consist of single row plots 1-2m long with 1m row spacing.Following defoliation of plots at 80% open bolls, 25 first position bolls will be hand harvested from the middle of the fruiting zone for each plot and weighed to estimate boll size. Following ginning on a common 10-saw laboratory gin, fuzzy seed and ginned fiber will be weighed to determine lint percent. The collected fiber will be used to determine High Volume Instrument fiber properties, including strength, length, micronaire, elongation, short fiber content, and fiber uniformity.Phenotypic data will be analyzed within and across locations using a random effects mixed model analysis of variance. Initially for each trait, Best Linear Unbiased Predictors (BLUPs) will be estimated for each environment and used in QTL analyses. Upon validation of homogeneous variance, a combined analysis of variance will be conducted across environments to include genotype x environment (G x E) interaction effects.Genotyping and associated data analysis will use methods described in Objective 1 methods, except that here we will conduct both joint QTL analysis of the 5 populations in each of the two elite backgrounds, as well as individual analysis of each single population.