Progress 10/01/20 to 09/30/21
Outputs Target Audience:The targeted audience of this report is polyploid plant breeders and geneticists. Due to a very complex genetic structure resulting from a large number of segregating alleles and allelic combinations infamilies, polyploid species have lagged behind in the applications of genomics. The high-throughput DNA sequencetechnology has brought the opportunity to assess these complex genomes through quantitative reduced representationsequencing (qRRS), SNP-binary dosage markers and newly developed computational tools. My group and collaboratorshave been trying to develop a series of pipeline computational tools for analyzing genomic data in complex autopolyploidspecies: tools like VCF2SM and SuperMASSA (from our collaborators) for processing raw DNA sequences to call geneticdosage markers (marker identification); MAPpoly for constructing a genetic linkage map from the dosage markers (linkagemap construction and haplotype inference); QTLpoly for locating genes that are important to trait phenotypes by using thelinkage map (QTL mapping) and also for performance prediction. These tools have already been used in several populationsto construct a complete linkage map and to map genes that affect trait variation in sweetpotato (autohexaploid) and potato(autotetroploid). It is time for polyploids to catch up with diploids in the era of genomics. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?Dr. Marcelo Molinari (developer of MAPpoly) and Dr. Guilherme da Silva Pereira (developer of QTLpoly) were postdoctoral research associates under my supervision during the previous reporting periods. In 2021, Dr. Guilherme da Silva Pereira has become Assistant Professor in Department of Agronomy at Federal University of Viçosa, Brazil, and Dr. Marcelo Molinari has been promoted to Research Assistant Professor in Department of Horticultural Science and Bioinformatics Research Center at North Carolina State University. A new postdoc, Gabriel Gesteira, joined the project in 2021. He has taken over the responsibility for the further development of QTLpoly. How have the results been disseminated to communities of interest?We freely release MAPpoly and QTLpoly software to promote the free and open scientific exchanges with the scientific community and free usages. We are fortunate that we have been working with the sweetpotato community for GT4SP (2014-2019) and SweetGAINS (2019-2022) and with the polyploid community at large for USDA/NIFA/SCRI Tools for Polyploids project (2020-2024). These interactions are essential for us as tool developers and provide us valuable datasets for tool development. As a part of the USDA/NIFA/SCRI project, we participate in the teaching workshops every year to teach our tools for the general polyploid community (https://www.polyploids.org/). Also, the USDA/NIFA/SCRI project have put us in contact with several polyploid breeders which could result in fruitful collaborations in the following years. What do you plan to do during the next reporting period to accomplish the goals?We have extended MAPpoly and QTLpoly to three connected full-sib families: TB, BT and NKB for a joint linkage and QTL mapping analysis. This is a significant step moving towards a multiple family analysis. Our next step is still to further extend MAPpoly and QTLpoly to a much larger 8 x 8 (16 parents) cross population, called Mwanga Diversity Population (MDP). The phenotypes of MDP have been collected in the last few years. However, the DNA data of MDP have been significantly delayed due to Covid-19 pandemic and also the changes of DNA sequencing protocols. Our collaborators still could not provide us a clear timeline on the DNA data delivery for MDP. We applied and got funded (starting 01/01/2022) a new USDA/NIFA project "A Genetics-Based Data Analysis System for Breeders in Polyploid Breeding Programs". For this project, we started to develop a further downstream computational tool, called GGSpoly, that took the results from MAPpoly and QTLpoly and perform genomic selection and practical breeding decision-making process and exercises.
Impacts What was accomplished under these goals?
We have extended MAPpoly for multiple family analysis and applied it to the three large full-sib families: Beauregard x Tanzania (BT), Tanzania x Beauregard (TB), and New Kawogo x Beauregard (NKB). We completed the task of building a joint map for the three families. We have also extended QTLpoly for multiple family analysis and applied it to BT, TB and NKB families for the joint mapping of QTL for beta keratin and other traits. This is a significant step moving the linkage and QTL analysis from a single full-sib family to multiple families and then to practical breeding populations. The original plan to extend the multiple family analysis to the 8 x 8 cross population, the Mwanga Diversity Population (MDP), was however delayed. Our collaborators had some technical problems to get the DNA data done with the new protocol. This will have to be pushed to the next year. We went through an extensive computational code optimization for MAPpoly and QTLpoly. This effort was spurred by our new postdoc Gabriel Gesteira who has extensive experience in computational programming. He has completely overhauled the codes of MAPpoly and QTLpoly on computational efficiency and memory requirements for extremely large DNA sequence data. As a result, MAPpoly and QTLpoly can be potentially performed in a typical PC, rather than special high-performance computer, for regular data analysis for a large dataset. By using MAPpoly and QTLpoly and jointly with our collaborators, we reported the discovery of a major QTL for root-knot nematode (Meloidogyne incognita) (RKN) resistance in cultivated sweetpotato. This QTL was located on linkage group 7, dominant in nature, and explained 58.3% of the phenotypic variation in RKN counts. Based on the mapping result and the identified specifical SNP allele, our collaborators have already launched the effort to select the targeted SNP allele in the breeding population. Also based on MAPpoly and QTLpoly, we reported the mapping result of a major QTL that is resistant to a devastating bacterial disease, common scab (Streptomyces spp.), in two potato populations. The QTL was mapped on linkage group 3, explaining ∼22 to 30% of the total variation. The identification of QTL haplotypes and candidate genes contributing to disease resistance can support genomics-assisted breeding approaches in the crop. For this mapping population, we have also performed QTL mapping analysis on seven traits over four years (2006-8 and 2014). Based on a multiple-QTL model approach, we detected 21 QTL for 15 out of 27 trait-year combination phenotypes. A hotspot on linkage group 5 was identified with co-located QTL for maturity, plant yield, specific gravity, and internal heat necrosis resistance evaluated over different years. Additional QTL for specific gravity and dry matter were detected with maturity-corrected phenotypes. Among the genes around QTL peaks, we found those on chromosome 5 that have been previously implicated in maturity (StCDF1) and tuber formation (POTH1). These analyses have the potential to provide insights into the biology and breeding of tetraploid potato and other autopolyploid species.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2021
Citation:
Pereira, G.D., M. Mollinari, M.J. Schumann, M.E. Clough, Z.-B. Zeng, G.C. Yencho (2021) The recombination landscape and multiple QTL mapping in a Solanum tuberosum cv.Atlantic-derived F 1 population. Heredity 126:817830 doi: 10.1038/s41437-021-00416-x.
- Type:
Journal Articles
Status:
Published
Year Published:
2021
Citation:
Oloka, BM, GS Pereira, VA Amankwaah, M Mollinari, KV Pecota, B Yada, BA Olukolu, Z-B Zeng and GC Yencho (2021) Discovery of a major QTL for root-knot nematode (Meloidogyne incognita) resistance in cultivated sweetpotato (Ipomoea batatas). Theor Appl Genet 134:19451955, DOI https://doi.org/10.1007/s00122-021-03797-z
- Type:
Journal Articles
Status:
Published
Year Published:
2021
Citation:
Guilherme da Silva Pereira, Marcelo Mollinari, Xinshun Qu, Christian Thill, Zhao-Bang Zeng, Kathleen Haynes, G Craig Yencho (2021) Quantitative trait locus mapping for common scab resistance in a tetraploid potato full-sib population. Plant Disease 105 (10), 3048-3054. /doi/10.1094/PDIS-10-20-2270-RE
- Type:
Journal Articles
Status:
Published
Year Published:
2021
Citation:
Hai-Bing Xie, Li-Gang Wang, Chen-Yu Fan, Long-Chao Zhang, Adeniyi C Adeola, Xue Yin, Zhao-Bang Zeng, Li-Xian Wang, Ya-Ping Zhang (2021) Genetic Architecture Underlying Nascent SpeciationThe Evolution of Eurasian Pigs under Domestication. Molecular Biology and Evolution, msab117, https://doi.org/10.1093/molbev/msab117
|
Progress 10/01/19 to 09/30/20
Outputs Target Audience:Due to a very complex genetic structure resulting from a large number of segregating alleles and allelic combinations in families, polyploid species have lagged behind in the applications of genomics. The high-throughput DNA sequence technology has brought the opportunity to assess these complex genomes through quantitative reduced representation sequencing (qRRS), SNP-binary dosage markers and newly developed computational tools. My group and collaborators have been trying to develop a series of pipeline computational tools for analyzing genomic data in complex autopolyploid species: tools like VCF2SM and SuperMASSA (from our collaborators) for processing raw DNA sequences to call genetic dosage markers (marker identification); MAPpoly for constructing a genetic linkage map from the dosage markers (linkage map construction and haplotype inference); QTLpoly for locating genes that are important to trait phenotypes by using the linkage map (QTL mapping) and also for performance prediction. These tools have already been used in several populations to construct a complete linkage map and to map genes that affect trait variation in sweetpotato (autohexaploid) and potato (autotetroploid). It is time for polyploids to catch up with diploids in the era of genomics. The targeted audience of this report is polyploid plant breeders and geneticists Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?Both Dr. Marcelo Molinari (developer of MAPpoly) and Dr. Guilherme da Silva Pereira(developer of QTLpoly) were postdoctoral research associates under my supervision during the reporting period. (Now 2021, Dr. Guilherme da Silva Pereira has become Assistant Professor in Department of Agronomy at Federal University of Viçosa, Brazil, and Dr. Marcelo Molinari has been promoted to Research Assistant Professor in Department of Horticultural Science and Bioinformatics Research Center at North Carolina State University.) How have the results been disseminated to communities of interest?We freely release MAPpoly and QTLpoly software to promote the free and open scientific exchanges with the scientific community and free usages. We are fortunate that we have been working with the sweetpotato community for GT4SP (2014-2019) and SweetGAINS (2019-2022) and with the polyploid community at large for USDA/NIFA/SCRI Tools for Polyploids project (2020-2024). These interactions are essential for us as tool developers and provide us valuable datasets for tool development. As a part of the USDA/NIFA/SCRI project, we participate in the teaching workshops every year to teach our tools for the general polyploid community (https://www.polyploids.org/workshop/2021/january/info). Also, the USDA/NIFA/SCRI project have put us in contact with several polyploid breeders which could result in fruitful collaborations in the following years What do you plan to do during the next reporting period to accomplish the goals?The current tools (MAPpoly and QTLpoly) can only be applied to a large full-sib family and does not apply to the practical breeding populations in polyploid crops. We have built separate linkage maps for three large full-sib families: Beauregard x Tanzania (BT), Tanzania x Beauregard (TB), and New Kawogo x Beauregard (NKB). However, if we want to have a joint analysis for the three families to increase statistical power, we need first to build a joint linkage map and then to map QTL for the three families.In the next reporting period, we aim to extend the tools to multiple inter-related families, and will use the genomic data of the BT, TB and NKB families for the computational tool development. In SweetGAINS, we also have access to a large 8 x 8 factorial design population. The crosses were made between two sets of parents belonging to two different heterotic groups resulting in 64 full-sib families, each with about 30 individuals. This population, called Mwanga Diversity Population (MDP), would serve as a primary breeding population for future efforts to understand the genetics of complex hexaploid sweetpotato and to help breeders improve their varieties. However, the genomic data of MDP are still not available yet. The computational tool development for the MDP populations will be the major research effort in the next several years.
Impacts What was accomplished under these goals?
Due to the previous multiple years of efforts associated with the Genomic Tools for Sweetpotato Improvement (GT4SP) project (2014-2019) and the Sweetpotato Genetic Advances and Innovative Seed Systems (SweetGAINS) project (2019-2022), both funded by the Bill & Melinda Gates Foundation, the year of 2020 is a culmination of fruition in terms of research paper publication and product (software) release. We developed the theory and methods and implemented the software for linkage and QTL analysis in a bi-parental full-sib population. This effort resulted in the first comprehensive genetic analysis of a hexaploid sweetpotato genome, including the construction of an ultra-dense genetic map and the inference of both the parental and progeny's haplotypes (Mollinari et al., 2020). This information was used in the subsequent efforts to map quantitative trait loci with economic and agronomic importance (Gemenet et al., 2020; Pereira et al., 2020). In order to make the multipoint map construction available to the scientific community, we developed and freely released MAPpoly (https://CRAN.R-project.org/package=mappoly). MAPpoly is an R package to construct genetic maps in autopolyploids with even ploidy levels. In its current version (0.2.3), MAPpoly can handle even ploidy levels up to 8 when using hidden Markov models (HMM), and up to 12 when using the two-point simplification. It contains a plethora of functions to perform all steps in the whole mapping process pipeline, such as loading a variety of dosage-based datasets including genotype probabilities, filtering procedures, pairwise linkage analysis, clustering linkage groups, ordering markers, phasing and multipoint map estimation, computation of genotype probabilities for further QTL analysis and inference of meiotic processes. It needs to be emphasized that the linkage map construction and haplotype inference in high autopolyploid is a highly complex problem in statistical genetics, and we are proud to make significant contributions in this area. The development of MAPpoly is a game-changing achievement and opened the door for genomic applications in plant breeding for high autopolyploid species. For QTL mapping we aim to interpret the genetic basis of quantitative trait variation in a population for genetic discovery and also for prediction. Due to a potentially large number of alleles at each QTL locus in polyploid populations, we developed a random QTL-effect model for mapping multiple QTL. The multiple QTL are searched sequentially. QTL effect parameter estimation is based on a mixed-effect model with REML. The test statistic for QTL identification is based on a score-statistic to empirically compute the p-value efficiently. The method is general and flexible and can be readily extended for multiple families. We developed QTLpoly (https://github.com/guilherme-pereira/QTLpoly) in an R package for a general QTL mapping analysis in polyploid populations (Pereira et al., 2020). QTLpoly takes the output of haplotype structure inferred from MAPpoly as an input in terms of the genotype conditional probability distribution at each genomic position for each individual and combines it with phenotypes to perform a variety of genetic analyses between genotypes and phenotypes. It can perform the genomic selection (GS) and prediction as an option. But more importantly it can build a clearly defined and flexible genetic model that can achieve the purposes of both genetic discovery and breeding value prediction for selection. Pereira et al. (2020) reported the mapping of a number of QTL for both qualitative traits and yield traits. Based on the QTL mapping of Pereira et al. (2020), Gemenet et al. (2020) reported an interesting and important study on the comparison of different analysis methods on the predictive ability (measured as the correlation between the predicted and observed phenotypes in the validation sample based on an10-fold cross-validation).The message is clear: a fuller genetic analysis can achieve not only a clear genetic discovery (identification of specifical QTL in the genome, specifical alleles and allelic combinations in terms of parental haplotypes, a genetic model of casual variants, the importance of QTL effects in terms of heritability), but also better prediction for breeding.
Publications
- Type:
Journal Articles
Status:
Published
Year Published:
2020
Citation:
Mollinari, M., B. Olokulu, G. Pereira, D. Gemenet, C. Yencho, Z.-B. Zeng (2020) Unraveling the hexaploid sweetpotato inheritance using ultra-dense multilocus mapping. G3: Genes, Genomics and Genetics 10:281-292 doi: https://doi.org/10.1534/g3.119.400620
- Type:
Journal Articles
Status:
Published
Year Published:
2020
Citation:
Gemenet, DC, G. Pereira, BD Boeck, JC Wood, M Mollinari, BA Olukolu, F Diaz, V Mosquera, RT Ssali, M David, MN Kitavi, G Burgos, TZ Felde, M Ghislain, E Carey, J Swanckaert, LJM Coin, Z Fei, JP Hamilton, B Yada, GC Yencho, Z-B Zeng, ROM Mwanga, A Khan, WJ Gruneberg, CR Buell (2020) Quantitative trait loci and differential gene expression analyses reveal the genetic basis for negatively associated ?-carotene and starch content in hexaploid sweetpotato [Ipomoea batatas (L.) Lam.] Theoretical and Applied Genetics 133:23-36. Doi: https://doi.org/10.1007/s00122-019-03437-7
- Type:
Journal Articles
Status:
Published
Year Published:
2020
Citation:
Pereira, G. D. Gemenet, M. Mollinar, B. Olukolu, F. Diaz, V. Mosquera, W. Gruneberg, A. Khan, C. Yencho and Z.-B. Zeng (2020) Multiple QTL mapping in autopolyploids: a random-effect model approach with application in a hexaploid sweetpotato full-sib population. Genetics 215 (3), 579-595 https://doi.org/10.1534/genetics.120.303080
- Type:
Journal Articles
Status:
Published
Year Published:
2020
Citation:
Dorcus G.; H Lindqvist-Kreuze; BD Boeck; G Pereira; M Mollinari; Z-B Zeng; GC Yencho; H Campos (2020) Sequencing depth and genotype quality: Accuracy and breeding operation considerations for genomic selection applications in autopolyploid crops. Theoretical and Applied Genetics 133(12):3345-3364. https://link.springer.com/article/10.1007/s00122-020-03673-2
- Type:
Journal Articles
Status:
Published
Year Published:
2020
Citation:
Chenxi Zhou, Bode Olukolu, Dorcus C Gemenet, Shan Wu, Wolfgang Gruneberg, Minh Duc Cao, Zhangjun Fei, Zhao-Bang Zeng, Andrew W George, Awais Khan, G Craig Yencho, Lachlan JM Coin (2020) Assembly of whole-chromosome pseudomolecules for polyploid plant genomes using outbred mapping populations. Nature Genetics 52 (11), 1256-1264. https://www.nature.com/articles/s41588-020-00717-7
|