Progress 10/01/20 to 09/30/21
Outputs Target Audience:Academic scholars, postdocs, and students. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?Training opportunities are available for the postdoctoral fellow involved in the project, through collaborations with academic, government, and industry groups. The postdoctoral fellow will also receive additional training in project organization and publication with mentoring of a graduate and undergraduate students working on parts of the project. How have the results been disseminated to communities of interest?
Nothing Reported
What do you plan to do during the next reporting period to accomplish the goals?Much effort in the upcoming year will focus on analysis of the large amount of data we now possess. This will require a great deal of integration because of the heterogeneous nature of the data: GBS data and whole genome skim generated here, and BUSCO/organelle genome data from Syngenta. To further collaborative goals with the USDA soybean germplasm collection, we will continue to generate and analyze sequence data from DNA samples sent to us from the collection. Planned emphases in this effort include: all available accessions of the two C-genome species, for comparison with accessions representing potential new species thought to belong to that group; additional samples newly accessioned by the USDA germplasm collection, particularly samples lacking definitive identification; accessions for which we have DNA or seed samples from our nearly 40 years of research on perennial Glycine and for which genome skim data are not yet available. Taxonomic emphasis is likely to be on G. stenophita and allopolyploid species possessing a G. stenophita genome (G. tabacina, G. pescadrensis), given our focus on white mold resistance in G. stenophita accessions. In general, subsequent steps in the overall project will involve focusing on allopolyploids.
Impacts What was accomplished under these goals?
This year was marked by a shift in methodology from genotyping by sequencing (GBS) to whole genome skim sequencing, which was due both to the decreased cost of DNA sequencing and the greater genomics capabilities of the new postdoctoral fellow now overseeing the project. The shift was also stimulated by the availability of Illumina short-read genome skim sequences representing complete organellar genomes and 2500 nuclear genes for over 550 Glycine accessions through collaboration with a private corporation (Syngenta). The nuclear genes were from the BUSCO gene set of genes shared across many taxa and commonly used to assess genome sequencing completeness. A research highlight was the acquisition, from the Australian National Herbarium (CANB), of dried herbarium material from a recently described (2015) new species of Glycine from Western Australia, G. remota. A genome skim sequence was obtained from this species, and combined with GBS sequences representing diploid species of Glycine that had been generated as part of this project to pinpoint G. remota in the Glycine phylogeny as a member of the I-genome. A paper is in preparation describing this finding. Phylogenetic analysis of the Syngenta plastid genome (plastome) sequences was conducted, as were analyses of the first 100 BUSCO nuclear genes to develop a pipeline for the planned full analysis of the nuclear genome. Initial findings are very promising, with practical advances such as identification of mis-classified accessions, and the elucidation of phylogenetic relationships, such as incongruence between plastid and nuclear genomes. Another milestone was the establishment of a formal collaboration with the USDA soybean germplasm collection through its new Curator, Adam Mahan. This led to our sequencing a plate of 96 samples sent to us from the collection, including unidentified or provisionally identified samples recently obtained from Australia. Analysis of these data, as well as of full Syngenta dataset and our previously generated GBS data, is pending.
Publications
|
Progress 10/01/19 to 09/30/20
Outputs Target Audience:Seminars and conference presentations were made to academic scholars, postdocs, and students. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?University of Wisconsin, Madison, WI, December, 2019. "A polyploid odyssey." Cornell University, Plant Breeding & Genetics Section, September, 2020. "The impacts of polyploidy, from cells to species." What do you plan to do during the next reporting period to accomplish the goals?To facilitate identification of progenitor taxa and to document the extent of diversity within G. tomentella T4 (= G. tomentella sensu stricto), an allopolyploid comprising subgenomes contributed by an H-genome diploid species (one of 6 possible taxa) and the G. tomentella D3 diploid (D-genome), we propose to de novo assemble genomes from three genetically different T4 individuals using a hybrid approach with long- and short- reads. Specifically we will use Oxford Nanopore to 50x coverage and Illumina Nova-Seq to 50x coverage. We will then use the newly generated genome assemblies as a reference to map our GBS data. Comparing percentage of reads mapped and SNPs called will provide insights into the likely diploid progenitors of the polyploid accessions. Further downstream analyses such as PCA and Structure will allow for determination as to which of the diploids is genetically closest to each of the polyploid's two homoeologous subgenomes. If resources are available, further Illumina (2 x 250 reads) genome resequencing of additional diploid progenitors to 20x coverage will be done to identify regions of the different genomes that are shared. Ultimately, we would like to look for signatures of selection in the genomes of these plants, which are known to harbor resistances and tolerances to biotic and abiotic stresses.
Impacts What was accomplished under these goals?
This past year was marked by difficulties even before the pandemic brought the project to a halt. These included deaths and other challenges in family of the Research Associate leading the project. Due to the Research Associate's at-risk status, she did not feel comfortable returning to campus while students were present, even after the lab received reactivation permission. A major goal had been to determine the identities of allotetraploid Glycine populations in the islands between Taiwan and mainland China; on Taiwan and in the Ryukyu Islands of Japan several different allotetraploid species occur, and Chinese colleagues have been studying their island populations and were collaborating with us. However, it was determined after significant investment of effort and communication that the DNA samples proposed for use in Genotyping By Sequencing (GBS) analyses were too degraded for that purpose, which was a major setback. Progress was made on morphological studies of perennial Glycine, leading to the return of herbarium specimens on loan from Royal Botanic Gardens, Kew, that had been borrowed years ago by a graduate student who did not remain for a Ph.D. In addition, some progress was made on Bayes Factor species delimitation and phylogeny reconstruction of diploid accessions using GBS data. This work, when completed, will likely result in the recognition of several new species. Communication is ongoing with the soybean germplasm collection at Illinois, with the goal of helping to curate the collection based on new information generated as part of this project.
Publications
|
Progress 10/01/18 to 09/30/19
Outputs Target Audience:Seminars and conference presentations were made to academic scholars, postdocs, and students. Changes/Problems:To facilitate identification of progenitor taxa and to document the extent of diversity within G. tomentella T4 (= G. tomentella sensu stricto), an allopolyploid contributed by an H genome diploid species (one of 6 possible taxa) and the G. tomentella D3 diploid, we propose to de novo assemble genomes from three genetically different T4 individuals using a hybrid approach with long- and short- reads. Specifically we will use Oxford Nanopore to 50x coverage and Illumina Nova-Seq to 100x coverage. We will then use the newly generated reference sequences to map our GBS data. Using PCA and Structure, we will be able to determine which of the diploids is genetically closest to each of the polyploid's two homoeologous subgenomes. If time and money allow, we will sequence additional diploid progenitors to 50x coverage using Illumina (2 x 250 reads) to identify regions of the different genomes that are shared. Ultimately, we would like to look for signatures of selection in the genomes of these plants, which are known to harbor resistances and tolerances to biotic and abiotic stresses. What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?The PI has reported findings at the following seminars and conference: West Virginia University, Morgantown, WV, November, 2018. "Polyploidy: Significance and Unanswered Questions." 45th Annual South African Association of Botanists, African Mycological Society, and South African Society for Systematic Botany Joint Congress, Johannesburg, South Africa, January, 2019 (keynote/plenary speaker). "A systematist in Wonderland: Harnessing multi-omics data to understand patterns of plant biodiversity and the processes involved in its generation." College of William & Mary, Williamsburg, VA, March, 2019. "What does polyploidy do?" What do you plan to do during the next reporting period to accomplish the goals?Plans are in place to complete 4-6 more GBS libraries that are composed mostly of tetraploid individuals, including newly discovered material from China. To help in determining progenitor taxa we propose additional sequencing described below in the section on changes. We will take an iterative approach to our data analysis of the tetraploids, including a "big picture" analysis with all taxa included but fewer individuals and focusing on individual triads (2 diploid progenitors and allopolyploid). Our additional work will focus on a triad that is more diverse than those previously studied. Preliminary analyses are completed, additional analyses needed for publication will be undertaken.
Impacts What was accomplished under these goals?
We have taken an iterative approach to data analysis. With data from over 600 accessions we have determined accessions that were mis-identified in the USDA/GRIN and/or CSIRO germplasm collections, or that are new taxa. Work in the past year has focused on G. tomentella complex diploid species. In many instances it was important to include tetraploid individuals in initial analyses to identify mis-identified species. We used NeighborNet in the SplitsTree package to cluster accessions. This preliminary step helped identify new taxa. We've used Admixture and Bayesian phylogenetic analyses to confirm groupings and new taxa. Bayesian Factor Delimitation favors the hypothesis with 13 species among diploid G. tomentella, which is four more species than anticipated based on previous work with markers having less resolution than our genome-wide GBS approach. Preliminary investigation of tetraploid data sets indicates that four of as many as eight tetraploid species are not going to be as easy to determine progenitor species as once thought. The number of potential diploid progenitor taxa have increased and their close relationships makes it challenging to determine which diploid dataset to work with. We have proposed changes below to facilitate accurate description of the genetic make-up of the tetraploids.
Publications
|
Progress 10/04/17 to 09/30/18
Outputs Target Audience:Researchers interested in wild relatives of soybean; biologists interested in species delimitation, polyploidy and legume phylogeny; anyone interested in the collection, identification, and maintenance of diversity in germplasm banks. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?Presentations by Jeff Doyle: Reed College, Portland, OR, November 10, 2017. "Perspectives on the Prevalence, Pattern, and Process of Plant Polyploidy" University of Texas, Austin, March, 2018. "Polyploidy: Significance and Unanswered Questions" Michigan State University, March, 2018. "Polyploidy: Significance and Unanswered Questions" 7thInternational Legume Conference, Sendai, Japan, September, 2018 (symposium co-organizer and speaker, Root to tip legume phylogenomics: Building the Foundation for Next Generation Legume Systematics). "Genomics, transcriptomics, and more: The making of a model non-model legume system, perennialGlycine(Phaseoleae)" West Virginia University, Morgantown, WV, November, 2018. "Polyploidy: Significance and Unanswered Questions." What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
The legume genus, Glycine, comprises two groups of species (subgenera): 1) the annual subgenus, Soja, consisting of two recognized annual species, the soybean (G. max) and its wild progenitor, G. soja, native to eastern Asia; and 2) the perennial subgenus, Glycine, most of whose species are native to Australia. For decades there were barely a half dozen recognized species in subgenus Glycine. Today there are 29 recognized species in the subgenus, but it has been known that one named species, G. tomentella, actually encompassed a polyploid complex involving a minimum of eight species requiring delimitation and species description. Additionally, it was known that G. tabacina, a name properly applied only to a tetraploid entity, also includes at least three still un-named diploids. We are using a "next-generation" sequencing approach (genotyping by sequencing; GBS), which detects variation at many thousands of DNA sites across the entire genome, to sample more accessions than any previous projects. This has illuminated more species and variation that originally thought. For example, we have discovered more variation among accessions designated "G. tomentella" and are now considering it to harbor up to fourteen undescribed species. A second important contribution of this project will be to alert the USDA germplasm repository of mis-identified accessions. So far, out of around 542 accessions, 39 accessions have had incorrect species designations. At least a dozen accessions without designations have been placed in the appropriate species groups. Another dozen accessions have not been placed within a known species group. These accessions represent either new species or are awaiting analyses with other groups of taxa. The major emphasis in this portion of the NC 7 project is focused on the group's second goal, to use molecular methods to characterize the wild perennial Glycine germplasm available. 1) We have sequenced and mapped reads from eight GBS libraries. Reads and SNPs were checked and filtered for quality. SNPs were used in network analyses to confirm diploid and tetraploid accessions, confirm previous designations or test hypotheses of new affiliations. 2) We have over 20,000 filtered genome-wide SNPs from 542 accessions of perennial Glycine accessions. These SNPs have been employed in network analyses at several different levels of comparison. 3) The most striking result we have to date is that what we had considered a single species based on smaller sample sizes have levels of divergence that suggest that what was considered one species should actually be recognized as two species. Ironically, as we investigate more accessions, an older hypothesis of two species in a single taxon appears more likely to be one species. Our understanding of the depth of divergence between taxa and genome groups has improved. We now are investigating the possibility there are a total of 14 unrecognized species within G. tomentella. 4) When analyses are completed we will notify the Soybean Germplasm repository of accessions that have changed species affiliation. The data that have been collected will be use to name and describe new species. We have found from other projects involved in assessing the wild relatives of soybean, that understanding species and the numbers of accessions for each species is imperative to designing genetic experiments. Too few accessions within a species makes tools like genome-wide association mapping unusable. As we develop species identifications and analyze collection patterns we will be able to inform future collections in Australia.
Publications
|
|