Progress 10/01/20 to 09/30/21
Outputs Target Audience:The target audiences reached by the effort were: Plant biologists working in research to develop crops that require fewer inputs. Graduate students and postdoctoral associates studying the biology of nitrogen fixation in plants. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?Due to COVID, our ability to disseminate knowledge to the community of interest has been limited. Despite this limitation, presentation were made in several venues, including: 1. Oral presentation at the Department of Energy BER Principal Investigator Meeting 2. Several poster presentations at the EuropeanNitrogen Fixation Conference 3. Oral presentation to the Biology Department of the University of South Carolina What do you plan to do during the next reporting period to accomplish the goals?We have completed the identification of candidate gene regulators, based on the current accomplishments of the project. We are now testing these genes in multiple combinations.
Impacts What was accomplished under these goals?
Aim I. Phylogenomic Discovery Of The Genetic Toolkit Of Root Nodule Symbioses The goal of this aim it to uncover the genomic toolkit required for the evolution of root nodule symbioses via a phylogenetic framework, comparing species that possess and lack nodulation ability. The first aim is generating candidate genes that are functionally evaluated in the remaining of the project. a. Construction of a phylogenetic framework of the n-fixing clade We developed a robust phylogeny of the nitrogen-fixing clade to use as a comparative framework to meet the goals of the first aim. We used this framework to make an initial inference of the number and distribution of independent origins and losses of nitrogen-fixing symbioses and the phylogenetic location of the underlying predisposition and to propose optimal sister clade comparisons for genome comparisons. Genome sequencing and assembly of these taxa are underway. We have also begun comparative sequence analysis of candidate functional genes that were sequenced via targeted-sequencing for all specimens in the phylogeny to identify regions associated with nodulation status. b. Comparative analyses of nodulation and detection of the genetic nodulation toolkit Selection and prioritization of genes based on genome comparisons We proposed 21 nodulating-non nodulating species comparisons based on our phylogeny (13 in legumes, eight in non-legumes) and began genome sequencing for ten of these comparisons. We have acquired tissue for all but one needed species and have completed long-read and short-read genome sequencing for seven; genome assembly for two of these is complete or underway. We are also performing comparisons between nodulators and non-nodulators using the functional genes sequenced in our targeted-sequencing experiment. Selection and prioritization of genes based on gene regulatory modules We continued the interpretation of the Medicago LCO time course RNAseq and ATACseq data sets. This effort has centered on defining regulatory programs by integrating RNA-seq and ATAC-seq time course data with the Dynamic Regulator Module Networks algorithm, and MTG-LASSO approach. The identified regulators of interest in this work included IBM1, and regulators involved in hormone responses associated with the early steps of symbiosis and nodule formation such as ethylene (ERF1, EDN1-3 and EIN3), ABA (ABI4-5), auxin (SHY2) and cytokinin (MtRRB15). Selection and prioritization of genes based on patterns of gene evolution (gene loss or gain) A previous gene orthology analysis of 39 plant species has been expanded to span a deep sampling of available transcriptome data from 93 plant species, 38 nodulation and 45 non-modulating members of the nitrogen-fixation clade, and 10 outgroup species. The available assemblies and transcriptomes were selected from a curated list of publicly-available data for 107 prospective species, and 93 were selected with protein sequence annotations. In total we obtained 3.2 million gene protein sequences spanning ~100k parsed gene ortho-groups with member genes spanning at least two species. We further analyzed these ortho-groups by examining evolution at the DNA sequence and protein sequence level with HyPhy and validating gene orthology relationships based on synteny. We are currently identifying sequence-based evidence for variation across species that can provide insight into the evolutionary history and origin of nodulation. Selection and prioritization of genes based on patterns of regulatory sequence evolution We selected the genome of 25 species capable of NFN symbiosis, including the reference M. truncatula, to identify conserved non-coding sequences (CNS) related to nodulation. A group of species outside of the nitrogen-fixation clade was also identified to be used as an outgroup. After pairwise alignment of each species genome to the reference M. truncatula, these alignments were merged using a phylogenetic-guided approach and according to their classification (NFC or outgroup). We used PhastCons to estimate the conservation score across the species in the NFC and detected 114,173 CNS. Addition filtering based on genome location and orthology resulted in 3,091 CNS remaining for further analysis. Finally, we evaluated the relationship between CNS and chromatin accessibility of the Medicago genome during nodulation, in genes related to this developmental process. A total of 38 of these CNS were located in proximity to 19 genes previously described as being associated with nodule development, including MtCRE1- a histidine kinase cytokinin receptor. Selection and prioritization of genes based on patterns single-cell root expression We characterized the early transcriptome reprogramming of individual Medicago root cells responding to rhizobia. We isolated nuclei from M. truncatula root infections zone segments in each of two replicated experiments, 24 hrs after spot inoculation with S. meliloti. We performed high-throughput, microfluidic-based single-nuclei RNA sequencing using the 10× Genomics Chromium technology. We captured 18,016 nuclei and obtained on average 68,270 reads per nucleus. We identified the expected cell populations contained in the M. truncatula roots. Pericycle cells are the first to respond mitotically to the presence of rhizobia. To reconstruct the pseudo temporal transcriptional program that gives rise to these differentiated pericycle cells, we performed a pseudotime analysis of the clusters that identified regulators of auxin biosynthesis (SYT) and members of the two-component cytokinin signaling system. Aim II. Verification of Molecular Mechanisms of Root Nodule Development We are verifying the function of targets discovered in the first aim for their effect on root nodule development in M. truncatula (nodulating) and Populus root organ cultures (non-nodulating). Below we describe the analysis of some of the most relevant candidate genes discovered in Aim I, that resulted in altered nodulation phenotypes when evaluated in Medicago. Evaluation of putative regulatory sequences associated with nodulation Among the conserved non-coding sequences discovered in the previous aim, we detected five sequences in the promoter of MtCRE1. We engineered three versions of this promoter containing deletions of the distal two CNS (Δ2CNS), or proximal three CNS (Δ3CNS), or all five of these CNS (Δ5CNS). We generated composite Medicago truncatula plants in the cre1-1 background developing transformed roots expressing CRE1 either under the wild-type promoter or one of the engineered promoters or not expressing CRE1 (empty vector) and inoculated them with Sinorhizobium meliloti. Two weeks after inoculation, we observed a gradual decrease in the number of nodules with the deletions that was statistically significant between the wild-type and Δ5CNS roots, providing evidence that CNS are required for nodule organogenesis. Evaluation of candidates detected based on patterns single-cell root expression We uncovered putative regulators of the early differentiation of pericycle cells during nodule formation by isolating and sequencing the transcriptome of Medicago truncatula single nuclei derived from roots of rhizobia inoculated plants. We identified a homologue of the Arabidopsis STY family proteins, and genes of the two-component system phosphorelay cascade involved in cytokinin signaling. RNAi knock-down regulation of both genes results in a significant reduction in the number of nodules produced by Medicago after inoculation with rhizobia. Aim III. Engineering Nodulation Capability Into The Bioenergy Woody Crop Populus spp. There are several genes/constructs currently under evaluation in Populus, for their effect on the induction of nodule-like structures. These include constructs that express STY, HK1, CRE1 and other nodulation genes identified in our current work. Generation and analysis of transgenic plants are ongoing.
Publications
|
Progress 10/01/19 to 09/30/20
Outputs Target Audience:The target audiences reached by the effort were: Plant biologists working in research to develop crops that require fewer inputs. Graduate students and postdoctoral associates studying the biology of nitrogen fixation in plants. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?The project has provided training for four postdoctoral associates in molecular and cellular biology, and genomics. How have the results been disseminated to communities of interest?Due to COVID, our ability to disseminate knowledge to the community of interest has been limited. Nonetheless, we have made presentations at two scientific meetings (Plant and Animal Genome Conference, and Department of Energy Principal Investigator Meeting) What do you plan to do during the next reporting period to accomplish the goals?We further narrowed the range of candidate genes to be tested, to introduce nitrogen fixation in poplar. We are now testing these genes in multiple combinations.
Impacts What was accomplished under these goals?
Completed library preparation, sequence-capture and sequencing for ~ 15,000 species of the Fabales, Fagales, Cucurbitables and Rosales, by far the largest phylogenomic data set ever assembled for a clade of plants. Assembly of the sequencing data and preliminary analysis based on ~10,000 species was used to generate a phylogeny using RAxML and FastTree. Completed de novo assembly of 129 sequenced functional genes involved in nodulation for all 15,000 species of the Fabales, Fagales, Cucurbitables and Rosales, and developed an updated database for the nodulation trait for all sequenced samples. This database has improved resolution at the genus-level and includes a measure of uncertainty that describes whether the trait is confirmed (present or absent) or whether it is inferred and by what source. Implemented a novel method (Dynamic Regulatory Module Network) to model temporal changes in gene expression and access of their regulatory motifs. This analysis identified the genes that have the highest regulatory importance in nodulation, based on their weight on the model prediction. The method identified five master regulatory genes involved in hormone responses in symbiosis and nodule formation. Applied comparative genomics methods to detect candidate genes supported by phylogenies. In this analysis we used alternative approaches to detect those genes, while controlling for phylogenetic relatedness and for differences in evolutionary rates among species. The analysis identified several genes previously known to be involved in nodulation (e.g. LysM type receptor kinase), as well as novel candidates. Developed a cytokinin sensor, expressed in the cortex of Medicago and Populus, for early detection of plants that are undergoing the developmental program that is expected in plants developing nodules. Because both cytokinin and auxin are involved in early nodule development, an auxin sensor is also currently under development. Developed methods for single-nuclei transcriptome (snRNA-seq) analysis applicable to a wide range of plants, for use in the snRNA characterization of roots from species that different in their response to rhizobia and nodule formation. We also optimized sequencing and assembly of plant genomes using a combination short read (Illumina) and long-reads (Oxford Nanopore). Supported by the phylogenomic data, this optimized approach will be used to sequence the genomes of the most informative species in the Fabales, Fagales, Cucurbitables and Rosales clades. Evaluated the overexpression of PtNIN orthologs in Populus for the development of nodule-like structures, which resulted in an increase in lateral roots. Similarly, the overexpression of Ljsnf2 led to more lateral roots, as well as nodule-like structures. Sections through the nodule-like structures revealed a central vasculature reminiscent of actinorhizal nodules.
Publications
|
Progress 06/04/19 to 09/30/19
Outputs Target Audience:
Nothing Reported
Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?This project has supported the training of several postdoctoral associates, research scientists and laboratory technicians working in the laboratories of the investigators: (1) Dr. Daniel Conde (Kirst Lab) is a postdoctoral researcher and overall project coordinator, currently supporting the research related to the generation of genome, transcriptome and chromatin accessibility data. In addition, Dr. Conde coordinates and supervises the generation of transgenic plants in the Kirst lab, and is responsible for the generation of gene constructs used in this process. (2) Dr. Thomas Irving (Ané Lab) is a postdoctoral researcher, developing the research associated with activating the Common Symbiosis Pathway in response to rhizobia in Populus sp. In the past year, Dr. Irving has undergone training in confocal microscopy. (3) Dr. Heather Rose Kates (Soltis Lab) is a post-doctoral researcher, who led the development of a high-throughput extraction process and is the liaison with our service provider. She currently also coordinates the activities related to the analysis of the sequencing data generated for the phylogenomic effort. (4) Dr. Sara Knaack (Roy Lab) is a postdoctoral researcher, executing analysis related to the detection of orthologous genes in nodulating and non-nodulating species, as well as the transcriptome data and chromatin accessibility data. Dr. Knaack has also led the implementation of the comparative genomics, PhastCons analysis, and the development of dynamic regulatory networks that are being applied to the discovery of key nodulation genes. (5) Dr. Lucas Maia (Ané Lab) is a postdoctoral researcher, is engineering root nodule organogenesis in Populus sp. In the past year, Dr. Maia has undergone training in confocal microscopy. In addition, the project has also provided career development opportunities for a research scientist (Dr. Ryan Folk), a programmer (Dr. Raphael Lafrance) and laboratory technician (Ms. Bella Ruben and Mr. Hank Schmidt). Over 10 undergraduate students have also been trained in molecular techniques at the PI and co-PI's laboratories. The project also provided training opportunities for visiting scholars. Three graduate students from the lab of Prof. Tingshuang Yi (Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China), Ms. Shuiyin Liu, Ms. Jiajin Wu, and Ms. Qin Tian, have also visited the Soltis/FLMNH laboratory as part of our collaborative phylogenomic project on Rosales and have learned the lab and computational steps integral to this project and collaboration (their samples of Rosales will be included in our project). Another visiting scholar (Mr. Wendell Pereira) also contributed extensively to the implementation of the CNS pipeline, and will join the Kirst laboratory in the Fall of 2019 as a postdoctoral researcher in the project. How have the results been disseminated to communities of interest?Project progress updates continue to be provided by a Twitter feed (@Nit_Fix), and overall project aims are described in our project webpage (www.nitfix.org). In addition, results have been disseminated in the following events: Kates, Heather, Folk, Ryan, Conde, Daniel, Ruben, Bella, Dervinis, Chris, LaFrance, Rafe, Kirst, Matias, Guralnick, Rob, Soltis, Douglas, Soltis, Pamela S. "Rapid workflows from specimens to sequences: Global-scale phylogenomics from collections." Revolutionizing systematics: Herbaria in the Genomics Age, Botany Conference. July 2018. Rochester, MN. Colloquium Presentation. Kates, Heather, Folk, Ryan, Conde, Daniel, Maia, Lucas, Knaack, Sarah, Irving, Thomas, Dervinis, Chris, LaFrance, Rafe, Balmant, Kelly, Roy, Sushmita, Ane, Jean-Michel, Kirst, Matias, Guralnick, Rob, Soltis, Douglas, Soltis, Pamela S. "Global Scale Phylogenomics of the Nitrogen Fixing Clade". Root Nodule Symbiosis: Genetics, Evolution, and Engineering for Future Crops, Plant and Animal Genome Conference. January 2019. San Diego, CA. Workshop Presentation. Kates, Heather. "The More you Look the More you See: Generating New Hypotheses in Plant Evolutionary Genetics using Big Data and New Methods". Oberlin College. April 2019. Departmental Seminar. Knaack et al. Deciphering the regulatory network controlling nitrogen fixation in plants. Poster presented at the Great Lakes Bioinformatics Conference. May 2019. Madison, WI. Poster presentation. Kirst, Matias. " Engineering a novel nitrogen-fixing symbiosis into Poplar". Root Nodule Symbiosis: Genetics, Evolution, and Engineering for Future Crops, Plant and Animal Genome Conference. January 2019. San Diego, CA. Workshop Presentation. Pereira, Wendell; Folk, Ryan; Kirst, Matias. Identification of conserved non-coding sequences in nitrogen fixing plants. Plant and Animal Genome Conference. January 2019. San Diego, CA. Poster Presentation. Finally, four posters were presented in the DOE 2019 Genomic Sciences Program Annual Principal Investigator (PI) Meeting by the PI Kirst, co-PI Folk, project coordinator Conde and postdoctoral associate Maia. What do you plan to do during the next reporting period to accomplish the goals?(1) Phylogenomic discovery of the genetic toolkit of root nodule symbioses The complete data necessary for the phylogenomic analysis is being generated in the last few months of year 2 and will be fully available at the start of year 3. Completion of the analysis of this data will be a major focus of the third year of the project, as it is one of the approaches that will be used for identification of candidate nodulation genes to be tested in Medicago and Populus. We will expand the transcriptome analysis of nodulating and non-nodulating species, to other species beyond Medicago and Populus, focusing initially on those species for which we have already obtained the necessary germplasm. The comparative analysis of transcriptomes, between these nodulating and non-nodulating species, and relative to Medicago and Populus, will be completed in the third year of the project. Finally, we have started developing a databased that integrates phylogenomic, comparative genomic and transcriptomic data to streamline the discovery of candidate genes for further analysis. While the initial framework of the database is expected to be completed in the last quarter of year 2, we expect that it will be continuously improved in the third year. (2) Verification of molecular mechanisms of root nodule development The second year of the project involved the testing for gain and loss of function of candidate genes for which there was evidence of a role in nodulation, as well as evidence from phylogenetic and comparative genomic/transcriptomic data. In the third year of the project we are continuing the selection and evaluation of genes or regulatory elements uncovered in the first aim, but now expanding to genetic elements for which there may not be previous evidence for a role in nodulation. In the third year, we expect to evaluate 40-50 constructs containing individual genetic targets or combinations of targets. (3) Engineering nodulation capability into Populus With the identification of the first promising candidates genes/constructs in year 2 (LtHK1 under both 35S and a cortex-specific promoters), part of our effort is now being redirected towards generation whole-plants containing these constructs, determining if the nodule structures observed in hairy-roots is analogous to legume nodules, and testing if these nodules are colonized by N-fixing bacteria. Additional candidates confirmed to have a role in nodulation in Aim II will be evaluated in whole-plants here.
Impacts What was accomplished under these goals?
(1) PHYLOGENOMIC DISCOVERY OF THE GENETIC TOOLKIT OF ROOT NODULE SYMBIOSES Phylogenomic - sampling, DNA extraction, library preparation and data analysis We successfully completed processing for all 15,000 species samples and submitted all DNA samples to our service provider for target capture and DNA sequencing. Library preparation and sequencing is complete for ~ 4,900 species, which represents 41% of our total sequencing goal. We designed a computational method for strategic selection of DNA samples for sequencing that simultaneously integrates DNA quality guidelines and taxonomic sampling goals, to finalize selections from among our unsequenced DNA samples, as well as library preparation and sequencing before the end of year 2. In the second year of the project we carried out computational analysis to (a) identify conserved regions that are specific to N-fixing species relative to an outgroup of non-N-fixing species, and (b) detect presence and absence of key nodulation genes in species of the N-fixing clade and an outgroup. Genome sequences from 34 species (25 N-fixing and 9 outgroup) were organized into multiple alignments relative to the Medicago v5 genome assembly. From this multiple alignment data for two species sets (N-fixing and outgroup species), two analysis pipelines were applied in parallel to identify conserved regulatory elements: one based on PHAST pipeline (specifically the phastCons algorithm) and one based on the CNS pipeline. Conserved regions called by phastCons were filtered by removing 1) regions <5 bp in size, and 2) regions identified in the N-fixing species that overlap with conserved regions detected in the outgroup. Future efforts will focus on the functional interpretation of such regions in the context of regulator binding sites and specificity to N-fixing species and evolution. In the second year of the project we also completed sequence processing, assembly, and preliminary analysis on 4,900 species analyzed by sequence-capture, representing 41% of our phylogenomic goal. We have put in place an easily and rapidly deployable pipeline to complete quality control and assembly of phylogenetic and functional loci for thousands of samples in parallel on the University of Florida supercomputer. We set up programs for multiple sequence alignment, phylogenetic analysis, and testing for correlation between detection and non-detection of functional genes with nodulation state that are easily re-run when the dataset is updated with sequence data for additional samples. Finally, we aligned 100 phylogenetic loci for all sequenced samples using a back-translation method in TranslatorX and MAFFT v. 7, optimized for deep divergences and are continually updated when new data are available. We built phylogenetic trees using FastTree for an additional quality control check and used RaxML-NG to build a phylogenetic tree at two points in data delivery (1,900 species and 4,900 species). To look at the relationship between nodulation state and detection or non-detection of the 129 functional genes sequenced, we used Pagel's test for evolutionary correlation and visualized these patterns alongside the phylogeny in the R package phytools. Transcriptomics - sampling, RNA extraction, library preparation and data analysis The project transcriptome component involves the comparison of profiles among nodulating and non-nodulating species, to uncover their differences and define potential candidate genes for further analysis in the second aim. We have obtained germplasm for these experiments, including of the species Medicago truncatula (nodulating), Populus tremula x alba (non-nodulating). Additional species will be selected upon completion the phylogenomic data analysis. We completed the processing of RNA samples from Medicago truncatula (nodulating) and Populus tremula x alba (non-nodulating) subjected to treatment with nodulation (Nod) factors, and collected over multiple time point following treatment. The data were normalized and processed with SLEUTH for further analysis, quantile-normalized and log-transformed. Principle components analysis results indicate clear separation of the replicates by time point, providing a strong foundation for further analysis. Principal components analysis of the data shown a strong temporal component, explaining ~42% of the variation by the first two components. We clustered the genes based on their log-transformed mean-subtracted expression profiles and identified several gene groups exhibiting dynamic patterns of expression. We have developed a novel algorithm, Dynamic Regulatory Module Networks (DRMN), to systematically integrate RNA-seq and ATAC-seq (described below) time course profiles to identify key cis-regulatory elements and transcription factors associated with genes that exhibit dynamic behavior. We have applied this algorithm to the RNA and ATAC-seq time course measuring transcriptional and accessibility profiles after treatment with Nod factors. The inputs to this algorithm are the RNA-seq time series data, the number of expression modules and upstream regulatory features for each time point that can be derived from the ATAC-seq time course. The algorithm outputs a set of gene expression modules (states) for each time point and their associated regulatory programs comprising the cis-regulatory elements that best predict the expression of genes in a particular module. These are candidates to be evaluated for their role in nodulation.? (2) VERIFICATION OF MOLECULAR MECHANISMS OF ROOT NODULE DEVELOPMENT The first year of the project focusses on evaluating genes known to be involved in nodulation (NIN and LHK genes) and hypothesized to trigger nodulation in species outside of the N-fixation clade. In the second year of the project, this strategy expanded to other genes that follow a phylogenetic pattern that suggests an association with species that nodulate in the N-fixing clade, and that have previously been identified as involved in nodule organogenesis or lipochitooligosaccharides (a.k.a. Nod-factor) detection. Our research strategy involves evaluation of candidate genes for their role in nodulation in Medicago (when transposon insertion lines are available), and in Populus using the hairy-root transformation system. Verification of candidate genes in Medicago We obtained the 14 orthogroups annotated by Aim 1 as showing the most significant association (excepting the orthogroups containing NIN & RPG) with the same phylogenetic pattern as NIN/RPG (i.e. present in nodulating FaFaCuRo and in outgroup dicot species, but not present in non-nodulating FaFaCuRo). We obtained transposon insertion lines in Medicago truncatula (from the Noble Research Institute) for three genes within these orthogroups. These lines were crossed and homozygous knockout lines for each of the three genes obtained, and we are now working to assess the nodulation phenotype of these lines. Verification of candidate genes in Populus Constructs currently being tested in poplar hairy-roots, for their effect on nodule development, include the expression of CCaMK, snf-2+NOOT1, WUSCHEL+IPT3, NOOT1+WUSCHEL under a cortex-specific promoter, as well as NFP under an epidermis specific promoter. Additionally, constructs for NIN and Lotus japonicus cytokinin receptor (LjHK) have also been tested (see below). (3) ENGINEERING NODULATION CAPABILITY INTO POPULUS We have successfully systematized the Agrobacterium rhizogenes-mediated transformation method into Populus tremula × P. alba (clone INRA 717-1-B4) with the regeneration of whole transgenic plants in our lab.
Publications
|