Source: UNIVERSITY OF FLORIDA submitted to NRP
(MC) IMPROVING NITROGEN-FIXATION OF FOREST TREE SPECIES
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
1019662
Grant No.
(N/A)
Cumulative Award Amt.
(N/A)
Proposal No.
(N/A)
Multistate No.
(N/A)
Project Start Date
Jun 4, 2019
Project End Date
May 7, 2024
Grant Year
(N/A)
Program Code
[(N/A)]- (N/A)
Recipient Organization
UNIVERSITY OF FLORIDA
G022 MCCARTY HALL
GAINESVILLE,FL 32611
Performing Department
Forest Resources and Conservation
Non Technical Summary
Nitrogen is a critical nutrient that plants need for normal growth and development. Although nitrogen is a naturally abundant gas component of the atmosphere, plants cannot absorb it directly from the air. Therefore, to achieve high productivity, bioenergy and agricultural crops require intensive fertilization, which is a costly process that damages the environment and is potentially hazardous to humans. Some plants have the capacity to obtain nitrogen through a mutualistic relationship with microbes that convert nitrogen gas into ammonia to be used by the host plant. The genetic mechanism that allows plants to establish this mutualism is absent from most crops, and it is currently unknown. This project will discover the genes underlying the mutualistic symbiotic relationship between plants and nitrogen-fixing bacteria and introduce this genetic machinery into bioenergy corps. Advanced comparative genomic technologies will be applied to plant families that possess and lack the capacity to interact with nitrogen-fixing microbes in order to identify candidate genes responsible for that interaction. Selected genes will be introduced into the woody perennial crop poplar, to engineer nitrogen-fixing capability in this bioenergy crop and minimize artificial fertilization requirements and the consequent environmental impact.
Animal Health Component
40%
Research Effort Categories
Basic
20%
Applied
40%
Developmental
40%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
12306991050100%
Goals / Objectives
Nitrogen (N) availability is critical for high biomass productivity, particularly in marginal lands, yet its application is costly, environmentally damaging, and potentially hazardous to human health. Nitrogen is the most common chemical component of Earth's atmosphere and the mineral nutrient required in the greatest amount by plants because of its role as the primary building block of DNA, RNA, and amino acids. Despite its abundance and critical importance for growth and development, plants cannot access N2 from the atmosphere directly. Instead, plants must absorb available N in the soil as nitrate, ammonium, or amino acids. Intensive fertilization with reactive forms of N is used to compensate for its low availability in forestry and agricultural lands. Over 118 million metric tons of N are used annually, produced from natural gas by the Haber-Bosh process that releases ~3% of all global carbon emissions and represents up to 50% of agriculture's operational costs. The dependence of fertilizer production on a fossil fuel is worrying for the long-term sustainability of modern forestry and agriculture. Additionally, of the N applied to forestry and agricultural lands, 50-75% is not captured by plants and is instead leached into waterways or released to the atmosphere as N gases. Leached N increases environmental degradation and leads to indirect adverse effects after being naturally converted to different chemical forms, in addition to negatively impacting human health (Gutierrez, 2012). Clearly, more efficient and cost-effective approaches are needed to enable forestry crops to acquire the N necessary to maximize growth while minimizing inputs and environmental impact. Two decades ago, it was discovered that all flowering plant lineages known to undergo root nodule N-fixation with bacterial symbionts occur within a single clade of angiosperms, with a single underlying origin. Thus, the improvement of the ability of tree species to acquire N may be achievable by introducing the genetic mechanisms that allow some plants species to obtain N through a mutualistic relationship with bacteria and archaea. Phylogenetic approaches and genomic resources allow the discovery of gene gains or losses linked to the evolution of N-fixing symbioses by comparing close relatives that have maintained or lost this ability (Delaux et al., 2015). Using this approach, we uncovered genomic novelties that were required for the evolution of root nodule symbioses through a series of experiments that compared close relative species that nodulate or do not nodulate. This outcome, and previous work on the characterization of the genes involved in nodulation in N fixing species, identifies candidate genes to be tested in this project. Goals: 1. Verification of molecular mechanisms of root nodule development in a forest species Genetic introduction of the molecular mechanisms that regulate root nodule formation in non N-fixing forest species may result in the development of these specialized organs in these species. The first objective of this is to verify the function of genes previously discovered to be associated with N-fixation, using poplar root organ cultures for screening them for a potential role. 2. Introducing nodulation capability into forest species of the genus Populus In the first objective, the genetic mechanisms leading to the development of nodule structures to support N-fixation will be identified. The last objective of this proposal is to engineer nodule development in entire poplar plants and test the impact of these structures on N-fixation. The ability of these new organs on poplar roots to support N-fixation will be evaluated with a wide range of N-fixing bacteria.
Project Methods
The following methods will be applied in the execution of the two main objectives of this research: Objective 1. Verification of molecular mechanisms of root nodule development in a forest species The first objective is to test the contribution of genetic components related to N-fixation in nodulating species, in root development of non-nodulating species of the genus Populus. Genetic introduction of the molecular mechanisms that regulate root nodule development is expected to result in the activation of the nodule developmental pathway. Among the various genes to be tested, we will evaluate histidine kinase cytokinin receptors (LHK/CRE1) and Nodule INception (NIN) or NIN-like genes (NLPs). We will use GoldenGate technology to assemble multi-gene constructs and express candidate genes. Constructs will be transformed into poplar roots using Agrobacterium rhizogenes-mediated transformation. The transgenic roots can be sub-cultured to give rise to root organ cultures. These transgenic roots and root organ cultures develop lateral roots normally. Therefore, we anticipate being able to detect without any difficulty the presence of nodule-like structures. When such structures are observed, thin sections will be used to test the presence of a vasculature system, a meristem and more generally to describe the developmental process in detail as described previously. Objective 2. Introducing nodulation capability into forest species of the genus Populus While the assessment of gain-of-function in poplar root culture (first aim) will identify genes that contribute to the development of nodule-like structures, it will not provide a rigorous quantification of their effect in the whole-plant phenotype in the presence of N-fixing bacteria, including N assimilation, plant development, and biomass productivity and quality. The objective of this aim is to address this limitation. More specifically, we will engineer nodule development genes selected in the previous aim into poplar, evaluate the development of nodules in poplar roots, inoculate with N-fixers, and test their impact on N-fixation and whole-plant properties. The ability of nodule-like structures to support N-fixation will be evaluated with a wide range of bacteria. To engineer nodule development and test the impact of these structures on N fixation and whole-plant properties in poplar, putative nodulating genes will be introduced in the Populus tremula × P. alba genotype INRA 717-1-B4 and phenotyped. Genes will be introduced into poplar using two alternative strategies: (a) A. rhizogenes-mediated transformation, extending the use of transgenic lines derived from the previous aim, or (b) classical transformation using A. tumefaciens. Transgenic poplar plants will be initially grown in a greenhouse, to confirm root nodule phenotypes observed in the first aim. At regular (bi-weekly) intervals after inoculation, we will visually inspect roots for root hair deformations, infection thread formation, and cortical cell divisions. If nodules are detected, those will be analyzed in detail for morphology and expression of standard nodulin genes (ENOD11, ERN1, ENOD40 and NIN), using quantitative RT-PCR. Quantification of N fixation - We will quantify N fixation in stable transgenic lines that present altered root phenotypes that indicate the presence of nodule-like structures. Alternative methods are currently adopted to quantify N fixation in plants, all of which have advantages and limitations. The simplest technical approach evaluates nitrogenase activity by acetylene reduction assays (ARA). While ARA provides a simple and rapid screen for N fixation by symbiotic bacteria, it does not quantify the assimilation of the N in the host-plant. Methods such 15N gas enrichment and 15N dilution address this limitation but are more technically difficult and of lower throughput. Stable lines expressing selected candidates will be initially screened for nitrogenase activity by ARA at the University of Florida. Lines that display significant nitrogenase activity will be evaluated using the 15N dilution and 15N gas enrichment techniques, providing the link between N fixation and the assimilation of fixed N by the plant. Quantification of biomass under N-fixation conditions - To quantify biomass growth and properties in transgenic plants, as well as whole-plant developmental properties, biological replicates of each transgenic line will be grown in a greenhouse and inoculated with N-fixing bacteria. Plants will be grown for at least eight weeks, in ebb-and-flow benches, alongside negative controls (wild-type and empty vector-transformed INRA 717-1-B4). After eight weeks of growth in the greenhouse, leaf, stem, and roots will be collected and stored in a drying room for four weeks. Biomass of root, stem, and leaves will be determined separately for each plant, and ratios of biomass allocated to leaves, stems, and roots will be estimated individually. Root architecture phenotyping--before drying the root component, they will be scanned on flatbed scanners for root architecture analysis. Image analysis software developed for the analysis of root systems (WinRhizo®, Regent Instruments, Quebec City, Canada) will be used to derive architectural traits from the digitized image of each root system. Biomass composition (lignin analysis)--The dried stems will be pulverized with a Wiley mill to a fine powder for measurement of cell wall chemical compositions. Pyrolysis molecular beam mass spectrometry (pyMBMS) is well suited for rapid quantification of lignin, and analytical methods have been developed and applied previously by us to the analysis of poplar. A spectrum ranging from 30 to 450 mass-to-charge (m/z) ratio will be generated for each sample analyzed (two technical replicates for each of six biological replicates) as a service by the Complex Carbohydrate Research Center (CCRC, University of Georgia). Peaks associated with lignin content and monomer composition will be summed as we described previously, dividing the syringyl (S) lignin peak sum by the guaiacyl (G) lignin peak sum to obtain an estimate of lignin S:G ratio for each genotype.

Progress 10/01/20 to 09/30/21

Outputs
Target Audience:The target audiences reached by the effort were: Plant biologists working in research to develop crops that require fewer inputs. Graduate students and postdoctoral associates studying the biology of nitrogen fixation in plants. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided? Nothing Reported How have the results been disseminated to communities of interest?Due to COVID, our ability to disseminate knowledge to the community of interest has been limited. Despite this limitation, presentation were made in several venues, including: 1. Oral presentation at the Department of Energy BER Principal Investigator Meeting 2. Several poster presentations at the EuropeanNitrogen Fixation Conference 3. Oral presentation to the Biology Department of the University of South Carolina What do you plan to do during the next reporting period to accomplish the goals?We have completed the identification of candidate gene regulators, based on the current accomplishments of the project. We are now testing these genes in multiple combinations.

Impacts
What was accomplished under these goals? Aim I. Phylogenomic Discovery Of The Genetic Toolkit Of Root Nodule Symbioses The goal of this aim it to uncover the genomic toolkit required for the evolution of root nodule symbioses via a phylogenetic framework, comparing species that possess and lack nodulation ability. The first aim is generating candidate genes that are functionally evaluated in the remaining of the project. a. Construction of a phylogenetic framework of the n-fixing clade We developed a robust phylogeny of the nitrogen-fixing clade to use as a comparative framework to meet the goals of the first aim. We used this framework to make an initial inference of the number and distribution of independent origins and losses of nitrogen-fixing symbioses and the phylogenetic location of the underlying predisposition and to propose optimal sister clade comparisons for genome comparisons. Genome sequencing and assembly of these taxa are underway. We have also begun comparative sequence analysis of candidate functional genes that were sequenced via targeted-sequencing for all specimens in the phylogeny to identify regions associated with nodulation status. b. Comparative analyses of nodulation and detection of the genetic nodulation toolkit Selection and prioritization of genes based on genome comparisons We proposed 21 nodulating-non nodulating species comparisons based on our phylogeny (13 in legumes, eight in non-legumes) and began genome sequencing for ten of these comparisons. We have acquired tissue for all but one needed species and have completed long-read and short-read genome sequencing for seven; genome assembly for two of these is complete or underway. We are also performing comparisons between nodulators and non-nodulators using the functional genes sequenced in our targeted-sequencing experiment. Selection and prioritization of genes based on gene regulatory modules We continued the interpretation of the Medicago LCO time course RNAseq and ATACseq data sets. This effort has centered on defining regulatory programs by integrating RNA-seq and ATAC-seq time course data with the Dynamic Regulator Module Networks algorithm, and MTG-LASSO approach. The identified regulators of interest in this work included IBM1, and regulators involved in hormone responses associated with the early steps of symbiosis and nodule formation such as ethylene (ERF1, EDN1-3 and EIN3), ABA (ABI4-5), auxin (SHY2) and cytokinin (MtRRB15). Selection and prioritization of genes based on patterns of gene evolution (gene loss or gain) A previous gene orthology analysis of 39 plant species has been expanded to span a deep sampling of available transcriptome data from 93 plant species, 38 nodulation and 45 non-modulating members of the nitrogen-fixation clade, and 10 outgroup species. The available assemblies and transcriptomes were selected from a curated list of publicly-available data for 107 prospective species, and 93 were selected with protein sequence annotations. In total we obtained 3.2 million gene protein sequences spanning ~100k parsed gene ortho-groups with member genes spanning at least two species. We further analyzed these ortho-groups by examining evolution at the DNA sequence and protein sequence level with HyPhy and validating gene orthology relationships based on synteny. We are currently identifying sequence-based evidence for variation across species that can provide insight into the evolutionary history and origin of nodulation. Selection and prioritization of genes based on patterns of regulatory sequence evolution We selected the genome of 25 species capable of NFN symbiosis, including the reference M. truncatula, to identify conserved non-coding sequences (CNS) related to nodulation. A group of species outside of the nitrogen-fixation clade was also identified to be used as an outgroup. After pairwise alignment of each species genome to the reference M. truncatula, these alignments were merged using a phylogenetic-guided approach and according to their classification (NFC or outgroup). We used PhastCons to estimate the conservation score across the species in the NFC and detected 114,173 CNS. Addition filtering based on genome location and orthology resulted in 3,091 CNS remaining for further analysis. Finally, we evaluated the relationship between CNS and chromatin accessibility of the Medicago genome during nodulation, in genes related to this developmental process. A total of 38 of these CNS were located in proximity to 19 genes previously described as being associated with nodule development, including MtCRE1- a histidine kinase cytokinin receptor. Selection and prioritization of genes based on patterns single-cell root expression We characterized the early transcriptome reprogramming of individual Medicago root cells responding to rhizobia. We isolated nuclei from M. truncatula root infections zone segments in each of two replicated experiments, 24 hrs after spot inoculation with S. meliloti. We performed high-throughput, microfluidic-based single-nuclei RNA sequencing using the 10× Genomics Chromium technology. We captured 18,016 nuclei and obtained on average 68,270 reads per nucleus. We identified the expected cell populations contained in the M. truncatula roots. Pericycle cells are the first to respond mitotically to the presence of rhizobia. To reconstruct the pseudo temporal transcriptional program that gives rise to these differentiated pericycle cells, we performed a pseudotime analysis of the clusters that identified regulators of auxin biosynthesis (SYT) and members of the two-component cytokinin signaling system. Aim II. Verification of Molecular Mechanisms of Root Nodule Development We are verifying the function of targets discovered in the first aim for their effect on root nodule development in M. truncatula (nodulating) and Populus root organ cultures (non-nodulating). Below we describe the analysis of some of the most relevant candidate genes discovered in Aim I, that resulted in altered nodulation phenotypes when evaluated in Medicago. Evaluation of putative regulatory sequences associated with nodulation Among the conserved non-coding sequences discovered in the previous aim, we detected five sequences in the promoter of MtCRE1. We engineered three versions of this promoter containing deletions of the distal two CNS (Δ2CNS), or proximal three CNS (Δ3CNS), or all five of these CNS (Δ5CNS). We generated composite Medicago truncatula plants in the cre1-1 background developing transformed roots expressing CRE1 either under the wild-type promoter or one of the engineered promoters or not expressing CRE1 (empty vector) and inoculated them with Sinorhizobium meliloti. Two weeks after inoculation, we observed a gradual decrease in the number of nodules with the deletions that was statistically significant between the wild-type and Δ5CNS roots, providing evidence that CNS are required for nodule organogenesis. Evaluation of candidates detected based on patterns single-cell root expression We uncovered putative regulators of the early differentiation of pericycle cells during nodule formation by isolating and sequencing the transcriptome of Medicago truncatula single nuclei derived from roots of rhizobia inoculated plants. We identified a homologue of the Arabidopsis STY family proteins, and genes of the two-component system phosphorelay cascade involved in cytokinin signaling. RNAi knock-down regulation of both genes results in a significant reduction in the number of nodules produced by Medicago after inoculation with rhizobia. Aim III. Engineering Nodulation Capability Into The Bioenergy Woody Crop Populus spp. There are several genes/constructs currently under evaluation in Populus, for their effect on the induction of nodule-like structures. These include constructs that express STY, HK1, CRE1 and other nodulation genes identified in our current work. Generation and analysis of transgenic plants are ongoing.

Publications


    Progress 10/01/19 to 09/30/20

    Outputs
    Target Audience:The target audiences reached by the effort were: Plant biologists working in research to develop crops that require fewer inputs. Graduate students and postdoctoral associates studying the biology of nitrogen fixation in plants. Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?The project has provided training for four postdoctoral associates in molecular and cellular biology, and genomics. How have the results been disseminated to communities of interest?Due to COVID, our ability to disseminate knowledge to the community of interest has been limited. Nonetheless, we have made presentations at two scientific meetings (Plant and Animal Genome Conference, and Department of Energy Principal Investigator Meeting) What do you plan to do during the next reporting period to accomplish the goals?We further narrowed the range of candidate genes to be tested, to introduce nitrogen fixation in poplar. We are now testing these genes in multiple combinations.

    Impacts
    What was accomplished under these goals? Completed library preparation, sequence-capture and sequencing for ~ 15,000 species of the Fabales, Fagales, Cucurbitables and Rosales, by far the largest phylogenomic data set ever assembled for a clade of plants. Assembly of the sequencing data and preliminary analysis based on ~10,000 species was used to generate a phylogeny using RAxML and FastTree. Completed de novo assembly of 129 sequenced functional genes involved in nodulation for all 15,000 species of the Fabales, Fagales, Cucurbitables and Rosales, and developed an updated database for the nodulation trait for all sequenced samples. This database has improved resolution at the genus-level and includes a measure of uncertainty that describes whether the trait is confirmed (present or absent) or whether it is inferred and by what source. Implemented a novel method (Dynamic Regulatory Module Network) to model temporal changes in gene expression and access of their regulatory motifs. This analysis identified the genes that have the highest regulatory importance in nodulation, based on their weight on the model prediction. The method identified five master regulatory genes involved in hormone responses in symbiosis and nodule formation. Applied comparative genomics methods to detect candidate genes supported by phylogenies. In this analysis we used alternative approaches to detect those genes, while controlling for phylogenetic relatedness and for differences in evolutionary rates among species. The analysis identified several genes previously known to be involved in nodulation (e.g. LysM type receptor kinase), as well as novel candidates. Developed a cytokinin sensor, expressed in the cortex of Medicago and Populus, for early detection of plants that are undergoing the developmental program that is expected in plants developing nodules. Because both cytokinin and auxin are involved in early nodule development, an auxin sensor is also currently under development. Developed methods for single-nuclei transcriptome (snRNA-seq) analysis applicable to a wide range of plants, for use in the snRNA characterization of roots from species that different in their response to rhizobia and nodule formation. We also optimized sequencing and assembly of plant genomes using a combination short read (Illumina) and long-reads (Oxford Nanopore). Supported by the phylogenomic data, this optimized approach will be used to sequence the genomes of the most informative species in the Fabales, Fagales, Cucurbitables and Rosales clades. Evaluated the overexpression of PtNIN orthologs in Populus for the development of nodule-like structures, which resulted in an increase in lateral roots. Similarly, the overexpression of Ljsnf2 led to more lateral roots, as well as nodule-like structures. Sections through the nodule-like structures revealed a central vasculature reminiscent of actinorhizal nodules.

    Publications


      Progress 06/04/19 to 09/30/19

      Outputs
      Target Audience: Nothing Reported Changes/Problems: Nothing Reported What opportunities for training and professional development has the project provided?This project has supported the training of several postdoctoral associates, research scientists and laboratory technicians working in the laboratories of the investigators: (1) Dr. Daniel Conde (Kirst Lab) is a postdoctoral researcher and overall project coordinator, currently supporting the research related to the generation of genome, transcriptome and chromatin accessibility data. In addition, Dr. Conde coordinates and supervises the generation of transgenic plants in the Kirst lab, and is responsible for the generation of gene constructs used in this process. (2) Dr. Thomas Irving (Ané Lab) is a postdoctoral researcher, developing the research associated with activating the Common Symbiosis Pathway in response to rhizobia in Populus sp. In the past year, Dr. Irving has undergone training in confocal microscopy. (3) Dr. Heather Rose Kates (Soltis Lab) is a post-doctoral researcher, who led the development of a high-throughput extraction process and is the liaison with our service provider. She currently also coordinates the activities related to the analysis of the sequencing data generated for the phylogenomic effort. (4) Dr. Sara Knaack (Roy Lab) is a postdoctoral researcher, executing analysis related to the detection of orthologous genes in nodulating and non-nodulating species, as well as the transcriptome data and chromatin accessibility data. Dr. Knaack has also led the implementation of the comparative genomics, PhastCons analysis, and the development of dynamic regulatory networks that are being applied to the discovery of key nodulation genes. (5) Dr. Lucas Maia (Ané Lab) is a postdoctoral researcher, is engineering root nodule organogenesis in Populus sp. In the past year, Dr. Maia has undergone training in confocal microscopy. In addition, the project has also provided career development opportunities for a research scientist (Dr. Ryan Folk), a programmer (Dr. Raphael Lafrance) and laboratory technician (Ms. Bella Ruben and Mr. Hank Schmidt). Over 10 undergraduate students have also been trained in molecular techniques at the PI and co-PI's laboratories. The project also provided training opportunities for visiting scholars. Three graduate students from the lab of Prof. Tingshuang Yi (Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China), Ms. Shuiyin Liu, Ms. Jiajin Wu, and Ms. Qin Tian, have also visited the Soltis/FLMNH laboratory as part of our collaborative phylogenomic project on Rosales and have learned the lab and computational steps integral to this project and collaboration (their samples of Rosales will be included in our project). Another visiting scholar (Mr. Wendell Pereira) also contributed extensively to the implementation of the CNS pipeline, and will join the Kirst laboratory in the Fall of 2019 as a postdoctoral researcher in the project. How have the results been disseminated to communities of interest?Project progress updates continue to be provided by a Twitter feed (@Nit_Fix), and overall project aims are described in our project webpage (www.nitfix.org). In addition, results have been disseminated in the following events: Kates, Heather, Folk, Ryan, Conde, Daniel, Ruben, Bella, Dervinis, Chris, LaFrance, Rafe, Kirst, Matias, Guralnick, Rob, Soltis, Douglas, Soltis, Pamela S. "Rapid workflows from specimens to sequences: Global-scale phylogenomics from collections." Revolutionizing systematics: Herbaria in the Genomics Age, Botany Conference. July 2018. Rochester, MN. Colloquium Presentation. Kates, Heather, Folk, Ryan, Conde, Daniel, Maia, Lucas, Knaack, Sarah, Irving, Thomas, Dervinis, Chris, LaFrance, Rafe, Balmant, Kelly, Roy, Sushmita, Ane, Jean-Michel, Kirst, Matias, Guralnick, Rob, Soltis, Douglas, Soltis, Pamela S. "Global Scale Phylogenomics of the Nitrogen Fixing Clade". Root Nodule Symbiosis: Genetics, Evolution, and Engineering for Future Crops, Plant and Animal Genome Conference. January 2019. San Diego, CA. Workshop Presentation. Kates, Heather. "The More you Look the More you See: Generating New Hypotheses in Plant Evolutionary Genetics using Big Data and New Methods". Oberlin College. April 2019. Departmental Seminar. Knaack et al. Deciphering the regulatory network controlling nitrogen fixation in plants. Poster presented at the Great Lakes Bioinformatics Conference. May 2019. Madison, WI. Poster presentation. Kirst, Matias. " Engineering a novel nitrogen-fixing symbiosis into Poplar". Root Nodule Symbiosis: Genetics, Evolution, and Engineering for Future Crops, Plant and Animal Genome Conference. January 2019. San Diego, CA. Workshop Presentation. Pereira, Wendell; Folk, Ryan; Kirst, Matias. Identification of conserved non-coding sequences in nitrogen fixing plants. Plant and Animal Genome Conference. January 2019. San Diego, CA. Poster Presentation. Finally, four posters were presented in the DOE 2019 Genomic Sciences Program Annual Principal Investigator (PI) Meeting by the PI Kirst, co-PI Folk, project coordinator Conde and postdoctoral associate Maia. What do you plan to do during the next reporting period to accomplish the goals?(1) Phylogenomic discovery of the genetic toolkit of root nodule symbioses The complete data necessary for the phylogenomic analysis is being generated in the last few months of year 2 and will be fully available at the start of year 3. Completion of the analysis of this data will be a major focus of the third year of the project, as it is one of the approaches that will be used for identification of candidate nodulation genes to be tested in Medicago and Populus. We will expand the transcriptome analysis of nodulating and non-nodulating species, to other species beyond Medicago and Populus, focusing initially on those species for which we have already obtained the necessary germplasm. The comparative analysis of transcriptomes, between these nodulating and non-nodulating species, and relative to Medicago and Populus, will be completed in the third year of the project. Finally, we have started developing a databased that integrates phylogenomic, comparative genomic and transcriptomic data to streamline the discovery of candidate genes for further analysis. While the initial framework of the database is expected to be completed in the last quarter of year 2, we expect that it will be continuously improved in the third year. (2) Verification of molecular mechanisms of root nodule development The second year of the project involved the testing for gain and loss of function of candidate genes for which there was evidence of a role in nodulation, as well as evidence from phylogenetic and comparative genomic/transcriptomic data. In the third year of the project we are continuing the selection and evaluation of genes or regulatory elements uncovered in the first aim, but now expanding to genetic elements for which there may not be previous evidence for a role in nodulation. In the third year, we expect to evaluate 40-50 constructs containing individual genetic targets or combinations of targets. (3) Engineering nodulation capability into Populus With the identification of the first promising candidates genes/constructs in year 2 (LtHK1 under both 35S and a cortex-specific promoters), part of our effort is now being redirected towards generation whole-plants containing these constructs, determining if the nodule structures observed in hairy-roots is analogous to legume nodules, and testing if these nodules are colonized by N-fixing bacteria. Additional candidates confirmed to have a role in nodulation in Aim II will be evaluated in whole-plants here.

      Impacts
      What was accomplished under these goals? (1) PHYLOGENOMIC DISCOVERY OF THE GENETIC TOOLKIT OF ROOT NODULE SYMBIOSES Phylogenomic - sampling, DNA extraction, library preparation and data analysis We successfully completed processing for all 15,000 species samples and submitted all DNA samples to our service provider for target capture and DNA sequencing. Library preparation and sequencing is complete for ~ 4,900 species, which represents 41% of our total sequencing goal. We designed a computational method for strategic selection of DNA samples for sequencing that simultaneously integrates DNA quality guidelines and taxonomic sampling goals, to finalize selections from among our unsequenced DNA samples, as well as library preparation and sequencing before the end of year 2. In the second year of the project we carried out computational analysis to (a) identify conserved regions that are specific to N-fixing species relative to an outgroup of non-N-fixing species, and (b) detect presence and absence of key nodulation genes in species of the N-fixing clade and an outgroup. Genome sequences from 34 species (25 N-fixing and 9 outgroup) were organized into multiple alignments relative to the Medicago v5 genome assembly. From this multiple alignment data for two species sets (N-fixing and outgroup species), two analysis pipelines were applied in parallel to identify conserved regulatory elements: one based on PHAST pipeline (specifically the phastCons algorithm) and one based on the CNS pipeline. Conserved regions called by phastCons were filtered by removing 1) regions <5 bp in size, and 2) regions identified in the N-fixing species that overlap with conserved regions detected in the outgroup. Future efforts will focus on the functional interpretation of such regions in the context of regulator binding sites and specificity to N-fixing species and evolution. In the second year of the project we also completed sequence processing, assembly, and preliminary analysis on 4,900 species analyzed by sequence-capture, representing 41% of our phylogenomic goal. We have put in place an easily and rapidly deployable pipeline to complete quality control and assembly of phylogenetic and functional loci for thousands of samples in parallel on the University of Florida supercomputer. We set up programs for multiple sequence alignment, phylogenetic analysis, and testing for correlation between detection and non-detection of functional genes with nodulation state that are easily re-run when the dataset is updated with sequence data for additional samples. Finally, we aligned 100 phylogenetic loci for all sequenced samples using a back-translation method in TranslatorX and MAFFT v. 7, optimized for deep divergences and are continually updated when new data are available. We built phylogenetic trees using FastTree for an additional quality control check and used RaxML-NG to build a phylogenetic tree at two points in data delivery (1,900 species and 4,900 species). To look at the relationship between nodulation state and detection or non-detection of the 129 functional genes sequenced, we used Pagel's test for evolutionary correlation and visualized these patterns alongside the phylogeny in the R package phytools. Transcriptomics - sampling, RNA extraction, library preparation and data analysis The project transcriptome component involves the comparison of profiles among nodulating and non-nodulating species, to uncover their differences and define potential candidate genes for further analysis in the second aim. We have obtained germplasm for these experiments, including of the species Medicago truncatula (nodulating), Populus tremula x alba (non-nodulating). Additional species will be selected upon completion the phylogenomic data analysis. We completed the processing of RNA samples from Medicago truncatula (nodulating) and Populus tremula x alba (non-nodulating) subjected to treatment with nodulation (Nod) factors, and collected over multiple time point following treatment. The data were normalized and processed with SLEUTH for further analysis, quantile-normalized and log-transformed. Principle components analysis results indicate clear separation of the replicates by time point, providing a strong foundation for further analysis. Principal components analysis of the data shown a strong temporal component, explaining ~42% of the variation by the first two components. We clustered the genes based on their log-transformed mean-subtracted expression profiles and identified several gene groups exhibiting dynamic patterns of expression. We have developed a novel algorithm, Dynamic Regulatory Module Networks (DRMN), to systematically integrate RNA-seq and ATAC-seq (described below) time course profiles to identify key cis-regulatory elements and transcription factors associated with genes that exhibit dynamic behavior. We have applied this algorithm to the RNA and ATAC-seq time course measuring transcriptional and accessibility profiles after treatment with Nod factors. The inputs to this algorithm are the RNA-seq time series data, the number of expression modules and upstream regulatory features for each time point that can be derived from the ATAC-seq time course. The algorithm outputs a set of gene expression modules (states) for each time point and their associated regulatory programs comprising the cis-regulatory elements that best predict the expression of genes in a particular module. These are candidates to be evaluated for their role in nodulation.? (2) VERIFICATION OF MOLECULAR MECHANISMS OF ROOT NODULE DEVELOPMENT The first year of the project focusses on evaluating genes known to be involved in nodulation (NIN and LHK genes) and hypothesized to trigger nodulation in species outside of the N-fixation clade. In the second year of the project, this strategy expanded to other genes that follow a phylogenetic pattern that suggests an association with species that nodulate in the N-fixing clade, and that have previously been identified as involved in nodule organogenesis or lipochitooligosaccharides (a.k.a. Nod-factor) detection. Our research strategy involves evaluation of candidate genes for their role in nodulation in Medicago (when transposon insertion lines are available), and in Populus using the hairy-root transformation system. Verification of candidate genes in Medicago We obtained the 14 orthogroups annotated by Aim 1 as showing the most significant association (excepting the orthogroups containing NIN & RPG) with the same phylogenetic pattern as NIN/RPG (i.e. present in nodulating FaFaCuRo and in outgroup dicot species, but not present in non-nodulating FaFaCuRo). We obtained transposon insertion lines in Medicago truncatula (from the Noble Research Institute) for three genes within these orthogroups. These lines were crossed and homozygous knockout lines for each of the three genes obtained, and we are now working to assess the nodulation phenotype of these lines. Verification of candidate genes in Populus Constructs currently being tested in poplar hairy-roots, for their effect on nodule development, include the expression of CCaMK, snf-2+NOOT1, WUSCHEL+IPT3, NOOT1+WUSCHEL under a cortex-specific promoter, as well as NFP under an epidermis specific promoter. Additionally, constructs for NIN and Lotus japonicus cytokinin receptor (LjHK) have also been tested (see below). (3) ENGINEERING NODULATION CAPABILITY INTO POPULUS We have successfully systematized the Agrobacterium rhizogenes-mediated transformation method into Populus tremula × P. alba (clone INRA 717-1-B4) with the regeneration of whole transgenic plants in our lab.

      Publications