Recipient Organization
UNIV OF HAWAII
3190 MAILE WAY
HONOLULU,HI 96822
Performing Department
Tropical Plant & Soil Sciences
Non Technical Summary
Measuring total soil biodiversity is one of the most basic, yet fundamental concepts that can contribute towards an integrative measure of soil health. However, currently there is no system that could quickly and effectively identify the immense biodiversity in soil. Leveraging available high-throughput sequencing technology and bioinformatic tools, this Seed proposal will 1) develop and test a series of ribosomal marker genes that can be sequenced together to identify major groups of soil organisms including bacteria, archaea, fungi, arbuscular mycorrhizal fungi, unicellular eukaryotes (protists), oomycetes, nematodes, and invertebrates; and 2) develop and streamline a flexible and transferable molecular identification system that will enable rapid identification of total soil biodiversity. These tools will be benchmarked using a set of 66 soil samples that came from different soil types, management systems, and disturbance frequencies. The resulting biodiversity measurements (richness, phylogenetic, and functional) will provide the first comprehensive examples of how biodiversity measures can provide insights into what makes a healthy soil, and provide supporting data for developing work in the field of Soil Health.
Animal Health Component
30%
Research Effort Categories
Basic
30%
Applied
30%
Developmental
40%
Goals / Objectives
The main goal of this project is to develop a molecular identification system that could quickly and efficiently identify all living organisms in any given soil. Successful development will allow rapid identification of total biodiversity in any soil systems that could immediately support incorporation of soil biodiversity into the various developing concepts in soil health. This project will start with two objectives:Objective 1: Develop a multiplex PCR system that is able to sequence ribosomal RNA genes from major groups of soil organisms (bacteria, archaea, fungi, arbuscular mycorrhizae, unicellular eukaryotes (protists), oomycetes, nematodes, and invertebrates) in a single sequencing run. This includes development and optimization of various protocols to sequence each group of organism individually and together as a single unit.Objective 2: Develop bioinformatic tools and workflows to handle large and diverse datasets, and streamline the tools to enable rapid identification of total soil biodiversity.
Project Methods
Soil collection: From a project aimed at understanding the components of soil health and develop a soil health metric, a total of 22 field sites were sampled in Hawaii, with three replications at each field site for a total of 66 soil samples. These sites span a broad range of six soil Orders (Andisols, Inceptisols, Mollisols, Oxisols, Ultisols, and Vertisols), management practices (conventional and organic), and disturbance history (pasture land, cultivated land, and native forest). Each of these dimensions are replicated at least three times. Collaborations with local farmers provided a rich environment to study soil health across a diverse soil backgrounds and management practices.Molecular markers gene of interests: A series of carefully-selected ribosomal markers based on literature review for their ability to best capture overall diversity in their respective groups will be used to sequence major groups of soil organisms (Table 1). Viruses are excluded because there are no conserved markers and are beyond the scope of this proposal. Preliminary data from our lab identifying nematodes from compost samples showed that the commonly used 18S primer pair (NF1 and 18Sr2b) was able to amplify 8 species of nematodes and the ITS primer pair (M2646-SHITS2R) was able to amplify 5 species, with only 1 species overlapping across these primers. Therefore, we will use three primer pairs to maximize detection of the diversity of nematodes following the recommendation of Porazinska et al., (2009).Organism grouprRNA genePrimersReferenceBacteria & Archaea16S515F-806R(Parada et al., 2016);(Apprill et al., 2015)FungiITSITS1F-ITS2(Gardes and Bruns, 1993);(White et al., 1990)Arbuscular Mycorrhizae18SWANDA-AML2?(Lee et al., 2008); (Dumbrell et al., 2011)Protists18S1389F-EukBr(Amaral-Zettler et al., 2009; Stoeck et al., 2010)OomycetesITSITS3oo-ITS4ngs(Riit et al., 2016)Metazoa18SSSU_FO4-SSU_R22(Fonseca et al., 2010)Nematodes18S28SITSNF1 and 18Sr2bD3A-D3BM2646F-SHITS2R(Mullin et al., 2003; Porazinska et al., 2009)(Nunn, 1992; Porazinska et al., 2009)Table 1: A series of primers that will be used in this proposed project.Table 1: A series of primers that will be used in this proposed project.Molecular identification of soil biodiversity: DNA will be extracted from these 66 soil samples using an established DNA extraction method that PI Nguyen developed that works well for a broad range of soil, including Oxisols and Andisols that have high DNA binding affinity. The Nguyen Lab's standard amplicon sequencing protocol will be used to amplify, barcode, and sequence each gene across all soil samples (Nguyen, 2019a). Amplified and barcoded product from each primer pair will be combined in equimolar concentration to form a single "gene library." There will be a total of nine gene libraries. These libraries will be combined at equimolar concentration into a single master library, which will then be sequenced using 300 bp paired-ends Illumina MiSeq. As a proof of concept, this protocol has been tested multiple times and have worked consistently well for three ribosomal genes (16S - prokaryotes, 18S - protists, ITS - fungi) and yielded an average of 31,000 sequences per sample, in MiSeq run with 250 samples.Multiplex PCR to assess total biodiversity: In order to streamline identification of soil organisms across many samples, we will use multiplex PCR where multiple primers will be combined into a single cocktail and used in a single PCR reaction. This method has been successfully used successfully in various systems (Oliveira and de Lencastre, 2002;Felske et al., 2003; Solà et al., 2018) and preliminary data in our lab showed that at least three primers could be multiplexed. However, a complex mixture of nine primers will need to be benchmarked using the 66 soil samples. We will use the same PCR amplification strategy as above, with the only difference being primer annealing temperature 64-52 °C (touch-down PCR) to mitigate primer binding and competition biases. We will also optimize of other parameters known to affect multiplex PCR (Edwards and Gibbs, 1994; Henegariu et al., 1997). Each of the 66 samples will then be barcoded to identify them and sequenced using Illumina MiSeq as above.AnalysesSequences will undergo a series of rigorous bioinformatics quality control steps using QIIME2 (Bolyen et al., 2019). Detailed pipeline in general for 16S, 18S, and ITS ribosomal markers has been refined by PI Nguyen and is adaptable to any primer sets (Nguyen, 2019b). To assess overall biodiversity, we will measure operational taxonomic units (OTUs), Faith's Phylogentic Diversity for conserved 16S and 18S genes (ITS is too variable to for this metric), Chao2 estimated richness, and Shannon's Index. These diversity metrices will be statistically compared among the soil types, management, disturbance regimes, and other environmental variables available in our soil dataset. Community description will be visualized using PCoA and/or NMDS and statistically compared using PERMANOVA based analyses.Functional diversity using trophic guild (who eats whom) will be measured and compared as above for fungi (Nguyen et al. 2016), protists (Xiong et al., 2018), nematodes (Sieriebriennikov et al., 2014), mites (Birkhofer et al., 2016, de Groot et al., 2016) and general soil fauna (Ehnes et al., 2011). To identify taxa that occur in one environment but not another (in other words, indicator taxa) that might provide insights into soil health, we will employ a robust method called Analysis of Composition of Microbiomes (ANCOM) that uses compositional constraints to reduce false discoveries (Mandal et al., 2015).A framework using co-occurrence network analysis will be used to connect OTUs from all these different experimental soil data together (Williams et al., 2014; Freilich et al., 2018). Cross-amplicon (functionally cross-kingdom) networks, although are at an infancy and still require further development, can serve as a powerful way to connect the richness of organisms in the soil environment. We have developed a method based on (Tipton et al., 2018) to produce fungal-bacterial networks that hypothesize relationships among the different taxa in a dataset (Fig 3). Network analyses can integrate nearly infinite data including taxonomic and functional data to explore co-occurrence and potential interactions, while network structure, such as network complexity, might inform about the state of a system (e.g. how healthy a soil might be). For instance, natural and stable systems have been associated with a more complex soil biodiversity network compared to agricultural and less stable systems (Morriën et al., 2017; de Araujo et al., 2018; Banerjee et al., 2019), and our work showed that organic fertilizers creates a more stable network than mineral fertilizers.Sequence data from the multiplexed PCR experiment will be processed as above. Richness, Phylogenetic Diversity, Chao2 richness estimator, Shannon's Diversity index, community similarity, functional diversity, and network integrity will be correlated to the previous dataset. Correlations among these datasets (from Objective 1~Objective 2) will allow us to measure the robustness of this multiplex PCR dataset, and provide insights into whether results from multiplexed PCRs have enough integrity to measure total soil biodiversity.