Source: PENNSYLVANIA STATE UNIVERSITY submitted to
GENETIC AND GENOMIC RESOURCES FOR RESTORATION AND SUSTAINABILITY OF GREEN ASH AND NORTHERN RED OAK
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
REVISED
Funding Source
Reporting Frequency
Annual
Accession No.
1020603
Grant No.
(N/A)
Project No.
PEN04717
Proposal No.
(N/A)
Multistate No.
(N/A)
Program Code
(N/A)
Project Start Date
Oct 1, 2019
Project End Date
Dec 31, 2021
Grant Year
(N/A)
Project Director
Carlson, JO, E..
Recipient Organization
PENNSYLVANIA STATE UNIVERSITY
208 MUELLER LABORATORY
UNIVERSITY PARK,PA 16802
Performing Department
Ecosystem Science & Management
Non Technical Summary
Eastern hardwood forests are suffering severe disturbances and poor regeneration that threaten the sustainability of forest-based ecosystems and economies. This is due in large part to major, ongoing mortality events to a wide range of forest tree species from exotic pests, diseases, and invasive species that are accentuated by abiotic stresses including climatic variations. Over the years, good progress has been made with our colleagues in developing genetic and genomics resources for hardwood forest trees, including deep gene sequence databases, DNA markers, genetic maps, and field trials. The genomic resources that our research community has developed are available publicly at the Hardwood Genomics website (www.hardwoodgenomics.org). Tree genetics field trials developed at Penn State and at collaborating public institutions are being used in numerous ongoing projects. This 2 project will focus on completing a platform of genomic and genetic resources specifically for northern red oak and green ash. We will continue to collect important phenotypic data from established field trials, expand collections and field trials, and complete development of genomic resources including whole-genome sequences for northern red oak and green ash. These advancements in data and research platforms will constitute major contributions to the forest genetics, forest genomics, tree improvement, ecological genomics, and plant evolution research communities, as well as applied forest health initiatives and tree species restoration efforts at universities and in state and national agencies.
Animal Health Component
0%
Research Effort Categories
Basic
75%
Applied
25%
Developmental
0%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
20106991080100%
Goals / Objectives
OUTCOME-BASED OBJECTIVESThe key outcome-based objectives of this study are:1. Assemble and analyze a draft reference genome for northern red oak.The outcome of this objective will be near-complete sequences for all 12 chromosome pairs of northern red oak. Using our NRO RNA sequence data (www.hardwoodgenomics.org) we will identify all of the genes and their locations in the NRO genome. We will annotate the predicted genes (ids) and putative functions based on genes previously identified and annotated in other well-studied model plant species. We will also compare the NRO genome to genomes of other tree species to determine what unique features may exist in the structure of the NRO genome.2. Complete assembly and analysis of the green ash reference genome.The outcome of this objective is the complete sequences for each of the 23 chromosome pairs of green ash. We will use our RNA sequence data for ash (at www.hardwoodgeniomics.org) to identify the genes and their locations in the green ash genome. We will annotate the predicted gene names (ids) and putative functions based on well-studied model plant species. We will also compare the organization of the green ash genome to model tree species and to other Asterid plants to determine what the unique features may exist in the overall structure of the ash genome.3. Collect and phenotype resistant green ash and northern red oak genotypes to develop seed orchards and to identify QTL for phenotypes in genetic mapping families. The outcomes of this objective will be the identification and propagation of additional stress-resistant green ash and NRO genotypes, the assembly of resistant genotypes to initiate seed orchards, and the location of regions of the genomes of green ash and NRO that are genetically associated with these important quantitative phenotypic traits (QTL). This will involve visually scoring phenotypes in existing provenance trials and in full-sib mapping populations for green ash (at PSU) and NRO (at Purdue). Traits to be measured include survival, height, diameter, bud break, flowering, incidence and type of insect attack, disease incidence and pathogens, and response to abiotic stresses that may occur. Mapping software will be used to find QTL regions in the green ash and NRO genetic maps responsible for variation in the phenotypes.4. Demonstrate the utility of the NRO and green ash genomes in addressing important issues in the ecology and protection of these species.The outcomes of this objective will be:i. Identification of candidate genes for resistance to EAB in green ash.ii. Identification of candidate genes for susceptibility to pathogens causing oak decline in NRO.iii. Identification of candidate genes for response to abiotic stresses in both species.iv. The genetic map location of QTL for the above traits in NRO and green ash.v. The use of genome wide SNPs to identify naturally occurring hybrids among red oak species.vi. In collaboration with tree genetics colleagues we initiate tests of DNA markers to identify hybrids, select resistant trees, and accelerate restoration programs for ash and oak species.For all 4 objectives, dissemination and technology transfer of results will be achieved through, publications in superior peer-reviewed journals, presentations at scientific and technical meetings, and release of results to the public at the www.hardwoodgenomics.org project website.
Project Methods
METHODOLOGICAL APPROACH:High-quality reference genomes are a key resource for studying the organization of genome structure and function and for applications such as the development of whole-genome selection approaches in breeding programs. These important goals for the research community and for accomplishing broader impacts in the forestry sector cannot be satisfactorily met using proxy genomes from related species that may differ in genome organization and gene content to green ash and NRO. However, the existence of genomes for species related to green ash [European ash genome] and NRO [pedunculate oak genome] provide benchmarks for quality of the reference genomes that we will construct and candidate genes we identify for green ash and NRO. New genome sequencing and assembly methods now speed the production while reducing costs of new genomes. Bioinformatics tools for assembly, annotation and comparison of high-quality genomes are also now more efficient and less expensive to conduct. Genetic linkage maps with the locations of QTL remain key elements in the validation and correction of genome assemblies and for the location of genes for traits of interest. We propose a stepwise approach to construct reference genomes and identify candidate genes using existing resources and tools. A summary of available information and resources on green ash and NRO is presented in Table 1.Table 1. Supporting Information for reference genomes & important traits.SpeciesNorthern Red Oak Green AshReference genome genotypeParent or F1-offspring of the genetic map familyEAB-resistant parent tree of the genetic map familyGenome size and ploidy762 Mb; 2N975 Mb, 2NGC content; heterozygosity35-40 %; 73-86%35 %; 68-80Available genetic and genomic resourcesBAC libraries, transcriptomes (including stress), high density linkage map and full-sib familiesTranscriptomes (including EAB-treatment), high density genetic linkage map and full-sib familyGenomes for related speciesQuercus robur and Q. lobataEuropean ash Fraxinus excelsiorTraits of interestAbiotic stress resistance, oak decline, seedling growth vigor, phenology, height & diameterEAB resistance; phenology, height & diameter growth; abiotic stress resistance;1. Assemble and analyze a draft reference genome for northern red oak. We will take an approach similar to the pedunculate oak genome. However we will use an F1-intercross progeny from our colleague's genetic mapping population. The greater homozygosity in the F1 will to improve assembly. The reference genome for NRO will proceed by 6 steps: 1) de novo assembly of contigs from Illumina short sequence reads; 2) hybrid de novo assembly using 5 to 25Kb PacBio Sequel technology sequence reads with step 1 contigs; 3) chromatin proximity-guided scaffolding of step 2 contigs into chromosome-length sequences using Hi-C chromatin interaction data; 4) gap closing and repetitive DNAs assembly using 100Kb to 1Mb long nanopore single-molecule sequences; 5) validation or correction of the assembly using the high-density NRO genetic map; 6) gene prediction with the MAKER gene-finding algorithm with support from gene expression data, followed by functional annotations of predicted genes by BLAST-to-GO tools. Whole-genome sequence alignments of the NRO genome to the genomes of Arabidopsis thaliana, Quercus robur, Populus trichocarpa, and Prunus persica will be conducted with the public COGE comparative genomics platform and results graphed using the Circos algorithm. We will contract Dovetail Genomics, Inc. to conduct steps 1 to 4 and provide a subaward to the Staton lab at the University of Tennessee to assist us with step 6.2. Complete the assembly and analysis of the green ash reference genome. The green ash reference genome assembly is currently at a more advanced stage than the NRO genome. We have produced a good draft genome assembly for the EAB-resistant parent of our genetic mapping family, using steps 1 to 3 above, with the assistance of the Dovetail Genomics company. The de novo assembly produced 11,700 scaffolds of 961 Mb (97% of genome) incorporated into 23 chromosome-scale sequences (1C = 23). The assembly aligns well with our genetic linkage map. We propose 5-10X depth of Nanopore long sequence reads to close gaps and place missing repeated DNAs within the chromosome sequences, at Dovetail Genomics. Nanopore reads, will also be used to validate the assembly. Genes will be identified and annotated as for NRO by Staton lab. Cross-species comparisons to the genomes of Arabidopsis thaliana, Populus trichocarpa, Prunus persica, Coffea canephora (an asterid), and Fraxinus excelsior will be conducted as above.3. Identify QTL in genetic mapping families and collect and phenotype additional resistant genotypes for seed orchards. Our current genetic linkage map for green ash contains 4,190 DNA markers, with strong synteny (sequence order) to our draft genome for green ash. The genetic linkage map for NRO includes 849 SNP markers (of app. 78,000 available SNPs). Both mapping populations contain over 400 progeny, which have been phenotyped for 2 years for height, diameter, bud break timing, dormancy period, flowering time, and insect pest- and disease-incidence. Phenotyping for these traits will be conducted a third time in 2020 in the NRO mapping family (at the University of Missouri and/or Purdue University) and in the green ash mapping family (at Penn State and/or at US Forest Service). QTLs will be mapped using MapQTL 5.0 software as in previous studeis by co-PI Zhebentyayeva. We will work with state and national forest services to propagate current and additional EAB-resistant green ash trees and stress-tolerant NRO trees for establishing seed orchards and new field trials at PSU.4. Demonstrate the utility of the NRO and green ash genomes in addressing important issues in the ecology and protection of these species. The oak and ash QTL genome regions mapped in objective 3 will be searched to identify genes with trait-related functional classifications. The functional associations of candidate genes with traits will be assessed by differential gene expression analysis using tissue-specific and stress-specific RNAseq data from our previous studies (www.hardwoodgenomics.org). QRT-PCR assays will also be conducted to quantify the association of candidate genes with traits. The extent of conservation of QTL and candidate genes among trees will assessed by genome comparisons. We will work with colleagues with collections of putative natural hybrids of NRO and inter-species crosses of green ash to evaluate genome-wide SNPs to identify/validate trees of hybrid origin. In addition, longer-term applications of genome-wide SNPs in tree-improvement will be initiated with collaborators.

Progress 10/01/19 to 09/30/20

Outputs
Target Audience:Target audiences reached include the general scientific community, tree-improvement specialists, woody-plant biotechnologists, evolutionary biologists, public research institutes, foundations focused on tree species restoration, tree growers, forest landowners, the general public interested in tree improvement, the USDA Forest Service, state forest and conservation agencies such as the PA Bureau of Forestry and the PA Department of Conservation and Natural Resources. Changes/Problems:1. Several changes to the proposed research timeline occurred during project year 1 due to the corona virus pandemic. From mid-March to July, 2020, sequencing facilities at the university and at the Hudson Alpha Institute were either closed or committed to covid-19 projects. Undergraduate students were not available to assist with maintaining field trials nor to collect phenotypic data during the growing season at Penn State and at collaborator's sites. Thus, wages budgeted for those tasks were applied towards the research technician's wages who conducted the field work at Penn State. Fortunately, we isolated the genomic DNA for northern red oak as soon as the project started in October, and the first 2 stages of DNA sequencing were thus completed before the pandemic had started. Also, our colleague who maintains the northern red oak field trials at Purdue University was able to collect leaf tissues from the reference tree in May at the stage needed and ship to the Hudson Alpha Institute for us. However, the Hi-C sequencing at Hudson Alpha Institute was then delayed until September. Thus, detailed results of the Hi-C and sequencing and new assembly of the northern red oak could not be presented in this report. 2.In the proposed budget, funds were requested for purchased services from the Dovetail Genomics Company to conduct all of the DNA sequencing and the final NRO genome assembly. However, the US Department of Energy's (DOE) Joint Genome Institute (JGI) offered to conduct the Hi-C sequencing part of the project for us, so that our northern red oak genome could bereleased to the public as the official genome for red oaks at the DOE's Phytozome website. This is quite an honor and will provide the highest possible visibility for the results of this project. Thus, we accepted the offer and the work was conducted at the JGI's lab in the Hudson Alpha Institute. This also meant that we had to do the first 2 stages of DNA sequencing in the Penn State genomics core facility rather than at the Dovetail Company, which worked out well. Thus, part of the $24,700 budgeted for purchased services was spent in year 1 internally at Penn State's core facility. Due to delays caused by the pandemic, however, the Hudson Alpha institute could not bill Penn State for their costs in this phase of the research within year 1 of the project. That bill will be paid in project year 2. 3. Since the Dovetail Genomics Company produced the green ash genome assembly for us before start of the project, we had proposed $7,300 as a purchased service to have that company generate Nanopore technology sequences and conduct gap-filling in the green ash genome. However, as stated in accomplishments, our pre-award assembly of the green ash genome was found upon closer inspection to not have large gaps or mis-assemblies (e.g., contigs placed wrongly) requiring correction. Also, leaf tissues of the reference tree at the stage required would not have been available in Spring 2020 for the Nanpore sequencing due to travel restrictions underway during the pandemic. Because the services of the Dovetail Genomics Company were no longer used for the gap-closing step, we decided to use the proposed purchased service instead for advanced bioinformatic analyses of the green ash genome, as required for publication. Unfortunately, the bioinformatic analysis was also delayed until September due to lack of personnel during the pandemic. The analyses now need to be continued into year 2. What opportunities for training and professional development has the project provided?This project is providing the opportunity for a recent graduate of the Horticulture B.S. program at Penn State, Maureen Mailander, to develop skills in genetics and genomics research with forest trees, as well as gaining experience with lab management dutiesthat will serve as a major step in her post-graduate professional development. The proposal included wages for undergraduate student assistantswho were to assist with field studies. However, undergraduate students were sent home prior to the field season due to the pandemic;so the designated wages were used to support wages for Ms. Mailander. In addition, an approved sub-award in year twosupports an early Ph.D. student to train with Professor Staton at the University of Tennessee on advanced bioinformatics approaches in comparative genomics using the data generated in this project. How have the results been disseminated to communities of interest?During year 1 of the project, we have provided updates on progress in developing the northern red oak genome and in analyzing the green ash genome through emails and video meetings with our colleagues and collaborators. We look forward to presenting our results to a wider community in year 2 through publication of manuscripts on the genome, and at research conferences including the biennial Schatz Tree Genetics Colloquium hosted by the Project Director John Carlson. What do you plan to do during the next reporting period to accomplish the goals? Release the green ash genome and predicted genes at the hardwoodgenomics.org website. Assemble the northern red oak (NRO) genome sequence, identify all genes, and release the reference genome and predicted genes for NRO at our public hardwoodgenomics.org website. Validate (or correct) the Northern red oak genome by comparison with genetic linkage map. Complete comparative genomics studies and manuscripts for publications on the green ash genome and gene families. Complete comparative genomics studies and manuscripts for publications on the NRO genome and gene families. Complete collection of phenotypic data and QTL analysis in green ash and NRO genetic mapping families. Prepare and submit publications on QTLs, traits, and candidate genes in green ash and NRO. Conduct whole-genome characterization of putative red oak hybrids for publication. Complete final report and distribute to stakeholders including the USDA, US Forest Service, and tree genetics/genomics groups.

Impacts
What was accomplished under these goals? Objective 1.(Carlson and Zhebentyayeva) In October 2019, high quality, high molecular weight (HMW) DNA was isolated from bark and bud tissues of an F2 progeny derived from hybridization of two full-sib trees in an F1 family for which a saturated genetic linkage map was published by our team in 2017. In November 2019, 26 Gb of high quality 150 bp sequences were produced at Penn State by Illumina sequencing of the F2 tree HMW DNA, giving app. 33-fold depth genome coverage. The Illumina sequence quality was validated by alignment to the genome of Q. lobata. Over 69 Gb of PacBio technology sequences (90-fold depth) averaging 22Kb in length were produced from the F2 tree HMW DNA, and validated by alignment to the Q. lobata genome. In March 2020, we sent the PACBio and Illumina sequences to the Hudson Alpha Institute's plant genomics facility for assistance with full-genome de novo assembly. By May 2020, the Hudson Alpha Institute had assembled the PacBio data into 9,120 consensus sequences (contigs) totaling 1,431 Mb, providing full coverage of the diploid genome (all 12 red oak chromosome pairs). App. 94% of the diploid genome was covered in 6,177 contigs over 50 Kb long. The 284 largest contigs over 1 Mb covered 661 Mb, or 85% of the haploid genome. In September 2020, Hudson-Alpha produced high quality chromatin proximity-guided "Hi-C" sequences, providing over 70-fold coverage of the northern red oak genome. By early October 2020, the Hudson-Alpha Institute had obtained a chromosome-level assembly of the 12 red oak chromosomes, by using the Hi-C data with the PACBio contigs. Objective 2. (Carlson and Zhebentyayeva) Our pre-award assembly of the green ash genome was validated using a high-density genetic linkage map. No large gaps or mis-assemblies (e.g., contigs placed wrongly) were identified. Because gaps were not found in the assembly, we decided that the proposed long Nanopore sequences were not needed. Instead we proceeded immediately to analysis of the current genome for preparation of a manuscript. In retrospect, obtaining Nanopore data would have not been possible in year 1, as suitable tissues (young emerging leaves) of the reference tree could not have been collected in Spring 2020 due to travel restrictions in place during the covid pandemic. The Staton bioinformatics lab at the University of Tennessee conducted the following analyses of the green ash genome, during year 1: A gene-finding algorithm search of the green ash genome identified 35,470 gene models after filtering to remove incomplete genes and putative genes that could not be annotated based on known plant gene functions. The OrthoFinder program identified 22,976 orthogroups (known protein families) that were shared among Asterid species with published genomes (olive, mimulus, coffee, tomato, and carrot). Of the predicted green ash genes, 32,239 (91%) were placed in 15,734 of the Asterid orthogroups. Comparisons of our green ash genome to previously published genomes for other species revealed that green ash has the whole genome duplication shared by other Asterid species. The analysis allowed us to identify positions of duplicated blocks in the Fraxinus pennsylvanica genome that have remained intact since the Asterid genome duplication. We also observed that the 23 chromosomes of green ash could largely be paired up with the 23 olive chromosomes, and structural differences between the genomes were identified. The black, white, and European ash genomes, available from collaborator Richard Buggs as contigs (pieces), were aligned onto our green ash chromosomes. This produced putative synteny-based chromosome assemblies of the black, white, and European ash genomes. Objective 3.(Steiner and Carlson) The search for additional genotypes of potentially Emerald Ash Borer (EAB)-resistant green ash and wilt-resistant northern red oak in PA was not conducted in year 1 due to the covid-19 outbreak. This activity was delayed until year 2 (and hopefully to continue beyond the project). A first step in starting a new seed orchard for green ash from existing EAB-susceptible, -lingering, and -resistant genotypes at Penn State was initiated by the grafting of bud and shoot tissues from root-sucker saplings of several EAB-lingering plants (i.e., trees in the ash provenance trial that escaped death from EAB infestation for several years after the primary mortality event) and known susceptible trees, onto potted full-sib green ash seedlings. The grafted seedlings are being maintained in the greenhouse until planting. A third year of phenotype data was collected in Spring 2020 from the full-sib green ash genetic mapping family on the Penn State campus, after students departed campus. Data for mortality, height, diameter, date of spring bud-break, and leaf disease were obtained. The 3 years of phenotype data will be used in Quantitaive Trait Locus (QTL) mapping of the traits in year 2 of the project. Collection of phenotypic data was also scheduled in 2020 for the northern red oak genetic mapping family on the Purdue University and University of Missouri campuses. However, data collection was not conducted due to covid-19regulations, and reductions in staff at those institutions.Our Purdue and Missouri colleagues will resume data collection in 2021. Objective 4. The results in year 1 on structure of the green ash genome are good examples of knowledge that will be obtained with these new resources. Additional activities will be conducted in year 2 using the full set of genes discovered in the green ash and northern red oak genomes. The manuscripts on the green ash and northern red oak genomes, for submission in research journals in year 2, will feature discovery of candidate genes for traits such as insect resistance, susceptibility to fungal pathogens, abiotic stress resistance, growth, and phenology traits.

Publications

  • Type: Journal Articles Status: Published Year Published: 2019 Citation: Steiner, K.C., Graboski, L.E., Knight, K.S., Koch, J.L. and Mason, M.E., 2019. Genetic, spatial, and temporal aspects of decline and mortality in a Fraxinus provenance test following invasion by the emerald ash borer. Biological Invasions, 21(11), pp.3439-3450.
  • Type: Journal Articles Status: Published Year Published: 2020 Citation: Soltani N, Best T, Grace D, Nelms C, Shumaker K, Romero-Severson J, Moses D, Schuster S, Staton M, Carlson J, Gwinn K. 2020. Transcriptome profiles of Quercus rubra responding to increased O3 stress. BMC Genomics, 21(1): 1-18.
  • Type: Conference Papers and Presentations Status: Other Year Published: 2019 Citation: John E. Carlson. Genetic opportunities for restoring green ash and American beech, Natural Areas Conference, Station Square Sheraton, Pittsburgh, PA, October 8, 2019.
  • Type: Conference Papers and Presentations Status: Other Year Published: 2019 Citation: John E. Carlson. Genomics tools and approaches for restoration genetics research with forest trees, Center for Integrated Breeding Research Seminar Series, Georg-August-University of Goettingen, December 20, 2019